Modes of communication continued - RPC
When discussing communication patterns in relation to Kafka, we were predominantly focused on modes of communication related to messaging, however, there are other paradigms, and one of the most common of them are Remote Procedure Calls.
RPC is a technique where one computer program executes a procedure of another, remotely, as if it were a normal local procedure call. This paradigm is more abstract, it may be implemented as messaging, but it is realized without the programmer explicitly implementing the details for the remote interaction between the program. Of course, this is within reason, and it will likely be slightly different and not exact same, but still, the result is the same.
RPC is quite old, being first conceptualized all the way back in 1960s, although it would take another twenty years for practical implementations to start popping up. Another fifteen or so years later, when Object Oriented Programming became the whole new sensation sweeping the nation (and the world at large), the concept was furthered into Remote Method Invocation, which, after we sobered up from the drunken stupor of Object Oriented Programming all the way (tm), largelly fell out of popularity, as significantly less versatile and universal between languages.
Over the years, a number of implementations have popped up. For us in Braiins, we are mostly concerned with Google's gRPC. We have already partially investigated the topic in one of our Rust chapters, feel free to see Protobufs and gRPC.
Here we have once again reiterated the fact that gRPC is very tightly coupled with Protocol buffers. Protobufs are the one and only ubiquitous format, and that is for a good reason. Unlike with communication over Kafka, which prescribes us no way in which to format our data, and lets us be more involved with the configuration process by default, gRPC is quite prescriptive, and the only allowed format is protobuffers.
This ensures compatibility, and creates some implications with regards to the evolution of interfaces, which we shall discuss later.
Use of RPC
The major usecases of RPC is to create direct communication between server and client, without the need for any intermediary. Whereas with Kafka, our architecture is fairly centralized (on a conceptual level, as we know that we can scale the cluster horizontally), here, there is no central broker that we have to go through. Nodes can create a peer-to-peer network, wherein they talk to each other directly as they need.
Furthermore, there is a slight conceptual gap between RPC and messaging. In messaging, we dealt mostly with events or notifications, whereas with RPC, we are executing a remote call, and that may well be for the purpose of performing an action on the remote host, rather than contacting it for the purpose of transferring data to be processed and stored, or to transfer that an event occurred or a notification has to be communicated.
Modern RPC implementations have many features which let us get quite sophisticated with our inter-program communication.
To kick off these chapters, let us remind ourselves with an example of a gRPC service
from our Rust chapter:
syntax = "proto3";
package calc;
message CalcInput {
int32 a = 1;
int32 b = 2;
}
message CalcOutput {
bool is_error = 1;
int32 result = 2;
}
service Calculator {
rpc Add (CalcInput) returns (CalcOutput);
rpc Sub (CalcInput) returns (CalcOutput);
rpc Div (CalcInput) returns (CalcOutput);
rpc Mul (CalcInput) returns (CalcOutput);
}
Muck like Kafka, gRPC is one of technologies that helps us transform our large monolithic applications into smaller, more manageable microservices.
The cons of using gRPC
However, unlike Kafka, there is no persistence of communication, and so we might experience losses of data. This loss of date would typically occur when one side of the connection is unavailable. An improperly designed service (or properly, if that is the better behavior given the severity such failure would have), could crash or fail to function if its gRPC counterpart is no longer available.
In Kafka, messages would simple continue to be pushed into the queue, ready to be processed when a consumer starts accepting messages again. Keep this in mind when designing your application.
Furthermore, gRPC is a bit problematic on the web browser side, as it heavily depends on the HTTP2 protocol in a way that's too low-level for browsers. There exist proxies for usage on the web, but of course, limitations apply.
Lastly, there is no consistent error handling. While gRPC describes the concept of a status code and message, there is no clear and consistent way to properly catch the errors across programming languages. While there are recommendations and guides related to error-handling, no universal consensus exists. Error handling is discussed in a later chapter.
Performance and bandwith
gRPC streams have quite low bandwidth and pretty good performance. Furthermore, since gRPC is more peer-to-peer, and don't have to go through a broker, then it is less likely the bandwith utilized for communication between two services will impact communications between a different pair of services.
This could happen with Kafka if you have only one broker, or enough traffic to fully saturate the connections of all of your brokers. In gRPC, it is less likely that two services would have connection routed through the same machine.
gRPC is also better suited as a public API (although there are better options even then). With gRPC, it is fairly okay to accept thousands or more connections, but it would be more difficult to setup Kafka in a way that having foreign agents connect to it in the numbers of tens of thousands would be okay, and prevent access issues and other unsavory things.
Deadlines and cancelling
To ensure reliable throughput, gRPC supports the concepts of deadlines and cancelling.
A deadline is essentially a timeout for a particular call (and you may hear it being referred to as such). By default, gRPC calls do not have deadlines, and so they aren't limited. The deadline is sent with the gRPC call to the service and is independently tracked by both the client and the service. It is possible that a gRPC call completes on one machine, but by the time the response has returned to the client the deadline has been exceeded.
If a deadline is exceeded, the client will immediately abort the HTTP request, whereas the server will also produce the cancelled status.
gRPC clients may also choose to cancel long-running calls when they are no longer needed. This is important in calls that use streams with no set bound, as they would otherwise be essentially running endlessly.