Application inner workings - Communication

Speaking bubble — Even applications need to communicate a lot with one another.

This will be the first article of some aspects of web development that are not directly linked to coding, and that many developers are aware of already. However, I often need to talk to people who handle the more “business” side of the development process, and I regularly see that they do not know some of these items that are quite basic practices in the developer’s eyes. Mind you, it’s also not their job to know. If they do make the effort to learn it, however, great credit to them, and this is why I wrote this article. So I am hoping that this article will be helpful to business analysts who want to know more about web development, so that they immediately earn the respect of the developers they work with, and so that it’s not required to have very patient translators in place.

This article will cover the way that applications communicate with one another, by briefly going into the three most common forms of application communication. What is asynchronous messaging? What are webservices? Kafka - is it a writer, or also some new shiny gadget? And so on.

Need for communication
Common language
Synchronous communication
Asynchronous communication
Web sockets

Need for communication

Right, first of, we already need to handle the first, basic question - why does an application need to communicate with third parties in the first place?

Well, it’s fairly simple. Applications should do what they are built for. For example, the application I’m working has as its primary goal to facilitate money management. This entails having the option to add multiple types of investments, accounts, etc. For this goal, the application won’t need any help from outside.

There are however parts, which are crucial to make the application usable, but that are not directly under its control. For instance, the application will allow accounts in different currencies, yet the user may have a preference to see his net worth in his preferred currency, hence a conversion is required. Since foreign exchange prices fluctuate constantly, these values cannot simply be hardcoded. Therefore, the application will request that information from some third party, for example the World Bank. That communication needs to be happening almost instantaneously, otherwise the users will flock in droves to some other app that’s more performant.

Data enrichment will also often require API calls to different applications - often even in the same company. In a microservice architecture, API communication becomes even more crucial than it would in a typical monolithical application.

So, now that the requirement for communication is established, I will quickly go over the options that are available.

Common language

I already noted the importance of common language in my article on DDD. This is not only the case for multiple developers that are working on the same codebase. It is also especially important when you are talking to non-developers, who may not know that several words actually mean the exact same thing, but who may be hesitant to ask in order to not seem ignorant. I have already often seen anxious faces when people ask about some REST interface, and the reply from the developer is something along the lines of “We have a webservice for this”. It is the exact same thing, yet the developer has lost the attention of the person he was communicating with. So, in order to avoid this, I will specify whenever some different phrases are basically just synonyms.

Another such example is when someone discusses interfaces between applications, and someone else mentions APIs. They are the exact same thing, obviously, right? Well, for people who don’t know that API stands for Application Programming Interface, it may not be that obvious, so always pay attention to your audience.

Synchronous communication

What does synchronous mean?

Okay, let’s start with the simplest form - synchronous communication.

While synchronous technically means “happening at the same time”, this is not really the case in application communication. However, it’s not far from it. Essentially, synchronous communication stands for the fact that the client(one application / the requestor)$ makes a request, and then waits until the server (another application / the requestee) answers with a response. The client is waiting until that response has come back, and will not go to the next line of code before that has happened.

This is the most straightforward way of communication, because it will allow logic to go from A to B, then C, then D, and so on. Everything is nicely sequential.

It’s also important to note that any communication happens point-to-point. One request goes to one destination.

Of course, if the request does not come in time, then the request is aborted. This is called a timeout, and should be handled appropriately.

Similarly, if the request has, for whatever reason, not made it to the server, the client has the responsibility to send the request again.

An analogy to synchronous communication would be making a phone call. You call someone (the request), and wait until that person picks up the phone (the response), or until you hang up (the timeout). The feedback is immediate however - you do not need to guess whether the information truly has arrived.

The most common examples of synchronous HTTP communication include REST (Representational State Transfer) and SOAP (Simple Object Access Protocol). Since REST is now considered the standard, I will not cover SOAP.

What is a Webservice? What is REST?

A webservice is an API that must be accessed over a network (typically the internet). In a web service, you have two entities - the client and the server. As such, REST is a web service, but not every web service is REST. Generally speaking though, they can be used almost interchangeably.

REST is, as we’ve now defined, a type of web service. It is the most commonly used in modern applications, and almost exclusively the communication method from any browser (no matter whether you use Chrome, Opera, Edge, Brave, whatever) to servers.

REST communication (usually) uses JSON (JavaScript Object Notation), which looks as follows:

{
  "name": "Hello",
  "lastname": "World",
  "phoneNumbers": [  // list entries
    "12345678",
    "98765432"
  ],
  "address": {  //nested object
    "city": "Barcelona",
    "streetName": "Avenida something"
  }
}

The part of the request that is in JSON is called the body. A request body would be some JSON in the request, a response body would be some JSON in the response. None are mandatory by default - it all depends on the API definition.

There are several items that define each REST endpoint(the place that’s called to make a request).

The URI
The HTTP method
The media type

The combination of these three attributes must be unique. If the API is following RESTful principles, one can often even already guess what happens at each endpoint.

REST call - the URI (also called the endpoint)

First of, you may be confused now. What the hell is a URI? Well, any URL is a URI, but not every URI is a URL. They are often used interchangeably though, so I will not get too technical to explain the difference here.

The URI consists of several parts again. For example, the API definition may show a URI such as https://myapp.com/persons/{personId}/addresses?zip-code=8000. Let’s go into detail what all these parts are:

https:// - This is the protocol, so here it’s a secured HTTP connection
myapp.com - The host / the human-readable version of the physical server
/persons/{personId}/addresses - the path that tells the application where the request should go, once it’s already on the server
{personId} - Wrapped inside curly braces indicate that this is a variable, so its value can change on every request
?zip-code=8000 - some additional (usually optional) query parameters that help the filtering. Query parameters are separated by &.

REST call - HTTP methods

Most REST APIs follow the CRUD convention. It stands for Create, Read, Update, Delete. There are more than 4 HTTP methods however.

GET: The most “basic” method, as these endpoints are typically only defined by their URI, and no request bodies are required. GET requests are used to get information.
POST: The method typically used to create new resources. POST is not idempotent, and, as such, making the same POST request twice will result in two created entities.
PUT: This method typically updates resources. It often does not provide any response body. It is idempotent, so performing the same request 50 times will still give the same result.
DELETE: Probably don’t need to really say what this does…

Some other, less common methods are:

PATCH: Also an update method, but usually rather to fill in the blanks, as opposed to the full override of a PUT.
OPTIONS: More used in frameworks, or to check which methods are allowed on an endpoint.
HEAD: Never actually seen this in use. It’s the same as GET, except without the response body, so purely for meta information.

Also note that whenever you enter a URL into your browser, it actually makes a GET request whenever you press on Enter.

Any REST communication can be verified in the DevTools of most browsers, which you can access through pressing Ctrl + I, F12, or using right click of the mouse and then Inspect. Under the Network tab of the DevTools, you can see any request that happens after you have opened the DevTools.

DevTools in Brave — The DevTools are full of helpful clues for debugging and testing!

After clicking on any request, you can see the response in the “response” tab, and the request values under the strangely named “headers” tab. Whenever you log a bug, if you can, attach the DevTools for the failed request in the evidence screenshot. Trust me, your developer colleague will be very grateful.

REST call - media types

This is less fun, but still important. You can technically have several endpoints with the same URI and HTTP method, but one would accept JSON, whereas the other would accept simple text. This is called the media type, or MIME types. There are quite a few, here are just some examples:

text/plain
text/html
application/json
multipart/form-data
image/jpeg
image/gif
… (really, there are soooo many)

There you go. The combination of all three is unique, all over the internet!

Asynchronous communication

A counterpart to synchronous communication is of course asynchronous communication. The most typical usecase is usually a message queue, or MQ in short. Lately however, Kafka is becoming very popular, for some reasons I will delve into later.

Keep in mind that asynchronous communication is only used between backends, so from server to server. The browser will never make asynchronous calls to a server.

What does asynchronous mean?

In short, asynchronous means that communication happens without waiting for a reply. So the application will send something, and immediately continue its processing, without any regard of what the destination will do. The destination can process the information on its own time, it’s not our concern anymore. This is why asynchronous communication is often described as fire and forget.

In our analogy of before, with the phone call being synchronous, sending a text message would be the asynchronous counterpart. While it may be true that too many of us actually do wait a long time for a reply, that’s not the required behavior. ;)

A major difference is also that in asynchronous communication, the body of the message is guaranteed to arrive at its destination. If the destination is down, for whatever reason, the message will simply wait until it is being picked up.

Just to confuse you a little - web services can technically be asynchronous also. However, it’s not the web service request itself, but instead how the application handles them. If the application performs the REST call in a separate thread, it can continue it’s business on the main thread. The newly spun up thread WILL wait for a response though - always!

What are MQs?

A message queue is essentially a funnel, where one application writes messages (bodies of text, in JSON, XML, or some other syntax), and another application listens to the queue to read them. If the listener fails to process the message, it is usually put on a fail queue, or error queue, where it can be requeued once it has been determined why the message has failed (or deleted/archived, of course).

MQs do not operate on the HTTP protocol, but instead on the MQ protocol.

Some notable MQ brokers include:

RabbitMQ (open source)
ActiveMQ (open source)
IBM MQ (enterprise)

Once a message has been read and processed successfully, it is removed from the queue.

MQs can also operate point-to-point (one sender, one destination), or they can be divided into topics, where multiple applications$ listen to the same message. This is called Pub/Sub, as one application publishes messages, and other applications subscribe to messages that belong to a certain topic.

Pub/Sub MQ — The Pub/Sub mechanism of queues

Kafka - Why is it better than MQ?

One could argue that point-to-point MQs, where a response is expected on some reply queue, are essentially synchronous also - not in their design, but in their usage. With Pub/Sub, this problem has notably already been resolved, as it really becomes fire and forget, and each listener can do with the information what it wants.

Apache Kafka is a new framework that builds on top of MQ principles though. They call this communication event streaming, in that they stream events instead of messages. It is only a small difference in nomenclature, but events do sound more asynchronous. An application simply notifies the surrounding world of an event that has happened in its domain, and the others stream that information. It actually even sounds a little creepy, if I put it like that…

A bigger difference though is that Kafka does NOT remove the event once it has been consumed by an application. This means that the consumers can rewind back to an old offset to replay old events. This often occurs after a problem has been fixed and reprocess these events post resolution.

Due to some other features of Kafka, it is also a lot more scalable and performant than traditional MQs. These features are for example a more efficient storage format, replication of partitions, but I will not go into detail, as that is outside of the scope of this simple article.

However, Kafka in general does require some overhead, and should therefore be used mostly when dealing with enormous amounts of data for stream processing.

Web sockets

Web sockets are the final protocol in this article. As mentioned before, asynchronous communication with MQs or Kafka is used between backends, and never from a server. In traditional programming, an application running in a browser would make some synchronous requests, and handle the responses. If the browser wants an update, it would need a refresh (or some programming to poll certain endpoints).

Polling is vastly inefficient. Let’s say you poll every 5 minutes. If the update comes after 5 minutes and 5 seconds, you’d need to wait almost the full 5 minutes to see that update reflected.

Now, web sockets provide a solution to this. In essence, the browser opens a channel to some server, and then just waits for updates, which it could then handle instantly. A typical example of this would be some messaging app such as WhatsApp. Can you imagine how unfriendly it would be to the user if we’d have to refresh some chat to see whether we’ve had a reply? Instead, the app is simply connected to the server, which then pushes a notification to your phone when a new message has come for you.

Web sockets are generally used when you need real-time information. Stock brokers would be interested in this for instance, to ensure they always have teal-time prices of their tickets. It’s not exactly the same as using Kafka - they’d rather use Kafka to process news about a certain stock. The difference here becomes harder to understand, but still, it’s not quite as time-sensitive.

Of course, there are some challenges that come along with Web Sockets also. Since an open connection is always required between the client and the server, this can quickly lead to issues if you have too many clients. Again, there are potential solutions to this (different middleware, horizontal scaling, etc.), but it’s not the golden solution either.

Software architects need to carefully evaluate which solution is the best option for their requirement. This is not always an easy task, even though it often seems like one (especially to less technical people).

Of course, there are solutions to mix and match all three types of communication. An existing application landscape often uses multiple of them anyway, but not usually with one single goal. With the correct middleware, that may be possible, but it’d be more of a curiosity than a real need.

These are the main forms of communication between applications over the internet. I hope that you will find this article useful, and if so, please do not hesitate to share it with your business colleagues who may need a refresher course on what the technologies are, what they allow, and how to use them. Them understanding these fine points will be a big timesaver for both of you, and who knows, the next requirements in your stories may just be a little bit better defined. :)

Published Oct 27, 2021