When I thought about writing about „AMQP vs HTTP“ I thought on writing about differences between both protocols, describe every single header and why it is there, how it is the flow of each message/package you are sending in each protocol, etc, etc, etc… But then I realized that all that grey information is out there, anyone that wants to read about the protocol itself can just „google it“ and have plenty information about each protocol. If you were looking for that information here, just “let me google that for you”: AMQP, HTTP. If you were looking for a real use case and a discussion about when to use each protocol and why, then you are in the right place.
Case of discussion
Some time ago, when we started the decentralization of the huge monolith we needed a way to synchronize data between the core and the regions. Core is the name we gave to the place where the central database and the e-commerce part are, and the regions are parts of the world where we set our components (Europe, America, Asia…). The publish-subscribe pattern was fitting perfectly in our project because all data that had to be synchronized was always in one direction. Also, the pattern allows you to turn it around and make the subscriber the publisher of new information and have the possibility of a bi-directional synchronization in the case of future needs.
Back then we faced the fact that we had to build a world wide publish-subscribe pattern, where the core had to be talking with services of which we may not know if they are up and running and, the most important thing to do was to be reliable due to the high amount of traffic that we expected (not only core to region traffic but also traffic that is generated in the region, too). We needed to have a certain queuing mechanism that would hold the communications in case that the receivers are too busy to respond, but, at the same time, we did not want to block the producer neither (by retrying requests again and again). We could not lose any communication, and with that, any synchronization of data at all. In that moment, the idea of using messaging queues came into our minds. Messaging queues allow you to have a middleware that will persist all communications and, depending on how you build them, you can add the reliability that you may need, forcing the acknowledge of each package and using a chain of trust in the communication.
So there we were, knowing what to do, but not how. Of course each of us had a great idea! You can always build a very basic “write somewhere the communications that were not acked and try again later until it works”, we even tried to use Redis as a messaging queue (and it works good, but it is not reliable). Then, among others, we found RabbitMQ and the AMQP protocol. Before this, I have to say that most of the team did not know this technology at all. We had no experience whatsoever with messaging queues, so we had to learn a lot. We went through a trial-error phase until we tamed this technology into our will, and the rest is history. For internal communications that has to be made in an asynchronous way we use RabbitMQ. We have a broker per region plus one in the core and we use shoveling for the transferring messages between core and regions.
But still, after all the experiences that we have accumulated in almost an entire year using this technology in production with great and undoubted performance, and having the RabbitMQ brokers as a core of our communications between components, “AMQP vs HTTP” is a daily (maybe weekly) discussion in our office. People sometimes even get angry at each other because of this topic and we have long pointless discussions about why better to use RabbitMQ and AMQP or better not. There are four kinds of opinions (that at least I found/heard) about this topic (internal asynchronous communication between components):
- I prefer to use HTTP.
- I prefer to use AMQP.
- I do not have a preference on them, I just use them where they are already in use.
- I don’t care why using one or another protocol, I just want to synchronize data.
The kind of opinions 3 and 4 I am going to leave out for the commenting part of this article (we can discuss it if you want, but it would be a drift from the main subject), so I will focus the rest of the article on the other two kinds of opinions.
And now is when the tricky part of this article comes for me, because “prefer” is probably the wrong word. Should we (as developers) have “opinions” on technical matters? Or should we better analyse the facts and use what is convenient in each case? Our profession is not about instincts but more about facts (even though instincts are important too!). So, let’s go to facts:
- Both protocols will allow you to have communication between components.
- Both protocols are traceable.
- Both protocols are well documented.
- AMQP is asynchronous.
- HTTP is synchronous.
- HTTP is easy to debug.
- People are familiar with HTTP.
- HTTP is well mapped to an interface.
- AMQP is easy to maintain and easy to scale.
- AMQP has guaranteed message delivery.
- With HTTP you need to have some kind of service discovery.
- AMQP needs to know the broker to reach the queue to read/write.
- RabbitMQ provides fanout mechanisms, shoveling and federation of queues (“out of the box”).
- If RabbitMQ is restarted while processing a message and it’s not done yet, it will requeue automatically and other backend services will pick it up and process it.
- HTTP is supported in almost every programming language.
The list of facts can grow for at least a couple of pages, but I think that there are enough key facts to be able to choose one or the other one depending on your situation.
- Debugging an HTTP request is really easy and repeatable, whereas an AMQP message is harder to debug (you need connection to the queue, libraries, maybe scripting, etc…).
- HTTP is a familiar technology for the developers, so there is no need to have extra training for a new developer in the project.
- HTTP is the most supported protocol in internet, so sharing your APIs as HTTP API is a good practice.
- Delivering messages with AMQP gives you reliability and being asynchronous allows you to not worry about the delivery at all.
- Knowing the host/IP of the cluster of AMQP brokers is enough to deliver/receive messages, whereas with HTTP you may have different hosts and IP depending on regions.
- You can use fanouts, meaning that one message will be enough to inform several different components, reducing the amount of communications.
Considerations when using AMQP
For delivering and receiving AMQP messages you need a broker. A broker is nothing more than a server that receives, stores and delivers messages. Depending on your costs and infrastructure, this fact may add extra complexity to your project, having to configure and maintain one/several/federation/… of brokers.
In our case we externalized this part using CloudAMQP. This made us not having to worry about maintenance but added an extra cost to the final solution.
I would say that, if you want to talk with the world (provide an API for third party usage), HTTP is giving all you need. It is supported, well known and widely used. Also, you “do not have to care” if there is a problem in the communications, because it is the duty of the client of your API to execute the request again if something went wrong.
If you want to have internal communication, where you control every single request, I would use AMQP, because it is easy to use, supported in most of the languages used nowadays, reliable, scalable and fast.
I think both protocols are great, no doubt about that. But only one of them adds reliability on your communications “out of the box” and that is key, also, only one of them is scalable almost out of the box and the fact that communications are not blocking the services allows you to continue using your resources for other processes. Furthermore, why re-inventing the wheel creating our own reliability process “just because”? Why adding extra code that has to be maintained only because “we can”? I have it clear, if it is asynchronous, use AMQP with RabbitMQ.
Probably you already realized that the entire discussion makes no sense, didn’t you? We are comparing shoes with t-shirts here, and that is a mistake that we have been doing for a long time.
We should focus in what is really important: why do we decide to go for one or another technology, which pros and which cons we have with the choices we make and, once we did, get all our efforts on making the best out of it.
Of course, reviewing the chosen infrastructure is always good, but we have to take into consideration when and why.