Apache Camel: How to perform load balancing with Ribbon
We have already seen other posts on how to increase the resilience of web services through Hystrix. And today we are going to delve a little deeper into it with the help of a load balancer such as Netflix Ribbon. Which will allow us to send calls to one or another web service and provide mechanisms in case of error.
Traditionally in a monolithic structure, there has always been a load balancer with known IP/DNS that, in turn, distributes the invocations that are made between the different nodes it has configured. For resilience and consumption reasons, this load balancer is usually kept in a separate server.
This solution leads to several problems, especially if it is oriented to microservices deployed in the cloud:
- It is more difficult to manage it if the target nodes increase or decrease dynamically.
- Keeping them on a separate server increases the cost of the solution.
- Unify the entry point into a single entity in our architecture, which can harm our system if it suffers from connection or performance problems.
- Increased network traffic to a single point, generating a bottleneck.
And to solve these problems, client-side load balancers were created, such as Netflix Ribbon. Allowing us to:
- Have more than one entry point.
- Dynamically associate nodes to the load balancer.
- Reduce infrastructure costs.
- Reduce the cost of network traffic both inbound, avoiding the bottleneck. As well as at the exit, by making calls between peers and reducing the latency between the balancer and the client.
Therefore, although through the Camel/Spring configuration we will be able to configure which nodes we are going to attack. Due to the improvements offered by a client load balancer and since Ribbon also belongs to the Spring Cloud Netflix framework. We are going to use the Service Discovery pattern, whereby, we will call the services based on their name and with the help of the Netflix Eureka server. We have already seen some of this topic here.
Now we will see briefly, how to set up the server and the clients. But if you want more detail you can look at the code here. The Eureka server will be built with a pure Spring application, with the spring-cloud-starter-netflix-eureka-server dependency and the @EnableEurekaServer annotation.
Of course, as a proof of concept in local and with a single Service Registry, we will have to modify the configuration file of the application to avoid errors of this configuration and excessive traces. This is because the Service Registry is prepared to work in a cluster and therefore when it starts it tries to look for other instances and register in them. Therefore the configuration file will be as follows:
Now we will create the clients. They will be two mocks, the only thing that will change will be a small value in the response (so we can clearly identify who we are calling in our REST calls) and the start port. These clients will be made with Apache Camel:
In its properties file, we will configure a few things:
- Eureka server: Through the property eureka.client.serviceUrl.defaultZone.
- Name in service registry: Through spring.application.name.
- Port to be deployed: Through server.port.
If we think about a real environment and they were deployed with a docker-compose for example. The only thing that would differentiate these instances would be the deployment port. And if the deployment is done on different machines, the configuration of each microservice could be exactly the same. In this way, and with the help of Netflix Eureka and Netflix Ribbon, we can create a cluster of microservices.
Finally, we are left with the creation of the client balancer. Here the same, as we have said with microservices about configuration. As we will see now it will be very simple.
About load balancer configuration, we need to include camel-ribbon dependency and the server.ribbon.eureka.enabled property in the configuration file, that’s all.
About the router, we have to indicate to the endpoint that we are going to call, which when using Netflix Eureka will be the name of the client that we have configured in the clients. And that we want to use the Netflix Ribbon balancer.
With this and once we have all the servers running, we can test how calling to http://localhost:9092/book we will receive alternatively the answer of a client or another.
But what happens if one of the clients goes down. We have talked that Ribbon will provide mechanisms to avoid failures and increase resilience. But if we drop one of the two clients, we see how it fails 1 out of 2 calls. And this is normal since it does round-robin scheduling, so once it will try to call one server and then the other.
But why does this happen? Basically, it is because the client is down and the Service Registry has not yet found out. But doesn’t Ribbon has some mechanism to detect that the node is not active? Yes, it does, but by default, it uses a Dummy mechanism that will always consider the server as good.
The solution to this problem is to implement a class that inherits from the IPing interface. And we need to pass it in the construction of the balancer. In this way, Ribbon will be able to verify by itself if the server that is going to call is enabled or not. And avoid calling it if it is not.
For the example, we are going to create a very rustic implementation that will be based on whether the actuator/health endpoint of the clients is active and returning UP.
On the other hand, programmatically we will create the Ribbon configuration. To which we will pass an instance of the class that we have created in the previous step. And we will pass it to our balancer as follows.
Now we only have to start it with the new configuration. And check how Ribbon continues returning us a correct answer in spite of the fact that the client has fallen just seconds before.
Finally. Although we have talked about the benefits of using a Service Registry like Eureka. We must also know that Apache Camel has its own implementation of a Service Discovery, although in this case, we will have to manually indicate the IPs of the services. But in this case, is even more important to configure an IPing implementation and pass it to the load balancer. You also can see an example here.