In this section, we will examine how to enhance microservices developed in Chapter 5, Scaling Microservices with Spring Cloud, for autoscaling. We need a component to monitor certain performance metrics and trigger autoscaling. We will call this component the life cycle manager.
The service life cycle manager, or the application life cycle manager, is responsible for detecting scaling requirements and adjusting the number of instances accordingly. It is responsible for starting and shutting down instances dynamically.
In this section, we will take a look at a primitive autoscaling system to understand the basic concepts, which will be enhanced in later chapters.
A typical autoscaling system has capabilities as shown in the following diagram:
The components involved in the autoscaling ecosystem in the context of microservices are explained as follows:
The life cycle manager introduced in this section is a minimal implementation to understand autoscaling capabilities. In later chapters, we will enhance this implementation with containers and cluster management solutions. Ansible, Marathon, and Kubernetes are some of the tools useful in building this capability.
In this section, we will implement an application-level autoscaling component using Spring Boot for the services developed in Chapter 5, Scaling Microservices with Spring Cloud.
The following diagram shows a sample deployment topology of BrownField PSS microservices:
As shown in the diagram, there are four physical machines. Eight VMs are created from four physical machines. Each physical machine is capable of hosting two VMs, and each VM is capable of running two Spring Boot instances, assuming that all services have the same resource requirements.
Four VMs, VM1 through VM4, are active and are used to handle traffic. VM5 to VM8 are kept as reserve VMs to handle scalability. VM5 and VM6 can be used for any of the microservices and can also be switched between microservices based on scaling demands. Redundant services use VMs created from different physical machines to improve fault tolerance.
Our objective is to scale out any services when there is increase in traffic flow using four VMs, VM5 through VM8, and scale down when there is not enough load. The architecture of our solution is as follows.
Have a look at the following flowchart:
As shown in the preceding diagram, the following activities are important for us:
In this example, we will use TPM (Transactions Per Minute) or RPM (Requests Per Minute) as sampler metrics for decision making. If the Search service has more than 10 TPM, then it will spin up a new Search service instance. Similarly, if the TPM is below 2, one of the instances will be shut down and released back to the pool.
When starting a new instance, the following policies will be applied:
These policies could be further enhanced. The life cycle manager ideally provides options to customize these rules through REST APIs or Groovy scripts.
We will take a look at how a simple life cycle manager is implemented. This section will be a walkthrough of the code to understand the different components of the life cycle manager.
Perform the following steps to implement the custom life cycle manager:
chapter6.lifecyclemanager
. The project structure is shown in the following diagram:The flowchart for this example is as shown in the following diagram:
The components of this diagram are explained in details here.
MetricsCollector
class with the following method. At the startup of the Spring Boot application, this method will be invoked using CommandLineRunner
, as follows:public void start(){ while(true){ eurekaClient.getServices().forEach(service -> { System.out.println("discovered service "+ service); Map metrics = restTemplate.getForObject("http://"+service+"/metrics",Map.class); decisionEngine.execute(service, metrics); }); } }
The preceding method looks for the services registered in the Eureka server and gets all the instances. In the real world, rather than polling, the instances should publish metrics to a common place, where metrics aggregation will happen.
DecisionEngine
code accepts the metric and applies certain scaling policies to determine whether the service requires scaling up or not:public boolean execute(String serviceId, Map metrics){ if(scalingPolicies.getPolicy(serviceId).execute(serviceId, metrics)){ return deploymentEngine.scaleUp(deploymentRules.getDeploymentRules(serviceId), serviceId); } return false; }
TpmScalingPolicy
, as follows:public class TpmScalingPolicy implements ScalingPolicy { public boolean execute(String serviceId, Map metrics){ if(metrics.containsKey("gauge.servo.tpm")){ Double tpm = (Double) metrics.get("gauge.servo.tpm"); System.out.println("gauge.servo.tpm " + tpm); return (tpm > 10); } return false; } }
true
, DecisionEngine
then invokes DeploymentEngine
to spin up another instance. DeploymentEngine
makes use of DeploymentRules
to decide how to execute scaling. The rules can enforce the number of min and max instances, in which region or machine the new instance has to be started, the resources required for the new instance, and so on. DummyDeploymentRule
simply makes sure the max instance is not more than 2.DeploymentEngine
, in this case, uses the JSch (Java Secure Channel) library from JCraft to SSH to the destination server and start the service. This requires the following additional Maven dependency:<dependency> <groupId>com.jcraft</groupId> <artifactId>jsch</artifactId> <version>0.1.53</version> </dependency>
DeploymentEngine
sends the following command over the SSH library on the target machine: String command ="java -jar -Dserver.port=8091 ./work/codebox/chapter6/chapter6.search/target/search-1.0.jar";
Integration with Nexus happens from the target machine using Linux scripts with Nexus CLI or using curl
. In this example, we will not explore Nexus.
We will only examine Search in this chapter, but in order to complete it, all the services have to be updated. In order to get the gauge.servo.tpm
metrics, we have to add TPMCounter
to all the microservices.
The following code counts the transactions over a sliding window of 1 minute:
class TPMCounter { LongAdder count; Calendar expiry = null; TPMCounter(){ reset(); } void reset (){ count = new LongAdder(); expiry = Calendar.getInstance(); expiry.add(Calendar.MINUTE, 1); } boolean isExpired(){ return Calendar.getInstance().after(expiry); } void increment(){ if(isExpired()){ reset(); } count.increment(); } }
SearchController
to set the tpm
value:class SearchRestController { TPMCounter tpm = new TPMCounter(); @Autowired GaugeService gaugeService; //other code
SearchRestController
, which submits the tpm
value as a gauge to the actuator endpoint:tpm.increment(); gaugeService.submit("tpm", tpm.count.intValue());
Perform the following steps to run the life cycle manager developed in the previous section:
DeploymentEngine.java
and update the password to reflect the machine's password, as follows. This is required for the SSH connection:session.setPassword("rajeshrv");
Chapter 6
) via the following command:mvn -Dmaven.test.skip=true clean install
./rabbitmq-server
java -jar target/config-server-0.0.1-SNAPSHOT.jar java -jar target/eureka-server-0.0.1-SNAPSHOT.jar java -jar target/lifecycle-manager-0.0.1-SNAPSHOT.jar java -jar target/search-1.0.jar java -jar target/search-apigateway-1.0.jar java -jar target/website-1.0.jar
http://localhost:8001
.http://localhost:8761
) and watch for a second SEARCH-SERVICE. Once the server is started, the instances will appear as shown here: