We might need to handle large amounts of traffic to our application. In production. In this case we need to scale up our service.
1. Update your service with an updated number of replicas. We update the NGINX service that we created previously to include 5 replicas. This is defining a new state for the service. $ docker service update --replicas=5 --detach=true nginx1 Following events occurs with this command: o The state of the service is updated to 5 replicas, which is stored in the swarm's internal storage. o Docker Swarm recognizes that the number of replicas that is scheduled now does not match the declared state of 5. As a result, Docker Swarm schedules 4 more tasks (containers) in an attempt to meet the declared state for the service. The swarm will actively check if the desired state is equal to actual state and will reconcile if needed. 2. Check the running instances. $ docker service ps nginx1 We will see that the swarm successfully started 4 more containers. These containers are scheduled across all three nodes of the cluster. 3. Send a lot of requests to http://localhost:80. $ curl localhost:80 Note--publish 80:80parameter is still in effect for this service; However, now when you send requests on port 80, the routing mesh has multiple containers in which to route requests to. The routing mesh acts as a load balancer for these containers. Note that it doesn't matter which node you send the requests. Try it out by curling multiple times. There is no connection between the node that receives the request and the node that that request is routed to. Because of the--mountcommand you used earlier, we can see which node is serving each request routing mesh limitations: The routing mesh can publish only one service on port 80. If you want multiple services exposed on port 80, you can use an external application load balancer outside of the swarm to accomplish this. 4. Check the aggregated logs for the service. $ docker service logs nginx1 We can get aggregated logs for the service by using following command. This aggregates the output from every running container. We can observe that each request was served by a different container.