Nowadays, zero downtime reload is mandatory for most systems. Especially for the system that is accessed all the time. Stakeholders demand the high availability of the system. So it is bad if the system needs downtime for reloads even if it’s in milliseconds. Socketmaster is there to help your system reload with zero downtime.
What is socketmaster
Socketmaster is an application that enables us to reload our application without downtime. It works by running our application as its child. On reload, socketmaster will start another process and send a termination signal to the old children. So we can handle incoming requests while waiting for active connection on the old processes to finish. By doing this, we won’t lose any request. Zero downtime reload is achieved.
To install socketmaster you can go to https://github.com/zimbatm/socketmaster to download the binary or compile it yourself.
As written on the socketmaster Readme, there are few things that we need to do to integrate socketmaster with our service.
Your server is responsible for:
- opening the socket passed on fd 3
- not crashing
- gracefully shutdown on SIGTERM (close the listener, wait for the child connections to close)
We will cover it all one by one.
To experiment with socketmaster, I created a simple web server in Go. I send continuous requests to the server and reload the service. We should get all successful responses, even when the server is reloaded. Note that even though socketmaster is written in Go, it can also handle any other system.
My web server in Go
I use the following code for the web server. This code will run a simple web server and we will integrate it with socketmaster to enable zero downtime reload.
On lines 7 to 11, you see that our web server will listen to fd 3, as stated on requirement point 1, that is
opening the socket passed on fd 3.
For the next requirement
not crashing, by default Go HTTP server has a panic recovery. But if you want to use your own panic recovery, you can check my post here.
Then for the last requirement
gracefully shutdown on SIGTERM (close the listener, wait for the child connections to close), see the line 22 - 23 of the code above. We listen to
SIGTERM and put it on a channel. On line 26, we wait until the termination signal is received on the channel. The code execution is blocked until the signal is received. If the termination signal is received, the code execution continues to line 28 where the server is gracefully shutdown. By then the socketmaster already spawned a new process to handle the incoming request.
I use this handler for testing. It will sleep for 3 seconds, then return a response with the current pid.
Run the socketmaster & test it
Build the go app. Then use this command to run the goapp with socket master.
socketmaster -listen tcp://:7008 -command=./mygoapp
-listen tcp://:7008socket master will listen on tcp port 7008.
-command=./mygoappthis is the command that will be executed by socketmaster. The socketmaster will execute a new command on reload, then signaling the old process.
When run, it will produce log like this
socketmaster 2021/02/10 15:39:25 Listening on tcp://:7008 socketmaster 2021/02/10 15:39:25 Starting ./mygoapp [./mygoapp] socketmaster 2021/02/10 15:39:26  listening...
To reload it we can send HUP signal to the socketmaster’s pid. In my example, the pid is
kill -HUP <pid>
To test it, I send continous request to the server, then reload it. Below is the result.
I reload the server when serving the request on line 3.
The first thing to make sure from the test is that all the requests got a response from the server. Notice that the pid in response in line 6 changed because it is handled by a different process after the reload.
Run it with systemd
Usually, like other server applications, the socketmaster is run as a daemon. Which means it runs in the background. In Linux, there is
systemd to run it in the background. This is a sample file to run socketmaster as a daemon. This file is saved in
[Unit] Description=<description about this service> [Service] ExecStart=socketmaster -listen tcp://:7008 -command=./mygoapp ExecReload=/bin/kill -HUP $MAINPID [Install] WantedBy=multi-user.target
To start it
systemctl start myservice
To reload it
systemctl reload myservice
To get its status
systemctl status myservice
root@d6bc11757efd:/# systemctl status myservice * myservice.service - <description about this service> Loaded: loaded (/etc/systemd/system/myservice.service; disabled; vendor preset: enabled) Active: active (running) since Wed 2021-02-10 04:21:27 UTC; 1min 7s ago Process: 378 ExecReload=/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS) Main PID: 352 (socketmaster) Tasks: 14 (limit: 2227) Memory: 2.7M CGroup: /docker/d6bc11757efdb63c8a359353764b29e72e752d39b2c6a8a2ca7514d56b6bcb86/system.slice/myservice.service |-352 /usr/local/bin/socketmaster -listen tcp://:7008 -command=./mygoapp `-379 ./mygoapp
With socketmaster we can reload our application without downtime. There are few things to be handled by your application to integrate with socketmaster. But it is simple and should not require much change. Socketmaster is ready to use in production. If you have any questions, leave a comment below.