WordPress / EasyEngine failed to start after server reboot
Imagine the following scenario: you use Easyengine to manage your WordPress sites in a VPS. Your server recently had a status change, maybe it got rebooted, or some services went offline and online again. Suddenly none of your WordPress sites works. You went into panic mode.
I encountered this nasty bug last week. It took me a whole night to figure out the problem. A large part of the difficulty is that I could not find any concrete info online describing all facets of this problem, so I have to stitch the pieces together.
I hope this article could help people encountering similar problems in future. Let’s dive in.
My VPS was restarted as part of the routine maintenance. I’ve also encountered this bug when I migrated my VPS to a different node / datacenter. In summary, this bug is almost always guaranteed to happen if the status of your server changed.
Before we talk about solutions, let’s talk about symptoms. Let’s imagine yoursite.net is the WordPress instance’s ID. Your first instinct may be to check the status of your site:
root@li1984-106:~# ee site enable yoursite.net Error: yoursite.net is already enabled! root@li1984-106:~# ee site enable yoursite.net --refresh Enabling site yoursite.net. Error: There was error in enabling yoursite.net. Please check logs.
Now let’s try restarting the site via Easyengine’s CLI. You may encounter this error:
root@li1984-106:~# ee site restart yoursite.net No container found for nginx_1 Error: Nginx test failed
That’s strange. No container for nginx? Let’s take a look at all the running Docker instances:
root@li1984-106:~# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e11e3494df25 easyengine/nginx:v4.1.4 "/usr/bin/openresty …" 3 months ago Exited (128) About an hour ago 80/tcp yoursitenet_nginx_1 9db46d17e382 easyengine/php7.2:v4.1.6 "docker-entrypoint.s…" 3 months ago Up About an hour 9000/tcp yoursitenet_php_1 fc98027815e9 easyengine/postfix:v4.1.5 "postfix start-fg" 3 months ago Exited (128) About an hour ago 25/tcp yoursitenet_postfix_1 d9102ce8619d easyengine/nginx-proxy:v4.1.4 "/app/docker-entrypo…" 3 months ago Exited (2) About an hour ago services_global-nginx-proxy_1 1a934a1b06ae easyengine/newrelic-daemon:v4.0.0 "sh -c '/usr/local/b…" 3 months ago Up About an hour services_global-newrelic-daemon_1 845e81b2e13d easyengine/mariadb:v4.1.3 "docker-entrypoint.s…" 3 months ago Exited (1) About an hour ago 3306/tcp services_global-db_1 ebc58a590bd4 easyengine/redis:v4.1.4 "docker-entrypoint.s…" 3 months ago Up About an hour 6379/tcp services_global-redis_1 de2358afdf25 easyengine/mariadb:v4.0.0 "docker-entrypoint.s…" 3 months ago Restarting (1) 8 seconds ago ee_global-db_1 f5b161945cd0 easyengine/redis:v4.0.0 "docker-entrypoint.s…" 3 months ago Up About an hour 6379/tcp ee_global-redis_1 385f9bddd415 easyengine/cron:v4.0.0 "/usr/bin/ofelia dae…" 22 months ago Up About an hour ee-cron-scheduler
The formatting may be messed up but the important thing is that we have the yoursitenet_nginx_1 container instance at port 80, and more importantly, services_global-nginx-proxy_1 container instance at no port.
Wait a minute… no port? That’s when I noticed something’s fishy.
There are several facets to this problem. Firstly we need to disable the WordPress sites for a safe debug procedure, also to provide a clean slate when we eventually bringing them back up:
root@li1984-106:~# ee site disable yoursite.net Disabling site yoursite.net. cd /Success: Site yoursite.net disabled.
The first and very likely problem is that something’s occupying port 80, thus preventing Easyengine from mounting their global nginx proxy container to that port.
The solution would be to find out what’s using port 80 of your server with the following command:
root@li1984-106:~# lsof -i :80 | grep LISTEN nginx 623 root 6u IPv4 17941 0t0 TCP *:http (LISTEN) nginx 623 root 7u IPv6 17942 0t0 TCP *:http (LISTEN) nginx 624 www-data 6u IPv4 17941 0t0 TCP *:http (LISTEN) nginx 624 www-data 7u IPv6 17942 0t0 TCP *:http (LISTEN)
As you can see, in my case, the server’s default nginx service causing a conflict with Easyengine.
To be specific, the server’s default nginx service is running globally as a native UNIX process, whereas Easyengine’s (supposedly) global nginx service is running inside a Docker container that has a lower priority than the server’s counterpart. Hence Easyengine’s nginx proxy could not use port 80 because it’s already occupied.
I did some digging and turns out one of the maintainers of Easyengine also advised against running the server’s nginx alongside Easyengine.
Let’s stop the server’s nginx process and check on the usage of port 80 again:
root@li1984-106:~# service nginx stop root@li1984-106:~# lsof -i :80 | grep LISTEN # now it prints nothing, meaning nothing's using port 80
Now with the ports sorted out, we need to restart the Easyengine global services that are stuck in a problematic state. This is done by stopping and removing all the Docker container instances and rebuilding them again inside Easyengine’s service directory:
root@li1984-106:~# cd /opt/easyengine/services/ root@li1984-106:/opt/easyengine/services# docker-compose down Stopping services_global-newrelic-daemon_1 ... done Stopping services_global-nginx-proxy_1 ... done Stopping services_global-redis_1 ... done Stopping services_global-db_1 ... done Removing services_global-newrelic-daemon_1 ... done Removing services_global-nginx-proxy_1 ... done Removing services_global-redis_1 ... done Removing services_global-db_1 ... done Network ee-global-frontend-network is external, skipping Network ee-global-backend-network is external, skipping root@li1984-106:/opt/easyengine/services# docker-compose up -d Creating services_global-db_1 ... done Creating services_global-redis_1 ... done Creating services_global-nginx-proxy_1 ... done Creating services_global-newrelic-daemon_1 ... done
And just for your reference, this is what happens when you try to execute this step while your port is occupied:
root@li1984-106:~# cd /opt/easyengine/services/ root@li1984-106:/opt/easyengine/services# docker-compose down Stopping services_global-newrelic-daemon_1 ... done Stopping services_global-redis_1 ... done Removing services_global-nginx-proxy_1 ... done Removing services_global-newrelic-daemon_1 ... done Removing services_global-db_1 ... done Removing services_global-redis_1 ... done Network ee-global-frontend-network is external, skipping Network ee-global-backend-network is external, skipping root@li1984-106:/opt/easyengine/services# docker-compose up -d Creating services_global-newrelic-daemon_1 ... Creating services_global-nginx-proxy_1 ... error Creating services_global-redis_1 ... Creating services_global-newrelic-daemon_1 ... done Creating services_global-redis_1 ... done Creating services_global-db_1 ... done ERROR: for global-nginx-proxy Cannot start service global-nginx-proxy: driver failed programming external connectivity on endpoint services_global-nginx-proxy_1 (2e44b6924f68439bfe437381d90ee5b38d55d91b8c367ff6710eae72e2df51bb): Error starting userland proxy: listen tcp 0.0.0.0:80: bind: address already in use ERROR: Encountered errors while bringing up the project.
Now let’s verify the Docker container status, specifically the previously problematic services_global-nginx-proxy_1. We can see that now it’s using and forwarding traffic from port 80 as expected:
root@li1984-106:~# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES def4bc5316aa easyengine/nginx-proxy:v4.1.4 "/app/docker-entrypo…" 2 minutes ago Up 2 minutes 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp services_global-nginx-proxy_1
The last step would be to re-enable your WordPress sites:
root@li1984-106:~# ee site enable yoursite.net Enabling site yoursite.net. Success: Site yoursite.net enabled. Running post enable configurations. Starting site's services. Global auth exists on admin-tools. Use `ee auth list global` to view credentials. Success: admin-tools enabled for yoursite.net site.
You should see your WordPress site up and running against. Congrats if that’s the case. If not, please leave a comment below to share the specific problem you encountered.