Mastodon admin experiences

2022-11-18

Mastodon has really taken off this month, as a result of Twitter collapsing.

The instance I run with Colin Wright, mathstodon.xyz, has grown from about a thousand active users to just over 5,000 as I write.

There are lots of people running small Mastodon instances who suddenly need to support lots more activity than they're used to. I've had to learn a lot about making a webserver run at scale, so I thought it would be worth writing down what I've learnt.

I didn't take proper notes while fixing things, so I've probably forgotten some important non-obvious stuff. Soz!

This is just the things that stuck in my mind recently.

Better places to look for help

First of all, the documentation on joinmastodon.org describes in broad terms what's involved in running a Mastodon instance, and instructions on how to set up a server.

Once you've got an instance running and you hit a resource bottleneck, there's a page on scaling up your server that describes some of the things you'll need to do.

There are quite a few other pages floating around on GitHub describing bits and bobs, but most of them are starting to get a bit out of date, describing old versions of Mastodon.

Eugen Rochko has just written a page on scaling a Mastodon instance, which I'd expect to be folded into the joinmastodon.org documentation eventually.

Server specs

masthstodon.xyz runs on a single VPS hosted by Vultr. We've gradually upgraded the server over the years, as more users join and we hit different resource bottlenecks.

At the moment, it has 6 vCPUs, 16GB of RAM and a 320GB hard disk. We have a separate 200GB block storage device which holds user media.

It handles the current traffic very well. I think last time we upgraded because everything slowed down, we could have instead got going again by tuning the services. But I didn't know that at the time!

Eventually, I'll have to move the different services on to separate servers, and get my head around something like Kubernetes to manage them. I'm wary of accidentally running up a huge bill by using an automatically-scaling cloud service, so I've stuck with something that has a predictable price.

First bottleneck - Disk space

The first problem we had was that we ran out of disk space on the server.

Mastodon makes a local copy of any media attached to any posts that the instance sees. Preview cards for links in posts often have images attached, which also take up space.

This will grow indefinitely, so you have to regularly run the tools to clear out old files. They can be re-fetched when someone looks at the post they're attached to, so this doesn't ruin the experience.

We have the following hourly cron jobs scheduled:

bin/tootctl media remove --days 2           # Remove media older than 2 days
bin/tootctl preview_cards remove --days 2   # Remove preview card images older than 2 days
bin/tootctl accounts cull                   # Cull remote accounts that no longer exist

There are other things that fill up space, such as custom emoji from other servers. I can't remember if they're cleared up by the media remove command, but either way they don't seem to be as much of a problem as they used to be.

When the server completely runs out of disk space, pretty much everything stops working, and you often can't run the commands you need to run to make space. My usual technique for getting out of that situation is to stop all the mastodon services, run du -h -d 1 in the mastodon/public/system directory to see what's taking up space and delete things until there's just enough space to run bin/tootctl media remove.

The PostgreSQL database takes up a lot of disk space, and you can never really reclaim that: it just grows and grows! At the moment, our database is about 58GB on the server's main disk, while media takes up about 140GB on the separate block storage device. The database needs to be very quick to access, so I don't think it would make sense to move it out to a separate storage device.

Second bottleneck - Sidekiq

I think the hardest part of scaling Mastodon is managing the Sidekiq processes.

These handle the behind-the-scenes jobs that keep the site running: sending and receiving content to and from other instances, tidying things up in the database, converting and scaling media files, and so on.

It operates on several queues: the web service adds a job to the queue, and when it gets to the front of the queue one of the Sidekiq workers picks it up and executes it.

If the queue grows faster than the Sidekiq workers can process it, then everything grinds to a halt. The web interface is still responsive, but nothing new appears.

The standard installation instructions only set up one Sidekiq process, with something like 5 worker threads. A single Sidekiq process prioritises the queues: it will always pick a job from a queue with a higher priority if one is available, and queues with lower priority are only handled once the other queues are empty.

This caused me loads of headaches!

My rule of thumb now is that I should have at least one Sidekiq process for each queue, so that when times are busy, each queue at least makes some progress.

The next mistake I made was that I didn't realise that a process will only handle queues you specifically give it. The result was that we had one queue building up a huge backlog, and a lot of Sidekiq workers sitting idle instead of processing it.

The instructions say that there should only be one process handling the scheduler, but for all the others, they should appear somewhere in the priority list of each Sidekiq process.

To manage all of this, I used supervisord to handle running each of the mastodon services, with several Sidekiq processes. I find this easier to work with than the systemd services that the installation instructions recommend.

Here's the current config file, /etc/supervisor/conf.d/mastodon.conf:

2022/11/supervisor-mathstodon.conf (Source)

[group:mastodon]
programs=web,sidekiq_default,sidekiq_scheduler,sidekiq_push,sidekiq_pull,streaming

[program:web]
command=/home/mastodon/.rbenv/shims/bundle exec puma -C config/puma.rb
user=mastodon
directory=/home/mastodon/live
stdout_logfile=/home/mastodon/live/log/puma.log
stdout_logfile_maxbytes=1MB
stdout_logfile_backups=5
redirect_stderr=true
environment=RAILS_ENV=production,PORT=3000,RAILS_LOG_LEVEL=warn,MAX_THREADS=10,WEB_CONCURRENCY=4
stopasgroup=true

[program:sidekiq_default]
command=/home/mastodon/.rbenv/shims/bundle exec sidekiq -c 30 -q default -q mailers -q push -q pull -q ingress
user=mastodon
directory=/home/mastodon/live
stdout_logfile=/home/mastodon/live/log/sidekiq.log
stdout_logfile_maxbytes=1MB
stdout_logfile_backups=5
redirect_stderr=true
environment=RAILS_ENV=production,DB_POOL=30
stopasgroup=true

[program:sidekiq_scheduler]
command=/home/mastodon/.rbenv/shims/bundle exec sidekiq -c 5 -q scheduler -q default -q mailers -q push -q pull -q ingress
user=mastodon
directory=/home/mastodon/live
stdout_logfile=/home/mastodon/live/log/sidekiq.log
stdout_logfile_maxbytes=1MB
stdout_logfile_backups=5
redirect_stderr=true
environment=RAILS_ENV=production,DB_POOL=5
stopasgroup=true

[program:sidekiq_pull]
command=/home/mastodon/.rbenv/shims/bundle exec sidekiq -c 30 -q pull -q ingress -q push -q default -q mailers
user=mastodon
directory=/home/mastodon/live
stdout_logfile=/home/mastodon/live/log/sidekiq.log
stdout_logfile_maxbytes=1MB
stdout_logfile_backups=5
redirect_stderr=true
environment=RAILS_ENV=production,DB_POOL=30
stopasgroup=true

[program:sidekiq_push]
command=/home/mastodon/.rbenv/shims/bundle exec sidekiq -c 30 -q push -q pull -q ingress -q default -q mailers
user=mastodon
directory=/home/mastodon/live
stdout_logfile=/home/mastodon/live/log/sidekiq.log
stdout_logfile_maxbytes=1MB
stdout_logfile_backups=5
redirect_stderr=true
environment=RAILS_ENV=production,DB_POOL=30
stopasgroup=true

[program:streaming]
command=/usr/bin/npm run start
user=mastodon
directory=/home/mastodon/live
stdout_logfile=/home/mastodon/live/log/streaming.log
stdout_logfile_maxbytes=1MB
stdout_logfile_backups=5
redirect_stderr=true
environment=NODE_ENV=production,PORT=4000
stopasgroup=true

Each Sidekiq worker consumes RAM, and potentially quite a lot of it, so you can't just stick a massive number in the -c argument.

Colin and I spent a few days obsessively staring at the Sidekiq monitoring tool from the Mastodon admin interface. If the Latency column in the "Queues" tab ever gets to more than a couple of minutes, that's a sign that you need to make a change quite urgently.

Third bottleneck - nginx's worker limit

This really annoyed me, and I wonder how many of our earlier problems were caused by this!

We use nginx as a reverse proxy for the Puma web service that serves the web interface, and the streaming API. This means that when someone makes a connection to mathstodon.xyz, nginx accepts it, and acts as a go-between with whichever service is required.

For each connection, nginx opens a system file handle. The default limit for how many open connections nginx can have is very low.

When we grew past 1,000 active users, lots of people started reporting mysterious "Server Error 500" messages. The problem was that nginx couldn't open any more file handles, so rejected new connections.

It seems that you can increase this limit, specified by the directive worker_rlimit_nofile hugely without any bad consequences. In /etc/nginx/nginx.conf, I added the following lines:

worker_rlimit_nofile 100000;
worker_connections 100000;

Everything's been fine since I did that!

I think that nginx running out of file handles has knock-on effects for the other processes: jobs fail, so they stay in the queue, and everything ends up grinding to a halt.