arches.io Restart Heroku Dynos Instead of Letting Them Swap to Disk

08 Aug 2013

Here's what Heroku says about dyno memory usage:

Dynos are available in 1X or 2X sizes and are allocated 512MB or 1024MB respectively.

Dynos whose processes exceed their memory quota are identified by an R14 error in the logs. This doesn’t terminate the process, but it does warn of deteriorating application conditions: memory used above quota will swap out to disk, which substantially degrades dyno performance.

If the memory size keeps growing until it reaches three times its quota, the dyno manager will restart your dyno with an R15 error.

From https://devcenter.heroku.com/articles/dynos on 8/8/13

Heroku dynos swap to disk for up to 3GB. That is not good.

Heroku exposes dyno size to our logs through the (log-runtime-metrics beta feature)[https://devcenter.heroku.com/articles/log-runtime-metrics]. Get that up and running and you'll see dyno size info in your logs like this:

2013-03-15T23:10:13+00:00 heroku[web.1]: source=heroku.2808254.web.1.d97d0ea7-cf3d-411b-b453-d2943a50b456 measure=load_avg_1m val=2.46
2013-03-15T23:10:13+00:00 heroku[web.1]: source=heroku.2808254.web.1.d97d0ea7-cf3d-411b-b453-d2943a50b456 measure=load_avg_5m val=1.06
2013-03-11T20:23:40+00:00 heroku[web.1]: source=heroku.2808254.web.1.d97d0ea7-cf3d-411b-b453-d2943a50b456 measure=load_avg_15m val=0.99
2013-03-15T23:10:13+00:00 heroku[web.1]: source=heroku.2808254.web.1.d97d0ea7-cf3d-411b-b453-d2943a50b456 measure=memory_total val=21.22 units=MB
2013-03-15T23:10:13+00:00 heroku[web.1]: source=heroku.2808254.web.1.d97d0ea7-cf3d-411b-b453-d2943a50b456 measure=memory_rss val=21.22 units=MB
2013-03-15T23:10:13+00:00 heroku[web.1]: source=heroku.2808254.web.1.d97d0ea7-cf3d-411b-b453-d2943a50b456 measure=memory_cache val=0.00 units=MB
2013-03-15T23:10:13+00:00 heroku[web.1]: source=heroku.2808254.web.1.d97d0ea7-cf3d-411b-b453-d2943a50b456 measure=memory_swap val=0.00 units=MB
2013-03-15T23:10:13+00:00 heroku[web.1]: source=heroku.2808254.web.1.d97d0ea7-cf3d-411b-b453-d2943a50b456 measure=memory_pgpgin val=348836 units=pages
2013-03-15T23:10:13+00:00 heroku[web.1]: source=heroku.2808254.web.1.d97d0ea7-cf3d-411b-b453-d2943a50b456 measure=memory_pgpgout val=343403 units=pages

Now we know how big each dyno is. We need a way to monitor the size and restart when we run up against our RAM limit, rather than swapping to disk. There are a few pieces to this puzzle:

# connect to heroku's legacy API
heroku = Heroku::API.new(:api_key => API_KEY)

# get the logs
response = open(heroku.get_logs(app_name, :ps => "web", 'num' => '1000').body).readlines

# do some parsing to find out which dynos are too big
size = response.detect{|l| l.include? "web.1"}.match(/val=(\d*)/)[1]

# restart big dynos
heroku.post_ps_restart(app_name, "ps" => "web.1")

Put all those pieces in a loop, then run the loop in its own heroku dyno by adding a new process in your Procfile:

monitor: bundle exec ruby monitor.rb

The full monitor that I'm using in production is in this gist: https://gist.github.com/arches/6187697