Phusion white papers Phusion overview

Phusion Blog

Duplicity + S3: easy, cheap, encrypted, automated full-disk backups for your servers

By Hongli Lai on November 11th, 2013


Backups are one of those things that are important, but that a lot of people don’t do. The thought of setting up backups always raised a mental barrier for me for a number of reasons:

  • I have to think about where to backup to.
  • I have to remember to run the backup on a periodic basis.
  • I worry about the bandwidth and/or storage costs.

I still remember the days when a 2.5 GB harddisk was considered large, and when I had to spent a few hours splitting MP3 files and putting them on 20 floppy disks to transfer them between computers. Backing up my entire harddisk would have costed me hundreds of dollars and hours of time. Because of this, I tend to worry about the efficiency of my backups. I only want to backup things that need backing up.

I tended to tweak my backup software and rules to be as efficient as possible. However, this made setting up backups a total pain, and makes it very easy to procrastinate backups… until it is too late.

I learned to embrace Moore’s Law

Times have changed. Storage is cheap, very cheap. Time Machine — Apple’s backup software — taught me to stop worrying about efficiency. Backing up everything not only makes backing up a mindless and trivial task, it also makes me feel safe. I don’t have to worry about losing my data anymore. I don’t have to worry that my backup rules missed an important file.

Backing up desktops and laptops is easy and cheap enough. A 2 TB harddisk costs only $100.

What about servers?

  • Most people can’t go to the data center and attach a hard disk. Buying or renting another harddisk from the hosting provider can be expensive. Furthermore, if your backup device resides on the same location where the data center is, then destruction of the data center (e.g. a fire) will destroy your backup as well.
  • Backup services provided by the hosting provider can be expensive.
  • Until a few years ago, bandwidth was relatively expensive, making backing up the entire harddisk to a remote storage service an unviable option for those with a tight budget.
  • And finally, do you trust that the storage provider will not read or tamper with your data?

Enter Duplicity and S3

Duplicity is a tool for creating incremental, encrypted backups. “Incremental” means that each backup only stores data that has changed since the last backup run. This is achieved by using the rsync algorithm.

What is rsync? It is a tool for synchronizing files between machines. The cool thing about rsync is that it only transfers changes. If you have a directory with 10 GB of files, and your remote machine has an older version of that directory, then rsync only transfers new files or changed files. Of the changed files, rsync is smart enough to only transfer the parts of the files that have changed!

At some point, Ben Escoto authored the tool rdiff-backup, an incremental backup tool which uses an rsync-like algorithm to create filesystem backups. Rdiff-backup also saves metadata such as permissions, owner and group IDs, ACLs, etc. Rdiff-backup stores past versions as well and allows easy rollback to a point in time. It even compresses backups. However, rdiff-backup has one drawback: you have to install it on the remote server as well. This makes it impossible to use rdiff-backup to backup to storage services that don’t allow running arbitrary software.

Ben later created Duplicity, which is like rdiff-backup but encrypts everything. Duplicity works without needing special software on the remote machine and supports many storage methods, for example FTP, SSH, and even S3.

On the storage side, Amazon has consistently lowered the prices of S3 over the past few years. The current price for the US-west-2 region is only $0.09 per GB per month.

Bandwidth costs have also lowered tremendously. Many hosting providers these days allow more than 1 TB of traffic per month per server.

This makes Duplicity and S3 the perfect combination for backing up my servers. Using encryption means that I don’t have to trust my service provider. Storing 200 GB only costs $18 per month.

Setting up Duplicity and S3 using Duply

Duplicity in itself is still a relative pain to use. It has many options — too many if you’re just starting out. Luckily there is a tool which simplifies Duplicity even further: Duply. It keeps your settings in a profile, and supports pre- and post-execution scripts.

Let’s install Duplicity and Duply. If you’re on Ubuntu, you should add the Duplicity PPA so that you get the latest version. If not, you can just install an older version of Duplicity from the distribution’s repositories.

# Replace 'precise' with your Ubuntu version's codename.
echo deb precise main | \
sudo tee /etc/apt/sources.list.d/duplicity.list
sudo apt-get update


# python-boto adds S3 support
sudo apt-get install duplicity duply python-boto

Create a profile. Let’s name this profile “test”.

duply test create

This will create a configuration file in $HOME/.duply/test/conf. Open it in your editor. You will be presented with a lot of configuration options, but only a few are really important. One of them is GPG_KEY and GPG_PW. Duplicity supports asymmetric public-key encryption, or symmetric password-only encryption. For the purposes of this tutorial we’re going to use symmetric password-only encryption because it’s the easiest.

Let’s generate a random, secure password:

openssl rand -base64 20

Comment out GPG_KEY and set a password in GPG_PW:

GPG_PW='<the password you just got from openssl>'

Scroll down and set the TARGET options:

TARGET='s3://s3-<region endpoint name><bucket name>/<folder name>'
TARGET_USER='<your AWS access key ID>'
TARGET_PASS='<your AWS secret key>'

Substitute “region endpoint name” with the host name of the region in which you want to store your S3 bucket. You can find a list of host names at the AWS website. For example, for US-west-2 (Oregon):


Set the base directory of the backup. We want to backup the entire filesystem:


It is also possible to set a maximum time for keeping old backups. In this tutorial, let’s set it to 6 months:


Save and close the configuration file.

There are also some things that we never want to backup, such as /tmp, /dev and log files. So we create an exclusion file $HOME/.duply/test/exclude with the following contents:

- /dev
- /home/*/.cache
- /home/*/.ccache
- /lost+found
- /media
- /mnt
- /proc
- /root/.cache
- /root/.ccache
- /run
- /selinux
- /sys
- /tmp
- /u/apps/*/current/log/*
- /u/apps/*/releases/*/log/*
- /var/cache/*/*
- /var/log
- /var/run
- /var/tmp

This file follows the Duplicity file list syntax. The - sign here means “exclude this directory”. For more information, please refer to the Duplicity man page.

Notice that this file excludes Capistrano-deployed Ruby web apps’ log files. If you’re running Node.js apps on your server then it’s easy to exclude your Node.js log files in a similar manner.

Finally, go to the Amazon S3 control panel, and create a bucket in the chosen region:

Create a bucket on S3

Enter the bucket name

Initiating the backup

We’re now ready to initiate the backup. This can take a while, so let’s open a screen session so that we can terminate the SSH session and check back later.

sudo apt-get install screen

Initiate the backup:

sudo duply test backup

Press Esc-D to detach the screen session.

Check back a few hours later. Login to your server and reattach your screen session:

screen -x

You should see something like this, which means that the backup succeeded. Congratulations!

--------------[ Backup Statistics ]--------------
Errors 0

--- Finished state OK at 16:48:16.192 - Runtime 01:17:08.540 ---

--- Start running command POST at 16:48:16.213 ---
Skipping n/a script '/home/admin/.duply/main/post'.
--- Finished state OK at 16:48:16.244 - Runtime 00:00:00.031 ---

Setting up periodic incremental backups with cron

We can use cron, the system’s periodic task scheduler, to setup periodic incremental backups. Edit root’s crontab:

sudo crontab -e

Insert the following:

0 2 * * 7 env HOME=/home/admin duply main backup

This line runs the duply main backup command every Sunday at 2:00 AM. Note that we set the HOME environment variable here to /home/admin. Duply is run as root because the cronjob belongs to root. However the Duply profiles are stored in /home/admin/.duply, which is why we need to set the HOME environment variable here.

If you want to setup daily backups, replace “0 2 * * 7” with “0 2 * * *”.

Making cron jobs less noisy

Cron has a nice feature: it emails you with the output of every job it has run. If you find that this gets annoying after a while, then you can make it only email you if something went wrong. For this, we’ll need the silence-unless-failed tool, part of phusion-server-tools. This tool runs the given command and swallows its output, unless the command fails.

Install phusion-server-tools and edit root’s crontab again:

sudo git clone /tools
sudo crontab -e


env HOME=/home/admin duply main backup


/tools/silence-unless-failed env HOME=/home/admin duply main backup

Restoring a backup

Simple restores

You can restore the latest backup with the Duply restore command. It is important to use sudo because this allows Duplicity to restore the original filesystem metadata.

The following will restore the latest backup to a specific directory. The target directory does not need to exist, Duplicity will automatically create it. After restoration, you can move its contents to the root filesystem using mv.

sudo duply main restore /restored_files

You can’t just do sudo duply main restore / here because your system files (e.g. bash, libc, etc) are in use.

Moving the files from /restored_files to / using mv might still not work for you. In that case, consider booting your server from a rescue system and restoring from there.

Restoring a specific file or directory

Use the fetch command to restore a specific file. This restores the /etc/password file in the backup and saves it to /home/admin/password. Notice the lack of leading slash in the etc/password argument.

sudo duply main fetch etc/password /home/admin/password

The fetch command also works on directories:

sudo duply main fetch etc /home/admin/etc

Restoring from a specific date

Every restoration command accepts a date, allowing you to restore from that specific date.

First, use the status command to get an overview of backup dates:

$ duply main status
Number of contained backup sets: 2
Total number of contained volumes: 2
 Type of backup set:                            Time:      Num volumes:
                Full         Sat Nov  8 07:38:30 2013                 1
         Incremental         Sat Nov  9 07:43:17 2013                 1

In this example, we restore the November 8 backup. Unfortunately we can’t just copy and paste the time string. Instead, we have to write the time in the w3 format. See also the Time Formats section in the Duplicity man page.

sudo duply test restore /restored_files '2013-11-08T07:38:30'

Safely store your keys or passwords!

Whether you used asymmetric public-key encryption or symmetric password-only encryption, you must store them safely! If you ever lose them, you will lose your data. There is no way to recover encrypted data for which the key or password is lost.

My preferred way of storing secrets is to store them inside 1Password and to replicate the data to my phone and tablet so that I have redundant encrypted copies. Alternatives to 1Password include LastPass or KeePass although I have no experience with them.


With Duplicity, Duply and S3, you can setup cheap and secure automated backups in a matter of minutes. For many servers this combo is the silver bullet.

One thing that this tutorial hasn’t dealt with, is database backups. While we’re backing up the database’s raw files, doing so isn’t a good idea. If the database files were being written to at the time the backup was made, then the backup will contain potentially irrecoverably corrupted database files. Even the database’s journaling file or write-ahead log won’t help, because these technologies are designed only to protect against power failures, not against concurrent file-level backup processes. Luckily Duply supports the concept of pre-scripts. In the next part of this article, we’ll cover pre-scripts and database backups.

I hope you’ve enjoyed this article. If you have any comments, please don’t hesitate to post them below. We regularly publish news and interesting articles. If you’re interested, please follow us on Twitter, or subscribe to our newsletter.

Discuss on Hacker News.

Tuning Phusion Passenger’s concurrency settings

By Hongli Lai on March 12th, 2013

Update February 2015: this blog post has been superseded by the Server Optimization Guide. Please read that guide instead.

Phusion Passenger turns Apache and Nginx into a full-featured application server for Ruby and Python web apps. It has a strong focus on ease of use, stability and performance. Phusion Passenger is built on top of tried-and-true, battle-hardened Unix technologies, yet at the same time introduces innovations not found in most traditional Unix servers. Since mid-2012, it aims to be the ultimate polyglot application server.

We recently received a support inquiry from a Phusion Passenger Enterprise customer regarding excessive process creation activity. During peak times, Phusion Passenger would suddenly create a lot of processes, making the server slow or unresponsive for a period of time. This is because Phusion Passenger spawns and shuts down application processes according to traffic, but they apparently had irregular traffic patterns during peak times. Since their servers were dedicated for 1 application only, the solution was to make the number of processes constant regardless of traffic. This could be done by setting PassengerMinInstances to a value equal to PassengerMaxPoolSize.

The customer then raised the question: what is the best value for PassengerMaxPoolSize? This is a non-trivial question, and the answer encompasses more than just PassengerMaxPoolSize. In this article we’re going to shed more light on this topic.

For simplicity reasons, we assume that your server only hosts 1 web application. Things become more complicated when more web applications are involved, but you can use the principles in this article to apply to multi-application server environments.

Aspects of concurrency tuning

The goal of tuning is usually to maximize throughput. Increasing the number of processes or threads increases the maximum throughput and concurrency, but there are several factors that should be kept in mind.

  • Memory. More processes implies a higher memory usage. If too much memory is used then the machine will hit swap, which slows everything down. You should only have as many processes as memory limits comfortably allow. Threads use less memory, so prefer threads when possible. You can create tens of threads in place of one process.
  • Number of CPUs. True (hardware) concurrency cannot be higher than the number of CPUs. In theory, if all processes/threads on your system use the CPUs constantly, then:
    • You can increase throughput up to NUMBER_OF_CPUS processes/threads.
    • Increasing the number of processes/threads after that point will increase virtual (software) concurrency, but will not increase true (hardware) concurrency and will not increase maximum throughput.

    Having more processes than CPUs may decrease total throughput a little thanks to context switching overhead, but the difference is not big because OSes are good at context switching these days.

    On the other hand, if your CPUs are not used constantly, e.g. because they’re often blocked on I/O, then the above does not apply and increasing the number of processes/threads does increase concurrency and throughput, at least until the CPUs are saturated.

  • Blocking I/O. This covers all blocking I/O, including hard disk access latencies, database call latencies, web API calls, etc. Handling input from the client and output to the client does not count as blocking I/O, because Phusion Passenger has buffering layers that relief the application from worrying about this.

    The more blocking I/O calls your application process/thread makes, the more time it spends on waiting for external components. While it’s waiting it does not use the CPU, so that’s when another process/thread should get the chance to use the CPU. If no other process/thread needs CPU right now (e.g. all processes/threads are waiting for I/O) then CPU time is essentially wasted. Increasing the number processes or threads decreases the chance of CPU time being wasted. It also increases concurrency, so that clients do not have to wait for a previous I/O call to be completed before being served.

With these in mind, we give the following tuning recommendations. These recommendations assume that your machine is dedicated to Phusion Passenger. If your machine also hosts other software (e.g. a database) then you should look at the amount of RAM that you’re willing to reserve for Phusion Passenger and Phusion Passenger-served applications.

Tuning the application process and thread count

In our experience, a typical single-threaded Rails application process uses 100 MB of RAM on a 64-bit machine, and by contrast, a thread would only consume 10% as much. We use this fact in determining a proper formula.

Step 1: determine the system’s limits

First, let’s define the maximum number of (single-threaded) processes, or the number of threads, that you can comfortably have given the amount of RAM you have. This is a reasonable upper limit that you can reach without degrading system performance. We use the following formulas.

In purely single-threaded multi-process deployments, the formula is as follows:

max_app_processes = (TOTAL_RAM * 0.75) / RAM_PER_PROCESS

This formula is derived as follows:

  • (TOTAL_RAM * 0.75): We can assume that there must be at least 25% of free RAM that the operating system can use for other things. The result of this calculation is the RAM that is freely available for applications.
  • / RAM_PER_PROCESS: Each process consumes a roughly constant amount of RAM, so the maximum number of processes is a single devision between the aforementioned calculation and this constant.

In multithreaded deployments, the formula is as follows:

max_app_threads_per_process =

Here, NUMBER_OF_PROCESSES is the number of application process you want to use. In case of Ruby or Python, this should be equal to NUMBER_OF_CPUS. This is because both Ruby and Python have a Global Interpreter Lock so that they cannot utilize multicore no matter how many threads they’re using. By using multiple processes, you can utilize multicore. If you’re using a language runtime that does not have a Global Interpreter Lock, e.g. JRuby or Rubinius, then NUMBER_OF_PROCESSES can be 1.

This formula is derived as follows:

  • (TOTAL_RAM * 0.75): The same as explained earlier.
  • - (NUMBER_OF_PROCESSES * RAM_PER_PROCESS): In multithreaded deployments, the application processes consume a constant amount of memory, so we deduct this from the RAM that is available to applications. The result is the amount of RAM available to application threads.
  • / (RAM_PER_PROCESS / 10): A thread consumes about 10% of the amount of memory a process would, so we divide the amount of RAM available to threads with this number. What we get is the number of threads that the system can handle.

On 32-bit systems, max_app_threads_per_process should not be higher than about 200. Assuming an 8 MB stack size per thread, you will run out of virtual address space if you go much further. On 64-bit systems you don’t have to worry about this problem.

Step 2: derive the applications’ needs

The earlier two formulas were not for calculating the number of processes or threads that application needs, but for calculating how much the system can handle without getting into trouble. Your application may not actually need that many processes or threads! If your application is CPU-bound, then you only need a small multiple of the number of CPUs you have. Only if your application performs a lot of blocking I/O (e.g. database calls that take tens of milliseconds to complete, or you call to Twitter) do you need a large number of processes or threads.

Armed with this knowledge, we derive the formulas for calculating how many processes or threads we actually need.

  • If your application performs a lot of blocking I/O then you should give it as many processes and threads as possible:
    # Use this formula for purely single-threaded multi-process deployments.
    desired_app_processes = max_app_processes
    # Use this formula for multithreaded deployments.
    desired_app_threads_per_process = max_app_threads_per_process
  • If your application doesn’t perform a lot of blocking I/O, then you should limit the number of processes or threads to a multiple of the number of CPUs to minimize context switching:
    # Use this formula for purely single-threaded multi-process deployments.
    desired_app_processes = min(max_app_processes, NUMBER_OF_CPUS)
    # Use this formula for multithreaded deployments.
    desired_app_threads_per_process = min(max_app_threads_per_process, 2 * NUMBER_OF_CPUS)

Step 3: configure Phusion Passenger

You should put the number for desired_app_processes into the PassengerMaxPoolSize option. Whether you want to make PassengerMinInstances equal to that number or not is up to you: doing so will make the number of processes static, regardless of traffic. If your application has very irregular traffic patterns, response times could drop while Passenger spins up new processes to handle peak traffic. Setting PassengerMinInstances as high as possible prevents this problem.

If desired_app_processes is 1, then you should set PassengerSpawnMethod conservative (on Phusion Passenger 3 or earlier) or PassengerSpawnMethod direct (on Phusion Passenger 4 or later). By using direct/conservative spawning instead of smart spawning, Phusion Passenger will not keep an ApplicationSpawner/Preloader process around. This is because an ApplicationSpawner/Preloader process is useless when there’s only 1 application process.

In order to use multiple threads you must use Phusion Passenger Enterprise 4. The open source version of Phusion Passenger does not support multithreading, and neither does version 3 of Phusion Passenger Enterprise. At the time of writing, Phusion Passenger Enterprise 4.0 is on its 4th Release Candidate. You can download it from the Customer Area.

You should put the number for desired_app_threads_per_process into the PassengerThreadCount option. If you do this, you also need to set PassengerConcurrencyModel thread in order to turn on multithreading support.

Possible step 4: configure Rails

Only if you’re on a multithreaded deployment do you need to configure Rails.

Rails is thread-safe since version 2.2, but you need to enable thread-safety by setting config.thread_safe! in config/environments/production.rb.

You should also increase the ActiveRecord pool size because it limits concurrency. You can configure it in config/database.yml. Set the pool size to the number of threads. But if you believe your database cannot handle that much concurrency, keep it at a low value.

Example 1: purely single-threaded multi-process deployment with lots of blocking I/O

Suppose you have 1 GB of RAM and lots of blocking I/O, and you’re on a purely single-threaded multi-process deployment.

# Use this formula for purely single-threaded multi-process deployments.
max_app_processes = (1024 * 0.75)  / 100 = 7.68
desired_app_processes = max_app_processes = 7.68

Conclusion: you should use 7 or 8 processes. Phusion Passenger should be configured as follows:

PassengerMaxPoolSize 7

However a concurrency of 7 or 8 is way too low if your application performs a lot of blocking I/O. You should use a multithreaded deployment instead, or you need to get more RAM so you can run more processes.

Example 2: multithreaded deployment with lots of blocking I/O

Consider the same machine and application (1 GB RAM, lots of blocking I/O), but this time you’re on a multithreaded deployment with 2 application processes. How many threads do you need per process?

Let’s assume that we’re using Ruby and that we have 4 CPUs. Then:

# Use this formula for multithreaded deployments.
= ((1024 * 0.75) - (4 * 100)) / (100 / 10)
= 368 / 10
= 36.8

Conclusion: you should use 4 processes, each with 36-37 threads, so that your system ends up with . Phusion Passenger Enterprise should be configured as follows:

PassengerMaxPoolSize 4
PassengerConcurrencyModel thread
PassengerThreadCount 36

Configuring the web server

If you’re using Nginx then it does not need configuring. Nginx is evented and already supports a high concurrency out of the box.

If you’re using Apache, then prefer the worker MPM (which uses a combination of processes and threads) or the event MPM (which is similar to the worker MPM, but better) over the prefork MPM (which only uses processes) whenever possible. PHP requires prefork, but if you don’t use PHP then you can probably use one of the other MPMs. Make sure you set a low number of processes and a moderate to high number of threads.

Because Apache performs a lot of blocking I/O (namely HTTP handling), you should give it a lot of threads so that it has a lot of concurrency. The number of threads should be at least the number of concurrent clients that you’re willing to serve with Apache. A small website can get away with 1 process and 100 threads. A large website may want to have 8 processes and 200 threads per process (resulting in 1600 threads in total).

If you cannot use the event MPM, consider putting Apache behind an Nginx reverse proxy, with response buffering turned on on the Nginx side. This reliefs a lot of concurrency problems from Apache. If you can use the event MPM then adding Nginx to the mix does not provide many advantages.


  • If your application performs a lot of blocking I/O, use lots of processes/threads. You should move away from single-threaded multiprocessing in this case, and start using multithreading.
  • If your application is CPU-bound, use a small multiple of the number of CPUs.
  • Do not exceed the number of processes/threads your system can handle without swapping.

We at Phusion are regularly updating our products. Want to stay up to date? Fill in your name and email address below and sign up for our newsletter. We won’t spam you, we promise.

The right way to deal with frozen processes on Unix

By Hongli Lai on September 21st, 2012

Those who administer production Unix systems have undoubtedly encountered the problem of frozen processes before. They just sit around, consuming CPU and/or memory indefinitely until you forcefully shut them down.

Phusion Passenger 3 – our high-performance and advanced web application server – was not completely immune to this problem either. That is until today, because we have just implemented a fix which will be part of Phusion Passenger 4.0 (4.0 hasn’t been released yet, but a first beta will appear soon). It’s going into the open source version, not just Phusion Passenger Enterprise, because we believe this is an important change that should benefit the entire community.

Behind our solution lies an array of knowledge about operating systems, process management and Unix. Today, I’d like to take this opportunity to share some of our knowledge with our readers. In this article we’re going to dive into the operating system-level details of frozen processes. Some of the questions answered by this article include:

  • What causes a process to become frozen and how can you debug them?
  • What facilities does Unix provide to control and to shut down processes?
  • How can a process manager (e.g. Phusion Passenger) automatically cleanup frozen processes?
  • Why was Phusion Passenger not immune to the problem of frozen processes?
  • How is Phusion Passenger’s frozen process killing fix implemented, and under what constraints does it work?

This article attempts to be generic, but will also provide Ruby-specific tips.

Not all frozen processes are made equal

Let me first clear some confusion. Frozen processes are sometimes also called zombie processes, but this is not formally correct. Formally, a zombie process as defined by Unix operating systems is a process that has already exited, but its parent process has not waited for its exit yet. Zombie processes show up in ps and on Linux they have “<defunct>” appended to their names. Zombie processes do not consume memory or CPU. Killing them – even with SIGKILL – has no effect. The only way to get rid of them is to either make the parent process call waitpid() on them, or by terminating the parent process. The latter causes the zombie process to be “adopted” by the init process (PID 1), which immediately calls waitpid() on any adopted processes. In any case, zombie processes are harmless.

In this article we’re only covering frozen processes, not zombie processes. A process is often considered frozen if it has stopped responding normally. They appear stuck inside something and are not throwing errors. Some of the general causes are:

  1. The process is stuck in an infinite loop. In practice we rarely see this kind of freeze.
  2. The process is very slow, even during shutdown, causing it to appear frozen. Some of the causes include:
    2.1. It is using too much memory, causing it to hit the swap. In this case you should notice the entire system becoming slow.
    2.2. It is performing an unoptimized operation that takes a long time. For example you may have code that iterates through all database table rows to perform a calculation. In development you only had a handful of rows so you never noticed this, but you forgot that in production you have millions of rows.
  3. The process is stuck in an I/O operation. It’s waiting for an external process or a server, be it the database, an HTTP API server, or whatever.

Debugging frozen processes

Obviously fixing a frozen process involves more than figuring out how to automatically kill it. Killing it just fixes the symptoms, and should be considered a final defense in the war against frozen processes. It’s better to figure out the root cause. We have a number of tools in our arsenal to find out what a frozen process is doing.


crash-watch is a tool that we’ve written to easily obtain process backtraces. Crash-watch can be instructed to dump backtraces when a given process crashes, or dump their current backtraces.

Crash-watch is actually a convenience wrapper around gdb, so you must install gdb first. It dumps C-level backtraces, not language-specific backtraces. If you run crash-watch on a Ruby program it will dump the Ruby interpreter’s C backtraces and not the Ruby code’s backtraces. It also dumps the backtraces of all threads, not just the active thread.

Invoke crash-watch as follows:

crash-watch --dump <PID>

Here is a sample output. This output is the result of invoking crash-watch on a simple “hello world” Rack program that simulates being frozen.

Crash-watch is especially useful for analyzing C-level problems. As you can see in the sample output, the program is frozen in a freeze_process call.

Crash-watch can also assist in analyzing problems caused by Ruby C extensions. Ruby’s mysql gem is quite notorious in this regard because it blocks all Ruby threads while it’s doing its work. If the MySQL server doesn’t respond (e.g. because of network problems, because the server is dead, or because the query is too heavy) then the mysql gem will freeze the entire process, making it even unable to respond to signals. With crash-watch you are able to clearly see that a process is frozen in a mysql gem call.

Phusion Passenger’s SIGQUIT handler

Ruby processes managed by Phusion Passenger respond to SIGQUIT by printing their backtraces to stderr. On Phusion Passenger, stderr is always redirected to the global web server error log (e.g. /var/log/apache2/error.log or /var/log/nginx/error.log), not the vhost-specific error log. If your Ruby interpreter supports it, Phusion Passenger will even print the backtraces of all Ruby threads, not just the active one. This latter feature requires either Ruby Enterprise Edition or Ruby 1.9.

Note that for this to work, the Ruby interpreter must be responding to signals. If the Ruby interpreter is frozen inside a C extension call (such as is the case in the sample program) then nothing will happen. In that case you should use crash-watch or the rb_backtrace() trick below.


If you want to debug a Ruby program that’s not managed by Phusion Passenger then there’s another trick to obtain the Ruby backtrace. The Ruby interpreter has a nice function called rb_backtrace() which causes it to print its current Ruby-level backtrace to stdout. You can use gdb to force a process to call that function. This works even when the Ruby interpreter is stuck inside a C call. This method has two downsides:

  1. Its reliability depends on the state of the Ruby interpreter. You are forcing a call from arbitrary places in the code, so you risk corrupting the process state. Use with caution.
  2. It only prints the backtrace of the active Ruby thread. It’s not possible to print the backtraces of any other Ruby threads.

First, start gdb:

$ gdb

Then attach gdb to the process you want:

attach <PID>

This will probably print a whole bunch of messages. Ignore them; if gdb prints a lot of library names and then asks you whether to continue, answer Yes.

Now we get to the cream. Use the following command to force a call to rb_backtrace():

p (void) rb_backtrace()

You should now see a backtrace appearing in the process’s stdout:

from `freeze_process'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/rack/thread_handler_extension.rb:67:in `call'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/rack/thread_handler_extension.rb:67:in `process_request'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/request_handler/thread_handler.rb:126:in `accept_and_process_next_request'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/request_handler/thread_handler.rb:100:in `main_loop'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/utils/robust_interruption.rb:82:in `disable_interruptions'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/request_handler/thread_handler.rb:98:in `main_loop'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/request_handler.rb:432:in `start_threads'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/request_handler.rb:426:in `initialize'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/request_handler.rb:426

strace and dtruss

The strace tool (Linux) and the dtruss tool (OS X) can be used to see which system calls a process is calling. This is specially useful for detecting problems belonging to categories (1) and (2.2).

Invoke strace and dtruss as follows:

sudo strace -p <PID>
sudo dtruss -p <PID>

Phusion Passenger’s role

Phusion Passenger

Phusion Passenger was traditionally architected to trust application processes. That is, it assumed that if we tell an application process to start, it will start, and if we tell it to stop it will stop. In practice this is not always true. Applications and libraries contain bugs that can cause freezes, or maybe interaction with an external buggy component causes freezes (network problems, database server problems, etc). Our point of view was that the developer and system administrator should be responsible for these kind of problems. If the developer/administrator does not manually intervene, the system may remain unusable.

Starting from Phusion Passenger 3, we began turning away from this philosophy. Our core philosophy has always been that software should Just Work™ with the least amount of hassle, and that software should strive to be zero-maintenance. As a result, Phusion Passenger 3 introduced the Watchdog, Phusion Passenger Enterprise introduced request time limiting, and Phusion Passenger 4 will introduce application spawning time limiting. These things attack different aspects of the freezing process problem:

  1. Application spawning time limiting solves the problem of application processes not starting up quickly enough, or application processes freezing during startup. This feature will be included in the open source version of Phusion Passenger.
  2. Request time limiting solves the problem of application processes freezing during a web request. This feature is not available in the open source version of Phusion Passenger, and only in Phusion Passenger Enterprise.
  3. The Watchdog traditionally only solves the problem of Phusion Passenger helper processes (namely the HelperAgent and LoggingAgent) freezing during web server shutdown. Now, it also solves the problem of application processes freezing during web server shutdown. These features will too be included in the open source version of Phusion Passenger.

The shutdown procedure and the fix

When you shut down your web server, the Watchdog will be the one to notice this event. It will send a message to the LoggingAgent and the HelperAgent to tell them to gracefully shut down. In turn, the HelperAgent will tell all application processes to gracefully shut down. The HelperAgent does not wait until they’ve actually shut down. Instead it will just assume that they will eventually shut down. It is for this reason that even if you shut down your web server, application processes may stay behind.

The Watchdog was already designed to assume that agent processes could freeze during shutdown, so it gives agent processes a maximum amount of time during which they shut down gracefully. If they don’t, then the Watchdog will forcefully kill them with SIGKILL. It wouldn’t just kill the agent processes, but also all application processes.

The fix was therefore pretty straightforward. Always have the watchdog kill applications processes, even if the HelperAgent terminates normally and in time. The final fix was effectively 3 lines of code.

Utilizing Unix process groups

The most straightforward method to shutdown application processes would be to maintain a list of their PIDs and then killing them one-by-one. The Watchdog however uses a more powerful and little-used Unix mechanism, namely process groups. Let me first explain them.

A system can have multiple process groups. A process belongs to exactly one process group. Each process group has exactly 1 “process group leader”. The process group ID is equal to the PID of the group leader.

The process group that a process belongs to is inherited from its parent process upon forking. However a process can be a member of any process group, no matter what group the parent belongs to.

Same-colored processes denote processes belonging to the same process group. As you can see in the process tree on the right, process group membership is not constrained by parent-child relationships.

You can simulate the process tree on the right using the following Ruby code.

top = $$
puts "Top process PID: #{$$}"
pid = fork do
  pid = $$
  pid2 = fork do
    pid2 = $$
    pid3 = fork do
      pid3 = $$
      sleep 0.1
      puts "#{top} belongs to process group: #{Process.getpgid(top)}"
      puts "#{pid} belongs to process group: #{Process.getpgid(pid)}"
      puts "#{pid2} belongs to process group: #{Process.getpgid(pid2)}"
      puts "#{pid3} belongs to process group: #{Process.getpgid(pid3)}"

    # We change process group ID of pid3 after the fact!
    Process.setpgid(pid3, top)


As you can see, you can change the process group membership of any process at any time, provided you have the permissions to do so.

The kill() system call provides a neat little feature: it allows you to send a signal to an entire process group! You can already guess where this is going.

Whenever the Watchdog spawns an agent process, it creates a new process group for it. The HelperAgent and the LoggingAgent are both process group leaders of their own process group. Since the HelperAgent is responsible for spawning application processes, all application processes automatically belong to the HelperAgent’s process group, unless the application processes themselves change this.

Upon shutdown, the Watchdog sends SIGKILL to HelperAgent’s and the LoggingAgent’s process groups. This ensures that everything is killed, including all application processes and even whatever subprocesses they spawn.


In this article we’ve considered several reasons why processes may freeze and how to debug them. We’ve seen which mechanisms Phusion Passenger provides to combat frozen processes. We’ve introduced the reader to Unix process groups and how they interact with Unix signal handling.

A lot of engineering has gone into Phusion Passenger to ensure that everything works correctly even in the face of failing software components. As you can see, Phusion Passenger uses proven and rock-solid Unix technologies and concepts to implement its advanced features.

Support us by buying Phusion Passenger Enterprise

It has been a little over a month now since we’ve released Phusion Passenger Enterprise, which comes with a wide array of additional features not found in the open source Phusion Passenger. However as we’ve stated before, we remain committed to maintaining and developing the open source version. This is made possible by the support of our customers; your support. If you like Phusion Passenger or if you want to enjoy the premium Enterprise features, please consider buying one or more Phusion Passenger Enterprise licenses. Development of the open source Phusion Passenger is directly funded by Phusion Passenger Enterprise sales.

Let me know your thoughts about this article in the comments section and thank you for your support!

Mail in 2012 from an admin’s perspective

By Luuk Hendriks on September 10th, 2012

There has been a lot of discussion about the current state of e-mail and how we use it. Many blog posts popped up, describing what has changed in the past several decades and how we could adapt to the new way e-mail is used. Different approaches and possible solutions are presented, but they all have one thing in common: they try to solve problems on the side of the end-user, the one reading his or her e-mail. But over those same several decades, a lot has changed for the system administrators too. And while most of the e-mail users don’t know (or even want to know) what happens behind the scenes, admins have sleepless nights to make sure you get your e-mails.

This blog post comes from the fact that I had an unnecessary hard time setting up SRS in Postfix. If you want to know how to do just that, please find the instructions in the latter part of this post. I’ll start with a short intro on what SPF and SRS are, why we need them, and why you might need them too most likely. Lastly a short introduction to DKIM is given, which is another nice way of improving the e-mail system. All of the described here are techniques to improve your and others’ mailing experience by verifying the sender: this helps identifying and reducing spam, and also prevents legitimate servers being wrongfully accused for spamming or other unwanted behaviour. The techniques are not new, in fact they have proven themselves in the last couple of years. Many e-mail service providers check SPF and DKIM, and Google Gmail’s spam filtering mechanisms take the validity into account. They even enforce DKIM validity on eBay and Paypal messages as these domains are obviously interesting for phising and abuse alike.

Rocket Science

Is setting up an e-mail server that hard? In the basics, no. Setting up an MTA to send a message to another MTA is not anywhere near rocket science. And while e-mail is nothing more than exchanging aforementioned messages, we are already halfway, right? But the challenges administrators have to deal with are spam (end-users know that one of course) and all the problems that come with it. Ever got an e-mail from Or got a complaint that sent an e-mail, but that someone doesn’t even exist? Forging of e-mail headers (especially the ‘From’ field) is easy, but as e-mail is a vital part of almost anyone’s daily routine, it has become problematic. So what can we do from an administrators point of view, to keep e-mail usable in 2012?

Identify and verify the mailing servers

As the forging of headers happens outside the control of the admin, methods of fighting it are limited by nature. Imagine a scenario where you control server A, but server B sends a forged e-mail to server C making it look like it came from server A (by forging the ‘From’ header). The message does not even pass your server A, so what can you do?
Only one thing in the forged e-mail is under your control: the domain name. And that is where the Sender Policy Framework (SPF) relies on. By adding special SPF records pointing to your mail servers to the DNS system, a receiving party can check whether the sending server is allowed to send mail from that domain name. Continuing on the previous example: hopefully, server C uses SPF to check the authenticity of the e-mail it got from B. As an admin, I published DNS records for server A containing the IP of server A (so A is allowed to send mail for, but B’s IP is not in that record. Server C checks the SPF record from and notices B is not allowed to use that domain name. A forged e-mail! Of course it is up to C to decide which way to handle the e-mail, be it dropping it, or bouncing it, or whatever they see fit.
Hopefully it has become clear that SPF is something that is not fully under your control. The only thing you can do, is provide the correct records in your DNS system, and hope that other mail servers to their SPF checks. To reduce spam towards your servers, you should of course do the SPF checking as well. I won’t go into detail about setting the records, but for the content and format check out this clear overview.

So far so good, right? Using SPF you can reduce the incoming spam, and you help other administrators to find out whether e-mails really came from your domain. But there is one problem: forwarding.
At Phusion, some of the guys like to have e-mail forwarded to their Gmail account. The forwarding itself is a breeze: Postfix has a map of virtual addresses pointing to the right Gmail address. But in an SPF-scenario, the following happens: sends an e-mail to, which is forwarded to and Google checks SPF records, but the check will fail! The mail originates from, so this domain is queried for an SPF record. But, the mail is received from the servers at It is a shortcoming of the SPF paradigm, with a solution called the Sender Rewriting Scheme (SRS). As the name suggests, SRS rewrites the sender (on the envelope), so the next server/MTA in the chain can check the SPF records of the right domain, instead of the original domain where the e-mail originated from. So in our described scenario, will become on the envelope, so when it is received by and Google’s server can check the SPF records for (as opposed to the records of Now, the check will pass.

As only the ‘Envelope’ is rewritten, the receiving end-user will see the normal ‘From’ headers and therefore the normal name of the sender. Almost completely invisible for the end-users, but some clients can show whether the e-mail comes from a valid source, like Gmail adds ‘via’ to the sender’s address.
Ever saw that one? That is SPF, maybe complemented by SRS. Or maybe you’re familiar with the Gmail popup shown above, telling you the SPF check passed (Mailed-by) but also the DKIM-signature (more on that later) was valid (Signed-by).

What is the hassle then? Maybe you already knew all of this, and you might think it is not that big of a deal. Maybe your MTA does this out of the box (please tell me then), but at Phusion we use Postfix and it costed me some sweat to get SRS up and running. To me, it seemed like almost nobody in the world uses SRS. If I’m new to a technology or tool, the first thing I do is engage the Google-fu to find out what other people use. What is hot on Github, what do people on StackOverflow suggest, which tools and software is still actively supported, that kind of things. The most recent stuff on SRS was from 2003 or 2004, and of course, mail is something well established but was there really no one interested in SRS for the last 8 years? Or is Postfix the problem?
Then we found pfix-tools (Github). It seems people are interested in SRS after all, or at least the developers behind this tool suite. As their readme states: “pfixtools is a suite of tools aiming at complementing postfix, to make it even more customizable, while keeping really high performance levels.” It contains pfix-srsd, a daemon to do the SRS rewritings for Postfix. Sweet! For those of you using Postfix, I will try to describe briefly but clearly how to set-up pfix-srsd.

Installation details: SRS for Postfix on Debian

FYI, this was done on a Debian 6 system.

Download and compile pfix-srsd

tar -xzvf pfixtools-0.8
aptitude install libev3 libev-dev libsrs2-0 libsrs2-dev libpcre3-dev
git clone
cd pfixtools
git submodule init
git submodule update

cd common
cd ../pfix-srsd

Finally, move the resulting binary pfix-srsd to e.g. /usr/local/bin/ (the following steps will use this path, so be careful if you change it)

Create config and secrets files


In /etc/postfix/pfix-srs.secrets, enter lots of random stuff, preferably from some kind of random generator

Fix the permissions:

chown postfix:postfix /etc/postfix/pfix-srs.secrets
chmod 400 /etc/postfix/pfix-srs.secrets

In /etc/postfix/ you can put addresses that should not be SRS’ed.
Always compile this file after you’ve changed it with postmap /etc/postfix/


NB: We use daemontools, but an example sysinit script is available here.



if [ -f $PFIXSRSD_CONFIG ]; then
echo "Error reading config file, aborting.." 
exit 1

exec setuidgid postfix $DAEMON -f $OPTIONS $DOMAIN $SECRETS >> /var/log/pfix-srsd.log 2>&1

Don’t forget to make it executable by chmod +x run

Modify Postfix config

add to /etc/postfix/

# SRS Remapping
recipient_canonical_maps = hash:/etc/postfix/, tcp:
recipient_canonical_classes = envelope_recipient
sender_canonical_maps = hash:/etc/postfix/, tcp:
sender_canonical_classes = envelope_sender

Reload and test

Afterwards, reload the postfix service (and maybe your firewall):

invoke-rc.d postfix reload

To test, you can send a test message to a Gmail account and check the headers. It should contains something alike the following:

Received-SPF: pass ( domain of designates as permitted sender);

Or use some of the available online testing tools, like this one. That might come in handy when you also get ur DKIM going, which you should!

Thanks and credits to Christoph Fischer who wrote a large part of the pfix-tools installation and configuration in German on his website.

What else can we do: an intro to DKIM

Another way to validate messages and their senders is DKIM, short for DomainKeys Identified Mail. It is based on digital signatures based on public-key cryptography. The public key is again available via DNS, the private key is used by the sender to create a signature and adding it in the DKIM-Signature: header. This header furthermore contains a specification of which fields are used to generate the signature, the signing algorithm, and of course the domain name. If you have a grasp of how public key cryptography works, you probably see the point already. But lets see how things are in the real world.

Google uses DKIM on their Gmail servers, not very surprisingly. Search for an e-mail from in your mail, and look for the headers. You want something like this:

DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20120113;

What do we see? The (mostly used) rsa-sha256 algorithm (a), the domain is (d), the headers used to construct the signature, the body hash (bh), and the signature itself (b). But very important is the selector (s), which is used in the DNS query. The following command –make sure dig is available on your system– shows the DKIM public key that we need to verify whether the signature is valid:

luuk@polecat> dig +short TXT
"k=rsa\; p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA1Kd87/UeJjenpabgbFwh+
ZB5DekAo5wMmk4wimDO+U8QzI3SD0" "7y2+07wlNWwIt8svnxgdxGkVbbhzY8i+RQ9DpSVpPbF7

Notice that this is a nice way to check if your DNS settings are as you expect them to be. To fetch the record, the selector (s) is used in conjunction with the domain (d): both were in the DKIM-signature header from the received e-mail. The ._domainkey. in between is defined by the standard (RFC6376) thus always should be there.

DKIM in practice

Not to hard in theory, is it? As an administrator you should take care of only two things: publish the TXT record in your DNS, and add the signatures to outgoing mails (and of course check the signatures on incoming mails, but that is taken care of by the same daemon most likely).

If you are interested in an exact and detailed way of setting up DKIM signing (again, for Debian/Postfix), please check out this guide at In a nutshell, you need the dkim-filter package that provides, amongst others, dkim-genkey and the dkim-filter daemon listening at a socket on your system. In the Postfix config a milter is specified, so all mail is proxied through the DKIM signing daemon. This way, signatures on incoming mails are verified, and signatures are added before outgoing messages leave your system. Rest is up to the receiving (or intermediate) parties to check your signatures, using your published DNS records. Easy as pie, effective like a horse.

Useful links:

Phusion Passenger & running multiple Ruby versions

By Hongli Lai on September 21st, 2010

UPDATE February 27 2013: this article has been obsolete. Phusion Passenger supports multiple Ruby interpreters as of version 4.0.0. The PassengerRuby config option has been made a per-virtual host option, so you can customize your Ruby interpreter on a per-application basis.

One of the questions we’ve been getting a lot lately is whether it’s possible to run multiple Ruby versions with Phusion Passenger, e.g. have app A and B run on Ruby 1.8.7 while having app C run on Ruby 1.9.2. In previous versions of Phusion Passenger there were ways to get around that, e.g. by mixing in Mongrels. As of Phusion Passenger 3 you can run all components as Phusion Passenger.

The setup that we currently recommend is to combine Phusion Passenger for Apache or Phusion Passenger for Nginx, with Phusion Passenger Standalone. One must first identify the Ruby version that you use most. One then proceeds with setting up Phusion Passenger for Apache or Phusion Passenger for Nginx to use that Ruby version. All applications that are to use a different Ruby version can be served separately through Phusion Passenger Standalone and hook into the main web server via a reverse proxy configuration.


Suppose that you have four websites:

  •, to run on Ruby 1.8.7.
  •, to run on Ruby 1.8.7.
  •, to run on Ruby 1.9.1.
  •, to run on Ruby 1.9.2.

And suppose that you’re using RVM to manage these Rubies.

Setting up and (Ruby 1.8.7)

The Ruby version that you use most is Ruby 1.8.7, so you setup Apache or Nginx to use Ruby 1.8.7 and to serve and

rvm use 1.8.7
gem install passenger --pre

# Then one of:
# Partial Apache configuration
PassengerRuby /home/someuser/.rvm/wrappers/ruby-1.8.7/ruby

<VirtualHost *:80>
    DocumentRoot /webapps/

<VirtualHost *:80>
    DocumentRoot /webapps/
# Partial Nginx configuration
passenger_ruby /home/someuser/.rvm/wrappers/ruby-1.8.7/ruby

server {
    listen 80;
    root /webapps/;
    passenger_enabled on;

server {
    listen 80;
    root /webapps/;
    passenger_enabled on;
} and have now been deployed on Phusion Passenger for Apache or Phusion Passenger for Nginx, and running on Ruby 1.8.7.

Setting up (Ruby 1.9.1)

The next step is to start in Phusion Passenger Standalone using Ruby 1.9.1. Since port 80 is already used by Apache or Nginx, we use start Phusion Passenger Standalone on a different port.

rvm use 1.9.1
gem install passenger --pre
cd /webapps/
passenger start -a -p 3000 -d is now running on localhost port 3000 as a background daemon. Next, connect it to Apache or Nginx via a reverse proxy.

# Partial Apache configuration
<VirtualHost *:80>
    DocumentRoot /webapps/
    PassengerEnabled off
    ProxyPass /
    ProxyPassReverse /
# Partial Nginx configuration
server {
    listen 80;
    root /webapps/;
    location / {
        proxy_set_header Host $host;

Setting up (Ruby 1.9.2)

We do the same thing for Port 3000 is already in use, so we assign it to port 3001.

rvm use 1.9.2
gem install passenger --pre
cd /webapps/
passenger start -a -p 3001 -d

Then we hook it up to the web server via reverse proxying.

# Partial Apache configuration
<VirtualHost *:80>
    DocumentRoot /webapps/
    PassengerEnabled off
    ProxyPass /
    ProxyPassReverse /
# Partial Nginx configuration
server {
    listen 80;
    root /webapps/;
    location / {
        proxy_set_header Host $host;

Performance tip

Phusion Passenger Standalone also supports listening on a Unix domain socket instead of a TCP socket. Unix domain sockets are significantly faster than TCP sockets.

Only Nginx supports reverse proxying to Unix domain sockets; Apache does not support this.

In order to make Phusion Passenger Standalone listen on a Unix domain socket, you need to run it with Nginx 0.8.21 or higher. In fact we contributed support for Unix domain sockets to Nginx specifically for this feature!

Start Phusion Passenger Standalone like this:

passenger start --socket /tmp/ -d --nginx-version 0.8.50

The --socket option tells Phusion Passenger to bind to the given Unix domain socket. The --nginx-version option tells Phusion Passenger Standalone to use Nginx 0.8; 0.7 is the default.

Next you must setup an Nginx upstream block with the Unix domain socket as the only entry. Then setup Nginx to reverse proxy to the created upstream block.

upstream fries_upstream {
    server unix:/tmp/;

server {
    listen 80;
    root /webapps/;
    location / {
        proxy_pass http://fries_upstream;
        proxy_set_header Host $host;

It should be noted that Phusion Passenger for Apache and Phusion Passenger for Nginx already use Unix domain sockets internally for optimal performance. In fact we’ve done this since version 1.0. We plan on elaborating more about our internal technologies in a future blog post.


Those of you who are familiar with Mongrel and Thin will see the similarity. Indeed, Phusion Passenger Standalone was designed to be able to used in a reverse proxy environment such as the one demonstrated in this article. Unlike Mongrel and Thin clusters however you only need a single Phusion Passenger Standalone instance per web application and thus only a single address to proxy to. Phusion Passenger Standalone will take care of starting and stopping application processes for you and will make sure processes are restarted when they crash.