Phusion white papers Phusion overview

Phusion Blog

About the recent oss-binaries.phusionpassenger.com outages

By Hongli Lai on January 14th, 2014

Update January 15 2013: This issue has been fixed in version 4.0.35.

As some of you might have noticed, there were some problems recently with the server oss-binaries.phusionpassenger.com, where we host our APT repository and precompiled binaries for Phusion Passenger. Although it was originally meant to be a simple file server meant for speeding up installation (by avoiding the need to compile Phusion Passenger), it has grown a lot in importance in the past few Phusion Passenger releases, so that any down time causes major problems for many users:

  • Our APT repository has grown more popular than we thought.
  • Many Heroku users are unable to start new dynos as long as the server is down. The Heroku dyno environment does not provide the necessary compiler toolchain, nor the hardware resources, to compile Phusion Passenger. Which is why when run on Heroku, Phusion Passenger downloads binaries from our server.

The server had first gone down on Sunday and was fixed later that day. Unfortunately it had gone down again on Tuesday morning, which we fixed soon after.

We sincerely apologize for this problem. But of course, apologies are not going to cut it. Since the first outage on Sunday we realized just how important this — originally minor — server became. Since Sunday we’ve begun work to solve this issue permanently. It’s clear that relying on a single server is a mistake, so we’re taking the following actions:

  • We’re adjusting the download timeouts in Phusion Passenger so that server problems don’t freeze it indefinitely. This allows Phusion Passenger to detect server problems quicker, and to fall back to compilation, without triggering any timeouts that may abort Phusion Passenger entirely. This work has been implemented yesterday.
  • Instead of trying to download the native_support binary, Phusion Passenger should try to compile it first, because compiling native_support takes less than 1 second. If the correct compiler toolchain is
    installed on the server then it will avoid using the network entirely, so that it’s unaffected by any server outages of ours. This has also been implemented yesterday. (The rest of Phusion Passenger takes longer to compile so we can’t apply the same strategy there.)
  • For Heroku users: having the binaries downloaded at Heroku deploy time, not at dyno boot time, so that Heroku users are less susceptible to download problems. This has been implemented yesterday.
  • Reverting any server changes that we’ve made recently to oss-binaries.phusionpassenger.com, in the hope that it would increase the server’s uptime. The true reason for the downtime is still under investigation, but we’re giving the other items in this list more priority because they have more potential to fix the problem permanently. This has been implemented today.
  • Setting up an Amazon S3 mirror for high availability. If the main server is down, Phusion Passenger should automatically download from the mirror instead. We’re currently working on this. This has been implemented.

The goal is to finish all these items this week and to release a new version that includes these fixes. We’re working around the clock on this.

Workarounds for now

Users can apply the following workaround for now in order to prevent Phusion Passenger from freezing during downloading of binaries:

Edit /etc/hosts and add “127.0.0.1 oss-binaries.phusionpassenger.com”

Phusion Passenger will automatically fall back to compiling if it can’t download binaries.

Unfortunately, this workaround will not be useful for users who rely on our APT repository, or Heroku users. We’re working on a true fix as quickly as we can.