Archive for the ‘Systems’ Category.

The S3 Cookbook

Did you know that you can enable access logging on S3? Did you know that you can add arbitrary metadata to objects in S3? Did you know that you can serve compressed content from S3?

The S3 Cookbook, an e-book written by Scott Patten, has easy-to-follow recipes to do those and about 60 other things (including one that I contributed for backing up a MySQL database to S3). In addition to the recipes, it also has chapters on S3’s architecture, authenticating S3 requests, and an overview of the S3 API.

You can checkout the full table of contents and download a sample chapter.

It’s published on Sopobo, a new platform for authors to self-publish technical books (that also happens to be created by Scott!). Sopobo includes tools for readers to interact with each other and with the author. If I were writing a book that I planned to shop around to a few publishers I’d seriously consider putting it on there first to get some feedback (and make a few bucks) before it got picked up.

Backing up your MySQL database to S3

Here’s a simple recipe, with complete code available on Github (written in Ruby), to automatically back up your MySQL database to Amazon’s S3 storage service, with regular incremental backup.

This article appears in The S3 Cookbook, an e-book written by Scott Patten. It is is based on the automatic MySQL backup in EC2 on Rails.

Solution

You might expect that you could simply upload the MySQL database files to S3. That could work if all your tables were MyISAM tables (assuming you did LOCK TABLES and FLUSH TABLES to make sure the database files were in a consistent state), but it won’t work for InnoDB tables. A more general approach is to use the mysqldump tool to back up the full contents of the database and then use MySQL’s binary log for incremental backups.

The binary log contains all changes to the database that are made since the last full backup, so to restore the database you first restore the full backup (the output from mysqldump) and then apply the changes from the binary log.

Here are Ruby scripts for doing the full backup, incremental backup, and restore.

The full backup script uses mysqldump to do the initial full backup and uploads it’s output to S3. It assumes the bucket is empty.

full_backup.rb:

#!/usr/bin/env ruby

require "common"

begin
  FileUtils.mkdir_p @temp_dir

  # assumes the bucket's empty
  dump_file = "#{@temp_dir}/dump.sql.gz"

  cmd = "mysqldump --quick --single-transaction --create-options -u#{@mysql_user}  --flush-logs --master-data=2 --delete-master-logs"
  cmd += " -p'#{@mysql_password}'" unless @mysql_password.nil?
  cmd += " #{@mysql_database} | gzip > #{dump_file}"
  run(cmd)

  AWS::S3::S3Object.store(File.basename(dump_file), open(dump_file), @s3_bucket)
ensure
  FileUtils.rm_rf(@temp_dir)
end

Once the full backup has been done, the following script can be run frequently (perhaps every 5 or 10 minutes) to rotate the binary log and upload it to S3. It must be run by a user that has read access to the MySQL binary log (see the Discussion section for details on configuring the MySQL binary log path).

incremental_backup.rb:

#!/usr/bin/env ruby

require "common"

begin
  FileUtils.mkdir_p @temp_dir
  execute_sql "flush logs"
  logs = Dir.glob("#{@mysql_bin_log_dir}/mysql-bin.[0-9]*").sort
  logs_to_archive = logs[0..-2] # all logs except the last
  logs_to_archive.each do |log|
    # The following executes once for each filename in logs_to_archive
    AWS::S3::S3Object.store(File.basename(log), open(log), @s3_bucket)
  end
  execute_sql "purge master logs to '#{File.basename(logs[-1])}'"
ensure
  FileUtils.rm_rf(@temp_dir)
end

The following script restores the full backup (mysqldump output) and the subsequent binary log files. It assumes the database exists and is empty.

restore.rb:

#!/usr/bin/env ruby

require "common"

# Retrieve a single file from S3
def retrieve_file(file)
  key = File.basename(file)
  AWS::S3::S3Object.find(key, @s3_bucket)

  open(file, 'w') do |f|
    AWS::S3::S3Object.stream(key, @s3_bucket) do |chunk|
      f.write chunk
    end
  end
end

# List the files matching filename_prefix in the S3 bucket
def list_keys(filename_prefix)
  AWS::S3::Bucket.objects(@s3_bucket, :prefix => filename_prefix).collect{|obj| obj.key}
end

# Retrieve the files matching filename_prefix in the S3 bucket
def retrieve_files(filename_prefix, local_dir)
  list_keys(filename_prefix).each do |k|
    file = "#{local_dir}/#{File.basename(k)}"
    retrieve_file(file)
  end
end

begin
  FileUtils.mkdir_p @temp_dir

  # download the dump file from S3
  file = "#{@temp_dir}/dump.sql.gz"
  retrieve_file(file)

  # restore the dump file
  cmd = "gunzip -c #{file} | mysql -u#{@mysql_user} "
  cmd += " -p'#{@mysql_password}' " unless @mysql_password.nil?
  cmd += " #{@mysql_database}"
  run cmd

  # download the binary log files
  retrieve_files("mysql-bin.", @temp_dir)
  logs = Dir.glob("#{@temp_dir}/mysql-bin.[0-9]*").sort

  # restore the binary log files
  logs.each do |log|
    # The following will be executed for each binary log file
    cmd = "mysqlbinlog --database=#{@mysql_database} #{log} | mysql -u#{@mysql_user} "
    cmd += " -p'#{@mysql_password}' " unless @mysql_password.nil?
    run cmd
  end
ensure
  FileUtils.rm_rf(@temp_dir)
end

The previous three scripts (full_backup.rb, incremental_backup.rb, and restore.rb) all include config.rb which contains all user-specific configuration and common.rb which defines some common functions:

config.rb:

@mysql_database = "your-mysql-database"
@mysql_user = "root"
@mysql_password = "password"
@s3_bucket = "your-s3-bucket-name"
@aws_access_key_id = "your-aws-access-key"
@aws_secret_access_key = "your-aws-secret-access-key-id"
@mysql_bin_log_dir = "/var/lib/mysql/binlog"
@temp_dir = "/tmp/mysql-backup"

common.rb:

require "config"
require "rubygems"
require "aws/s3"
require "fileutils"

def run(command)
  result = system(command)
  raise("error, process exited with status #{$?.exitstatus}") unless result
end

def execute_sql(sql)
  cmd = %{mysql -u#{@mysql_user} -e "#{sql}"}
  cmd += " -p'#{@mysql_password}' " unless @mysql_password.nil?
  run cmd
end

AWS::S3::Base.establish_connection!(:access_key_id => @aws_access_key_id, :secret_access_key => @aws_secret_access_key, :use_ssl => true)

# It doesn't hurt to try to create a bucket that already exists
AWS::S3::Bucket.create(@s3_bucket)

Discussion

To enable binary logging make sure that the MySQL config file (my.cnf) has the following line in it:

log_bin = /var/db/mysql/binlog/mysql-bin

The path (/var/db/mysql/binlog) can be any directory that MySQL can write to, but it needs to match the value of @mysql_bin_log_dir in config.rb.

Note for EC2 users: The root volume (”/”) has limited space, it’s a good idea to use /mnt for your MySQL data files and logs.

The MySQL user needs to have the “RELOAD” and the “SUPER” privileges, these can be granted with the following SQL commands (which need to be executed as the MySQL root user):

GRANT RELOAD ON *.* TO 'user_name'@'%' IDENTIFIED BY 'password';
GRANT SUPER ON *.* TO 'user_name'@'%' IDENTIFIED BY 'password';

(Replace user_name with the value of @mysql_user in config.rb).

You’ll probably want to perform the full backup on a regular schedule, and the incremental backup on a more frequent schedule, but the relative frequency of each will depend on how large your database is, how frequently it’s updated, and how important it is to be able to restore quickly.  This is because for a large database mysqldump can be slow and can increase the system load noticeably, while rotating the binary log is quick and inexpensive to perform. But if your database changes normally contain many updates (as opposed to just inserts) it can be slower to restore from the binary logs.

To have the backups run automatically you could add something like the following to your crontab file, adjusting the times as necessary:

crontab:

# Incremental backup every 10 minutes
*/10 * * * *  root  /usr/local/bin/incremental_backup.rb
# Full backup every day at 05:01
1 5 * * *  root  /usr/local/bin/full_backup.rb

Before this can work however, two small details must be taken care of, which have been left as an exercise for the reader:

  1. When the full backup runs it should delete any binary log files that might already exist in the bucket. Otherwise the restore will try to restore them even though they’re older than the full backup.
  2. The execution of the scripts should not overlap. If the full backup hasn’t finished before the incremental starts (or vice versa) the backup will be in an inconsistent state.

A rock-solid setup for sending SMTP mail from your EC2 web server

(None of this is EC2-centric, but it’s particularly needed on EC2.)

A frequent topic of discussion on the EC2 forums is how to send email reliably, efficiently, and especially without it being marked as spam. I found that even with a valid SPF record most mail sent from an EC2 instance was marked as spam or silently discarded.

This is probably partly because of the lack of matching reverse DNS records. But spam filters can be a bit arbitrary and the easiest way is to relay outgoing mail through a good smtp provider. (I don’t recommend relaying outbound mail through Google Apps, they supposedly have a 500 messages/day limit according to many people on their forums, although I couldn’t find that published anywhere. UPDATE: The info is here, thanks John Ward.)

I have tried a couple of SMTP providers, and I recommend AuthSMTP. They are reliable, have good service, and our mail that’s delivered through them almost never gets marked as spam. Also, they have monthly quotas rather than daily, so you have a chance to increase it before you hit the limit.

Rather than deliver directly to the AuthSMTP mail server from your web app it’s a good idea to deliver to a local queueing mail server, which will forward via the AuthSMTP gateway. Your web app will deliver mail to localhost (or perhaps a dedicated instance if you prefer), port 25.

This has several advantages:

  1. Your web server can finish the request more quickly.
  2. There’s less chance that the mail server will be unavailable. At least the mail will be queued locally until the remote server becomes available again. AuthSMTP has proven to be quite reliable, but it has been unavailable on a couple of occasions.
  3. AuthSMTP limits the number of concurrent connections that you can make. You can easily configure your local mail server to limit the number of outgoing connections to the gateway.

Configuration

I recommend using Postfix, it’s fast, reliable and most importantly, easy to configure. Your Linux distribution will definitely have a Postfix package available (it comes pre-installed on EC2 on Rails). On Debian or Ubuntu install with:

sudo aptitude install postfix

Here’s the config file, /etc/postfix/main.cf:

myhostname = www.YOURDOMAIN.com
mydomain = YOURDOMAIN.com
myorigin = $mydomain

smtpd_banner = $myhostname ESMTP $mail_name
biff = no
append_dot_mydomain = no

alias_maps = hash:/etc/aliases
alias_database = hash:/etc/aliases
mydestination = localdomain, localhost, localhost.localdomain, localhost
mynetworks = 127.0.0.0/8
mailbox_size_limit = 0
recipient_delimiter = +

# SECURITY NOTE: Listening on all interfaces. Make sure your firewall is
# configured correctly
inet_interfaces = all

relayhost = [mail.authsmtp.com]
smtp_connection_cache_destinations = mail.authsmtp.com
smtp_sasl_auth_enable = yes
smtp_sasl_password_maps = static:YOUR_AUTHSMPT_USER_ID:YOUR_AUTHSMTP_PW
smtp_sasl_security_options = noanonymous

default_destination_concurrency_limit = 4

soft_bounce = yes

How simple is that?! Have you ever seen a sendmail config file?

soft_bounce is important because it means that postfix will queue the messages if they’re bounced by the remote gateway for any reason (this is only if it’s bounced by the gateway, not if it’s bounced by the destination server). This would usually be caused by some configuration problem like an authentication failure. If the message is bounced by the eventual destination server (e.g. the mailbox doesn’t exist or is full), or if the destination server can’t be contacted, your local server won’t know about it because the message has already been accepted by the gateway. (It’s probably a good idea to keep track of bounced messages returned by the eventual destination server, see “Don’t spoof the From field” below.)

default_destination_concurrency_limit is so you stay within AuthSMTP’s concurrent connection limit. If you have Postfix running on multiple instances you’ll need to adjust this accordingly.

To see mail that’s stuck in the queue:

mailq

Postfix will automatically try to resend it, but you can force it to be sent immediately using:

sudo postqueue -f

Monitoring

Of course you need to know if anything goes wrong with the mail delivery and it won’t be in your web app’s log. I use scripts in /etc/cron.hourly to check logs and mail me the output if there are errors. But when it comes you mail delivery failure you might have a bit of a chicken-and-egg problem: you can’t use postfix to send the mail if postfix is having problems. Here’s a simple ruby script to send emergency mail via a different mail server. It’s configured to use Google Apps (you’ll need to create a new account to send the mail from), if you don’t use Google Apps you can easily change this to use a different mail server.

Save this as /usr/local/bin/emergency_mail_sender:

#!/usr/bin/env ruby

# This is a simple script to send mail via an alternate server when there are
# errors with the normal queueing mail sender
# The subject is the first command-line arg and the body is received on stdin

#################################

from_address             = "admin_mail_sender@YOURDOMAIN.com"
to_address               = "admin@YOURDOMAIN.com"
smtp_server              = "smtp.gmail.com"
smtp_port                = 587
smtp_mail_from_domain    = "YOURDOMAIN.com"
smtp_account_name        = "admin_mail_sender@YOURDOMAIN.com"
smtp_password            = "YOUR_PASSWORD"
smtp_authentication_type = :plain
debug                    = false

#################################

subject = ARGV[0]
body = $stdin.read

require 'rubygems'
require 'net/smtp'
require 'tlsmail'

exit if body.nil? || body == ""

msgstr = <<END_OF_MESSAGE
Subject: #{subject}

#{body}
END_OF_MESSAGE

Net::SMTP.enable_tls(OpenSSL::SSL::VERIFY_NONE)
  smtp = Net::SMTP.new(smtp_server, smtp_port)
  smtp.set_debug_output $stderr if debug
  smtp.start(smtp_mail_from_domain, smtp_account_name, smtp_password, smtp_authentication_type) do |s|
    s.send_message msgstr, from_address, to_address
end

Here’s a script that can be run by cron every hour to check for mail delivery problems, it uses the emergency_mail_sender script to notify you of the problem. It works on Ubuntu (but it needs the logtail package installed), it might not work on other systems. Save this as /etc/cron.hourly/check_mail_logs

#!/bin/sh
hostname=`hostname -s`
mailer=/usr/local/bin/emergency_mail_sender
/usr/sbin/logtail -f/var/log/mail.warn | $mailer "$hostname: mail warnings"
/usr/sbin/logtail -f/var/log/mail.err | $mailer "$hostname: mail errors"
/usr/sbin/logtail -f/var/log/syslog | grep 'status=' | egrep -v 'status=sent' | $mailer "$hostname: undelivered mail"

SPF

Here’s your SPF record:

v=spf1 include:authsmtp.com include:aspmx.googlemail.com ~all

If you’re not using Google Apps to send mail for your domain remove include:aspmx.googlemail.com. If you want to create your own SPF record there’s a good SPF record generator at spfwizard.com.

Don’t spoof the From field

You should only send mail from somebody@yourdomain.com. If you try to send mail from somebody@pauldowman.com, for example, the receiver will see that pauldowman.com has an SPF record, and that it doesn’t authorize your mail server. Then into the spam folder you go.

To get around this you can send from something like noreply@yourdomain.com, and set the Reply-To header to somebody@pauldowman.com. You can even set the name in the from field, for example: “Paul Dowman via yoursite” <noreply@pauldowman.com>. The Reply-To header will make sure that most people’s replies go to the correct address, but a few will inevitably end up at noreply@yourdomain.com so it’s probably a good idea to set up an autoresponder at that address, or at least make sure the message bounces so the user eventually realizes the mistake.

EC2 on Rails now with multiple instance support, Ubuntu 7.10, 64-bit version, Capistrano tasks

I’ve been working hard on EC2 on Rails, version 0.9.5 is now available. Since my last post here there have been some major changes:

Capistrano tasks

There is now a rubygem available that provides Capistrano tasks to manage the instance. There are tasks to set the server’s timezone, install packages and rubygems, backup, restore, create and delete the database, set the MySQL root password, and more. To use these in your Rails project type:

> sudo gem install ec2onrails

Put Capfile in the root of your rails folder, and put deploy.rb in the config folder.

Then, from the root of your project type:

> cap ec2onrails:setup

This automatically sets your server’s timezone, installs any custom rubygems and Ubuntu packages, and creates your database for you. You can now deploy your rails app as you normally would:

> cap deploy:migrations

Another useful task for testing is:

> cap ec2onrails:restore_db_and_deploy

This recreates the database, restores data from an S3 bucket (specified in your deploy.rb), and deploys the app. I use this to prepare a staging server with the current production data and current production version of the app. After running this task I have an exact copy of my production server. I then deploy the latest version to this server before deploying it to production. This is a good way to be really sure your production deployment won’t fail (especially your migrations).

To see a list of all available Capistrano tasks:

> cap -T

New Ubuntu version

It’s now built with Ubuntu 7.10 “Gutsy”.

Support for new instance types

There are both i386 and x86_64 versions available to support the new EC2 instance types. So you can now use large and extra-large instances.

Multiple instances

The earlier versions only worked if your rails app was running on a single server. That was lame! Now you can have multiple instances using any combination of these roles: web server, app server, primary database. I’m working on adding a MySQL slave role and eventually a Memcache role.

For full instructions and details see the project web site.

EC2 server image update

Just a couple of announcements about my Ruby on Rails server image for EC2:

New version

There is a new server image with some minor changes and bug fixes. For details see the change log. I’ll keep the old image around for a while.

Mailing lists

I have created two Google groups, one for announcements only (new versions, etc.), and one for general discussion.

Ongoing work

I’m working on a build script so the image can be built from source. Then I will provide access to a subversion repository with the source. Some people have expressed an interest in contributing, which I’m really happy about! If you want to get involved, please join the mailing list!