Migrating Rails with
a Large Codebase


presented by

Greg Hurrell and Adam Derewecki

Causes stats

  • Facebook Platform launch partner
  • Platform for collective action with 180m+ users
  • 44k+ commits, first commit December 2006
  • 1.2k+ source code files (down from 1.4k+)
  • 93k+ LOC
  • 3.4k+ example spec suite

Upgrade Roadmap

  • Rails 2.1.0 (May 2008)
  • Rails 2.3.14
  • Rails 3.0.11
  • Rails 3.2.3
  • Rails 3.2.6, 3.2.7 etc

Plan

  • Get spec suite passing
  • Manually QA in development environment
  • Test on staging
  • Gradual roll-out to production
  • Long-lived branches = pain, so be expedient

Anticipated
Pain Points

  • View auditing (due to new escaping behavior)
  • Gem compatibility
  • Asset pipeline
  • Routes (new DSL, and problematic for a Facebook Canvas app with "asymmetric" routes)

ree-1.8.7 to 1.9.3

  • Range#include? => Range#cover? to test Time range
  • File.open needs :binmode => true for binary files (like images)
  • Gem version upgrades
  • Range#step will not iterate over ranges of Time
  • Strings are not Enumerables, must call String#each_line to enumerate
  • Enumerable#map with no block returns an Enumerator instead of an Array
  • Date.today no longer calls Time.now, which broke our Time.warp

ree-1.8.7 to 1.9.3

  • Set operations require an enumerable instead of an instance of the object
  • e.g. Set.new([1,2]) - 1 vs. Set.new([1,2]) - [1]
  • Date.parse no longer understands mm/dd/yyyy format (which Facebook uses)
  • Removed String#to_a
  • when 'condition': 'returnvalue' became when 'condition' then 'returnvalue'
  • {:k1, val, :k2, val2, ...} syntax removed

ree-1.8.7 to 1.9.3

  • 1.9.3 went live at 12:35

mysql gem to mysql2 gem

  • Benchmarking showed that the same query repeated 20x would have 1/20 of the queries running on the order of 100ms instead of .1-.4ms
  • Performing a basic "SELECT * FROM" query on a table with 30k rows and fields of nearly every Ruby-representable data type, then iterating over every row using an #each-like method yielding a block*:
    user       system     total       real
    Mysql2
    0.750000   0.180000   0.930000 (  1.821655)
    do_mysql
    1.650000   0.200000   1.850000 (  2.811357)
    Mysql
    7.500000   0.210000   7.710000 (  8.065871)
  • (* https://github.com/brianmario/mysql2/)

When 'stable' isn't

  • Prepared Statement Cache for database queries
  • New feature in Rails 3.1
  • Allows `SELECT * FROM users WHERE id = 1' to be compiled as `SELECT * FROM users WHERE id = ?' and bind `1' as the param
  • Prepared statements execute faster
  • ... except on MySQL, there is no significant speed gain
  • We launched Rails 3.2.3 on the night of 2012/4/18

The next morning...

  • ActiveRecord::StatementInvalid: Mysql::Error: Can't create more than max_prepared_stmt_count statements (current value: 16382): SELECT `articles`.* FROM `articles` WHERE `articles`.`id` = ? LIMIT 1
  • Google fu => https://github.com/rails/rails/issues/5121
  • Statement cache breaks for :has_many relations: SELECT * FROM articles WHERE user_id = ? and group_id = 1
  • Notice how only one id is parameterized
  • Eventually, you hit the MySQL default limit of 16382 prepared statements

Moving to bleeding edge

  • fd3984 allows the Statement Cache to be disabled
  • We vendored Rails and put HEAD at fd3984
    commit fd398475afb64e362059a500e5cd54d08b9afdee
    Author: Aaron Patterson <aaron.patterson@gmail.com>
    Date:   Tue Feb 21 15:08:54 2012 -0800
    
    prepared statements can be disabled
  • This was also nearly 2 months after the commit, and it hadn't made it to a stable release

Master-Slave

  • In Rails 2.1, `masochism' Gem directed writes to the master and reads to the slave
  • In Rails 2.3, evaluated several alternatives (Octopus, DbCharmer); switched to the master_slave Gem
  • With each subsequent update, we had to repatch master_slave and and our own "ar_extensions" code
  • Because of sitewide optimizations, we decided that SELECTs off of the master were not taxing enough to worry about getting read-from-slaves set up
  • Slaves exist today for failover purposes

Over-siloed database topologies

  • Database topology was too "smart"
  • Some large tables (cause_memberships ~900m rows)
  • `in_silo' broke with every major point release we upgraded to
  • Only silo databases once you hit performance problems

Caching ARel Relations

  • Shame on us for storing ActiveRecord objects in memcache!
  • Model.where(:id => x) is an ActiveRecord::Relation
  • If you cache this, you're caching the un-evaluated query
  • Every time it's retrieved from the cache, it evaluates
  • .. probably not what you wanted memcache to do

Mailers

  • Mailer.deliver_themailer
  • Mailer.themailer().deliver
  • Custom behavior implemented via common superclass
  • Reimplemented in terms of interceptors
  • There was no easy to way to convert this and we ended up verifying each mailer by hand
  • Premailer adds an additional layer of complexity

The Static
Asset Pipeline

A.K.A. Silver Bullet

  • Huge performance boost via concatenation and minification
  • Slow and painstaking process
  • Minimal effort up front to get rolling: symlink, then incrementally migrate and SASSify
  • Huge changes required to deploy process

Rollout

  • Special care needed with async jobs
  • Need separate queues, each running different Rails stack
  • Gradual roll out across application servers
  • Memcache may require invalidation

Takeaways

  • Stay as close to vanilla Rails as possible
  • Be prepared to pull in commits ahead of Rails stable
  • Minimize your external dependencies (lean Gemfile)
  • Simplicity is a win for the product and for the code base ("Dumb is the new clever")
  • Beware ARel and lazily evaluated queries

Was it all
worth it?

  • Huge performance gains (Ruby 1.9, Asset Pipeline, mysql2 adapter)
  • Able to use latest shiny toys (Haml, Sass, Compass, Jasmine, RSpec 2 etc)
  • Years of technical debt paid off, code deleted
  • Developer productivity and happiness higher than ever