Changing decimal type precision of a MySQL column on a big table

Ups! Maybe the requirements changed or maybe, just like me, you only later learned what precision means when defining decimal type for a mysql column. If you are like me, let me share something…

Imagine we have a rating column for a product. Before, the rating could be a decimal number between 1.0 and 5.0, now, we want to turn that into rating between 1.0 and 10.0. In the past, since we knew only one decimal point can be behind the number so we set precision to 1:

t.decimal :rating, precision: 1

Well – wrong assumption and since that was set and our table grew to million of records, now, we have a bit of an issue with the new requirement.

To clarify, precision for a decimal in mysql does not mean digits behind the decimal point – it includes all the digits and scale is what defines how many digits are after a decimal point. DECIMAL(M,N) says that M is the number of significant digits that are stored for values, and N represents the number of digits that can be stored following the decimal point.

So, to satisfy the new requirement, we need to modify `rating` column to have new decimal type – DECIMAL(3,1) (ex. 10.0 – 3 digits total, one after decimal point). Unfortunately, changing the type for the existing column will lock the table and we might experience downtime.

We could apply a different fix for this particular problem with ratings, but here we are learning how to change a decimal type of the column without the downtime so let’s focus on that.

So how do we go from what we have to the same named column with correct type? Well, here are the steps:

  1. Add new column with correct type 
  2. Update model to handle writing to both columns whenever possible 
  3. Migrate data for existing records to newly created column
  4. Rename old rating column to old_rating 
  5. Rename ‘rating_new` to rating
  6.  Remove custom getter and setter for rating
  7.  Celebrate!

Each of these steps will ensure the app is still running as expected while we switch from rating (incorrect type) to rating_new (correct_type) and then back to rating column but now with correct type.

Add new column with correct type

t.decimal :rating_new, precision: 3, decimal: 1

Handle writing to new column

Model should now understand that both columns are to be written to. And we make it safe around deploys by always making sure the column exist before we save.

This gives us necessary freedom for the deploy and for later changes on the name of the column.

In the model `/app/models/product.rb`

def rating=(value)	
  raise_rating_save_error = true	

  if Product.column_names.include?('rating_new')	
    write_attribute(:rating, value)	
    raise_rating_save_error = false	
  end	

  if Product.column_names.include?('rating')	
    if value.to_d < 10 || raise_rating_save_error	
      write_attribute(:rating, value)	
    end	
  end	
end

A simple test for the methods could look like:

# Minitest
test "Storing rating under 9, stores to both rating and rating new" do 
  rating = 9  
  product = create(:product, rating: rating) 

  if Product.column_names.include?('rating') 
    assert_equal rating, product[:rating]  
    assert_equal rating, product.rating  
  end 

  if Product.column_names.include?('rating_new') 
    assert_equal rating, product[:rating_new]  
    assert_equal rating, product.rating_new  
  end 
end 

And for reading we taking raiting_new first whenever possible:

def rating	
  self[:rating_new] || self[:rating]	
end

Migrate data for existing records to newly created column

We need to make sure all the data from rating is now present also in the rating_new, we can use whatever approach to data migrations we want – rake task is the simplest and it could look something like:

desc 'Copy product rating to rating_new'
task copy_rating_to_rating_new: :environment do
  Product.where(rating_new: nil).where.not(rating: nil).find_each do |product|
    # choose strategy for updating 
    # background job / direct update
  end
end

Rename old rating column to old_rating 

rename_column :products, :rating, old_rating

Rename rating_new to rating

rename_column :products, :rating_new, rating

 Remove custom getter and setter for rating

Now that we have rating with the correct decimal type, we can remove all the additional code in /app/models/product.rb and /test/models/product.rb

 Celebrate!

Yay! Things work. There is still the old_rating column that we might want to remove in the maintenance window, bu at least for now, we can accept more refined ratings. The old rating column has still the same old name with new precision!