San Francisco, USA

5214F Diamond Heights Blvd #553
San Francisco, CA 94131

Pune, India

203, Jewel Towers, 2nd Floor
Lane Number 5, Koregaon Park
Pune 411001, India

301 - 275 - 3997
hello@BigBinary.com

Rails 6.1 adds *_previously_was attribute methods

This blog is part of our Rails 6.1 series.

Rails 6.1 adds *_previously_was attribute methods for dirty tracking the previous attribute value after the model is saved or reset. *_previously_was returns the previous attribute value that was changed before the model was saved

Before Rails 6.1, to retrieve the previous attribute value, we used *_previous_change or previous_changes.

Here is how it can be used.

Rails 6.0.0

>> user = User.new
=> #<User id: nil, name: nil, email: nil, created_at: nil, updated_at: nil>

>> user.name = "Sam"

# *_was returns the original value. In this example, the name was initially nil.
>> user.name_was
=> nil
>> user.save!

# After save, the original value is set to "Sam". To retrieve the
# previous value, we had to use `previous_changes`.
>> user.previous_changes[:name]
=> [nil, "Sam"]

Rails 6.1.0

>> user = User.find_by(name: "Sam")
=> #<User id: 1, name: "Sam", email: nil, created_at: "2019-10-14 17:53:06", updated_at: "2019-10-14 17:53:06">

>> user.name = "Nick"
>> user.name_was
=> "Sam"

>> user.save!

>> user.previous_changes[:name]
=> ["Sam", "Nick"]

# *_previously_was returns the previous value.
>> user.name_previously_was
=> "Sam"

# After reload, all the dirty tracking
# attributes is reset.
>> user.reload
>> user.name_previously_was
=> nil

Check out the pull request for more details on this.


Rails 6 adds guard against DNS Rebinding attacks

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

In a DNS Rebinding attack, a malicious webpage runs client-side script when it is loaded, to attack endpoints within a given network.

What is DNS Rebinding attack?

DNS Rebinding can be summarized as follows.

  • An unsuspecting victim is tricked into loading rebinding.network which is resolved by a DNS server controlled by a malicious entity.
  • Victims web browser sends a DNS query and gets the real IP address, say 24.56.78.99 of http://rebinding.network. This DNS server also sets a very short TTL value ( say 1 second ) on the response so that the client won’t cache this response for long.
  • The script on this webpage cannot attack services running in local network due to CORS restrictions imposed by victims web browser. Instead it starts sending a suspicious POST request to http://rebinding.network/setup/reboot with a JSON payload {params: factory-reset}.
  • First few requests are indeed sent to 24.56.78.99 (real IP address), with the DNS info from the cache, but then the browser sends out a DNS query for rebinding.network when it observes that the cache has gone stale.
  • When the malicious DNS server gets the request for a second time, instead of responding with 24.56.78.99 (which is the real IP address of rebinding.network), it responds with 192.168.1.90, an address at which, a poorly secured smart device runs.

Using this exploit, an attacker is able to factory-reset a device which relied on security provided by local network.

This attack is explained in much more detail in this blog post.

How does it affect Rails?

Rails’s web console was particularly vulnerable to a Remote Code Execution (RCE) via a DNS Rebinding.

In this blog post, Ben Murphy goes into technical details of exploiting this vulnerability to open Calculator app (only works in OS X).

How does Rails 6 mitigate DNS Rebinding?

Rails mitigates DNS Rebinding attack by maintaining a whitelist of domains from which it can receive requests. This is achieved with a new HostAuthorization middleware. This middleware leverages the fact that HOST request header is a forbidden header.

# taken from Rails documentation

# Allow requests from subdomains like `www.product.com` and
# `beta1.product.com`.
Rails.application.config.hosts << ".*\.product\.com/"

In the above example, Rails would render a blocked host template, if it receives requests from domains outside of above whitelist.

In development environment, default whitelist includes 0.0.0.0/0, ::0 (CIDR notations for IPv4 and IPv6 default routes) and localhost. For all other environments, config.hosts is empty and host header checks are not done.


Rails 6 adds ActiveStorage::Blob#open

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

Rails 6 adds ActiveStorage::Blob#open which downloads a blob to a tempfile on disk and yields the tempfile.

>> blob = ActiveStorage::Blob.first
=> <ActiveStorage::Blob id: 1, key: "6qXeoibkvohP4VJiU4ytaEkH", filename: "Screenshot 2019-08-26 at 10.24.40 AM.png", ..., created_at: "2019-08-26 09:57:30">

>> blob.open do |tempfile|
>>   puts tempfile.path  #do some processing
>> end
# Output: /var/folders/67/3n96myxs1rn5q_c47z7dthj80000gn/T/ActiveStorage-1-20190826-73742-mve41j.png

Processing a blob

Let’s take an example of a face detection application where the user images are uploaded. Let’s assume that the images are uploaded on S3.

Before Rails 6, we will have to download the image in system’s memory, process it with an image processing program and then send the processed image back to the S3 bucket.

The overhead

If the processing operation is successful, the original file can be deleted from the system. We need to take care of a lot of uncertain events from the download phase till the phase when the processed image is created.

ActiveStorage::Blob#open to the rescue

ActiveStorage::Blob#open, abstracts away all this complications and gives us a tempfile which is closed and unlinked once the block is executed.

 1. open takes care of handling all the fanfare of getting a blob object to a tempfile.  2. open takes care of the tempfile cleanup after the block.

>> blob = ActiveStorage::Blob.first
>> blob.open do |tempfile|
>>   tempfile  #do some processing
>> end
   # once the given block is executed
   # the tempfile is closed and unlinked

=> #<Tempfile: (closed)> 

By default, tempfiles are created in Dir.tmpdir directory, but ActiveStorage::Blob#open also takes an optional argument tmpdir to set a custom directory for storing the tempfiles.

>> Dir.tmpdir
=> "/var/folders/67/3n96myxs1rn5q_c47z7dthj80000gn/T"

>> blob = ActiveStorage::Blob.first
>> blob.open(tmpdir: "/desired/path/to/save") do |tempfile|
>>   puts tempfile.path  #do some processing
>> end

Here is the relevant commit.


Rails 6 adds ActionMailer#email_address_with_name

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

When using ActionMailer::Base#mail, if we want to display name and email address of the user in email, we can pass a string in format "John Smith" <john@example.com> in to, from or reply_to options.

Before Rails 6, we had to join name and email address using string interpolation as mentioned in Rails 5.2 Guides and shown below.

  email_with_name = %("John Smith" <john@example.com>)
  mail(
    to: email_with_name,
    subject: 'Hey Rails 5.2!'
  )

Problem with string interpolation is it doesn’t escape unexpected special characters like quotes(“) in the name.

Here’s an example.

Rails 5.2

irb(main):001:0> %("John P Smith" <john@example.com>)
=> "\"John P Smith\" <john@example.com>"

irb(main):002:0> %('John "P" Smith' <john@example.com>)
=> "'John \"P\" Smith' <john@example.com>"

Rails 6 adds ActionMailer::Base#email_address_with_name to join name and email address in the format "John Smith" <john@example.com> and take care of escaping special characters.

Rails 6.1.0.alpha

irb(main):001:0> ActionMailer::Base.email_address_with_name("john@example.com", "John P Smith")
=> "John P Smith <john@example.com>"

irb(main):002:0> ActionMailer::Base.email_address_with_name("john@example.com", 'John "P" Smith')
=> "\"John \\\"P\\\" Smith\" <john@example.com>"
  mail(
    to: email_address_with_name("john@example.com", "John Smith"),
    subject: 'Hey Rails 6!'
  )

Here’s the relevant pull request for this change.


Rails 6 raises ArgumentError if param contains colon

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

The :param option in routes is used to override default resource identifier i.e. :id.

Let’s take for an example that we want product :name to be as the default resource identifier instead of :id while defining routes for products. In this case, :param option comes handy. We will see below how we can use this option.

Before Rails 6, if resource custom param contains a colon, Rails used to consider that as an extra param which should not be the case because it sneaks in an extra param.

An issue was raised in Aug, 2017 which was later fixed in February this year.

So, now Rails 6 raises ArgumentError if a resource custom param contains a colon(:).

Let’s checkout how it works.

Rails 5.2

Let’s create routes for products with custom param as name/:pzn.

>> Rails.application.routes.draw do
>>   resources :products, param: 'name/:pzn'
>> end
$ rake routes | grep products
products     GET    /products(.:format)                    products#index
             POST   /products(.:format)                    products#create
new_product  GET    /products/new(.:format)                products#new
edit_product GET    /products/:name/:pzn/edit(.:format)    products#edit
product      GET    /products/:name/:pzn(.:format)         products#show
             PATCH  /products/:name/:pzn(.:format)         products#update
             PUT    /products/:name/:pzn(.:format)         products#update
             DELETE /products/:name/:pzn(.:format)         products#destroy

As we can see, Rails also considers :pzn as a parameter.

Now let’s see how it works in Rails 6.

Rails 6.0.0.rc1

>> Rails.application.routes.draw do
>>   resources :products, param: 'name/:pzn'
>> end
$ rake routes | grep products

rake aborted!
ArgumentError: :param option can't contain colons
/Users/amit/.rvm/gems/ruby-2.6.3/gems/actionpack-6.0.0.rc1/lib/action_dispatch/routing/mapper.rb:1149:in `initialize'
/Users/amit/.rvm/gems/ruby-2.6.3/gems/actionpack-6.0.0.rc1/lib/action_dispatch/routing/mapper.rb:1472:in `new'
/Users/amit/.rvm/gems/ruby-2.6.3/gems/actionpack-6.0.0.rc1/lib/action_dispatch/routing/mapper.rb:1472:in `block in resources'
...
...
...

Here is the relevant issue and the pull request.


Rails 6 introduces new code loader called Zeitwerk

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

Zeitwerk is the new code loader that comes with Rails 6 by default. In addition to providing autoloading, eager loading, and reloading capabilities, it also improves the classical code loader by being efficient and thread safe. According to the author of Zeitwerk, Xavier Noria, one of the main motivations for writing Zeitwerk was to keep code DRY and to remove the brittle require calls.

Zeitwerk is available as a gem with no additional dependencies. It means any regular Ruby project can use Zeitwerk.

How to use Zeitwerk

Zeitwerk is baked in a Rails 6 project, thanks to the Zeitwerk-Rails integration. For a non-Rails project, adding the following into the project’s entry point sets up Zeitwerk.

loader = Zeitwerk::Loader.new
loader.push_dir(...)
loader.setup

For gem maintainers, Zeitwerk provides the handy .for_gem utility method

The following example from Zeitwerk documentation illustrates the usage of Zeitwerk.for_gem method.

#lib/my_gem.rb (main file)

require "zeitwerk"
loader = Zeitwerk::Loader.for_gem
loader.setup

module MyGem
  # Since the setup has been performed, at this point we are already
  # able to reference project constants, in this case MyGem::MyLogger.
  include MyLogger
end

How does Zeitwerk work?

Before we look into Zeitwerk’s internals, the following section provides a quick refresher on constant-resolution in Ruby and how classical code loader of Rails works.

Ruby’s constant resolution looks for a constant in the following places.

  • In each entry of Module.nesting
  • In each entry of Module.ancestors

It triggers ‘constant_missing’ callback when it can’t find the constant.

Ruby used to look for constants in Object.ancestors as well, but that seems not the case anymore. An in-depth explanation of constant resolution can be found at Conrad Irwin’s blog.

Classical Code Loader in Rails

Classical code loader (code loader in Rails version < 6.0) achieves autoloading by overriding Module#const_missing and loads the missing constant without the need for an explicit require call as long as the code follows certain conventions.

  • The file should be within a directory in ActiveSupport::Dependencies.autoload_paths
  • A file should be named after the class, i.e Admin::RoutesController => admin/routes_controller.rb
Zeitwerk Mode

Zeitwerk takes an entirely different approach in autoloading by registering constants to be autoloaded by Ruby.

Consider the following configuration in which Zeitwerk manages lib directory and lib has automobile.rb file.

loader.push_dir('./lib')

Zeitwerk then uses Module.autoload to tell Ruby that “Automobile” can be found in “lib/automobile.rb”.

autoload "Automobile", "lib/automobile.rb"

Unlike classical loader, Zeitwerk takes module nesting into account while loading constants by leveraging the new Tracepoint API to go look for constants defined in subdirectories when a new class or module is defined.

Let us look at an example to understand this better.

class Automobile
  # => Tracepoint hook triggers here.
  # include Engine
end

When the tracepoint hook triggers, Zeitwerk checks for an automobile directory in the same level as automobile.rb and sets up Module.autoload for that directory and all the files (in this case ./automobile/engine.rb) within that directory.

Conclusion

Previously in Rails, we had a code loader that was riddled with gotchas and struggled to be thread safe. Zeitwerk does a better job by leveraging the new Ruby standard API and matches Ruby’s semantics for constants.


Rails 6 adds ActiveSupport::ActionableError

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

When working in a team on a Rails application, we often bump into PendingMigrationError or other errors that need us to run a rails command, rake task etc.

Rails introduced a way to resolve such frequent errors in development from error page itself.

Rails 6 added ActiveSupport::ActionableError module to define actions we want perform on errors, right from the error page.

For example, this is how PendingMigrationError page looks like in Rails 6.

How Actionable error looks like in Rails 6

By default, a button is added on error screen that says Run pending migrations. Clicking on this button would dispatch rails db:migrate action. Page will reload once migrations run successfully.

We can also define custom actions to execute on errors.

How to define actions on error?

We need to include ActiveSupport::ActionableError module in our error class. We can monkey patch an existing error class or define custom error class.

#action api is provided to define actions on error. First argument in #action is name of the action. This string would be displayed on the button on error page. Second argument is a block where we can write commands or code to fix the error.

Let’s take an example of seeding posts data from controller, if posts not already present.

# app/controllers/posts_controller.rb

class PostsController < ApplicationController

  def index
    @posts = Post.all
    if @posts.empty?
      raise PostsMissingError
    end
  end

end
# app/errors/posts_missing_error.rb

class PostsMissingError < StandardError

  include ActiveSupport::ActionableError

  action "seed posts data" do
    Rails::Command.invoke 'posts:seed'
  end

end
# lib/tasks/posts.rake

namespace :posts do

  desc 'posts seed task'
  task :seed do
    Post.create(title: 'First Post')
  end

end
# app/views/posts/index.html.erb

<% @posts.each do |post| %>
  <%= post.title %>
<% end %>

Let’s check /posts (posts#index action) when no posts are present. We would get an error page with an action button on it as shown below.

Actionable error - seed posts data

Clicking on seed posts data action button will run our rake task and create posts. Rails will automatically reload /posts after running rake task.

Posts index page

ActionDispatch::ActionableExceptions middleware takes care of invoking actions from error page. ActionableExceptions middleware dispatches action to ActionableError and redirects back when action block has successfully run. Action buttons are added on error page from this middleware template.

Checkout the pull request for more information on actionable error.


This is how our workspace looks like

BigBinary has been remote and flexible since the start, and it’s one of the best things a company can offer. You don’t need to spend hours commuting, you can work when you feel productive. Working remotely also means that you have the flexibility of working from Starbucks, from a library or from your home. You can set up your own workspace at home and still have the office-like feeling.

We recently got a chance to see workspaces of our colleagues and everyone shared photos of environments they work in on Slack and it was fun seeing everyone’s desk and the setup they have. We thought it would be fun to share a peek at the home offices we have. Here we go.

Akhil Gautam

BigBinary Remote Workspace

Amit Choudhary

BigBinary Remote Workspace

Chimed Palden

BigBinary Remote Workspace

Chirag Shah

BigBinary Remote Workspace BigBinary Remote Workspace

Ershad Kunnakkadan

BigBinary Remote Workspace

Mohit Natoo

BigBinary Remote Workspace

BigBinary Remote Workspace BigBinary Remote Workspace

Neeraj Singh

BigBinary Remote Workspace BigBinary Remote Workspace

Nitin Kalasannavar

BigBinary Remote Workspace

Paras Bansal

BigBinary Remote Workspace

Pranav Raj

BigBinary Remote Workspace

Prathamesh Sonpatki

BigBinary Remote Workspace

Rahul Mahale

BigBinary Remote Workspace

Rishi Mohan

BigBinary Remote Workspace BigBinary Remote Workspace BigBinary Remote Workspace BigBinary Remote Workspace BigBinary Remote Workspace

Shibin Madassery

BigBinary Remote Workspace

Sony Mathew

BigBinary Remote Workspace

Sunil Kumar

BigBinary Remote Workspace

Tyler and Naiara

BigBinary Remote Workspace

Unnikrishnan KP

BigBinary Remote Workspace

Vishal Telangre

BigBinary Remote Workspace


Rails 6 add_foreign_key & remove_foreign_key SQLite3

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

Rails provides add_foreign_key to add foreign key constraint for a column on a table.

It also provides remove_foreign_key to remove the foreign key constraint.

Before Rails 6, add_foreign_key and remove_foreign_key were not supported for SQLite3.

Rails 6 now adds this support. Now, we can create and remove foreign key constraints using add_foreign_key and remove_foreign_key in SQLite3.

Let’s checkout how it works.

Rails 5.2

We have two tables named as orders and users. Now, let’s add foreign key constraint of users in orders table using add_foreign_key and then try removing it using remove_foreign_key.

>> class AddUserReferenceToOrders < ActiveRecord::Migration[6.0]
>>   def change
>>     add_column :orders, :user_id, :integer
>>     add_foreign_key :orders, :users
>>   end
>> end

=> :change

>> AddUserReferenceToOrders.new.change
-- add_column(:orders, :user_id, :integer)
   (1.2ms)  ALTER TABLE "orders" ADD "user_id" integer
   -> 0.0058s
-- add_foreign_key(:orders, :users)
   -> 0.0000s

=> nil

>> class RemoveUserForeignKeyFromOrders < ActiveRecord::Migration[6.0]
>>   def change
>>     remove_foreign_key :orders, :users
>>   end
>> end

=> :change

>> RemoveUserForeignKeyFromOrders.new.change
-- remove_foreign_key(:orders, :users)
   -> 0.0001s

=> nil

We can see that add_foreign_key and remove_foreign_key are ignored by Rails 5.2 with SQLite3.

Rails 6.0.0.rc1

We have two tables named as orders and users. Now, let’s add foreign key constraint of users in orders table using add_foreign_key.

>> class AddUserReferenceToOrders < ActiveRecord::Migration[6.0]
>>   def change
>>     add_column :orders, :user_id, :integer
>>     add_foreign_key :orders, :users
>>   end
>> end

=> :change

>> AddUserReferenceToOrders.new.change
-- add_column(:orders, :user_id, :integer)
   (1.0ms)  SELECT sqlite_version(*)
   (2.9ms)  ALTER TABLE "orders" ADD "user_id" integer
   -> 0.0091s
-- add_foreign_key(:orders, :users)
   (0.0ms)  begin transaction
   (0.1ms)  PRAGMA foreign_keys
   (0.1ms)  PRAGMA defer_foreign_keys
   (0.0ms)  PRAGMA defer_foreign_keys = ON
   (0.1ms)  PRAGMA foreign_keys = OFF
   (0.2ms)  CREATE TEMPORARY TABLE "aorders" ("id" integer NOT NULL PRIMARY KEY, "number" varchar DEFAULT NULL, "total" decimal DEFAULT NULL, "completed_at" datetime DEFAULT NULL, "created_at" datetime(6) NOT NULL, "updated_at" datetime(6) NOT NULL, "user_id" integer DEFAULT NULL)
   (0.1ms)  INSERT INTO "aorders" ("id","number","total","completed_at","created_at","updated_at","user_id")
                     SELECT "id","number","total","completed_at","created_at","updated_at","user_id" FROM "orders"
   (0.3ms)  DROP TABLE "orders"
   (0.1ms)  CREATE TABLE "orders" ("id" integer NOT NULL PRIMARY KEY, "number" varchar DEFAULT NULL, "total" decimal DEFAULT NULL, "completed_at" datetime DEFAULT NULL, "created_at" datetime(6) NOT NULL, "updated_at" datetime(6) NOT NULL, "user_id" integer DEFAULT NULL, CONSTRAINT "fk_rails_f868b47f6a"
FOREIGN KEY ("user_id")
  REFERENCES "users" ("id")
)
   (0.1ms)  INSERT INTO "orders" ("id","number","total","completed_at","created_at","updated_at","user_id")
                     SELECT "id","number","total","completed_at","created_at","updated_at","user_id" FROM "aorders"
   (0.1ms)  DROP TABLE "aorders"
   (0.0ms)  PRAGMA defer_foreign_keys = 0
   (0.0ms)  PRAGMA foreign_keys = 1
   (0.6ms)  commit transaction
   -> 0.0083s

=> []

>> class RemoveUserForeignKeyFromOrders < ActiveRecord::Migration[6.0]
>>   def change
>>     remove_foreign_key :orders, :users
>>   end
>> end

=> :change

>> RemoveUserForeignKeyFromOrders.new.change
-- remove_foreign_key(:orders, :users)
   (1.4ms)  SELECT sqlite_version(*)
   (0.0ms)  begin transaction
   (0.0ms)  PRAGMA foreign_keys
   (0.0ms)  PRAGMA defer_foreign_keys
   (0.0ms)  PRAGMA defer_foreign_keys = ON
   (0.0ms)  PRAGMA foreign_keys = OFF
   (0.2ms)  CREATE TEMPORARY TABLE "aorders" ("id" integer NOT NULL PRIMARY KEY, "number" varchar DEFAULT NULL, "total" decimal DEFAULT NULL, "completed_at" datetime DEFAULT NULL, "created_at" datetime(6) NOT NULL, "updated_at" datetime(6) NOT NULL, "user_id" integer DEFAULT NULL)
   (0.3ms)  INSERT INTO "aorders" ("id","number","total","completed_at","created_at","updated_at","user_id")
                     SELECT "id","number","total","completed_at","created_at","updated_at","user_id" FROM "orders"
   (0.4ms)  DROP TABLE "orders"
   (0.1ms)  CREATE TABLE "orders" ("id" integer NOT NULL PRIMARY KEY, "number" varchar DEFAULT NULL, "total" decimal DEFAULT NULL, "completed_at" datetime DEFAULT NULL, "created_at" datetime(6) NOT NULL, "updated_at" datetime(6) NOT NULL, "user_id" integer DEFAULT NULL)
   (0.1ms)  INSERT INTO "orders" ("id","number","total","completed_at","created_at","updated_at","user_id")
                     SELECT "id","number","total","completed_at","created_at","updated_at","user_id" FROM "aorders"
   (0.1ms)  DROP TABLE "aorders"
   (0.0ms)  PRAGMA defer_foreign_keys = 0
   (0.0ms)  PRAGMA foreign_keys = 1
   (0.7ms)  commit transaction
   -> 0.0179s

=> []

Now, let’s remove foreign key constraint of users from orders table using remove_foreign_key.

>> class RemoveUserForeignKeyFromOrders < ActiveRecord::Migration[6.0]
>>   def change
>>     remove_foreign_key :orders, :users
>>   end
>> end

=> :change

>> RemoveUserForeignKeyFromOrders.new.change
-- remove_foreign_key(:orders, :users)
   (1.4ms)  SELECT sqlite_version(*)
   (0.0ms)  begin transaction
   (0.0ms)  PRAGMA foreign_keys
   (0.0ms)  PRAGMA defer_foreign_keys
   (0.0ms)  PRAGMA defer_foreign_keys = ON
   (0.0ms)  PRAGMA foreign_keys = OFF
   (0.2ms)  CREATE TEMPORARY TABLE "aorders" ("id" integer NOT NULL PRIMARY KEY, "number" varchar DEFAULT NULL, "total" decimal DEFAULT NULL, "completed_at" datetime DEFAULT NULL, "created_at" datetime(6) NOT NULL, "updated_at" datetime(6) NOT NULL, "user_id" integer DEFAULT NULL)
   (0.3ms)  INSERT INTO "aorders" ("id","number","total","completed_at","created_at","updated_at","user_id")
                     SELECT "id","number","total","completed_at","created_at","updated_at","user_id" FROM "orders"
   (0.4ms)  DROP TABLE "orders"
   (0.1ms)  CREATE TABLE "orders" ("id" integer NOT NULL PRIMARY KEY, "number" varchar DEFAULT NULL, "total" decimal DEFAULT NULL, "completed_at" datetime DEFAULT NULL, "created_at" datetime(6) NOT NULL, "updated_at" datetime(6) NOT NULL, "user_id" integer DEFAULT NULL)
   (0.1ms)  INSERT INTO "orders" ("id","number","total","completed_at","created_at","updated_at","user_id")
                     SELECT "id","number","total","completed_at","created_at","updated_at","user_id" FROM "aorders"
   (0.1ms)  DROP TABLE "aorders"
   (0.0ms)  PRAGMA defer_foreign_keys = 0
   (0.0ms)  PRAGMA foreign_keys = 1
   (0.7ms)  commit transaction
   -> 0.0179s

=> []

We can see here that with Rails 6, add_foreign_key and remove_foreign_key work and were able to add and remove foreign key constraint respectively.

Here is the relevant pull request.


Rails 6 adds ActionDispatch::Request::Session#dig

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

Rails 6 added ActionDispatch::Request::Session#dig.

This works the same way as Hash#dig.

It extracts the nested value specified by the sequence of keys.

Hash#dig was introduced in Ruby 2.3.

Before Rails 6, we can achieve the same thing by first converting session to a hash and then calling Hash#dig on it.

Let’s checkout how it works.

Rails 5.2

Let’s add some user information in session and use dig after converting it to a hash.

>> session[:user] = { email: 'jon@bigbinary.com', name: { first: 'Jon', last: 'Snow' }  }

=> {:email=>"jon@bigbinary.com", :name=>{:first=>"Jon", :last=>"Snow"}}

>> session.to_hash

=> {"session_id"=>"5fe8cc73c822361e53e2b161dcd20e47", "_csrf_token"=>"gyFd5nEEkFvWTnl6XeVbJ7qehgL923hJt8PyHVCH/DA=", "return_to"=>"http://localhost:3000", "user"=>{:email=>"jon@bigbinary.com", :name=>{:first=>"Jon", :last=>"Snow"}}}


>> session.to_hash.dig("user", :name, :first)

=> "Jon"

Rails 6.0.0.rc1

Let’s add the same information to session and now use dig on session object without converting it to a hash.

>> session[:user] = { email: 'jon@bigbinary.com', name: { first: 'Jon', last: 'Snow' }  }

=> {:email=>"jon@bigbinary.com", :name=>{:first=>"Jon", :last=>"Snow"}}

>> session.dig(:user, :name, :first)

=> "Jon"

Here is the relevant pull request.


Marking arrays of translations safe using html suffix

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

Before Rails 6

Before Rails 6, keys with the _html suffix in the language locale files are automatically marked as HTML safe. These HTML safe keys do not get escaped when used in the views.

# config/locales/en.yml

en:
  home:
    index:
      title_html: <h2>We build web & mobile applications</h2>
      description_html: We are a dynamic team of <em>developers</em> and <em>designers</em>.
      sections:
        blogs:
          title_html: <h3>Blogs & publications</h3>
          description_html: We regularly write our blog. Our blogs are covered by <strong>Ruby Inside</strong> and <strong>Ruby Weekly Newsletter</strong>.
<!-- app/views/home/index.html.erb -->

<%= t('.title_html') %>
<%= t('.description_html') %>

<%= t('.sections.blogs.title_html') %>
<%= t('.sections.blogs.description_html') %>

Once rendered, this page looks like this.

rails-6-supports-marking-arrays-of-translations-as-html-safe/before-rails-6-i18n-_html-suffix-without-array-key.png

This way of marking translations as HTML safe by adding _html suffix to the keys does not work as expected when the value is an array.

# config/locales/en.yml

en:
  home:
    index:
      title_html: <h2>We build web & mobile applications</h2>
      description_html: We are a dynamic team of <em>developers</em> and <em>designers</em>.
      sections:
        blogs:
          title_html: <h3>Blogs & publications</h3>
          description_html: We regularly write our blog. Our blogs are covered by <strong>Ruby Inside</strong> and <strong>Ruby Weekly Newsletter</strong>.
        services:
          title_html: <h3>Services we offer</h3>
          list_html:
            - <strong>Ruby on Rails</strong>
            - React.js &#9883;
            - React Native &#9883; &#128241;
<!-- app/views/home/index.html.erb -->

<%= t('.title_html') %>
<%= t('.description_html') %>

<%= t('.sections.blogs.title_html') %>
<%= t('.sections.blogs.description_html') %>

<%= t('.sections.services.title_html') %>
<ul>
  <% t('.sections.services.list_html').each do |service| %>
    <li><%= service %></li>
  <% end %>
<ul>

The rendered page escapes the unsafe HTML while rendering the array of translations for the key .sections.services.list_html even though that key has the _html suffix.

rails-6-supports-marking-arrays-of-translations-as-html-safe/before-rails-6-i18n-_html-suffix-with-array-key.png

A workaround is to manually mark all the translations in that array as HTML safe using the methods such as #raw or #html_safe.

<!-- app/views/home/index.html.erb -->

<%= t('.title_html') %>
<%= t('.description_html') %>

<%= t('.sections.blogs.title_html') %>
<%= t('.sections.blogs.description_html') %>

<%= t('.sections.services.title_html') %>
<ul>
  <% t('.sections.services.list_html').each do |service| %>
    <li><%= service.html_safe %></li>
  <% end %>
<ul>

rails-6-supports-marking-arrays-of-translations-as-html-safe/rails-6-i18n-array-key-with-_html-suffix.png

Arrays of translations are trusted as HTML safe by using the ‘_html’ suffix in Rails 6

In Rails 6, the unexpected behavior of not marking an array of translations as HTML safe even though the key of that array has the _html suffix is fixed.

# config/locales/en.yml

en:
  home:
    index:
      title_html: <h2>We build web & mobile applications</h2>
      description_html: We are a dynamic team of <em>developers</em> and <em>designers</em>.
      sections:
        blogs:
          title_html: <h3>Blogs & publications</h3>
          description_html: We regularly write our blog. Our blogs are covered by <strong>Ruby Inside</strong> and <strong>Ruby Weekly Newsletter</strong>.
        services:
          title_html: <h3>Services we offer</h3>
          list_html:
            - <strong>Ruby on Rails</strong>
            - React.js &#9883;
            - React Native &#9883; &#128241;
<!-- app/views/home/index.html.erb -->

<%= t('.title_html') %>
<%= t('.description_html') %>

<%= t('.sections.blogs.title_html') %>
<%= t('.sections.blogs.description_html') %>

<%= t('.sections.services.title_html') %>
<ul>
  <% t('.sections.services.list_html').each do |service| %>
    <li><%= service %></li>
  <% end %>
<ul>

rails-6-supports-marking-arrays-of-translations-as-html-safe/rails-6-i18n-array-key-with-_html-suffix.png

We can see above that we no longer need to manually mark the translations as HTML safe for the key .sections.services.title_html using the methods such as #raw or #html_safe since that key has the _html suffix.


To learn more about this feature, please checkout rails/rails#32361.


Rails 6 adds filter_attributes on ActiveRecord::Base

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

A lot of times, we ask user for sensitive data such as password, credit card number etc. We should not be able to see this information in logs. So, there must be a way in Rails to filter out these parameters from logs.

Rails provides a way of doing this. We can add parameters to Rails.application.config.filter_parameters.

There is one more way of doing this in Rails. We can also use https://api.rubyonrails.org/classes/ActionDispatch/Http/FilterParameters.html.

However there is still a security issue when we call inspect on an ActiveRecord object for logging purposes. In this case, Rails does not consider Rails.application.config.filter_parameters and displays the sensitive information.

Rails 6 fixes this. It considers Rails.application.config.filter_parameters while inspecting an object.

Rails 6 also provides an alternative way to filter columns on ActiveRecord level by adding filter_attributes on ActiveRecord::Base.

In Rails 6, filter_attributes on ActiveRecord::Base takes priority over Rails.application.config.filter_parameters.

Let’s checkout how it works.

Rails 6.0.0.rc1

Let’s create a user record and call inspect on it.

>> class User < ApplicationRecord
>>  validates :email, :password, presence: true
>> end

=> {:presence=>true}

>> User.create(email: 'john@bigbinary.com', password: 'john_wick_bigbinary')
BEGIN
  User Create (0.6ms)  INSERT INTO "users" ("email", "password", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"  [["email", "john@bigbinary.com"], ["password", "john_wick_bigbinary"], ["created_at", "2019-05-17 21:34:34.504394"], ["updated_at", "2019-05-17 21:34:34.504394"]]
COMMIT

=> #<User id: 2, email: "john@bigbinary.com", password: [FILTERED], created_at: "2019-05-17 21:34:34", updated_at: "2019-05-17 21:34:34">

We can see that password is filtered as it is added to Rails.application.config.filter_parameters by default in config/initializers/filter_parameter_logging.rb.

Now let’s add just :email to User.filter_attributes

>> User.filter_attributes = [:email]

=> [:email]

>> User.first.inspect
SELECT "users".* FROM "users" ORDER BY "users"."id" ASC LIMIT $1  [["LIMIT", 1]]

=> "#<User id: 2, email: [FILTERED], password: \"john_wick_bigbinary\", created_at: \"2019-05-17 21:34:34\", updated_at: \"2019-05-17 21:34:34\">"

We can see here that User.filter_attributes took priority over Rails.application.config.filter_parameters and removed filtering from password and filtered just email.

Now, let’s add both :email and :password to User.filter_attributes.

>> User.filter_attributes = [:email, :password]

=> [:email, :password]

>> User.first.inspect

=> "#<User id: 2, email: [FILTERED], password: [FILTERED], created_at: \"2019-05-17 21:34:34\", updated_at: \"2019-05-17 21:34:34\">"

We can see that now both email and password are filtered out.

Here is the relevant pull request.


ArgumentError for invalid :limit & :precision Rails 6

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

Rails 6 raises ArgumentError when :limit and :precision are used with invalid datatypes.

Before Rails 6, it used to return ActiveRecord::ActiveRecordError.

Let’s checkout how it works.

Rails 5.2

Let’s create an orders table and try using :limit with a column named as quantity with data type integer.

>> class CreateOrders < ActiveRecord::Migration[5.2]
>>   def change
>>     create_table :orders do |t|
>>       t.string :item
>>       t.integer :quantity, limit: 10
>>
>>       t.timestamps
>>     end
>>   end
>> end

=> :change

>> CreateOrders.new.change
-- create_table(:orders)

=> Traceback (most recent call last):
        2: from (irb):11
        1: from (irb):3:in 'change'
ActiveRecord::ActiveRecordError (No integer type has byte size 10. Use a numeric with scale 0 instead.)

We can see that use of :limit with integer column raises ActiveRecord::ActiveRecordError in Rails 5.2.

Now let’s try using :precision of 10 with a datetime column.

>> class CreateOrders < ActiveRecord::Migration[5.2]
>>   def change
>>     create_table :orders do |t|
>>       t.string :item
>>       t.integer :quantity
>>       t.datetime :completed_at, precision: 10
>>
>>       t.timestamps
>>     end
>>   end
>> end

=> :change

>> CreateOrders.new.change
-- create_table(:orders)

=> Traceback (most recent call last):
        2: from (irb):12
        1: from (irb):3:in 'change'
ActiveRecord::ActiveRecordError (No timestamp type has precision of 10. The allowed range of precision is from 0 to 6)

We can see that invalid value of :precision with datetime column also raises ActiveRecord::ActiveRecordError in Rails 5.2.

Rails 6.0.0.rc1

Let’s create an orders table and try using :limit with a column named as quantity with data type integer in Rails 6.

>> class CreateOrders < ActiveRecord::Migration[6.0]
>>   def change
>>     create_table :orders do |t|
>>       t.string :item
>>       t.integer :quantity, limit: 10
>>
>>       t.timestamps
>>     end
>>   end
>> end

=> :change

>> CreateOrders.new.change
-- create_table(:orders)

=> Traceback (most recent call last):
        2: from (irb):11
        1: from (irb):3:in 'change'
ArgumentError (No integer type has byte size 10. Use a numeric with scale 0 instead.)

We can see that use of :limit with integer column raises ArgumentError in Rails 6.

Now let’s try using :precision of 10 with a datetime column.

>> class CreateOrders < ActiveRecord::Migration[6.0]
>>   def change
>>     create_table :orders do |t|
>>       t.string :item
>>       t.integer :quantity
>>       t.datetime :completed_at, precision: 10
>>
>>       t.timestamps
>>     end
>>   end
>> end

=> :change

>> CreateOrders.new.change
-- create_table(:orders)

=> Traceback (most recent call last):
        2: from (irb):12
        1: from (irb):3:in 'change'
ArgumentError (No timestamp type has precision of 10. The allowed range of precision is from 0 to 6)

We can see that invalid value of :precision with datetime column also raises ArgumentError in Rails 6.

Here is the relevant pull request.


Rails 6 Pass custom config to ActionCable::Server::Base

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

Before Rails 6, Action Cable server used default configuration on boot up, unless custom configuration is provided explicitly.

Custom configuration can be mentioned in either config/cable.yml or config/application.rb as shown below.

# config/cable.yml

production:
  url: redis://redis.example.com:6379
  adapter: redis
  channel_prefix: custom_

Or

# config/application.rb

config.action_cable.cable = { adapter: "redis", channel_prefix: "custom_" }

In some cases, we need another Action Cable server running separately from application with a different set of configuration.

Problem is that both approaches mentioned earlier set Action Cable server configuration on application boot up. This configuration can not be changed for the second server.

Rails 6 has added a provision to pass custom configuration. Rails 6 allows us to pass ActionCable::Server::Configuration object as an option when initializing a new Action Cable server.

config = ActionCable::Server::Configuration.new
config.cable = { adapter: "redis", channel_prefix: "custom_" }

ActionCable::Server::Base.new(config: config)

For more details on Action Cable configurations, head to Action Cable docs.

Here’s the relevant pull request for this change.


Rails 6 adds support of symbol keys

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

Rails 6 added support of symbol keys with ActiveSupport::HashWithIndifferentAccess#assoc.

Please note that documentation of ActiveSupport::HashWithIndifferentAccess#assoc in Rails 5.2 shows that ActiveSupport::HashWithIndifferentAccess#assoc works with symbol keys but it doesn’t.

In Rails 6, ActiveSupport::HashWithIndifferentAccess implements a hash where string and symbol keys are considered to be the same.

Before Rails 6, HashWithIndifferentAccess#assoc used to work with just string keys.

Let’s checkout how it works.

Rails 5.2

Let’s create an object of ActiveSupport::HashWithIndifferentAccess and call assoc on that object.

>> info = { name: 'Mark', email: 'mark@bigbinary.com' }.with_indifferent_access

=> {"name"=>"Mark", "email"=>"mark@bigbinary.com"}

>> info.assoc(:name)

=> nil

>> info.assoc('name')

=> ["name", "Mark"]

We can see that assoc does not work with symbol keys with ActiveSupport::HashWithIndifferentAccess in Rails 5.2.

Rails 6.0.0.beta2

Now, let’s call assoc on the same hash in Rails 6 with both string and symbol keys.

>> info = { name: 'Mark', email: 'mark@bigbinary.com' }.with_indifferent_access

=> {"name"=>"Mark", "email"=>"mark@bigbinary.com"}

>> info.assoc(:name)

=> ["name", "Mark"]

>> info.assoc('name')

=> ["name", "Mark"]

As we can see, assoc works perfectly fine with both string and symbol keys with ActiveSupport::HashWithIndifferentAccess in Rails 6.

Here is the relevant pull request.


Rails 6 preserves status of #html_safe?

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

Before Rails 6

Before Rails 6, calling #html_safe? on a slice of an HTML safe string returns nil.

>> html_content = "<div>Hello, world!</div>".html_safe
# => "<div>Hello, world!</div>"
>> html_content.html_safe?
# => true
>> html_content[0..-1].html_safe?
# => nil

Also, before Rails 6, the ActiveSupport::SafeBuffer#* method does not preserve the HTML safe status as well.

>> line_break = "<br />".html_safe
# => "<br />"
>> line_break.html_safe?
# => true
>> two_line_breaks = (line_break * 2)
# => "<br /><br />"
>> two_line_breaks.html_safe?
# => nil

Rails 6 returns expected status of #html_safe?

In Rails 6, both of the above cases have been fixed properly.

Therefore, we will now get the status of #html_safe? as expected.

>> html_content = "<div>Hello, world!</div>".html_safe
# => "<div>Hello, world!</div>"
>> html_content.html_safe?
# => true
>> html_content[0..-1].html_safe?
# => true

>> line_break = "<br />".html_safe
# => "<br />"
>> line_break.html_safe?
# => true
>> two_line_breaks = (line_break * 2)
# => "<br /><br />"
>> two_line_breaks.html_safe?
# => true

Please check rails/rails#33808 and rails/rails#36012 for the relevant changes.


Recyclable cache keys in Rails

Recyclable cache keys or cache versioning was introduced in Rails 5.2. Large applications frequently need to invalidate their cache because cache store has limited memory. We can optimize cache storage and minimize cache miss using recyclable cache keys.

Recyclable cache keys is supported by all cache stores that ship with Rails.

Before Rails 5.2, cache_key’s format was {model_name}/{id}-{update_at}. Here model_name and id are always constant for an object and updated_at changes on every update.

Rails 5.1

>> post = Post.last

>> post.cache_key
=> "posts/1-20190522104553296111"

# Update post
>> post.touch

>> post.cache_key
=> "posts/1-20190525102103422069" # cache_key changed

In Rails 5.2, #cache_key returns {model_name}/{id} and new method #cache_version returns {updated_at}.

Rails 5.2

>> ActiveRecord::Base.cache_versioning = true

>> post = Post.last

>> post.cache_key
=> "posts/1"

>> post.cache_version
=> "20190522070715422750"

>> post.cache_key_with_version
=> "posts/1-20190522070715422750"

Let’s update post instance and check cache_key and cache_version’s behaviour.

>> post.touch

>> post.cache_key
=> "posts/1" # cache_key remains same

>> post.cache_version
=> "20190527062249879829" # cache_version changed

To use cache versioning feature, we have to enable ActiveRecord::Base.cache_versioning configuration. By default cache_versioning config is set to false for backward compatibility.

We can enable cache versioning configuration globally as shown below.

ActiveRecord::Base.cache_versioning = true
# or
config.active_record.cache_versioning = true

Cache versioning config can be applied at model level.

class Post < ActiveRecord::Base
  self.cache_versioning = true
end

# Or, when setting `#cache_versioning` outside the model -

Post.cache_versioning = true

Let’s understand the problem step by step with cache keys before Rails 5.2.

Rails 5.1 (without cache versioning)

1. Write post instance to cache using fetch api.

>> before_update_cache_key = post.cache_key
=> "posts/1-20190527062249879829"

>> Rails.cache.fetch(before_update_cache_key) { post }
=> #<Post id: 1, title: "First Post", created_at: "2019-05-22 17:23:22", updated_at: "2019-05-27 06:22:49">

2. Update post instance using touch.

>> post.touch
   (0.1ms)  begin transaction
  Post Update (1.6ms)  UPDATE "posts" SET "updated_at" = ? WHERE "posts"."id" = ?  [["updated_at", "2019-05-27 08:01:52.975653"], ["id", 1]]
   (1.2ms)  commit transaction
=> true

3. Verify stale cache_key in cache store.

>> Rails.cache.fetch(before_update_cache_key)
=> #<Post id: 1, title: "First Post", created_at: "2019-05-22 17:23:22", updated_at: "2019-05-27 06:22:49">

4. Write updated post instance to cache using new cache_key.

>> after_update_cache_key = post.cache_key
=> "posts/1-20190527080152975653"

>> Rails.cache.fetch(after_update_cache_key) { post }
=> #<Post id: 1, title: "First Post", created_at: "2019-05-22 17:23:22", updated_at: "2019-05-27 08:01:52">

5. Cache store now has two copies of post instance.

>> Rails.cache.fetch(before_update_cache_key)
=> #<Post id: 1, title: "First Post", created_at: "2019-05-22 17:23:22", updated_at: "2019-05-27 06:22:49">

>> Rails.cache.fetch(after_update_cache_key)
=> #<Post id: 1, title: "First Post", created_at: "2019-05-22 17:23:22", updated_at: "2019-05-27 08:01:52">

cache_key and its associated instance becomes irrelevant as soon as an instance is updated. But it stays in cache store until it is manually invalidated.

This sometimes result in overflowing cache store with stale keys and data. In applications that extensively use cache store, a huge chunk of cache store gets filled with stale data frequently.

Now let’s take a look at the same example. This time with cache versioning to understand how recyclable cache keys help optimize cache storage.

Rails 5.2 (cache versioning)

1. Write post instance to cache store with version option.

>> ActiveRecord::Base.cache_versioning = true

>> post = Post.last

>> cache_key = post.cache_key
=> "posts/1"

>> before_update_cache_version = post.cache_version
=> "20190527080152975653"

>> Rails.cache.fetch(cache_key, version: before_update_cache_version) { post }
=> #<Post id: 1, title: "First Post", created_at: "2019-05-22 17:23:22", updated_at: "2019-05-27 08:01:52">

2. Update post instance.

>> post.touch
   (0.1ms)  begin transaction
  Post Update (0.4ms)  UPDATE "posts" SET "updated_at" = ? WHERE "posts"."id" = ?  [["updated_at", "2019-05-27 09:09:15.651029"], ["id", 1]]
   (0.7ms)  commit transaction
=> true

3. Verify stale cache_version in cache store.

>> Rails.cache.fetch(cache_key, version: before_update_cache_version)
=> #<Post id: 1, title: "First Post", created_at: "2019-05-22 17:23:22", updated_at: "2019-05-27 08:01:52">

4. Write updated post instance to cache.

>> after_update_cache_version = post.cache_version
=> "20190527090915651029"

>> Rails.cache.fetch(cache_key, version: after_update_cache_version) { post }
=> #<Post id: 1, title: "First Post", created_at: "2019-05-22 17:23:22", updated_at: "2019-05-27 09:09:15">

5. Cache store has replaced old copy of post with new version automatically.

>> Rails.cache.fetch(cache_key, version: before_update_cache_version)
=> nil

>> Rails.cache.fetch(cache_key, version: after_update_cache_version)
=> #<Post id: 1, title: "First Post", created_at: "2019-05-22 17:23:22", updated_at: "2019-05-27 09:09:15">

Above example shows how recyclable cache keys maintains single, latest copy of an instance. Stale versions are removed automatically when new version is added to cache store.

Rails 6 added #cache_versioning for ActiveRecord::Relation.

ActiveRecord::Base.collection_cache_versioning configuration should be enabled to use cache versioning feature on collections. It is set to false by default.

We can enable this configuration as shown below.

ActiveRecord::Base.collection_cache_versioning = true
# or
config.active_record.collection_cache_versioning = true

Before Rails 6, ActiveRecord::Relation had cache_key in format {table_name}/query-{query-hash}-{count}-{max(updated_at)}.

In Rails 6, cache_key is split in stable part cache_key - {table_name}/query-{query-hash} and volatile part cache_version - {count}-{max(updated_at)}.

For more information, check out blog on ActiveRecord::Relation#cache_key in Rails 5.

Rails 5.2

>> posts = Post.all

>> posts.cache_key
=> "posts/query-00644b6a00f2ed4b925407d06501c8fb-3-20190522172326885804"

Rails 6

>> ActiveRecord::Base.collection_cache_versioning = true

>> posts = Post.all

>> posts.cache_key
=> "posts/query-00644b6a00f2ed4b925407d06501c8fb"

>> posts.cache_version
=> "3-20190522172326885804"

Cache versioning works similarly for ActiveRecord::Relation as ActiveRecord::Base.

In case of ActiveRecord::Relation, if number of records change and/or record(s) are updated, then same cache_key is written to cache store with new cache_version and updated records.

Conclusion

Previously, cache invalidation had to be done manually either by deleting cache or setting cache expire duration. Cache versioning invalidates stale data automatically and keeps latest copy of data, saving on storage and performance drastically.

Check out the pull request and commit for more details.


Rails 6 deprecates where.not as NOR & Rails 6.1 as NAND

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

A notable deprecation warning has been added in Rails 6 when using where.not with multiple attributes.

Before Rails 6, if we use where.not with multiple attributes, it applies logical NOR (NOT(A) AND NOT(B)) in WHERE clause of the query. This does not always work as expected.

Let’s look at an example to understand this better.

We have Post model with a polymorphic association.

Rails 5.2
>> Post.all
=> #<ActiveRecord::Relation [
#<Post id: 1, title: "First Post", source_type: "Feed", source_id: 100>,
#<Post id: 2, title: "Second Post", source_type: "Feed", source_id: 101>]>

>> Post.where(source_type: "Feed", source_id: 100)
=> #<ActiveRecord::Relation [#<Post id: 1, title: "First Post", source_type: "Feed", source_id: 100>]>

>> Post.where.not(source_type: "Feed", source_id: 100)
=> #<ActiveRecord::Relation []>

In the last query, we expect ActiveRecord to fetch one record.

Let’s check SQL generated for the above case.

>> Post.where.not(source_type: "Feed", source_id: 100).to_sql

=> SELECT "posts".* FROM "posts" WHERE "posts"."source_type" != 'Feed' AND "posts"."source_id" != 100

where.not applies AND to the negation of source_type and source_id, and fails to fetch expected records.

In such cases, correct implementation of where.not would be logical NAND (NOT(A) OR NOT(B)).

Let us query posts table using NAND this time.

>> Post.where("source_type != 'Feed' OR source_id != 100")

   SELECT "posts".* FROM "posts" WHERE (source_type != 'Feed' OR source_id != 100)

=> #<ActiveRecord::Relation [#<Post id: 2, title: "Second Post", source_type: "Feed", source_id: 101>]>

Above query works as expected and returns one record. Rails 6.1 will change where.not working to NAND similar to the above query.

Rails 6.0.0.rc1
>> Post.where.not(source_type: "Feed", source_id: 100)

DEPRECATION WARNING: NOT conditions will no longer behave as NOR in Rails 6.1. To continue using NOR conditions, NOT each conditions manually (`.where.not(:source_type => ...).where.not(:source_id => ...)`). (called from irb_binding at (irb):1)

=> #<ActiveRecord::Relation []>

It is well mentioned in deprecation warning that if we wish to use NOR condition with multiple attributes, we can chain multiple where.not using a single predicate.

>> Post.where.not(source_type: "Feed").where.not(source_id: 100)

Here’s the relevant discussion and pull request for this change.


Rails 6 adds support for Optimizer Hints

Rails 6 has added support to provide optimizer hints.

What is Optimizer Hints?

Many relational database management systems (RDBMS) have a query optimizer. The job of the query optimizer is to determine the most efficient and fast plan to execute a given SQL query. Query optimizer has to consider all possible query execution plans before it can determine which plan is the optimal plan for executing the given SQL query and then compile and execute that query.

An optimal plan is chosen by the query optimizer by calculating the cost of each possible plans. Typically, when the number of tables referenced in a join query increases, then the time spent in query optimization grows exponentially which often affects the system’s performance. The fewer the execution plans the query optimizer needs to evaluate, the lesser time is spent in compiling and executing the query.

As an application designer, we might have more context about the data stored in our database. With the contextual knowledge about our database, we might be able to choose a more efficient execution plan than the query optimizer.

This is where the optimizer hints or optimizer guidelines come into picture.

Optimizer hints allow us to control the query optimizer to choose a certain query execution plan based on the specific criteria. In other words, we can hint the optimizer to use or ignore certain optimization plans using optimizer hints.

Usually, optimizer hints should be provided only when executing a complex query involving multiple table joins.

Note that the optimizer hints only affect an individual SQL statement. To alter the optimization strategies at the global level, there are different mechanisms supported by different databases. Optimizer hints provide finer control over other mechanisms which allow altering optimization plans by other means.

Optimizer hints are supported by many databases such as MySQL, PostgreSQL with the help of pg_hint_plan extension, Oracle, MS SQL, IBM DB2, etc. with varying syntax and options.

Optimizer Hints in Rails 6

Before Rails 6, we have to execute a raw SQL query to use the optimizer hints.

query = "SELECT
            /*+ JOIN_ORDER(articles, users) MAX_EXECUTION_TIME(60000) */
            articles.*
         FROM articles
         INNER JOIN users
         ON users.id = articles.user_id
         WHERE (published_at > '2019-02-17 13:15:44')
        ".squish

ActiveRecord::Base.connection.execute(query)

In the above query, we provided two optimizer hints to MySQL .

/*+ HINT_HERE ANOTHER_HINT_HERE ... */

Another approach to use optimizer hints prior to Rails 6 is to use a monkey patch like this.

In Rails 6, using optimizer hints is easier.

The same example looks like this in Rails 6.

Article
  .joins(:user)
  .where("published_at > ?", 2.months.ago)
  .optimizer_hints(
    "JOIN_ORDER(articles, users)",
    "MAX_EXECUTION_TIME(60000)"
  )

This produces the same SQL query as above but the result is of type ActiveRecord::Relation.

In PostgreSQL (using the pg_hint_plan extension), the optimizer hints have a different syntax.

Article
  .joins(:user)
  .where("published_at > ?", 2.months.ago)
  .optimizer_hints("Leading(articles users)", "SeqScan(articles)")

Please checkout the documentation of each database separately to learn the support and syntax of optimizer hints.

To learn more, please checkout this PR which introduced the #optimization_hints method to Rails 6.

Bonus example: Using optimizer hints to speedup a slow SQL statement in MySQL

Consider that we have articles table with some indexes.

class CreateArticles < ActiveRecord::Migration[6.0]
  def change
    create_table :articles do |t|
      t.string :title, null: false
      t.string :slug, null: false
      t.references :user
      t.datetime :published_at
      t.text :description

      t.timestamps

      t.index :slug, unique: true
      t.index [:published_at]
      t.index [:slug, :user_id]
      t.index [:published_at, :user_id]
      t.index [:title, :slug]
    end
  end
end

Let’s try to fetch all the articles which have been published in the last 2 months.

>> Article.joins(:user).where("published_at > ?", 2.months.ago)
# Article Load (10.5ms)  SELECT `articles`.* FROM `articles` INNER JOIN `users` ON `users`.`id` = `articles`.`user_id` WHERE (published_at > '2019-02-17 11:38:18.647296')
=> #<ActiveRecord::Relation [#<Article id: 20, title: "Article 20", slug: "article-20", user_id: 1, ...]>

Let’s use EXPLAIN to investigate why it is taking 10.5ms to execute this query.

>> Article.joins(:user).where("published_at > ?", 2.months.ago).explain
# Article Load (13.9ms)  SELECT `articles`.* FROM `articles` INNER JOIN `users` ON `users`.`id` = `articles`.`user_id` WHERE (published_at > '2019-02-17 11:39:05.380577')
=> # EXPLAIN for: SELECT `articles`.* FROM `articles` INNER JOIN `users` ON `users`.`id` = `articles`.`user_id` WHERE (published_at > '2019-02-17 11:39:05.380577')
# +--------+----------+----------------+-----------+------+----------+-------+
# | select |   table  | possible_keys  | key       | rows | filtered | Extra |
# | _type  |          |                |           |      |          |       |
# +--------+----------+----------------+-----------+------+----------+-------+
# | SIMPLE |   users  | PRIMARY        | PRIMARY   | 2    | 100.0    | Using |
# |        |          |                |           |      |          | index |
# +--------+----------+----------------+-----------+------+----------+-------+
# | SIMPLE | articles | index          | index     | 9866 | 10.0     | Using |
# |        |          | _articles      | _articles |      |          | where |
# |        |          | _on_user_id,   | _on       |      |          |       |
# |        |          | index          | _user_id  |      |          |       |
# |        |          | _articles      |           |      |          |       |
# |        |          | _on            |           |      |          |       |
# |        |          | _published_at, |           |      |          |       |
# |        |          | index          |           |      |          |       |
# |        |          | _articles      |           |      |          |       |
# |        |          | _on            |           |      |          |       |
# |        |          | _published_at  |           |      |          |       |
# |        |          | _and_user_id   |           |      |          |       |
# +--------+----------+----------------+-----------+------+----------+-------+

According to the above table, it appears that the query optimizer is considering users table first and then the articles table.

The rows column indicates the estimated number of rows the query optimizer must examine to execute the query.

The filtered column indicates an estimated percentage of table rows that will be filtered by the table condition.

The formula rows x filtered gives the number of rows that will be joined with the following table.

Also,

  • For users table, the number of rows to be joined with the following table is 2 x 100% = 2,
  • For articles table, the number of rows to be joined with the following table is 500 * 7.79 = 38.95.

Since the articles tables contain more records which references very few records from the users table, it would be better to consider the articles table first and then the users table.

We can hint MySQL to consider the articles table first as follows.

>> Article.joins(:user).where("published_at > ?", 2.months.ago).optimizer_hints("JOIN_ORDER(articles, users)")
# Article Load (2.2ms)  SELECT `articles`.* FROM `articles` INNER JOIN `users` ON `users`.`id` = `articles`.`user_id` WHERE (published_at > '2019-02-17 11:54:06.230651')
=> #<ActiveRecord::Relation [#<Article id: 20, title: "Article 20", slug: "article-20", user_id: 1, ...]>

Note that it took 2.2ms now to fetch the same records by providing JOIN_ORDER(articles, users) optimization hint.

Let’s try to EXPLAIN what changed by using this JOIN_ORDER(articles, users) optimization hint.

>> Article.joins(:user).where("published_at > ?", 2.months.ago).optimizer_hints("JOIN_ORDER(articles, users)").explain
# Article Load (4.1ms)  SELECT /*+ JOIN_ORDER(articles, users) */ `articles`.* FROM `articles` INNER JOIN `users` ON `users`.`id` = `articles`.`user_id` WHERE (published_at > '2019-02-17 11:55:24.335152')
=> # EXPLAIN for: SELECT /*+ JOIN_ORDER(articles, users) */ `articles`.* FROM `articles` INNER JOIN `users` ON `users`.`id` = `articles`.`user_id` WHERE (published_at > '2019-02-17 11:55:24.335152')
# +--------+----------+----------------+-----------+------+----------+--------+
# | select |   table  | possible_keys  | key       | rows | filtered | Extra  |
# | _type  |          |                |           |      |          |        |
# +--------+----------+----------------+-----------+------+----------+--------+
# | SIMPLE | articles | index          | index     | 769  | 100.0    | Using  |
# |        |          | _articles      | _articles |      |          | index  |
# |        |          | _on_user_id,   | _on       |      |          | condi  |
# |        |          | index          | _publish  |      |          | tion;  |
# |        |          | _articles      | ed_at,    |      |          | Using  |
# |        |          | _on            |           |      |          | where  |
# |        |          | _published_at, |           |      |          |        |
# |        |          | index          |           |      |          |        |
# |        |          | _articles      |           |      |          |        |
# |        |          | _on            |           |      |          |        |
# |        |          | _published_at  |           |      |          |        |
# |        |          | _and_user_id   |           |      |          |        |
# +--------+----------+----------------+-----------+------+----------+--------+
# | SIMPLE | users    | PRIMARY        | PRIMARY   | 2    | 100.0    | Using  |
# |        |          |                |           |      |          | index  |
# +--------+----------+----------------+-----------+------+----------+--------+

The result of the EXPLAIN query shows that the articles table was considered first and then the users table as expected. We can also see that the index_articles_on_published_at index key was considered from the possible keys to execute the given query. The filtered column for both tables shows that the number of filtered rows was 100% which means no filtering of rows occurred.

We hope this example helps in understanding how to use #explain and #optimization_hints methods in order to investigate and debug the performance issues and then fixing it.


Rails 6 reports object allocations while rendering

This blog is part of our Rails 6 series. Rails 6.0 was recently released.

Recently, Rails 6 added allocations feature to ActiveSupport::Notifications::Event. Using this feature, an event subscriber can see how many number of objects were allocated during the event’s start time and end time. We have written in detail about this feature here.

By taking the benefit of this feature, Rails 6 now reports the allocations made while rendering a view template, a partial and a collection.

Started GET "/articles" for ::1 at 2019-04-15 17:24:09 +0530
Processing by ArticlesController#index as HTML
  Rendering articles/index.html.erb within layouts/application
  Rendered shared/_ad_banner.html.erb (Duration: 0.1ms | Allocations: 6)
  Article Load (1.3ms)  SELECT "articles".* FROM "articles"
  ↳ app/views/articles/index.html.erb:5
  Rendered collection of articles/_article.html.erb [100 times] (Duration: 6.1ms | Allocations: 805)
  Rendered articles/index.html.erb within layouts/application (Duration: 17.6ms | Allocations: 3901)
Completed 200 OK in 86ms (Views: 83.6ms | ActiveRecord: 1.3ms | Allocations: 29347)

Notice the Allocations: information in the above logs.

We can see that

  • 6 objects were allocated while rendering shared/_ad_banner.html.erb view partial,
  • 805 objects were allocated while rendering a collection of 100 articles/_article.html.erb view partials,
  • and 3901 objects were allocated while rendering articles/index.html.erb view template.

We can use this information to understand how much time was spent while rendering a view template and how many objects were allocated in the process’ memory between the time when that view template had started rendering and the time when that view template had finished rendering.

To learn more about this feature, please check rails/rails#34136.

Note that we can also collect this information by subscribing to Action View hooks.

ActiveSupport::Notifications.subscribe /^render_.+.action_view$/ do |event|
  views_path = Rails.root.join("app/views/").to_s
  template_identifier = event.payload[:identifier]
  template_name = template_identifier.sub(views_path, "")
  message = "[#{event.name}] #{template_name} (Allocations: #{event.allocations})"

  ViewAllocationsLogger.log(message)
end

This should log something like this.

[render_partial.action_view] shared/_ad_banner.html.erb (Allocations: 43)

[render_collection.action_view] articles/_article.html.erb (Allocations: 842)

[render_template.action_view] articles/index.html.erb (Allocations: 4108)