Recyclable cache keys
or cache versioning
was introduced in Rails 5.2.
frequently need to invalidate their cache
because cache store has limited memory.
We can optimize cache storage
and minimize cache miss
using recyclable cache keys.
Recyclable cache keys
is supported by all
that ship with Rails.
Before Rails 5.2,
cache_key’s format was
Here model_name and id
are always constant
for an object
changes on every update.
In Rails 5.2,
and new method #cache_version
Let’s update post instance
and check cache_key
and cache_version’s behaviour.
To use cache versioning feature,
we have to
is set to false
for backward compatibility.
We can enable cache versioning
configuration globally as shown below.
Cache versioning config can
be applied at model level.
Let’s understand the problem
step by step
with cache keys
before Rails 5.2.
Cache versioning works similarly
In case of ActiveRecord::Relation,
if number of records change
and/or record(s) are updated, then
same cache_key is written to cache store
with new cache_version and updated records.
Previously, cache invalidation
had to be done manually
either by deleting cache or
setting cache expire duration.
Cache versioning invalidates
stale data automatically
and keeps latest copy
saving on storage
and performance drastically.
Rails 6 has added support
to provide optimizer hints.
What is Optimizer Hints?
Many relational database management systems (RDBMS)
have a query optimizer.
The job of the query optimizer
to determine the most efficient and fast plan
to execute a given SQL query.
Query optimizer has to consider
all possible query execution plans
before it can determine which plan is the optimal plan
for executing the given SQL query
then compile and execute that query.
An optimal plan is chosen by the query optimizer
by calculating the cost of each possible plans.
when the number of tables referenced in a join query increases,
then the time spent in query optimization grows exponentially
which often affects the system’s performance.
The fewer the execution plans
the query optimizer needs to evaluate,
the lesser time is spent in compiling and executing the query.
As an application designer,
we might have more context
about the data stored in our database.
With the contextual knowledge about our database,
we might be able to choose a more efficient execution plan
than the query optimizer.
This is where the optimizer hints or optimizer guidelines
come into picture.
Optimizer hints allow us
to control the query optimizer
to choose a certain query execution plan
based on the specific criteria.
In other words,
we can hint the optimizer
to use or ignore certain optimization plans
using optimizer hints.
optimizer hints should be provided
only when executing a complex query
involving multiple table joins.
Note that the optimizer hints
only affect an individual SQL statement.
To alter the optimization strategies at the global level,
there are different mechanisms supported by different databases.
Optimizer hints provide
finer control over other mechanisms
which allow altering optimization plans by other means.
This produces the same SQL query as above
the result is of type ActiveRecord::Relation.
In PostgreSQL (using the pg_hint_plan extension),
the optimizer hints have a different syntax.
Please checkout the documentation of each database separately
to learn the support and syntax of optimizer hints.
To learn more,
checkout this PR
which introduced the #optimization_hints method
to Rails 6.
Bonus example: Using optimizer hints to speedup a slow SQL statement in MySQL
Consider that we have articles table
with some indexes.
Let’s try to fetch all the articles
which have been published in the last 2 months.
Let’s use EXPLAIN
why it is taking 10.5ms
to execute this query.
According to the above table,
it appears that the query optimizer
is considering users table first
then the articles table.
The rows column indicates
the estimated number of rows the query optimizer must examine
to execute the query.
an estimated percentage of table rows
that will be filtered by the table condition.
The formula rows x filtered
gives the number of rows
that will be joined with the following table.
For users table,
the number of rows to be joined
with the following table is 2 x 100% = 2,
For articles table,
the number of rows to be joined
with the following table is 500 * 7.79 = 38.95.
Since the articles tables contain more records
which references very few records from the users table,
it would be better to consider the articles table first
then the users table.
We can hint MySQL to consider the articles table first as follows.
Note that it took 2.2ms now
fetch the same records
by providing JOIN_ORDER(articles, users) optimization hint.
Let’s try to EXPLAIN
by using this JOIN_ORDER(articles, users) optimization hint.
The result of the EXPLAIN query shows that
the articles table was considered first
then the users table as expected.
We can also see that
the index_articles_on_published_at index key
was considered from the possible keys
to execute the given query.
The filtered column for both tables shows that
the number of filtered rows was 100%
which means no filtering of rows occurred.
We hope this example helps
how to use #explain
in order to investigate and debug the performance issues
then fixing it.
added allocations feature
Using this feature,
an event subscriber can see
how many number of objects were allocated
during the event’s start time and end time.
We have written in detail about this feature
By taking the benefit of this feature,
Rails 6 now reports the allocations made
rendering a view template,
Notice the Allocations: information in the above logs.
We can see that
6 objects were allocated while rendering
shared/_ad_banner.html.erb view partial,
805 objects were allocated while rendering
a collection of 100 articles/_article.html.erb view partials,
and 3901 objects were allocated
while rendering articles/index.html.erb view template.
We can use this information
understand how much time was spent
while rendering a view template
how many objects were allocated in the process’ memory
between the time when that view template had started rendering
the time when that view template had finished rendering.
Before Rails 6,
we have to provide a custom block
to perform custom logging
around retries and discards
of the jobs defined using Active Job framework.
Notice the custom blocks provided
to retry_on and discard_on methods
to an individual job
in the above example.
Extracting such custom logic
to a base class or to a 3rd-party gem is possible
it will be non-standard and will be a bit difficult task.
An alternative approach is
to the hooks instrumented using Active Support Instrumentation API
which is a standard and recommended way.
Prior versions of Rails 6 already instruments
enqueue_at.active_job, enqueue.active_job, perform_start.active_job,
Unfortunately no hook is instrumented
around retries and discards of an Active Job
prior to Rails 6.
Rails 6 has introduced hooks
to Active Job
around retries and discards
to which one can easily subscribe
using Active Support Instrumentation API
to perform custom logging and monitoring or to collect any custom information.
The newly introduced hooks are
Let’s discuss each of these hooks in detail.
Note that whenever we say a job,
it means a job of type ActiveJob.
The enqueue_retry.active_job hook
instrumented when a job
enqueued to retry again
due to occurrence of an exception which
is configured using the retry_on method
in the job’s definition.
This hook is triggered
only when above condition is satisfied
the number of executions of the job
less than the number of attempts
defined using the retry_on method.
The number of attempts is by default set to 5
if not defined explicitly.
This is how we would subscribe to this hook
perform custom logging in our Rails application.
Note that the BackgroundJob::Logger above is our custom logger.
If we want, we can add any other logic instead.
We will change the definition of Container::DeleteJob job as below.
Let’s enqueue this job.
Assume that this job keeps throwing Timeout::Error exception
due to a network issue.
The job will be retried twice
since it is configured to retry
when a Timeout::Error exception occurs
up to maximum 3 attempts.
While retrying this job,
Active Job will instrument enqueue_retry.active_job hook
along with the necessary job payload.
Since we have already subscribed to this hook,
our subscriber would log something like this
with the help of BackgroundJob::Logger.log.
Rails allows us to use different databases
using the database.yml config file.
It uses sqlite3 as the default database when a new Rails app is created.
But it is also possible to use different databases such as MySQL or PostgreSQL.
The contents of database.yml change as per the database. Also each database
has a different adapter. We need to include the gems pg or mysql2 accordingly.
Before Rails 6, it was not possible to change the contents of database.yml
automatically. But now a command has been added
to do this automatically.
Let’s say our app has started with sqlite and now we have to switch to MySQL.
$ rails db:system:change --to=mysql
Overwrite /Users/prathamesh/Projects/reproductions/squish_app/config/database.yml? (enter "h" for help) [Ynaqdhm] Y
Our database.yml is now changed to contain the configuration for MySQL database and
the Gemfile also gets updated automatically with addition of mysql2 gem in place of sqlite3.
This command also takes care of using proper gem versions in the Gemfile when the database
backend is changed.
We frequently think about how good it would be if we could run tests in parallel on local
so there would be less wait time for tests to be completed. Wait times increase
considerably when the count of tests are on the higher side, which is a common case
for a lot of applications.
Though CI tools like CircleCi and Travis CI
provide a feature to run tests in parallel, there still wasn’t a straightforward way to
parallelize tests on local before Rails 6.
Before Rails 6, if we wanted to parallelize tests, we would use Parallel Tests.
Rails 6 adds the parallelization of tests by default. Rails 6 added parallelize
as a class method on ActiveSupport::TestCase which takes
a hash as a parameter with the keys workers and with. The worker key is responsible for setting the number of parallel
workers. The default value of the worker key is :number_of_processors, which finds the number of processors on the machine and sets it as the number of
parallel workers. with takes two values - :processes, which is the default one, and :threads as a value.
Rails 6 also added two hooks - parallelize_setup, which is called before the processes are forked, and parallelize_teardown, which
is called after the processes are killed. Rails 6 also handles creation of multiple databases and namespacing of those databases for parallel tests
out of the box.
If we want to disable parallel testing, we can set the value of workers as 1 or less.
Rails 6 also provides an environment variable PARALLEL_WORKERS to set the number of parallel workers on runtime.
$ PARALLEL_WORKERS=10 bin/rails test
Here is the relevant
pull request for adding parallelize
and pull request for setting number of
processors as default workers count.
Time.now or Process.clock_gettime(Process::CLOCK_REALTIME)
can jump forwards and backwards
as the system time-of-day clock is changed.
Whereas, clock time using CLOCK_MONOTONIC
returns the absolute wall-clock time
since an unspecified time in the past
(for example, system start-up time, or the Epoch).
The CLOCK_MONOTONIC does not change with the system time-of-day clock,
it just keeps advances forwards at one tick per tick
resets if the system is rebooted.
In general, CLOCK_MONOTONIC is recommended
to compute the elapsed time between two events.
To read more about the differences
between CLOCK_REALTIME and CLOCK_MONOTONIC,
please check the discussion on
this Stackoverflow thread.
written by Luca Guidi on the same topic is a recommended read.
2. No need to create hand made event objects on our own
Since it is a common practice
to initialize an event
in the event subscriber block,
Rails 6 now makes this a bit easy.
If the block passed to the subscriber only takes one argument
the Active Support Notification framework
now yields an event object to the block.
System (kernel) keeps track of CPU time per process.
The clock time returned
using CLOCK_PROCESS_CPUTIME_ID represents the CPU time
that has passed since the process started.
Since a process may not always get all CPU cycles
between start and finish of the process,
the process often has to
(sleep and) share CPU time among other processes.
If the system puts a process to sleep,
then the time spend waiting
not counted in the process’ CPU time.
The CPU time of an event can be fetched
using the #cpu_time method.
Rails 6 now computes the idle time of an event, too.
The idle time of an event
represents the difference
between the event’s #duration and #cpu_time.
Note that the #duration
computed using the difference
between the event’s monotonic time at the start (#time)
the monotonic time at the end (#end).
Let’s see how to get these time values.
It prints this.
Notice the @cpu_time_start and @cpu_time_finish counters
in the inspected event object representation
which are used to calculate the CPU time.
We will now know how many objects were allocated
between the start and end of an event
using event’s #allocations method.
The above example should print something like this.
Notice the @allocation_count_finish
and @allocation_count_start counters
in the inspected event object representation
which are used to calculate the number of objects
allocated during an event whose difference is (834228 - 834227 = 1).
is used to encrypt and authenticate passwords using
It assumes the model has a column named password_digest.
Before Rails 6,
did not accept any attribute as a parameter.
So, if we needed
encryption on a different column other than password_digest,
we would have to manually encrypt the value before storing it.
Rails 6 makes it easy and allows custom attributes as a parameter to has_secure_password.
has_secure_password still defaults to password so it works with previous versions of Rails.
has_secure_password still needs the column named column_name_digest defined on the model.
has_secure_password also adds the authenticate_column_name method to authenticate the custom column.