Rails 5.2 adds DSL for configuring Content Security Policy header

This blog is part of our Rails 5.2 series.

Content Security Policy (CSP) is an added layer of security that helps to detect and mitigate various types of attacks on our web applications, including Cross Site Scripting (XSS) and data injection attacks.

What is XSS ?

In this attack, victim’s browser may execute malicious scripts because browser trusts the source of the content even when it’s not coming from the correct source.

Here is our blog on XSS written sometime back.

How CSP can be used to mitigate and report this attack ?

By using CSP, we can specify domains that are valid sources of executable scripts. Then a browser with CSP compatibility will only execute those scripts that are loaded from these whitelisted domains.

Please note that CSP makes XSS attack a lot harder but CSP does not make XSS attack impossible. CSP does not stop DOM-based XSS (also known as client-side XSS). To prevent DOM-based XSS, Javascript code should be carefully written to avoid introducing such vulnerabilities.

In Rails 5.2, a DSL was added for configuring Content Security Policy header.

Let’s check the configuration.

We can define global policy for the project in an initializer.

# config/initializers/content_security_policy.rb

Rails.application.config.content_security_policy do |policy|
  policy.default_src :self, :https
  policy.font_src    :self, :https, :data
  policy.img_src     :self, :https, :data
  policy.object_src  :none
  policy.script_src  :self, :https
  policy.style_src   :self, :https, :unsafe_inline
  policy.report_uri  "/csp-violation-report-endpoint"
end

We can override global policy within a controller as well.

# Override policy inline

class PostsController < ApplicationController
  content_security_policy do |policy|
    policy.upgrade_insecure_requests true
  end
end
# Using mixed static and dynamic values

class PostsController < ApplicationController
  content_security_policy do |policy|
    policy.base_uri :self, -> { "https://#{current_user.domain}.example.com" }
  end
end

Content Security Policy can be deployed in report-only mode as well.

Here is global setting in an initializer.

# config/initializers/content_security_policy.rb

Rails.application.config.content_security_policy_report_only = true

Here we are putting an override at controller level.

class PostsController < ApplicationController
  content_security_policy_report_only only: :index
end

Policy specified in content_security_policy_report_only header will not be enforced, but any violations will be reported to a provided URI. We can provide this violation report URI in report_uri option.

# config/initializers/content_security_policy.rb

Rails.application.config.content_security_policy do |policy|
  policy.report_uri  "/csp-violation-report-endpoint"
end

If both content_security_policy_report_only and content_security_policy headers are present in the same response then policy specified in content_security_policy header will be enforced while content_security_policy_report_only policy will generate reports but will not be enforced.

Rails 5.2 disallows raw SQL in dangerous Active Record methods preventing SQL injections

This blog is part of our Rails 5.2 series.

We sometimes use raw SQL in Active Record methods. This can lead to SQL injection vulnerabilities when we unknowingly pass unsanitized user input to the Active Record method.

class UsersController < ApplicationController
  def index
    User.order("#{params[:order]} ASC")
  end
end

Although this code is looking fine on the surface, we can see the issues looking at the example from rails-sqli.

pry(main)> params[:order] = "(CASE SUBSTR(authentication_token, 1, 1) WHEN 'k' THEN 0 else 1 END)"

pry(main)> User.order("#{params[:order]} ASC")
User Load (1.0ms)  SELECT "users".* FROM "users" ORDER BY (CASE SUBSTR(authentication_token, 1, 1) WHEN 'k' THEN 0 else 1 END) ASC
=> [#<User:0x00007fdb7968b508
  id: 1,
  email: "piyush@example.com",
  authentication_token: "Vkn5jpV_zxhqkNesyKSG">]

There are many Active Record methods which are vulnerable to SQL injection and some of these can be found here.

However, in Rails 5.2 these APIs are changed and they allow only attribute arguments and Rails does not allow raw SQL. With Rails 5.2 it is not mandatory but the developer would see a deprecation warning to remind about this.

irb(main):004:0> params[:order] = "email"
=> "email"
irb(main):005:0> User.order(params[:order])
  User Load (1.0ms)  SELECT  "users".* FROM "users" ORDER BY email LIMIT $1  [["LIMIT", 11]]
=> #<ActiveRecord::Relation [#<User id: 1, email: "piyush@example.com", authentication_token: "Vkn5jpV_zxhqkNesyKSG">]>

irb(main):008:0> params[:order] = "(CASE SUBSTR(authentication_token, 1, 1) WHEN 'k' THEN 0 else 1 END)"
irb(main):008:0> User.order("#{params[:order]} ASC")
DEPRECATION WARNING: Dangerous query method (method whose arguments are used as raw SQL) called with non-attribute argument(s): "(CASE SUBSTR(authentication_token, 1, 1) WHEN 'k' THEN 0 else 1 END)". Non-attribute arguments will be disallowed in Rails 6.0. This method should not be called with user-provided values, such as request parameters or model attributes. Known-safe values can be passed by wrapping them in Arel.sql(). (called from irb_binding at (irb):8)
  User Load (1.2ms)  SELECT  "users".* FROM "users" ORDER BY (CASE SUBSTR(authentication_token, 1, 1) WHEN 'k' THEN 0 else 1 END) ASC
=> #<ActiveRecord::Relation [#<User id: 1, email: "piyush@example.com", authentication_token: "Vkn5jpV_zxhqkNesyKSG">]>

In Rails 6, this will result into an error.

In Rails 5.2, if we want to run raw SQL without getting the above warning, we have to change raw SQL string literals to an Arel::Nodes::SqlLiteral object.

irb(main):003:0> Arel.sql('title')
=> "title"
irb(main):004:0> Arel.sql('title').class
=> Arel::Nodes::SqlLiteral

irb(main):006:0> User.order(Arel.sql("#{params[:order]} ASC"))
  User Load (1.2ms)  SELECT  "users".* FROM "users" ORDER BY (CASE SUBSTR(authentication_token, 1, 1) WHEN 'k' THEN 0 else 1 END) ASC
=> #<ActiveRecord::Relation [#<User id: 1, email: "piyush@example.com", authentication_token: "Vkn5jpV_zxhqkNesyKSG">]>

This should be done with care and should not be done with user input.

Here is relevant commit and discussion.

Ruby 2.6 adds RubyVM::AST module

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Ruby 2.6 added RubyVM::AST to generate Abstract Syntax Tree of code. Please note that this feature is experimental and under active development.

As of now RubyVM::AST supports two methods named as parse and parse_file.

parse method takes string as parameter and returns root node of the tree in the form of an object of RubyVM::AST::Node.

parse_file method takes file name as parameter and returns root node of the tree in the form of an object of RubyVM::AST::Node.

Ruby 2.6.0-preview2

irb> RubyVM::AST.parse("(1..100).select { |num| num % 5 == 0 }")
=> #<RubyVM::AST::Node(NODE_SCOPE(0) 1:0, 1:38): >

irb> RubyVM::AST.parse_file("/Users/amit/app.rb")
=> #<RubyVM::AST::Node(NODE_SCOPE(0) 1:0, 1:38): >

RubyVM::AST::Node has seven public instance methods - children, first_column, first_lineno, inspect, last_column, last_lineno and type.

Ruby 2.6.0-preview2

irb> ast_node = RubyVM::AST.parse("(1..100).select { |num| num % 5 == 0 }")
=> #<RubyVM::AST::Node(NODE_SCOPE(0) 1:0, 1:38): >

irb> ast_node.children
=> [nil, #<RubyVM::AST::Node(NODE_ITER(9) 1:0, 1:38): >]

irb> ast_node.first_column
=> 0

irb> ast_node.first_lineno
=> 1

irb> ast_node.inspect
=> "#<RubyVM::AST::Node(NODE_SCOPE(0) 1:0, 1:38): >"

irb> ast_node.last_column
=> 38

irb> ast_node.last_lineno
=> 1

irb> ast_node.type
=> "NODE_SCOPE"

This module will majorly help in building static code analyzer and formatter.

Inline Installation of Firefox Extension

Inline Installation

Firefox extensions, similar to Chrome extensions, help us modify and personalize our browsing experience by adding new features to the existing sites.

Once we’ve published our extension to the Mozilla’s Add-on store(AMO), users who browse the AMO can find the extension and install it with one-click. But, if a user is already on our site where a link is provided to the extension’s AMO listing page, they would need to navigate away from our website to the AMO, complete the install process, and then return back to our site. That is a bad user experience.

The inline installation enables us to initiate the extension installation from our site. The extension can still be hosted on the AMO but users would no longer have to leave our site to install it.

We had to try out a few suggested approaches before we got it working.

InstallTrigger

InstallTrigger is an interface included in the Mozilla’s Apps API for installing extensions. Using JavaScript, the install method of InstallTrigger can be used to start the download and installation of an extension (or anything packaged in a .xpi file) from a Web page.

A XPI(pronounced as “zippy”) is similar to a zip file, which contains manifest file and the install script for the extension.

So, let’s try to install the Grammarly Extension for Firefox. To use it, we first need its .xpi file’s location. Once we have published our extension on the AMO, we can navigate to the listings page for it and get the link for the .xpi.

For our present example, here’s the listing page for Grammarly Extension.

Here, we can get the .xpi file’s location by right clicking on the + Add to Firefox button and clicking on Copy Link Location. Note that the + Add to Firefox button would only be visible if we browse the link on a Firefox browser. Otherwise, it would be replaced by a Get Firefox Now button.

Once we have the URL, we can trigger the installation via JavaScript on our web page.

InstallTrigger.install({
  'Name of the Extension': {
    URL: "url pointing to the .xpi file's location on AMO",
  },
});

Pointing to the latest version of the Extension

When we used the URL in the above code, the .xpi file’s URL was specific to the extension’s current version. If the extension has an update, the installed extensions for existing users would be updated automatically. But the URL to the .xpi on our website would be pointing to the older version. Although the old link would still work, we would always want new users to download the latest version.

To do that, we can either fetch the listing page in the background and parse the HTML to get the latest link. But that approach can break if the HTML changes.

Or we can query the Addons Services API, which returns the information for the extension in XML format.

For the Grammarly Extension, we first need its slug-id. We can get it by looking at its listing page’s URL. From https://addons.mozilla.org/en-US/firefox/addon/grammarly-1/, we can note down the slug which is grammarly-1

Using this slug id, we can now get the extension details using https://services.addons.mozilla.org/en-US/firefox/api/1.5/addon/grammarly-1. It returns the info for the Grammarly Extension. What we are particularly interested in is the value in the <install> node. That is what the desired value is for the latest version for our .xpi.

Let’s see how we can implement the whole thing using React.

import axios from 'axios';
import cheerio from 'cheerio';

const FALLBACK_GRAMMARLY_EXTENSION_URL =
  'https://addons.mozilla.org/firefox/downloads/file/1027073/grammarly_for_firefox-8.828.1757-an+fx.xpi';
const URL_FOR_FETCHING_XPI = `https://services.addons.mozilla.org/en-US/firefox/api/1.5/addon/grammarly-1`;

export default class InstallExtension extends Component {
  state = {
    grammarlyExtensionUrl: FALLBACK_GRAMMARLY_EXTENSION_URL,
  };

  componentWillMount() {
    axios.get(URL_FOR_FETCHING_XPI).then(response => {
      const xml = response.data;
      const $ = cheerio.load(xml);
      const grammarlyExtensionUrl = $('addon install').text();
      this.setState({ grammarlyExtensionUrl });
    });
  }

  triggerInlineInstallation = event => {
    InstallTrigger.install({
      Grammarly: { URL: this.state.grammarlyExtensionUrl },
    });
  };

  render() {
    return (
      <Button onClick={this.triggerInlineInstallation}>
        Install Grammarly Extension
      </Button>
    );
  }
}

In the above code, we are using the npm packages axios for fetching the xml and cheerio for parsing the xml. Also, we have set a fallback URL as the initial value in case the fetching of the new URL from the xml response fails.

Using parametrized containers for deploying Rails micro services on Kubernetes

When using micro services with containers, one has to consider modularity and reusability while designing a system.

While using Kubernetes as a distributed system for container deployments, modularity and reusability can be achieved using parameterizing containers to deploy micro services.

Parameterized containers

Assuming container as a function in a program, how many parameters does it have? Each parameter represents an input that can customize a generic container to a specific situation.

Let’s assume we have a Rails application isolated in services like puma, sidekiq/delayed-job and websocket. Each service runs as a separate deployment on a separate container for the same application. When deploying the change we should be building the same image for all the three containers but they should be different function/processes. In our case, we will be running 3 pods with the same image. This can be achieved by building a generic image for containers. The Generic container must be accepting parameters to run different services.

We need to expose parameters and consume them inside the container. There are two ways to pass parameters to our container.

  1. Using environment variables.
  2. Using command line arguments.

In this article, we will use environment variables to run parameterized containers like puma, sidekiq/delayed-job and websocket for Rails applications on kubernetes.

We will deploy wheel on kubernetes using parametrized container approach.

Pre-requisite

Building a generic container image.

Dockerfile in wheel uses bash script setup_while_container_init.sh as a command to start a container. The script is self-explanatory and, as we can see, it consists of two functions web and background. Function web starts the puma service and background starts the delayed_job service.

We create two different deployments on kubernetes for web and background services. Deployment templates are identical for both web and background. The value of environment variable POD_TYPE to init-script runs the particular service in a pod.

Once we have docker image built, let’s deploy the application.

Creating kubernetes deployment manifests for wheel application

Wheel uses PostgreSQL database and we need postgres service to run the application. We will use the postgres image from docker hub and will deploy it as deployment.

Note: For production deployments, database should be deployed as a statefulset or use managed database services.

K8s manifest for deploying PostgreSQL.

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: db
  name: db
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: db
    spec:
      containers:
      - image: postgres:9.4
        name: db
        env:
        - name: POSTGRES_USER
          value: postgres
        - name: POSTGRES_PASSWORD
          value: welcome

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: db
  name: db
spec:
  ports:
  - name: headless
    port: 5432
    targetPort: 5432
  selector:
    app: db

Create Postgres DB and the service.

$ kubectl create -f db-deployment.yml -f db-service.yml
deployment db created
service db created

Now that DB is available, we need to access it from the application using database.yml.

We will create configmap to store database credentials and mount it on the config/database.yml in our application deployments.

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: database-config
data:
  database.yml: |
    development:
      adapter: postgresql
      database: wheel_development
      host: db
      username: postgres
      password: welcome
      pool: 5

    test:
      adapter: postgresql
      database: wheel_test
      host: db
      username: postgres
      password: welcome
      pool: 5

    staging:
      adapter: postgresql
      database: postgres
      host: db
      username: postgres
      password: welcome
      pool: 5

Create configmap for database.yml.

$ kubectl create -f database-configmap.yml
configmap database-config created

We have the database ready for our application, now let’s proceed to deploy our Rails services.

Deploying Rails micro-services using the same docker image

In this blog, we will limit our services to web and background with kubernetes deployment.

Let’s create a deployment and service for our web application.

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: wheel-web
  labels:
    app: wheel-web
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: wheel-web
    spec:
      containers:
      - image: bigbinary/wheel:generic
        name: web
        imagePullPolicy: Always
        env:
        - name: DEPLOY_TIME
          value: $date
          value: staging
        - name: POD_TYPE
          value: WEB
        ports:
        - containerPort: 80
        volumeMounts:
          - name: database-config
            mountPath: /wheel/config/database.yml
            subPath: database.yml
      volumes:
        - name: database-config
          configMap:
            name: database-config

---

apiVersion: v1
kind: Service
metadata:
  labels:
    app: wheel-web
  name: web
spec:
  ports:
  - name: puma
    port: 80
    targetPort: 80
  selector:
    app: wheel-web
  type: LoadBalancer

Note that we used POD_TYPE as WEB, which will start the puma process from the container startup script.

Let’s create a web/puma deployment and service.

kubectl create -f web-deployment.yml -f web-service.yml
deployment wheel-web created
service web created
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: wheel-background
  labels:
    app: wheel-background
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: wheel-background
    spec:
      containers:
      - image: bigbinary/wheel:generic
        name: background
        imagePullPolicy: Always
        env:
        - name: DEPLOY_TIME
          value: $date
        - name: POD_TYPE
          value: background
        ports:
        - containerPort: 80
        volumeMounts:
          - name: database-config
            mountPath: /wheel/config/database.yml
            subPath: database.yml
      volumes:
        - name: database-config
          configMap:
            name: database-config

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: wheel-background
  name: background
spec:
  ports:
  - name: background
    port: 80
    targetPort: 80
  selector:
    app: wheel-background

For background/delayed-job we set POD_TYPE as background. It will start delayed-job process.

Let’s create background deployment and the service.

kubectl create -f background-deployment.yml -f background-service.yml
deployment wheel-background created
service background created

Get application endpoint.

$ kubectl get svc web -o wide | awk '{print $4}'
a55714dd1a22d11e88d4b0a87a399dcf-2144329260.us-east-1.elb.amazonaws.com

We can access the application using the endpoint.

Now let’s see pods.

$ kubectl get pods
NAME                                READY     STATUS    RESTARTS   AGE
db-5f7d5c96f7-x9fll                 1/1       Running   0          1h
wheel-background-6c7cbb4c75-sd9sd   1/1       Running   0          30m
wheel-web-f5cbf47bd-7hzp8           1/1       Running   0          10m

We see that db pod is running postgres, wheel-web pod is running puma and wheel-background pod is running delayed job.

If we check logs, everything coming to puma is handled by web pod. All the background jobs are handled by background pod.

Similarly, if we are using websocket, separate API pods, traffic will be routed to respective services.

This is how we can deploy Rails micro services using parametrized containers and a generic image.

Configuring memory allocation in ImageMagick

ImageMagick has a security policy file policy.xml using which we can control and limit the execution of the program. For example, the default memory limit of ImageMagick-6 is 256 MiB.

Recently, we saw following error while processing a gif image.

convert-im6.q16: DistributedPixelCache '127.0.0.1' @ error/distribute-cache.c/ConnectPixelCacheServer/244.
convert-im6.q16: cache resources exhausted `file.gif' @ error/cache.c/OpenPixelCache/3945.

This happens when ImageMagick cannot allocate enough memory to process the image. This can be fixed by tweaking memory configuration in policy.xml.

Path of policy.xml can be located as follows.

$ identify -list policy

Path: /etc/ImageMagick-6/policy.xml
  Policy: Resource
    name: disk
    value: 1GiB

Memory limit can be configured in the following line of policy.xml.

<policy domain="resource" name="memory" value="256MiB"/>

Increasing this value would solve the error if you have a machine with larger a memory.

Uploading files directly to S3 using Pre-signed POST request

It’s easy to create a form in Rails which can upload a file to the backend. The backend, can then take the file and upload it to S3. We can do that by using gems like paperclip or carrierwave. Or if we are using Rails 5.2, we can use Active Storage

But for applications, where Rails is used only as an API backend, uploading via a form is not an option. In this case, we can expose an endpoint which accepts files, and then Rails can handle uploading to S3.

In most of the cases, the above solution works. But recently, in one of our applications which is hosted at Heroku we faced time-out related problems while uploading large files. Here is what heroku’s docs says about how long a request can take.

The router terminates the request if it takes longer than 30 seconds to complete.

Pre-signed POST request

An obvious solution is to upload the files directly to S3. However inorder to do that, the client needs AWS credentials, which is not ideal. If the client is a Single Page Application, the AWS credentials would be visible in the javascript files. Or if the client is a mobile app, someone might be able to reverse engineer the application, and get hold of the AWS credentials.

Here’s where Pre-signed POST request comes to the rescue. Here is official docs from AWS on this topic.

Uploading via Pre-signed POST is a two step process. The client first requests a permission to upload the file. The backend receives the request, generates the pre-signed URL and returns the response along with other fields. The client can then upload the file to the URL received in the response.

Implementation

Add the AWS gem to you Gemfile and run bundle install.

gem 'aws-sdk'

Create a S3 bucket with the AWS credentials.

aws_credentials = Aws::Credentials.new(
  ENV['AWS_ACCESS_KEY_ID'],
  ENV['AWS_SECRET_ACCESS_KEY']
)

s3_bucket = Aws::S3::Resource.new(
  region: 'us-east-1',
  credentials: aws_credentials
).bucket(ENV['S3_BUCKET'])

The controller handling the request for getting the presigned URL should have following code.

def request_for_presigned_url
  presigned_url = s3_bucket.presigned_post(
    key: "#{Rails.env}/#{SecureRandom.uuid}/${filename}",
    success_action_status: '201',
    signature_expiration: (Time.now.utc + 15.minutes)
  )

  data = { url: presigned_url.url, url_fields: presigned_url.fields }

  render json: data, status: :ok
end

In the above code, we are creating a presigned url using the presigned_post method.

The key option specifies path where the file would be stored. AWS supports a custom ${filename} directive for the key option. This ${filename} directive tells S3 that if a user uploads a file named image.jpg, then S3 should store the file with the same name. In S3, we cannot have duplicate keys, so we are using SecureRandom to generate unique key so that 2 files with same name can be stored.

If a file is successfully uploaded, then client receives HTTP status code under key success_action_status. If the client sets its value to 200 or 204 in the request, Amazon S3 returns an empty document along with 200 or 204 as HTTP status code. We set it to 201 here because we want the client to notify us with the S3 key where the file was uploaded to. The S3 key is present in the XML document which is received as a response from AWS only when the status code is 201.

signature_expiration specifies when the signature on the POST will expire. It defaults to one hour from the creation of the presigned POST. This value should not exceed one week from the creation time. Here, we are setting it to 15 minutes.

Other configuration options can be found here.

In response to the above request, we send out a JSON which contains the URL and the fields required for making the upload.

Here’s a sample response.

{
  "url": "https://s3.amazonaws.com/<some-s3-url>",
  "url_fields": {
    "key": "development/8614bd40-691b-4668-9241-3b342c6cf429/${filename}",
    "success_action_status": "201",
    "policy": "<s3-policy>",
    "x-amz-credential": "********************/20180721/us-east-1/s3/aws4_request",
    "x-amz-algorithm": "AWS4-HMAC-SHA256",
    "x-amz-date": "201807021T144741Z",
    "x-amz-signature": "<hexadecimal-signature>"
  }
}

Once the client gets the above credentials, it can proceed with the actual file upload.

The client can be anything. An iOS app, android app, an SPA or even a Rails app. For our example, let’s assume it’s a node client.

var request = require("request");
function uploadFileToS3(response) {
  var options = {
    method: 'POST',
    url: response.url,
    formData: {
      ...response.url_fields,
      file: <file-object-for-upload>
    }
  }

  request(options, (error, response, body) => {
    if (error) throw new Error(error);
    console.log(body);
  });
}

Here, we are making a POST request to the URL received from the earlier presigned response. Note that we are using the spread operator to pass url_fields in formData.

When the POST request is successful, the client then receives an XMLresponse from S3 because we set the response code to be 201. A sample response can be like the following.

<?xml version="1.0" encoding="UTF-8"?>
<PostResponse>
    <Location>https://s3.amazonaws.com/link-to-the-file</Location>
    <Bucket>s3-bucket</Bucket>
    <Key>development/8614bd40-691b-4668-9241-3b342c6cf429/image.jpg</Key>
    <ETag>"32-bit-tag"</ETag>
</PostResponse>

Using the above response, the client can then let the API know about where the file was uploaded by sending the value from the Key node. Although, this can be optional in some cases, depending on the API, if it actually needs this info.

Advantages

Using AWS S3 presigned-urls has a few advantages.

  • The main advantage of uploading directly to S3 is that there would be considerably less load on your application server since the server is now free from handling the receiving of files and transferring to S3.

  • Since the file upload happens directly on S3, we can bypass the 30 seconds Heroku time limit.

  • AWS credentials are not shared with the client application. So no one would be able to get their hands on your AWS keys.

  • The generated presigned-url can be initialized with an expiration time. So the URLs and the signatures generated would be invalid after that time period.

  • The client does not need to install any of the AWS libraries. It just needs to upload the file via a simple POST request to the generated URL.

How we reduced infrastructure cost by 10% for an e-commerce project

Recently, we got an opportunity to reduce the infrastructure cost of a medium-sized e-commerce project. In this blog we discuss how we reduced the total infrastructure cost by 10%.

Changes to MongoDB instances

Depending on the requirements, modern web applications use different third-party services. For example, it’s easy and cost effective to subscribe to a GeoIP lookup service than building and maintaining one. Some third-party services get very expensive as the usage increases but people don’t look for alternatives due to legacy reasons.

In our case, our client had been paying more than $5,000/month for a third-party MongoDB service. This service charges based on the storage used and we had years of data in it. This data is consumed by a machine learning system to fight fraudulent purchases and users. We had a look at both the ML system and the data in MongoDB and found we actually didn’t need all the data in the database. The system never read data older than 30-60 days in some of the biggest mongo collections.

Since we were already using nomad as our scheduler, we wrote a periodic nomad job that runs every week to delete unnecessary data. The nomad job syncs both primary and secondary MongoDB instances to release the free space back to OS. This helped reduce monthly bill to $630/month.

Changes to MongoDB service provider

Then we looked at the MongoDB service provider. It was configured years back when the application was built. There are other vendors who provided the same service for a much cheaper price. We switched our MongoDB to mLab and now the database runs in a $180/month dedicated cluster. With WiredTiger’s compression enabled, we don’t use as much storage we used to use before.

Making use of Auto Scaling

Auto Scaling can be a powerful tool when it comes to reducing costs. We had been running around 15 large EC2 instances. This was inefficient due to following two reasons.

  1. It cannot cope up when the traffic increases beyond its limit.
  2. Resources are underused when traffic is less.

Auto Scaling solves both the issues. For web servers, we switched to smaller instances and used Target Tracking Scaling Policy to keep the average aggregate CPU utilization at 70%.

Background job workers made use of a nomad job we built. It periodically calculated the number of required instances based on the count of pending jobs and the job’s queue priority. This number was pushed to CloudWatch as a metric and the Auto Scaling group scaled based on that. This approach was effective in boosting performance and reducing cost.

Buying reserved instances

AWS has a feature to reserve instances for services like EC2, RDS, etc.. It’s often preferable to buy reserved instances than running the application using on-demand instances. We evaluated reserved instance utilization using the reporting tool and bought the required reserved instances.

Looking for cost-effective solutions

Sometimes, different solutions to the same problem can have different costs. For example, we had been facing small DDoS attack regularly and we had to rate-limit requests based on IP and other parameters. Since we had been using Cloudflare, we could have used their rate-limiting feature. Performance wise, it was the best solution but they charge based on the number of good requests. It would be expensive for us since it’s a high-traffic application. We looked for other solutions and solved the problem using Rack::Attack. We wrote a blog about it sometime back. The solution presented in the blog was effective in mitigating the DDoS attack we faced and didn’t cost us anything significant.

Requesting custom pricing

If you are a comparatively larger customer of a third-party service, it’s more likely that you don’t have to pay the published price. Instead, we could request for custom pricing. Many companies will be happy to give 20% to 50% price discounts if we can commit to a minimum spending in the year. We tried negotiating a new contract for an expensive third-party service and got the deal with 40% discount compared to their published minimum price.

Running an infrastructure can be both technically and economically challenging. But if we can look between the lines and if we are willing to update existing systems, we would be amazed in terms of how much money we can save every month.

Ruby 2.6 adds Enumerable#filter as an alias of Enumerable#select

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Ruby 2.6 has added Enumerable#filter as an alias of Enumerable#select. The reason for adding Enumerable#filter as an alias is to make it easier for people coming from other languages to use Ruby. A lot of other languages including Java, R, PHP etc. have filter method to filter/select records based on a condition.

Let’s take an example in which we have to select/filter all numbers which are divisible by 5 from a range.

Ruby 2.5

irb> (1..100).select { |num| num % 5 == 0 }
=> [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100]

irb> (1..100).filter { |num| num % 5 == 0 }
=> Traceback (most recent call last):
2: from /Users/amit/.rvm/rubies/ruby-2.5.1/bin/irb:11:in `<main>' 1: from (irb):2 NoMethodError (undefined method`filter' for 1..100:Range)

Ruby 2.6.0-preview2

irb> (1..100).select { |num| num % 5 == 0 }
=> [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100]

irb> (1..100).filter { |num| num % 5 == 0 }
=> [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100]

Also note that along with Enumerable#filter, Enumerable#filter! is also added as an alias for Enumerable#select!.

Here is relevant commit and discussion.

Ruby 2.6 adds support for non-ASCII capital letter as a first character in constant name

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Before Ruby 2.6, constant must have a capital ASCII letter as the first character. It means class and module name cannot start with non-ASCII capital character.

Below code will raise class/module name must be CONSTANT exception.

  class Большойдвоичный
  end

We can use above non-ASCII character as a method name or variable name though.

Below code will run without any exception

  class NonAsciiMethodAndVariable
    def Большойдвоичный
      Имя = "BigBinary"
    end
  end

“Имя” is treated as a variable name in above example, even though first letter(И) is a capital non-ASCII character.

Ruby 2.6

Ruby 2.6 relaxes above mentioned limitation. We can now define constants in laguages other than English. Languages having capital letters like Russian and Greek can be used to define constant name.

Below code will run without exception in any Ruby 2.6.

  class Большойдвоичный
  end

As capital non-Ascii characters are now treated as constant, below code will raise a warning in Ruby 2.6.

  irb(main):001:0> Имя = "BigBinary"
  => "BigBinary"
  irb(main):002:0> Имя = "BigBinary"
  (irb):2: warning: already initialized constant Имя
  (irb):1: warning: previous definition of Имя was here

Above code will run without any warnings on Ruby versions prior to 2.6

Here is relevant commit and discussion for this change.

Setting up a high performance Geocoder

One of our applications uses geocoding extensively. When we started the project, we included the excellent Geocoder gem, and set Google as the geocoding backend. As the application scaled, its geocoding requirements grew and soon we were looking at geocoding bills worth thousands of dollars.

An alternative Geocoder

Our search for an alternative geocoder landed us on Nominatim. Written in C, with a PHP web interface, Nominatim was performant enough for our requirements. Once set up, Nominatim required 8GB of RAM to run and this included RAM for the PostgreSQL (+ PostGIS) as well.

The rest of the blog discusses how to setup Nominatim and the tips and tricks that we learned along the way and how it compares with the geocoding solution offered by Google.

Setting up Nominatim

We started off by looking for Amazon Machine Images with Nominatim setup and could only find one which was hosted by OpenStreetMap but the magnet link was dead.

Next, we went through the official installation document. We decided to give docker a shot and found that there are many Nominatim docker builds. We used https://github.com/merlinnot/nominatim-docker since it seemed to follow all the steps mentioned in the official installation guide.

Issues faced during Setup

Out of Memory Errors

The official documentation recommends using 32GB of RAM for initial import but we needed to double the memory to 64GB to make it work.

Also any time docker build failed, due to the large amount of data that is generated on each run, we also ran out of disk space on subsequent docker builds since docker caches layers across builds.

Merging Multiple Regions

We wanted to geocode locations from USA, Mexico, Canada and Sri Lanka. USA, Mexico and Canada are included by default in North America data extract but we had to merge data for Sri Lanka with North America to get it in a format required for initial import.

The following snippet pre-processes map data for North America and Sri Lanka into a single data.osm.pbf file that can be directly used by Nominatim installer.

RUN curl -L 'http://download.geofabrik.de/north-america-latest.osm.pbf' \
    --create-dirs -o /srv/nominatim/src/north-america-latest.osm.pbf
RUN curl -L 'http://download.geofabrik.de/asia/sri-lanka-latest.osm.pbf' \
    --create-dirs -o /srv/nominatim/src/sri-lanka-latest.osm.pbf

RUN osmconvert /srv/nominatim/src/north-america-latest.osm.pbf \
    -o=/srv/nominatim/src/north-america-latest.o5m
RUN osmconvert /srv/nominatim/src/sri-lanka-latest.osm.pbf \
    -o=/srv/nominatim/src/sri-lanka-latest.o5m

RUN osmconvert /srv/nominatim/src/north-america-latest.o5m \
    /srv/nominatim/src/sri-lanka-latest.o5m \
    -o=/srv/nominatim/src/data.o5m

RUN osmconvert /srv/nominatim/src/data.o5m \
    -o=/srv/nominatim/src/data.osm.pbf

Slow Search times

Once the installation was done, we tried running simple location searches like this one, but the search timed out. Usually Nominatim can provide a lot of information from its web-interface by appending &debug=true to the search query.

# from
https://nominatim.openstreetmap.org/search.php?q=New+York&polygon_geojson=1&viewbox=
# to
https://nominatim.openstreetmap.org/search.php?q=New+York&polygon_geojson=1&viewbox=&debug=true

We created an issue in Nominatim repository and got very prompt replies from Nominatim maintainers, especially from Sarah Hoffman .

# runs analyze on the entire nominatim database
psql -d nominatim -c 'ANALYZE VERBOSE'

PostgreSQL query planner depends on statistics collected by postgres statistics collector while executing a query. In our case, query planner took an enormous amount of time to plan queries as there were no stats collected since we had a fresh installation.

Comparing Nominatim and Google Geocoder

We compared 2500 addresses and we found that Google geocoded 99% of those addresses. In comparison Nominatim could only geocode 47% of the addresses.

It means we still need to geocode ~50% of addresses using Google geocoder. We found that we could increase geocoding efficiency by normalizing the addresses we had.

Address Normalization using libpostal

Libpostal is an address normalizer, which uses statistical natural-language processing to normalize addresses. Libpostal also has ruby bindings which made it quite easy to use it for our test purposes.

Once libpostal and its ruby-bindings were installed (installation is straightforward and steps are available in ruby-postal’s github page), we gave libpostal + Nominatim a go.

require 'geocoder'
require 'ruby_postal/expand'
require 'ruby_postal/parser'

Geocoder.configure({lookup: :nominatim, nominatim: { host: "nominatim_host:port"}})

full_address = [... address for normalization ...]
expanded_addresses = Postal::Expand.expand_address(full_address)
parsed_addresses = expanded_addresses.map do |address|
  Postal::Parser.parse_address(address)
end

parsed_addresses.each do | address |
  parsed_address = [:house_number, :road, :city, :state, :postcode, :country].inject([]) do |acc, key|
    # address is of format
    # [{label: 'postcode', value: 12345}, {label: 'city', value: 'NY'} .. ]
    key_value = address.detect { |address| address[:label] == key }
    if key_value
        acc << "#{key_value_pair[:value]}".titleize
    end
    acc
  end

  coordinates = Geocoder.coordinates(parsed_address.join(", "))
  if (coordinates.is_a? Array) && coordinates.present?
    puts "By Libpostal #{coordinates} => #{parsed_address.join(", ")}"
    break
  end
end

With this, we were able to improve our geocoding efficiency by 10% as Nominatim + Libpostal combination could geocode ~ 59% of addresses.

Debugging failing tests in puppeteer because of background tab

We have been using puppeteer in one of our projects to write end-to-end tests. We run our tests in headful mode to see the browser in action.

If we start puppeteer tests and do nothing in our laptop (just watch the tests being executed) then all the tests will pass.

However if we are doing our regular work in our laptop while tests are running then tests would fail randomly. This was quite puzzling.

Debugging such flaky tests is hard. We first suspected that the test cases themselves needed more of implicit waits for element/text to be present/visible on the DOM.

After some debugging using puppeteer protocol logs, it seemed like the browser was performing certain actions very slowly or was waiting for the browser to be active ( in view ) before performing those actions.

Chrome starting with version 57 introduced throtlling of background tabs for improving performance and battery life. We execute one test per browser meaning we didn’t make use of multiple tabs. Also tests failed only when the user was performing some other activities while the tests were executing in other background windows. Pages were hidden only when user switched tabs or minimized the browser window containing the tab.

After observing closely we noticed that the pages were making requests to the server. The issue was the page was not painting if the page is not in view. We added flag --disable-background-timer-throttling but we did not notice any difference.

After doing some searches we noticed the flag --disable-renderer-backgrounding was being used in karma-launcher. The comment states that it is specifically required on macOS. Here is the code responsible for lowering the priority of the renderer when it is hidden.

But the new flag didn’t help either.

While looking at all the available command line switches for chromium, we stumbled upon --disable-backgrounding-occluded-windows. Chromium also backgrounds the renderer while the window is not visible to the user. It seems from the comment that the flag kDisableBackgroundingOccludedWindowsForTesting is specifically added to avoid non-deterministic behavior during tests.

We have added following flags to chromium for running our integration suite and this solved our problem.

const chromeArgs = [
  '--disable-background-timer-throttling',
  '--disable-backgrounding-occluded-windows',
  '--disable-renderer-backgrounding'
];

References

Using Kubernetes ingress controller for authenticating applications

Kubernetes Ingress has redefined the routing in this era of containerization and with all these freehand routing techniques the thought of “My router my rules” seems real.

We use nginx-ingress as a routing service for our applications. There is a lot more than routing we can do with ingress. One of the important features is setting up authentication using ingress for our application. As all the traffic goes from ingress to our service, it makes sense to setup authentication on ingress.

As mentioned in ingress repository there are different types of techniques available for authentication including:

  • Basic authentication
  • Client-certs authentication
  • External authentication
  • Oauth external authentication

In this blog, we will set up authentication for the sample application using basic ingress authentication technique.

Pre-requisites

First, let’s create ingress resources from upstream example by running the following command.

$ kubectl create -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/mandatory.yaml
namespace "ingress-nginx" created
deployment "default-http-backend" created
service "default-http-backend" created
configmap "nginx-configuration" created
configmap "tcp-services" created
configmap "udp-services" created
serviceaccount "nginx-ingress-serviceaccount" created
clusterrole "nginx-ingress-clusterrole" created
role "nginx-ingress-role" created
rolebinding "nginx-ingress-role-nisa-binding" created
clusterrolebinding "nginx-ingress-clusterrole-nisa-binding" created
deployment "nginx-ingress-controller" created

Now that ingress controller resources are created we need a service to access the ingress.

Use following manifest to create service for ingress.

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
  labels:
    k8s-addon: ingress-nginx.addons.k8s.io
  name: ingress-nginx
  namespace: ingress-nginx
spec:
  externalTrafficPolicy: Cluster
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: http
  - name: http
    port: 80
    protocol: TCP
    targetPort: http
  selector:
    app: ingress-nginx
  type: LoadBalancer

Now, get the ELB endpoint and bind it with some domain name.

$kubectl create -f ingress-service.yml
service ingress-nginx created

$ kubectl -n ingress-nginx get svc  ingress-nginx -o wide
NAME            CLUSTER-IP      EXTERNAL-IP                                                               PORT(S)                      AGE       SELECTOR
ingress-nginx   100.71.250.56   abcghccf8540698e8bff782799ca8h04-1234567890.us-east-2.elb.amazonaws.com   80:30032/TCP,443:30108/TCP   10s       app=ingress-nginx

Let’s create a deployment and service for our sample application kibana. We need elasticsearch to run kibana.

Here is manifest for the sample application.

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: kibana
  name: kibana
  namespace: ingress-nginx
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
       - image: kibana:latest
         name: kibana
         ports:
           - containerPort: 5601
---
apiVersion: v1
kind: Service
metadata:
  annotations:
  labels:
    app: kibana
  name: kibana
  namespace: ingress-nginx

spec:
  ports:
  - name: kibana
    port: 5601
    targetPort: 5601
  selector:
    app: kibana
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: elasticsearch
  name: elasticsearch
  namespace: ingress-nginx
spec:
  replicas: 1
  strategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
       - image: elasticsearch:latest
         name: elasticsearch
         ports:
           - containerPort: 5601
---
apiVersion: v1
kind: Service
metadata:
  annotations:
  labels:
    app: elasticsearch
  name: elasticsearch
  namespace: ingress-nginx
spec:
  ports:
  - name: elasticsearch
    port: 9200
    targetPort: 9200
  selector:
    app: elasticsearch

Create the sample application.

kubectl apply -f kibana.yml
deployment "kibana" created
service "kibana" created
deployment "elasticsearch" created
service "elasticsearch" created

Now that we have created application and ingress resources, it’s time to create an ingress and access the application.

Use the following manifest to create ingress.

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
  name: kibana-ingress
  namespace: ingress-nginx
spec:
  rules:
    - host: logstest.myapp-staging.com
      http:
        paths:
          - path: /
            backend:
              serviceName: kibana
              servicePort: 5601
$kubectl -n ingress-nginx create -f ingress.yml
ingress "kibana-ingress" created.

Now that our application is up, when we access the kibana dashboard using URL http://logstest.myapp-staging.com We directly have access to our Kibana dashboard and anyone with this URL can access logs as shown in the following image.

Kibana dashboard without authentication

Now, let’s set up a basic authentication using htpasswd.

Follow below commands to generate the secret for credentials.

Let’s create an auth file with username and password.

$ htpasswd -c auth kibanaadmin
New password: <kibanaadmin>
New password:
Re-type new password:
Adding password for user kibanaadmin

Create k8s secret.

$ kubectl -n ingress-nginx create secret generic basic-auth --from-file=auth
secret "basic-auth" created

Verify the secret.

kubectl get secret basic-auth -o yaml
apiVersion: v1
data:
  auth: Zm9vOiRhcHIxJE9GRzNYeWJwJGNrTDBGSERBa29YWUlsSDkuY3lzVDAK
kind: Secret
metadata:
  name: basic-auth
  namespace: ingress-nginx
type: Opaque

Use following annotations in our ingress manifest by updating the ingress manifest.

kubectl -n ingress-nginx edit ingress kibana ingress

Paste the following annotations

nginx.ingress.kubernetes.io/auth-type: basic
nginx.ingress.kubernetes.io/auth-secret: basic-auth
nginx.ingress.kubernetes.io/auth-realm: "Kibana Authentication Required - kibanaadmin"

Now that ingress is updated, hit the URL again and as shown in the image below we are asked for authentication.

Kibana dashboard without authentication

Ruby 2.6 adds write_timeout to Net::HTTP

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Before Ruby 2.6, if we created a large request with Net::HTTP, it would hang forever until request is interrupted. To fix this issue, write_timeout attribute and write_timeout= method is added to Net::HTTP in Ruby 2.6. Default value for write_timeout is 60 seconds and can be set to an integer or a float value.

Similarly, write_timeout attribute and write_timeout= method is added to Net::BufferedIO class.

If any chunk of response is not written within number of seconds provided to write_timeout, Net::WriteTimeout exception is raised. Net::WriteTimeout exception is not raised on Windows systems.

Example
# server.rb

require 'socket'

server = TCPServer.new('localhost', 2345)
loop do
  socket = server.accept
end
Ruby 2.5.1
# client.rb

require 'net/http'

connection = Net::HTTP.new('localhost', 2345)
connection.open_timeout = 1
connection.read_timeout = 3
connection.start

post = Net::HTTP::Post.new('/')
body = (('a' * 1023) + "\n") * 5_000
post.body = body

puts "Sending #{body.bytesize} bytes"
connection.request(post)
Output
$ RBENV_VERSION=2.5.1 ruby client.rb

Sending 5120000 bytes

Ruby 2.5.1 processes request endlessly unless above program is interrupted.

Ruby 2.6.0-dev

Add write_timeout attribute to Net::HTTP instance in client.rb program.

# client.rb

require 'net/http'

connection = Net::HTTP.new('localhost', 2345)
connection.open_timeout = 1
connection.read_timeout = 3

# set write_timeout to 10 seconds
connection.write_timeout = 10

connection.start

post = Net::HTTP::Post.new('/')
body = (('a' * 1023) + "\n") * 5_000
post.body = body

puts "Sending #{body.bytesize} bytes"
connection.request(post)
Output
$ RBENV_VERSION=2.6.0-dev ruby client.rb

Sending 5120000 bytes
Traceback (most recent call last):
    13: `from client.rb:17:in `<main>``
    12: `from /net/http.rb:1479:in `request``
    11: `from /net/http.rb:1506:in `transport_request``
    10: `from /net/http.rb:1506:in `catch``
     9: `from /net/http.rb:1507:in `block in transport_request``
     8: `from /net/http/generic_request.rb:123:in `exec``
     7: `from /net/http/generic_request.rb:189:in `send_request_with_body``
     6: `from /net/protocol.rb:221:in `write``
     5: `from /net/protocol.rb:239:in `writing``
     4: `from /net/protocol.rb:222:in `block in write``
     3: `from /net/protocol.rb:249:in `write0``
     2: `from /net/protocol.rb:249:in `each_with_index``
     1: `from /net/protocol.rb:249:in `each``
`/net/protocol.rb:270:in `block in write0`: Net::WriteTimeout (Net::WriteTimeout)`

In Ruby 2.6.0, above program is terminated raising Net::WriteTimeout exception after 10 seconds (value set to write_timeout attribute).

Here is relevant commit and discussion for this change.

Ruby 2.6 Introduces Dir#each_child and Dir#children instance methods

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Ruby 2.5 had introduced class level methods Dir::each_child and Dir::children. We wrote a detailed blog about it.

In Ruby 2.6, same methods are added as instance methods on Dir class. Dir#children returns array of all the filenames except . and .. in the directory. Dir#each_child yields all the filenames and operates on it.

Let’s have a look at examples to understand it better.

Dir#children
directory = Dir.new('/Users/tejaswinichile/workspace')

directory.children
=> ["panda.png", "apple.png", "banana.png", "camera.jpg"]

Dir#each_child iterates and calls block for each file entry in the given directory. It uses filename as a parameter to the block.

Dir#each_child
directory = Dir.new('/Users/tejaswinichile/workspace')

directory.each_child { |filename| puts "Curently reading: #{filename}"}

Curently reading: panda.png
Curently reading: apple.png
Curently reading: banana.png
Curently reading: camera.jpg
=> #<Dir:/Users/tejaswinichile/Desktop>

If we don’t pass any block to each_child, it returns enumerator instead.

directory = Dir.new('/Users/tejaswinichile/workspace')

directory.each_child

=> #<Enumerator: #<Dir:/Users/tejaswinichile/Desktop>:each_child>

Here is relevant commit and discussion for this change.

Ruby 2.6 adds option to not raise exception for Integer, Float methods

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

We can use Integer and Float methods to convert values to integers and floats respectively. Ruby also has to_i and to_f methods for same purpose. Let’s see how it differs from the Integer method.

>> "1one".to_i
=> 1

>> Integer("1one")
ArgumentError: invalid value for Integer(): "1one"
	from (irb):2:in `Integer'
	from (irb):2
	from /Users/prathamesh/.rbenv/versions/2.4.0/bin/irb:11:in `<main>'
>>

The to_i method tries to convert the given input to integer as much as possible whereas the Integer method throws an ArgumentError if it can’t covert the input to integer. The Integer and Float methods parse more strictly compared to to_i and to_f respectively.

Some times, we might need the strictness of Integer and Float but ability to not raise an exception every time the input can’t be parsed.

Before Ruby 2.6 it was possible to achieve it in following way.

>> Integer("msg") rescue nil

In Ruby 2.6, the Integer and Float methods accept a keyword argument exception which can be either true or false. If it is false then no exception is raised if the input can’t be parsed and nil is returned.

>> Float("foo", exception: false)
=> nil
>> Integer("foo", exception: false)
=> nil

This is also faster than rescuing the exception and returning nil.

>> Benchmark.ips do |x|
?>       x.report("rescue") {
?>           Integer('foo') rescue nil
>>       }
>>     x.report("kwarg") {
?>           Integer('foo', exception: false)
>>       }
>>     x.compare!
>> end
Warming up --------------------------------------
              rescue    41.896k i/100ms
               kwarg    81.459k i/100ms
Calculating -------------------------------------
              rescue    488.006k (± 4.5%) i/s -      2.472M in   5.076848s
               kwarg      1.024M (±11.8%) i/s -      5.050M in   5.024937s

Comparison:
               kwarg:  1023555.3 i/s
              rescue:   488006.0 i/s - 2.10x  slower

As we can see, rescuing the exception is twice slower than using the new keyword argument. We can still use the older technique if we want to return a different value from nil.

>> Integer('foo') rescue 42
=> 42

By default, the keyword argument exception is set to true for backward compatibility.

The Chinese version of this blog is available here.

Speeding up Docker image build process of a Rails application

tl;dr : We reduced the Docker image building time from 10 minutes to 5 minutes by re-using bundler cache and by precompiling assets.

We deploy one of our Rails applications on a dedicated Kubernetes cluster. Kubernetes is a good fit for us since as per the load and resource consumption, Kubernetes horizontally scales the containerized application automatically. The prerequisite to deploy any kind of application on Kubernetes is that the application needs to be containerized. We use Docker to containerize our application.

We have been successfully containerizing and deploying our Rails application on Kubernetes for about a year now. Although containerization was working fine, we were not happy with the overall time spent to containerize the application whenever we changed the source code and deployed the app.

We use Jenkins for building on-demand Docker images of our application with the help of CloudBees Docker Build and Publish plugin.

We observed that the average build time of a Jenkins job to build a Docker image was about 9 to 10 minutes.

Investigating what takes most time

We wipe the workspace folder of the Jenkins job after finishing each Jenkins build to avoid any unintentional behavior caused by the residue left from a previous build. The application’s folder is about 500 MiB in size. Each Jenkins build spends about 20 seconds to perform a shallow Git clone of the latest commit of the specified git branch from our remote GitHub repository.

After cloning the latest source code, Jenkins executes docker build command to build a Docker image with a unique tag to containerize the cloned source code of the application.

Jenkins build spends another 10 seconds invoking docker build command and sending build context to Docker daemon.

01:05:43 [docker-builder] $ docker build --build-arg RAILS_ENV=production -t bigbinary/xyz:production-role-management-feature-1529436929 --pull=true --file=./Dockerfile /var/lib/jenkins/workspace/docker-builder
01:05:53 Sending build context to Docker daemon 489.4 MB

We use the same Docker image on a number of Kubernetes pods. Therefore, we do not want to execute bundle install and rake assets:precompile tasks while starting a container in each pod which would prevent that pod from accepting any requests until these tasks are finished.

The recommonded approach is to run bundle install and rake assets:precompile tasks while or before containerizing the Rails application.

Following is a trimmed down version of our actual Dockerfile which is used by docker build command to containerize our application.

FROM bigbinary/xyz-base:latest

ENV APP_PATH /data/app/

WORKDIR $APP_PATH

ADD . $APP_PATH

ARG RAILS_ENV

RUN bin/bundle install --without development test

RUN bin/rake assets:precompile

CMD ["bin/bundle", "exec", "puma"]

The RUN instructions in the above Dockerfile executes bundle install and rake assets:precompile tasks while building a Docker image. Therefore, when a Kubernetes pod is created using such a Docker image, Kubernetes pulls the image, starts a Docker container using that image inside the pod and runs puma server immediately.

The base Docker image which we use in the FROM instruction contains necessary system packages. We rarely need to update any system package. Therefore, an intermediate layer which may have been built previously for that instruction is reused while executing the docker build command. If the layer for FROM instruction is reused, Docker reuses cached layers for the next two instructions such as ENV and WORKDIR respectively since both of them are never changed.

01:05:53 Step 1/8 : FROM bigbinary/xyz-base:latest
01:05:53 latest: Pulling from bigbinary/xyz-base
01:05:53 Digest: sha256:193951cad605d23e38a6016e07c5d4461b742eb2a89a69b614310ebc898796f0
01:05:53 Status: Image is up to date for bigbinary/xyz-base:latest
01:05:53  ---> c2ab738db405
01:05:53 Step 2/8 : ENV APP_PATH /data/app/
01:05:53  ---> Using cache
01:05:53  ---> 5733bc978f19
01:05:53 Step 3/8 : WORKDIR $APP_PATH
01:05:53  ---> Using cache
01:05:53  ---> 0e5fbc868af8

Docker checks contents of the files in the image and calculates checksum for each file for an ADD instruction. Since source code changes often, the previously cached layer for the ADD instruction is invalidated due to the mismatching checksums. Therefore, the 4th instruction ADD in our Dockerfile has to add the local files in the provided build context to the filesystem of the image being built in a separate intermediate container instead of reusing the previously cached instruction layer. On an average, this instruction spends about 25 seconds.

01:05:53 Step 4/8 : ADD . $APP_PATH
01:06:12  ---> cbb9a6ac297e
01:06:17 Removing intermediate container 99ca98218d99

We need to build Docker images for our application using different Rails environments. To achieve that, we trigger a parameterized Jenkins build by specifying the needed Rails environment parameter. This parameter is then passed to the docker build command using --build-arg RAILS_ENV=production option. The ARG instruction in the Dockerfile defines RAILS_ENV variable and is implicitly used as an environment variable by the rest of the instructions defined just after that ARG instruction. Even if the previous ADD instruction didn’t invalidate build cache; if the ARG variable is different from a previous build, then a “cache miss” occurs and the build cache is invalidated for the subsequent instructions.

01:06:17 Step 5/8 : ARG RAILS_ENV
01:06:17  ---> Running in b793b8cc2fe7
01:06:22  ---> b8a70589e384
01:06:24 Removing intermediate container b793b8cc2fe7

The next two RUN instructions are used to install gems and precompile static assets using sprockets. As earlier instruction(s) already invalidates the build cache, these RUN instructions are mostly executed instead of reusing cached layer. The bundle install command takes about 2.5 minutes and the rake assets:precompile task takes about 4.35 minutes.

01:06:24 Step 6/8 : RUN bin/bundle install --without development test
01:06:24  ---> Running in a556c7ca842a
01:06:25 bin/bundle install --without development test
01:08:22  ---> 82ab04f1ff42
01:08:40 Removing intermediate container a556c7ca842a
01:08:58 Step 7/8 : RUN bin/rake assets:precompile
01:08:58  ---> Running in b345c73a22c
01:08:58 bin/bundle exec rake assets:precompile
01:09:07 ** Invoke assets:precompile (first_time)
01:09:07 ** Invoke assets:environment (first_time)
01:09:07 ** Execute assets:environment
01:09:07 ** Invoke environment (first_time)
01:09:07 ** Execute environment
01:09:12 ** Execute assets:precompile
01:13:20  ---> 57bf04f3c111
01:13:23 Removing intermediate container b345c73a22c

Above both RUN instructions clearly looks like the main culprit which were slowing down the whole docker build command and thus the Jenkins build.

The final instruction CMD which starts the puma server takes another 10 seconds. After building the Docker image, the docker push command spends another minute.

01:13:23 Step 8/8 : CMD ["bin/bundle", "exec", "puma"]
01:13:23  ---> Running in 104967ad1553
01:13:31  ---> 35d2259cdb1d
01:13:34 Removing intermediate container 104967ad1553
01:13:34 Successfully built 35d2259cdb1d
01:13:35 [docker-builder] $ docker inspect 35d2259cdb1d
01:13:35 [docker-builder] $ docker push bigbinary/xyz:production-role-management-feature-1529436929
01:13:35 The push refers to a repository [docker.io/bigbinary/xyz]
01:14:21 d67854546d53: Pushed
01:14:22 production-role-management-feature-1529436929: digest: sha256:07f86cfd58fac412a38908d7a7b7d0773c6a2980092df416502d7a5c051910b3 size: 4106
01:14:22 Finished: SUCCESS

So, we found the exact commands which were causing the docker build command to take so much time to build a Docker image.

Let’s summarize the steps involved in building our Docker image and the average time each needed to finish.

Command or Instruction Average Time Spent
Shallow clone of Git Repository by Jenkins 20 Seconds
Invocation of docker build by Jenkins and sending build context to Docker daemon 10 Seconds
FROM bigbinary/xyz-base:latest 0 Seconds
ENV APP_PATH /data/app/ 0 Seconds
WORKDIR $APP_PATH 0 Seconds
ADD . $APP_PATH 25 Seconds
ARG RAILS_ENV 7 Seconds
RUN bin/bundle install --without development test 2.5 Minutes
RUN bin/rake assets:precompile 4.35 Minutes
CMD ["bin/bundle", "exec", "puma"] 1.15 Minutes
Total 9 Minutes

Often, people build Docker images from a single Git branch, like master. Since changes in a single branch are incremental and hardly has differences in the Gemfile.lock file across commits, bundler cache need not be managed explicitly. Instead, Docker automatically re-uses the previously built layer for the RUN bundle install instruction if the Gemfile.lock file remains unchanged.

In our case, this does not happen. For every new feature or a bug fix, we create a separate Git branch. To verify the changes on a particular branch, we deploy a separate review app which serves the code from that branch. To achieve this workflow, everyday we need to build a lot of Docker images containing source code from varying Git branches as well as with varying environments. Most of the times, the Gemfile.lock and assets have different versions across these Git branches. Therefore, it is hard for Docker to cache layers for bundle install and rake assets:precompile tasks and re-use those layers during every docker build command run with different application source code and a different environment. This is why the previously built Docker layer for the RUN bin/bundle install instruction and the RUN bin/rake assets:precompile instruction was often not being used in our case. This reason was causing the RUN instructions to be executed without re-using the previously built Docker layer cache while performing every other Docker build.

Before discussing the approaches to speed up our Docker build flow, let’s familiarize with the bundle install and rake assets:precompile tasks and how to speed up them by reusing cache.

Speeding up “bundle install” by using cache

By default, Bundler installs gems at the location which is set by Rubygems. Also, Bundler looks up for the installed gems at the same location.

This location can be explicitly changed by using --path option.

If Gemfile.lock does not exist or no gem is found at the explicitly provided location or at the default gem path then bundle install command fetches all remote sources, resolves dependencies if needed and installs required gems as per Gemfile.

The bundle install --path=vendor/cache command would install the gems at the vendor/cache location in the current directory. If the same command is run without making any change in Gemfile, since the gems were already installed and cached in vendor/cache, the command will finish instantly because Bundler need not to fetch any new gems.

The tree structure of vendor/cache directory looks like this.

vendor/cache
├── aasm-4.12.3.gem
├── actioncable-5.1.4.gem
├── activerecord-5.1.4.gem
├── [...]
├── ruby
│   └── 2.4.0
│       ├── bin
│       │   ├── aws.rb
│       │   ├── dotenv
│       │   ├── erubis
│       │   ├── [...]
│       ├── build_info
│       │   └── nokogiri-1.8.1.info
│       ├── bundler
│       │   └── gems
│       │       ├── activeadmin-043ba0c93408
│       │       [...]
│       ├── cache
│       │   ├── aasm-4.12.3.gem
│       │   ├── actioncable-5.1.4.gem
│       │   ├── [...]
│       │   ├── bundler
│       │   │   └── git
│       └── specifications
│           ├── aasm-4.12.3.gemspec
│           ├── actioncable-5.1.4.gemspec
│           ├── activerecord-5.1.4.gemspec
│           ├── [...]
│           [...]
[...]

It appears that Bundler keeps two separate copies of the .gem files at two different locations, vendor/cache and vendor/cache/ruby/VERSION_HERE/cache.

Therefore, even if we remove a gem in the Gemfile, then that gem will be removed only from the vendor/cache directory. The vendor/cache/ruby/VERSION_HERE/cache will still have the cached .gem file for that removed gem.

Let’s see an example.

We have 'aws-sdk', '2.11.88' gem in our Gemfile and that gem is installed.

$ ls vendor/cache/aws-sdk-*
vendor/cache/aws-sdk-2.11.88.gem
vendor/cache/aws-sdk-core-2.11.88.gem
vendor/cache/aws-sdk-resources-2.11.88.gem

$ ls vendor/cache/ruby/2.4.0/cache/aws-sdk-*
vendor/cache/ruby/2.4.0/cache/aws-sdk-2.11.88.gem
vendor/cache/ruby/2.4.0/cache/aws-sdk-core-2.11.88.gem
vendor/cache/ruby/2.4.0/cache/aws-sdk-resources-2.11.88.gem

Now, we will remove the aws-sdk gem from Gemfile and run bundle install.

$ bundle install --path=vendor/cache
Using rake 12.3.0
Using aasm 4.12.3
[...]
Updating files in vendor/cache
Removing outdated .gem files from vendor/cache
  * aws-sdk-2.11.88.gem
  * jmespath-1.3.1.gem
  * aws-sdk-resources-2.11.88.gem
  * aws-sdk-core-2.11.88.gem
  * aws-sigv4-1.0.2.gem
Bundled gems are installed into `./vendor/cache`

$ ls vendor/cache/aws-sdk-*
no matches found: vendor/cache/aws-sdk-*

$ ls vendor/cache/ruby/2.4.0/cache/aws-sdk-*
vendor/cache/ruby/2.4.0/cache/aws-sdk-2.11.88.gem
vendor/cache/ruby/2.4.0/cache/aws-sdk-core-2.11.88.gem
vendor/cache/ruby/2.4.0/cache/aws-sdk-resources-2.11.88.gem

We can see that the cached version of gem(s) remained unaffected.

If we add the same gem 'aws-sdk', '2.11.88' back to the Gemfile and perform bundle install, instead of fetching that gem from remote Gem repository, Bundler will install that gem from the cache.

$ bundle install --path=vendor/cache
Resolving dependencies........
[...]
Using aws-sdk 2.11.88
[...]
Updating files in vendor/cache
  * aws-sigv4-1.0.3.gem
  * jmespath-1.4.0.gem
  * aws-sdk-core-2.11.88.gem
  * aws-sdk-resources-2.11.88.gem
  * aws-sdk-2.11.88.gem

$ ls vendor/cache/aws-sdk-*
vendor/cache/aws-sdk-2.11.88.gem
vendor/cache/aws-sdk-core-2.11.88.gem
vendor/cache/aws-sdk-resources-2.11.88.gem

What we understand from this is that if we can reuse the explicitly provided vendor/cache directory every time we need to execute bundle install command, then the command will be much faster because Bundler will use gems from local cache instead of fetching from the Internet.

Speeding up “rake assets:precompile” task by using cache

JavaScript code written in TypeScript, Elm, JSX etc cannot be directly served to the browser. Almost all web browsers understands JavaScript (ES4), CSS and image files. Therefore, we need to transpile, compile or convert the source asset into the formats which browsers can understand. In Rails, Sprockets is the most widely used library for managing and compiling assets.

In development environment, Sprockets compiles assets on-the-fly as and when needed using Sprockets::Server. In production environment, recommended approach is to pre-compile assets in a directory on disk and serve it using a web server like Nginx.

Precompilation is a multi-step process for converting a source asset file into a static and optimized form using components such as processors, transformers, compressors, directives, environments, a manifest and pipelines with the help of various gems such as sass-rails, execjs, etc. The assets need to be precompiled in production so that Sprockets need not resolve inter-dependencies between required source dependencies every time a static asset is requested. To understand how Sprockets work in great detail, please read this guide.

When we compile source assets using rake assets:precompile task, we can find the compiled assets in public/assets directory inside our Rails application.

$ ls public/assets
manifest-15adda275d6505e4010b95819cf61eb3.json
icons-6250335393ad03df1c67eafe138ab488.eot
icons-6250335393ad03df1c67eafe138ab488.eot.gz
cons-b341bf083c32f9e244d0dea28a763a63.svg
cons-b341bf083c32f9e244d0dea28a763a63.svg.gz
application-8988c56131fcecaf914b22f54359bf20.js
application-8988c56131fcecaf914b22f54359bf20.js.gz
xlsx.full.min-feaaf61b9d67aea9f122309f4e78d5a5.js
xlsx.full.min-feaaf61b9d67aea9f122309f4e78d5a5.js.gz
application-adc697aed7731c864bafaa3319a075b1.css
application-adc697aed7731c864bafaa3319a075b1.css.gz
FontAwesome-42b44fdc9088cae450b47f15fc34c801.otf
FontAwesome-42b44fdc9088cae450b47f15fc34c801.otf.gz
[...]

We can see that the each source asset has been compiled and minified along with its gunzipped version.

Note that the assets have a unique and random digest or fingerprint in their file names. A digest is a hash calculated by Sprockets from the contents of an asset file. If the contents of an asset is changed, then that asset’s digest also changes. The digest is mainly used for busting cache so a new version of the same asset can be generated if the source file is modified or the configured cache period is expired.

The rake assets:precompile task also generates a manifest file along with the precompiled assets. This manifest is used by Sprockets to perform fast lookups without having to actually compile our assets code.

An example manifest file, in our case public/assets/manifest-15adda275d6505e4010b95819cf61eb3.json looks like this.

{
  "files": {
    "application-8988c56131fcecaf914b22f54359bf20.js": {
      "logical_path": "application.js",
      "mtime": "2018-07-06T07:32:27+00:00",
      "size": 3797752,
      "digest": "8988c56131fcecaf914b22f54359bf20"
    },
    "xlsx.full.min-feaaf61b9d67aea9f122309f4e78d5a5.js": {
      "logical_path": "xlsx.full.min.js",
      "mtime": "2018-07-05T22:06:17+00:00",
      "size": 883635,
      "digest": "feaaf61b9d67aea9f122309f4e78d5a5"
    },
    "application-adc697aed7731c864bafaa3319a075b1.css": {
      "logical_path": "application.css",
      "mtime": "2018-07-06T07:33:12+00:00",
      "size": 242611,
      "digest": "adc697aed7731c864bafaa3319a075b1"
    },
    "FontAwesome-42b44fdc9088cae450b47f15fc34c801.otf": {
      "logical_path": "FontAwesome.otf",
      "mtime": "2018-06-20T06:51:49+00:00",
      "size": 134808,
      "digest": "42b44fdc9088cae450b47f15fc34c801"
    },
    [...]
  },
  "assets": {
    "icons.eot": "icons-6250335393ad03df1c67eafe138ab488.eot",
    "icons.svg": "icons-b341bf083c32f9e244d0dea28a763a63.svg",
    "application.js": "application-8988c56131fcecaf914b22f54359bf20.js",
    "xlsx.full.min.js": "xlsx.full.min-feaaf61b9d67aea9f122309f4e78d5a5.js",
    "application.css": "application-adc697aed7731c864bafaa3319a075b1.css",
    "FontAwesome.otf": "FontAwesome-42b44fdc9088cae450b47f15fc34c801.otf",
    [...]
  }
}

Using this manifest file, Sprockets can quickly find a fingerprinted file name using that file’s logical file name and vice versa.

Also, Sprockets generates cache in binary format at tmp/cache/assets in the Rails application’s folder for the specified Rails environment. Following is an example tree structure of the tmp/cache/assets directory automatically generated after executing RAILS_ENV=environment_here rake assets:precompile command for each Rails environment.

$ cd tmp/cache/assets && tree
.
├── demo
│   ├── sass
│   │   ├── 7de35a15a8ab2f7e131a9a9b42f922a69327805d
│   │   │   ├── application.css.sassc
│   │   │   └── bootstrap.css.sassc
│   │   ├── [...]
│   └── sprockets
│       ├── 002a592d665d92efe998c44adc041bd3
│       ├── 7dd8829031d3067dcf26ffc05abd2bd5
│       └── [...]
├── production
│   ├── sass
│   │   ├── 80d56752e13dda1267c19f4685546798718ad433
│   │   │   ├── application.css.sassc
│   │   │   └── bootstrap.css.sassc
│   │   ├── [...]
│   └── sprockets
│       ├── 143f5a036c623fa60d73a44d8e5b31e7
│       ├── 31ae46e77932002ed3879baa6e195507
│       └── [...]
└── staging
    ├── sass
    │   ├── 2101b41985597d41f1e52b280a62cd0786f2ee51
    │   │   ├── application.css.sassc
    │   │   └── bootstrap.css.sassc
    │   ├── [...]
    └── sprockets
        ├── 2c154d4604d873c6b7a95db6a7d5787a
        ├── 3ae685d6f922c0e3acea4bbfde7e7466
        └── [...]

Let’s inspect the contents of an example cached file. Since the cached file is in binary form, we can forcefully see the non-visible control characters as well as the binary content in text form using cat -v command.

$ cat -v tmp/cache/assets/staging/sprockets/2c154d4604d873c6b7a95db6a7d5787a

^D^H{^QI"
class^F:^FETI"^SProcessedAsset^F;^@FI"^Qlogical_path^F;^@TI"^]components/Comparator.js^F;^@TI"^Mpathname^F;^@TI"T$root/app/assets/javascripts/components/Comparator.jsx^F;^@FI"^Qcontent_type^F;^@TI"^[application/javascript^F;^@TI"
mtime^F;^@Tl+^GM-gM-z;[I"^Klength^F;^@Ti^BM-L^BI"^Kdigest^F;^@TI"%18138d01fe4c61bbbfeac6d856648ec9^F;^@FI"^Ksource^F;^@TI"^BM-L^Bvar Comparator = function (props) {
  var comparatorOptions = [React.createElement("option", { key: "?", value: "?" })];
  var allComparators = props.metaData.comparators;
  var fieldDataType = props.fieldDataType;
  var allowedComparators = allComparators[fieldDataType] || allComparators.integer;
  return React.createElement(
    "select",
    {
      id: "comparator-" + props.id,
      disabled: props.disabled,
      onChange: props.handleComparatorChange,
      value: props.comparatorValue },
    comparatorOptions.concat(allowedComparators.map(function (comparator, id) {
      return React.createElement(
        "option",
        { key: id, value: comparator },
        comparator
      );
    }))
  );
};^F;^@TI"^Vdependency_digest^F;^@TI"%d6c86298311aa7996dd6b5389f45949f^F;^@FI"^Srequired_paths^F;^@T[^FI"T$root/app/assets/javascripts/components/Comparator.jsx^F;^@FI"^Udependency_paths^F;^@T[^F{^HI"   path^F;^@TI"T$root/app/assets/javascripts/components/Comparator.jsx^F;^@F@^NI"^^2018-07-03T22:38:31+00:00^F;^@T@^QI"%51ab9ceec309501fc13051c173b0324f^F;^@FI"^M_version^F;^@TI"%30fd133466109a42c8cede9d119c3992^F;^@F

We can see that there are some weird looking characters in the above file because it is not a regular file to be read by humans. Also, it seems to be holding some important information such as mime-type, original source code’s path, compiled source, digest, paths and digests of required dependencies, etc. Above compiled cache appears to be of the original source file located at app/assets/javascripts/components/Comparator.jsx having actual contents in JSX and ES6 syntax as shown below.

const Comparator = (props) => {
  const comparatorOptions = [<option key="?" value="?" />];
  const allComparators = props.metaData.comparators;
  const fieldDataType = props.fieldDataType;
  const allowedComparators = allComparators[fieldDataType] || allComparators.integer;
  return (
    <select
      id={`comparator-${props.id}`}
      disabled={props.disabled}
      onChange={props.handleComparatorChange}
      value={props.comparatorValue}>
      {
        comparatorOptions.concat(allowedComparators.map((comparator, id) =>
          <option key={id} value={comparator}>{comparator}</option>
        ))
      }
    </select>
  );
};

If similar cache exists for a Rails environment under tmp/cache/assets and if no source asset file is modified then re-running the rake assets:precompile task for the same environment will finish quickly. This is because Sprockets will reuse the cache and therefore will need not to resolve the inter-assets dependencies, perform conversion, etc.

Even if certain source assets are modified, Sprockets will rebuild the cache and re-generate compiled and fingerprinted assets just for the modified source assets.

Therefore, now we can understand that that if we can reuse the directories tmp/cache/assets and public/assets every time we need to execute rake assets:precompile task, then the Sprockets will perform precompilation much faster.

Speeding up “docker build” – first attempt

As discussed above, we were now familiar about how to speed up the bundle install and rake assets:precompile commands individually.

We decided to use this knowledge to speed up our slow docker build command. Our initial thought was to mount a directory on the host Jenkins machine into the filesystem of the image being built by the docker build command. This mounted directory then can be used as a cache directory to persist the cache files of both bundle install and rake assets:precompile commands run as part of docker build command in each Jenkins build. Then every new build could re-use the previous build’s cache and therefore could finish faster.

Unfortunately, this wasn’t possible due to no support from Docker yet. Unlike the docker run command, we cannot mount a host directory into docker build command. A feature request for providing a shared host machine directory path option to the docker build command is still open here.

To reuse cache and perform faster, we need to carry the cache files of both bundle install and rake assets:precompile commands between each docker build (therefore, Jenkins build). We were looking for some place which can be treated as a shared cache location and can be accessed during each build.

We decided to use Amazon’s S3 service to solve this problem.

To upload and download files from S3, we needed to inject credentials for S3 into the build context provided to the docker build command.

Alternatively, these S3 credentials can be provided to the docker build command using --build-arg option as discussed earlier.

We used s3cmd command-line utility to interact with the S3 service.

Following shell script named as install_gems_and_precompile_assets.sh was configured to be executed using a RUN instruction while running the docker build command.

set -ex

# Step 1.
if [ -e s3cfg ]; then mv s3cfg ~/.s3cfg; fi

bundler_cache_path="vendor/cache"
assets_cache_path="tmp/assets/cache"
precompiled_assets_path="public/assets"
cache_archive_name="cache.tar.gz"
s3_bucket_path="s3://docker-builder-bundler-and-assets-cache"
s3_cache_archive_path="$s3_bucket_path/$cache_archive_name"

# Step 2.
# Fetch tarball archive containing cache and extract it.
# The "tar" command extracts the archive into "vendor/cache",
# "tmp/assets/cache" and "public/assets".
if s3cmd get $s3_cache_archive_path; then
  tar -xzf $cache_archive_name && rm -f $cache_archive_name
fi

# Step 3.
# Install gems from "vendor/cache" and pack up them.
bin/bundle install --without development test --path $bundler_cache_path
bin/bundle pack --quiet

# Step 4.
# Precompile assets.
# Note that the "RAILS_ENV" is already defined in Dockerfile
# and will be used implicitly.
bin/rake assets:precompile

# Step 5.
# Compress "vendor/cache", "tmp/assets/cache"
# and "public/assets" directories into a tarball archive.
tar -zcf $cache_archive_name $bundler_cache_path \
                             $assets_cache_path  \
                             $precompiled_assets_path

# Step 6.
# Push the compressed archive containing updated cache to S3.
s3cmd put $cache_archive_name $s3_cache_archive_path || true

# Step 7.
rm -f $cache_archive_name ~/.s3cfg

Let’s discuss the various steps annotated in the above script.

  1. The S3 credentials file injected by Jenkins into the build context needs to be placed at ~/.s3cfg location, so we move that credentials file accordingly.
  2. Try to fetch the compressed tarball archive comprising directories such as vendor/cache, tmp/assets/cache and public/assets. If exists, extract the tarball archive at respective paths and remove that tarball.
  3. Execute the bundle install command which would re-use the extracted cache from vendor/cache.
  4. Execute the rake assets:precompile command which would re-use the extracted cache from tmp/assets/cache and public/assets.
  5. Compress the cache directories vendor/cache, tmp/assets/cache and public/assets in a tarball archive.
  6. Upload the compressed tarball archive containing updated cache directories to S3.
  7. Remove the compressed tarball archive and the S3 credentials file.

Please note that, in our actual case we had generated different tarball archives depending upon the provided RAILS_ENV environment. For demonstration, here we use just a single archive instead.

The Dockerfile needed to update to execute the install_gems_and_precompile_assets.sh script.

FROM bigbinary/xyz-base:latest

ENV APP_PATH /data/app/

WORKDIR $APP_PATH

ADD . $APP_PATH

ARG RAILS_ENV

RUN install_gems_and_precompile_assets.sh

CMD ["bin/bundle", "exec", "puma"]

With this setup, average time of the Jenkins builds was now reduced to about 5 minutes. This was a great achievement for us.

We reviewed this approach in a great detail. We found that although the approach was working fine, there was a major security flaw. It is not at all recommended to inject confidential information such as login credentials, private keys, etc. as part of the build context or using build arguments while building a Docker image using docker build command. And we were actually injecting S3 credentials into the Docker image. Such confidential credentials provided while building a Docker image can be inspected using docker history command by anyone who has access to that Docker image.

Due to above reason, we needed to abandon this approach and look for another.

Speeding up “docker build” – second attempt

In our second attempt, we decided to execute bundle install and rake assets:precompile commands outside the docker build command. Outside meaning the place to execute these commands was Jenkins build itself. So with the new approach, we had to first execute bundle install and rake assets:precompile commands as part of the Jenkins build and then execute docker build as usual. With this approach, we could now avail the inter-build caching benefits provided by Jenkins.

The prerequisite was to have all the necessary system packages installed on the Jenkins machine required by the gems enlisted in the application’s Gemfile. We installed all the necessary system packages on our Jenkins server.

Following screenshot highlights the things that we needed to configure in our Jenkins job to make this approach work.

1. Running the Jenkins build in RVM managed environment with the specified Ruby version

Sometimes, we need to use different Ruby version as specified in the .ruby-version in the cloned source code of the application. By default, the bundle install command would install the gems for the system Ruby version available on the Jenkins machine. This was not acceptable for us. Therefore, we needed a way to execute the bundle install command in Jenkins build in an isolated environment which could use the Ruby version specified in the .ruby-version file instead of the default system Ruby version. To address this, we used RVM plugin for Jenkins. The RVM plugin enabled us to run the Jenkins build in an isolated environment by using or installing the Ruby version specified in the .ruby-version file. The section highlighted with green color in the above screenshot shows the configuration required to enable this plugin.

2. Carrying cache files between Jenkins builds required to speed up “bundle install” and “rake assets:precompile” commands

We used Job Cacher Jenkins plugin to persist and carry the cache directories such as vendor/cache, tmp/cache/assets and public/assets between builds. At the beginning of a Jenkins build just after cloning the source code of the application, the Job Cacher plugin restores the previously cached version of these directories into the current build. Similarly, before finishing a Jenkins build, the Job Cacher plugin copies the current version of these directories at /var/lib/jenkins/jobs/docker-builder/cache on the Jenkins machine which is outside the workspace directory of the Jenkins job. The section highlighted with red color in the above screenshot shows the necessary configuration required to enable this plugin.

3. Executing the “bundle install” and “rake assets:precompile” commands before “docker build” command

Using the “Execute shell” build step provided by Jenkins, we execute bundle install and rake assets:precompile commands just before the docker build command invoked by the CloudBees Docker Build and Publish plugin. Since the Job Cacher plugin already restores the version of vendor/cache, tmp/cache/assets and public/assets directories from the previous build into the current build, the bundle install and rake assets:precompile commands re-uses the cache and performs faster.

The updated Dockerfile has lesser number of instructions now.

FROM bigbinary/xyz-base:latest

ENV APP_PATH /data/app/

WORKDIR $APP_PATH

ADD . $APP_PATH

CMD ["bin/bundle", "exec", "puma"]

With this approach, average Jenkins build time is now between 3.5 to 4.5 minutes.

Following graph shows the build time trend of some of the recent builds on our Jenkins server.

Please note that the spikes in the above graphs shows that certain Jenkins builds took more than 5 minutes sometimes due to concurrently running builds at that time. Because our Jenkins server has a limited set of resources, concurrently running builds often run longer than estimated.

We are still looking to improve the containerization speed even more and still maintaining the image size small. Please let us know if there’s anything else we can do to improve the containerization process.

Note that that our Jenkins server runs on the Ubuntu OS which is based on Debian. Our base Docker image is also based on Debian. Some of the gems in our Gemfile are native extensions written in C. The pre-installed gems on Jenkins machine have been working without any issues while running inside the Docker containers on Kubernetes. It may not work if both of the platforms are different since native extension gems installed on Jenkins host may fail to work inside the Docker container.

Ruby 2.6 adds Binding#source_location

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Before Ruby 2.6, if we want to know file name with location and line number of source code, we would need to use Binding#eval .

binding.eval('[__FILE__, __LINE__]')
=> ["/Users/taha/blog/app/controllers/application_controller", 2]

Ruby 2.6 adds more readable method Binding#source_location to achieve similar result.

binding.source_location
=> ["/Users/taha/blog/app/controllers/application_controller", 2]

Here is relevant commit and discussion for this change.

The Chinese version of this blog is available here.

Ruby 2.6 adds String#split with block

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Before Ruby 2.6, String#split returned array of splitted strings.

In Ruby 2.6, a block can be passed to String#split which yields each split string and operates on it. This avoids creating an array and thus is memory efficient.

We will add method is_fruit? to understand how to use split with a block.

def is_fruit?(value)
  %w(apple mango banana watermelon grapes guava lychee).include?(value)
end

Input is a comma separated string with vegetables and fruits names. Goal is to fetch names of fruits from input string and store it in an array.

String#split
input_str = "apple, mango, potato, banana, cabbage, watermelon, grapes"

splitted_values = input_str.split(", ")
=> ["apple", "mango", "potato", "banana", "cabbage", "watermelon", "grapes"]

fruits = splitted_values.select { |value| is_fruit?(value) }
=> ["apple", "mango", "banana", "watermelon", "grapes"]

Using split an intermediate array is created which contains both fruits and vegetables names.

String#split with a block
fruits = []

input_str = "apple, mango, potato, banana, cabbage, watermelon, grapes"

input_str.split(", ") { |value| fruits << value if is_fruit?(value) }
=> "apple, mango, potato, banana, cabbage, watermelon, grapes"

fruits
=> ["apple", "mango", "banana", "watermelon", "grapes"]

When a block is passed to split, it returns the string on which split was called and does not create an array. String#split yields block on each split string, which in our case was to push fruit names in a separate array.

Update

Benchmark

We created a large random string to benchmark performance of split and split with block

require 'securerandom'

test_string = ''

100_000.times.each do
  test_string += SecureRandom.alphanumeric(10)
  test_string += ' '
end
require 'benchmark'

Benchmark.bmbm do |bench|

  bench.report('split') do
    arr = test_string.split(' ')
    str_starts_with_a = arr.select { |str| str.start_with?('a') }
  end

  bench.report('split with block') do
    str_starts_with_a = []
    test_string.split(' ') { |str| str_starts_with_a << str if str.start_with?('a') }
  end

end

Results

Rehearsal ----------------------------------------------------
split              0.023764   0.000911   0.024675 (  0.024686)
split with block   0.012892   0.000553   0.013445 (  0.013486)
------------------------------------------- total: 0.038120sec

                       user     system      total        real
split              0.024107   0.000487   0.024594 (  0.024622)
split with block   0.010613   0.000334   0.010947 (  0.010991)

We did another iteration of benchmarking using benchmark/ips.

require 'benchmark/ips'
Benchmark.ips do |bench|


  bench.report('split') do
    splitted_arr = test_string.split(' ')
    str_starts_with_a = splitted_arr.select { |str| str.start_with?('a') }
  end

  bench.report('split with block') do
    str_starts_with_a = []
    test_string.split(' ') { |str| str_starts_with_a << str if str.start_with?('a') }
  end

  bench.compare!
end

Results

Warming up --------------------------------------
               split     4.000  i/100ms
    split with block    10.000  i/100ms
Calculating -------------------------------------
               split     46.906  (± 2.1%) i/s -    236.000  in   5.033343s
    split with block    107.301  (± 1.9%) i/s -    540.000  in   5.033614s

Comparison:
    split with block:      107.3 i/s
               split:       46.9 i/s - 2.29x  slower

This benchmark shows that split with block is about 2 times faster than split.

Here is relevant commit and discussion for this change.

The Chinese version of this blog is available here.

How to upload source maps to Honeybadger

During the development of a chrome extension, debugging was difficult because line number of minified JavaScript file is of no use without a source map. Previously, Honeybadger could only download the source map files which were public and our source maps were inside the .crx package which was inaccessible to honeybadger.

Recently, Honeybadger released a new feature to upload the source maps to Honeybadger. We have written a grunt plugin to upload the source maps to Honeybadger.

Here is how we can upload source map to Honeybadger.

First, install the grunt plugin.

npm install --save-dev grunt-honeybadger-sourcemaps

Configure the gruntfile.

grunt.initConfig({
  honeybadger_sourcemaps: {
    default_options:{
      options: {
        appId: "xxxx",
        token: "xxxxxxxxxxxxxx",
        urlPrefix: "http://example.com/",
        revision: "<app version>"
        prepareUrlParam: function(fileSrc){
          // Here we can manipulate the filePath
          return filesrc.replace('built/', '');
        },
      },
      files: [{
        src: ['@path/to/**/*.map']
      }],
    }
  },
});
grunt.loadNpmTasks('grunt-honeybadger-sourcemaps');
grunt.registerTask('upload_sourcemaps', ['honeybadger_sourcemaps']);

We can get the appId and token from Honeybadger project settings.

grunt upload_sourcemaps

Now, we can upload the source maps to Honeybadger and get better error stack trace.

Testing

Clone the following repo.

git clone https://github.com/bigbinary/grunt-honeybadger-sourcemaps

Replace appId and token in Gruntfile.js and run grunt test. It should upload the sample source maps to your project.