Managing Rails tasks such as 'db:migrate' and 'db:seed' on Kubernetes while performing rolling deployments

This post assumes that you have basic understanding of Kubernetes terms like pods and deployments.

Problem

We want to deploy a Rails application on Kubernetes. We assume that the assets:precompile task would be run as part of the Docker image build process.

We want to run rake tasks such as db:migrate and db:seed on the initial deployment, and just db:migrate task on each later deployment.

We cannot run these tasks while building the Docker image as it would not be able to connect to the database at that moment.

So, how to run these tasks?

Solution

We assume that we have a Docker image named myorg/myapp:v0.0.1 which contains the source code for our Rails application.

We also assume that we have included database.yml manifest in this Docker image with the required configuration needed for connecting to the database.

We need to create a Kubernetes deployment template with the following content.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: myapp
spec:
  template:
    spec:
      containers:
      - image: myorg/myapp:v0.0.1
        name: myapp
        imagePullPolicy: IfNotPresent
        env:
        - name: DB_NAME
          value: myapp
        - name: DB_USERNAME
          value: username
        - name: DB_PASSWORD
          value: password
        - name: DB_HOST
          value: 54.10.10.245
        ports:
        - containerPort: 80
      imagePullSecrets:
        - name: docker_pull_secret

Let’s save this template file as myapp-deployment.yml.

We can change the options and environment variables in above template as per our need. The environment variables specified here will be available to our Rails application.

To apply above template for the first time on Kubernetes, we will use the following command.

$ kubectl create -f myapp-deployment.yml

Later on, to apply the same template after modifications such as change in the Docker image name or change in the environment variables, we will use the following command.

$ kubectl apply -f myapp-deployment.yml

After applying the deployment template, it will create a pod for our application on Kuberentes.

To see the pods, we use the following command.

$ kubectl get pods

Let’s say that our app is now running in the pod named myapp-4007005961-1st7s.

To execute a rake task, for e.g. db:migrate on this pod, we can run the following command.

$ kubectl exec myapp-4007005961-1st7s                              \
          -- bash -c                                               \
          'cd ~/myapp && RAILS_ENV=production bin/rake db:migrate'

Similarly, we can execute db:seed rake task as well.

If we already have an automated flow for deployments on Kubernetes, we can make use of this approach to programmatically or conditionally run any rake task as per the needs.

Why not to use Kubernetes Jobs to solve this?

We faced some issues while using Kubernetes Jobs to run migration and seed rake tasks.

  1. If the rake task returns a non-zero exit code, the Kubernetes job keeps spawning pods until the task command returns a zero exit code.

  2. To get around the issue mentioned above we needed to unnecessarily implement additional custom logic of checking job status and the status of all the spawned pods.

  3. Capturing the command’s STDOUT or STDERR was difficult using Kubernetes job.

  4. Some housekeeping was needed such as manually terminating the job if it wasn’t successful. If not done, it will fail to create a Kubernetes job with the same name, which is bound to occur when we perform later deployments.

Because of these issues, we choose not to rely on Kubernetes jobs to solve this problem.

Ruby 2.4 allows to customize suffix of the rotated log files

This blog is part of our Ruby 2.4 series.

In Ruby, The Logger class can be used for rotating log files daily, weekly or monthly.

daily_logger = Logger.new('foo.log', 'daily')

weekly_logger = Logger.new('foo.log', 'weekly')

monthly_logger = Logger.new('foo.log', 'monthly')

At the end of the specified period, Ruby will change the file extension of the log file as follows:

foo.log.20170615

The format of the suffix for the rotated log file is %Y%m%d. In Ruby 2.3, there was no way to customize this suffix format.

Ruby 2.4 added the ability to customize the suffix format by passing an extra argument shift_period_suffix.

# Ruby 2.4

logger = Logger.new('foo.log', 'weekly', shift_period_suffix: '%d-%m-%Y')

Now, suffix of the rotated log file will use the custom date format which we passed.

foo.log.15-06-2017

Ruby 2.4 added Hash#transform_values and its destructive version from Active Support

This blog is part of our Ruby 2.4 series.

It is a common use case to transform the values of a hash.

{ a: 1, b: 2, c: 3 } => { a: 2, b: 4, c: 6 }

{ a: "B", c: "D", e: "F" } => { a: "b", c: "d", e: "f" }

We can transform the values of a hash destructively (i.e. modify the original hash with new values) or non-destructively (i.e. return a new hash instead of modifying the original hash).

Prior to Ruby 2.4, we need to use following code to transform the values of a hash.

# Ruby 2.3 Non-destructive version

> hash = { a: 1, b: 2, c: 3 }
 #=> {:a=>1, :b=>2, :c=>3}

> hash.inject({}) { |h, (k, v)| h[k] = v * 2; h }
 #=> {:a=>2, :b=>4, :c=>6}

> hash
 #=> {:a=>1, :b=>2, :c=>3}

> hash = { a: "B", c: "D", e: "F" }
 #=> {:a=>"B", :c=>"D", :e=>"F"}

> hash.inject({}) { |h, (k, v)| h[k] = v.downcase; h }
 #=> {:a=>"b", :c=>"d", :e=>"f"}

> hash
 #=> {:a=>"B", :c=>"D", :e=>"F"}
# Ruby 2.3 Destructive version

> hash = { a: 1, b: 2, c: 3 }
 #=> {:a=>1, :b=>2, :c=>3}

> hash.each { |k, v| hash[k] = v * 2 }
 #=> {:a=>2, :b=>4, :c=>6}

> hash
 #=> {:a=>2, :b=>4, :c=>6}

> hash = { a: "B", c: "D", e: "F" }
 #=> {:a=>"B", :c=>"D", :e=>"F"}

> hash.each { |k, v| hash[k] = v.downcase }
 #=> {:a=>"b", :c=>"d", :e=>"f"}

> hash
 #=> {:a=>"b", :c=>"d", :e=>"f"}

transform_values and transform_values! from Active Support

Active Support has already implemented handy methods Hash#transform_values and Hash#transform_values! to transform hash values.

Now, Ruby 2.4 has also implemented Hash#map_v and Hash#map_v! and then renamed to Hash#transform_values and Hash#transform_values! for the same purpose.

# Ruby 2.4 Non-destructive version

> hash = { a: 1, b: 2, c: 3 }
 #=> {:a=>1, :b=>2, :c=>3}

> hash.transform_values { |v| v * 2 }
 #=> {:a=>2, :b=>4, :c=>6}

> hash
 #=> {:a=>1, :b=>2, :c=>3}

> hash = { a: "B", c: "D", e: "F" }
 #=> {:a=>"B", :c=>"D", :e=>"F"}

> hash.transform_values(&:downcase)
 #=> {:a=>"b", :c=>"d", :e=>"f"}

> hash
 #=> {:a=>"B", :c=>"D", :e=>"F"}
# Ruby 2.4 Destructive version

> hash = { a: 1, b: 2, c: 3 }
 #=> {:a=>1, :b=>2, :c=>3}

> hash.transform_values! { |v| v * 2 }
 #=> {:a=>2, :b=>4, :c=>6}

> hash
 #=> {:a=>2, :b=>4, :c=>6}

> hash = { a: "B", c: "D", e: "F" }
 #=> {:a=>"B", :c=>"D", :e=>"F"}

> hash.transform_values!(&:downcase)
 #=> {:a=>"b", :c=>"d", :e=>"f"}

> hash
 #=> {:a=>"b", :c=>"d", :e=>"f"}

Using prettier and rubocop in Ruby on Rails application to format JavaScript, CSS and Ruby files

Recently we started using prettier and rubocop to automatically format our code on git commit. Here is how we got started with setting up both prettier and rubocop in our Ruby on Rails applications.

Generate package.json

If you don’t already have a package.json file then execute the following command to create a package.json file with value {}.

echo "{}" > package.json

Install prettier

Now execute following command to install prettier.

npm install --save-dev lint-staged husky prettier

# Ignore `node_modules`
echo "/node_modules" >> .gitignore

Add scripts & ignore node_modules

Now open package.json and replace the whole file with following content.

{
  "scripts": {
    "precommit": "lint-staged"
  },
  "lint-staged": {
    "app/**/*.{js,es6,jsx}": [
      "./node_modules/prettier/bin/prettier.js --trailing-comma es5 --write",
      "git add"
    ]
  },
  "devDependencies": {
    "husky": "^0.13.4",
    "lint-staged": "^3.6.0",
    "prettier": "^1.4.2"
  }
}

Note that if you send pull request with your changes and circleCI or such tools run npm install then downgrade husky to ^0.13.4 and that will solve the problem.

In Ruby on Rails applications third party vendor files are stored in vendor folder and we do not want to format JavaScript code in those files. Hence we have applied the rule to run prettier only on files residing in app directory.

Here at BigBinary we store all JavaScript files using ES6 features with extension .es6. Hence we are running such files through prettier. Customize this to match with your application requirements.

Note that “precommit” hook is powered by husky. Read up “husky” documentation to learn about “prepush” hook and other features.

Commit the change

git add .
git commit -m "Added support for prettier for JavaScript files"

Execute prettier on current code

./node_modules/prettier/bin/prettier.js --single-quote --trailing-comma es5 --write "{app,__{tests,mocks}__}/**/*.{js,es6,jsx,scss,css}"

We want more

We were thrilled to see prettier format our JavaScript code. We wanted more of it at more places. We found that prettier can also format CSS files. We changed our code to also format CSS code. It was an easy change. All we had to do was change one line.

Before : "app/**/*.{js,es6,jsx}"

After : "app/**/*.{js,es6,jsx,scss,css}"

Inspired by prettier we welcomed rubocop

Now that JavaScript and CSS files are covered we started looking at other places where we can get this productivity gain.

Since we write a lot of Ruby code we turned our attention to rubocop.

It turned out that “rubocop” already had a feature to automatically format the code.

Open package.json and change lint-staged section to following

"app/**/*.{js,es6,jsx,scss,css}": [
  "./node_modules/prettier/bin/prettier.js --single-quote --trailing-comma es5 --write",
  "git add"
],
"{app,test}/**/*.rb": [
  "bundle exec rubocop -a",
  "git add"
]

Open Gemfile and add following line.

group :development do
  gem "rubocop"
end

The behavior of rubocop can be controlled by .rubocop.yml file. If you want to get started with the rubocop file that Rails uses then just execute following command at the root of your Rails application.

wget https://raw.githubusercontent.com/rails/rails/master/.rubocop.yml

Open the downloaded file and change the TargetRubyVersion value to match with the ruby version the project is using. ` value to match with the ruby version the project is using.

Execute rubcop in all ruby files.

bundle install
bundle exec rubocop -a "{app}/**/*.rb"

Code is changed on git commit and not on git add

We notice that some people were a bit confused when git add did not format the code.

Code is formatted when git commit is done.

npm install is important

It’s important to note that users need to do npm install for all this to work. Otherwise prettier or rubocop won’t be activated and they will silently fail.

Full package.json file

After all the changes are done then package.json should be like as shown below.

{
  "scripts": {
    "precommit": "lint-staged"
  },
  "lint-staged": {
    "app/**/*.{js,es6,jsx,scss,css}": [
      "./node_modules/prettier/bin/prettier.js --trailing-comma es5 --write",
      "git add"
    ],
    "{app,test}/**/*.rb": [
      "bundle exec rubocop -a",
      "git add"
    ]
  },
  "devDependencies": {
    "husky": "^0.13.4",
    "lint-staged": "^3.6.0",
    "prettier": "^1.4.2"
  }
}

Rails 5.1 adds delegate_missing_to

This blog is part of our Rails 5.1 series.

When we use method_missing then we should also use respond_to_missing?. Because of this code becomes verbose since both method_missing and respond_to_missing? need to move in tandem.

DHH in the issue itself provided a good example of this verbosity.

class Partition
  def initialize(first_event)
    @events = [ first_event ]
  end

  def people
    if @events.first.detail.people.any?
      @events.collect { |e| Array(e.detail.people) }.flatten.uniq
    else
      @events.collect(&:creator).uniq
    end
  end

  private
    def respond_to_missing?(name, include_private = false)
      @events.respond_to?(name, include_private)
    end

    def method_missing(method, *args, &block)
      @events.public_send(method, *args, &block)
    end
end

He proposed to use a new method delegate_missing_to. Here is how it can be used.

class Partition
  delegate_missing_to :@events

  def initialize(first_event)
    @events = [ first_event ]
  end

  def people
    if @events.first.detail.people.any?
      @events.collect { |e| Array(e.detail.people) }.flatten.uniq
    else
      @events.collect(&:creator).uniq
    end
  end
end

Why not SimpleDelegator

We at BigBinary have used SimpleDelegator. However one issue with this is that statically we do not know to what object the calls are getting delegated to since at run time the delegator could be anything.

DHH had following to say about this pattern.

I prefer not having to hijack the inheritance tree for such a simple feature.

Why not delegate method

Delegate method works. However here we need to white list all the methods and in some cases the list can get really long. Following is a real example from a real project.

delegate :browser_status, :browser_stats_present?,
         :browser_failed_count, :browser_passed_count,
         :sequential_id, :project, :initiation_info,
         :test_run, success?,
         to: :test_run_browser_stats

Delegate everything

Sometimes we just want to delegate all missing methods. In such cases method delegate_missing_to does the job neatly. Note that the delegation happens to only public methods of the object being delegated to.

Check out the pull request for more details on this.

Using Kubernetes Configmap with configuration files for deploying Rails applications

This post assumes that you have basic understanding of Kubernetes terms like pods and deployments.

We deploy our Rails applications on Kubernetes and frequently do rolling deployments.

While performing application deployments on kubernetes cluster, sometimes we need to change the application configuration file. Changing this application configuration file means we need to change source code, commit the change and then go through the complete deployment process.

This gets cumbersome for simple changes.

Let’s take the case of wanting to add queue in sidekiq configuration.

We should be able to change configuration and restart the pod instead of modifying the source-code, creating a new image and then performing a new deployment.

This is where Kubernetes’s ConfigMap comes handy. It allows us to handle configuration files much more efficiently.

Now we will walk you through the process of managing sidekiq configuration file using configmap.

Starting with configmap

First we need to create a configmap. We can either create it using kubectl create configmap command or we can use a yaml template.

We will be using yaml template test-configmap.yml which already has sidekiq configuration.

apiVersion: v1
kind: ConfigMap
metadata:
  name: test-staging-sidekiq
  labels:
    name: test-staging-sidekiq
  namespace: test
data:
  config: |-
    ---
    :verbose: true
    :environment: staging
    :pidfile: tmp/pids/sidekiq.pid
    :logfile: log/sidekiq.log
    :concurrency: 20
    :queues:
      - [default, 1]
    :dynamic: true
    :timeout: 300

The above template creates configmap in the test namespace and is only accessible in that namespace.

Let’s launch this configmap using following command.

$ kubectl create -f  test-configmap.yml
configmap "test-staging-sidekiq" created

After that let’s use this configmap to create our sidekiq.yml configuration file in deployment template named test-deployment.yml.

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: test-staging
  labels:
    app: test-staging
  namespace: test
spec:
  template:
    metadata:
      labels:
        app: test-staging
    spec:
      containers:
      - image: <your-repo>/<your-image-name>:latest
        name: test-staging
        imagePullPolicy: Always
       env:
        - name: REDIS_HOST
          value: test-staging-redis
        - name: APP_ENV
          value: staging
        - name: CLIENT
          value: test
        volumeMounts:
            - mountPath: /etc/sidekiq/config
              name: test-staging-sidekiq
        ports:
        - containerPort: 80
      volumes:
        - name: test-staging-sidekiq
          configMap:
             name: test-staging-sidekiq
             items:
              - key: config
                path:  sidekiq.yml
      imagePullSecrets:
        - name: registrykey

Now let’s create a deployment using above template.

$ kubectl create -f  test-deployment.yml
deployment "test-pv" created

Once the deployment is created, pod running from that deployment will start sidekiq using the sidekiq.yml mounted at /etc/sidekiq/config/sidekiq.yml.

Let’s check this on the pod.

deployer@test-staging-2766611832-jst35:~$ cat /etc/sidekiq/config/sidekiq_1.yml
---
:verbose: true
:environment: staging
:pidfile: tmp/pids/sidekiq_1.pid
:logfile: log/sidekiq_1.log
:concurrency: 20
:timeout: 300
:dynamic: true
:queues:
  - [default, 1]

Our sidekiq process uses this configuration to start sidekiq. Looks like configmap did its job.

Further if we want to add one new queue to sidekiq, we can simply modify the configmap template and restart the pod.

For example if we want to add mailer queue we will modify template as shown below.

apiVersion: v1
kind: ConfigMap
metadata:
  name: test-staging-sidekiq
  labels:
    name: test-staging-sidekiq
  namespace: test
data:
  config: |-
    ---
    :verbose: true
    :environment: staging
    :pidfile: tmp/pids/sidekiq_1.pid
    :logfile: log/sidekiq_1.log
    :concurrency: 20
    :queues:
      - [default, 1]
      - [mailer, 1]
    :dynamic: true
    :timeout: 300

Let’s launch this configmap using following command.

$ kubectl apply -f  test-configmap.yml
configmap "test-staging-sidekiq" configured

Once the post is restarted, it will use new sidekiq configuration fetched from the configmap.

In this way, we keep our Rails application configuration files out of the source-code and tweak them as needed.

Rails 5.1 adds support for limit in batch processing

This blog is part of our Rails 5.1 series.

Before Rails 5.1, we were not able to limit the number of records fetched in batch processing.

Let’s take an example. Assume our system has 20 users.

 User.find_each{ |user| puts user.id }

The above code will print ids of all the 20 users.

There was no way to limit the number of records. Active Record’s limit method didn’t work for batches.

 User.limit(10).find_each{ |user| puts user.id }

The above code still prints ids of all 20 users, even though the intention was to limit the records fetched to 10.

Rails 5.1 has added support to limit the records in batch processing.

 User.limit(10).find_each{ |user| puts user.id }

The above code will print only 10 ids in Rails 5.1.

We can make use of limit in find_in_batches and in_batches as well.

 total_count = 0

 User.limit(10).find_in_batches(batch_size: 4) do |batch|
   total_count += batch.count
 end

 total_count
#=> 10

Rails 5.1 does not share thread_mattr_accessor variable with subclass

This blog is part of our Rails 5.1 series.

Rails 5.0 provides mattr_accessor to define class level variables on a per thread basis.

However, the variable was getting shared with child classes as well. That meant when a child class changed value of the variable, then its effect was seen in the parent class.

class Player
  thread_mattr_accessor :alias
end

class PowerPlayer < Player
end

Player.alias = 'Gunner'
PowerPlayer.alias = 'Bomber'

> PowerPlayer.alias
#=> "Bomber"

> Player.alias
#=> "Bomber"

This isn’t the intended behavior as per OOPS norms.

In Rails 5.1 this problem was resolved. Now a change in value of thread_mattr_accessor in child class will not affect value in its parent class.

class Player
  thread_mattr_accessor :alias
end

class PowerPlayer < Player
end

Player.alias = 'Gunner'
PowerPlayer.alias = 'Bomber'

> PowerPlayer.alias
#=> "Bomber"

> Player.alias
#=> "Gunner"

Rails 5.1 introduced assert_changes and assert_no_changes

This blog is part of our Rails 5.1 series.

Rails 5.1 has introduced assert_changes and assert_no_changes. It can be seen as a more generic version of assert_difference and assert_no_difference.

assert_changes

assert_changes asserts the value of an expression is changed before and after invoking the block. The specified expression can be string like assert_difference.

@user = users(:john)
assert_changes 'users(:john).status' do
  post :update, params: {id: @user.id, user: {status: 'online'}}
end

We can also pass a lambda as an expression.

@user = users(:john)
assert_changes -> {users(:john).status} do
  post :update, params: {id: @user.id, user: {status: 'online'}}
end

assert_changes also allows options :from and :to to specify initial and final state of expression.

@light = Light.new
assert_changes -> { @light.status }, from: 'off', to: 'on' do
  @light.turn_on
end

We can also specify test failure message.

@invoice = invoices(:bb_client)
assert_changes -> { @invoice.status }, to: 'paid', 'Expected the invoice to be marked paid' do
  @invoice.make_payment
end

assert_no_changes

assert_no_changes has same options and asserts that the expression doesn’t change before and after invoking the block.

Forward ActiveRecord::Relation#count to Enumerable#count if block given

This blog is part of our Rails 5.1 series.

Let’s say that we want to know all the deliveries in progress for an order.

The following code would do the job.

class Order
  has_many :deliveries

  def num_deliveries_in_progress
    deliveries.select { |delivery| delivery.in_progress? }.size
  end

end

But usage of count should make more sense over a select, right?

class Order
  has_many :deliveries

  def num_deliveries_in_progress
    deliveries.count { |delivery| delivery.in_progress? }
  end

end

However the changed code would return count for all the order deliveries, rather than returning only the ones in progress.

That’s because ActiveRecord::Relation#count silently discards the block argument.

Rails 5.1 fixed this issue.

module ActiveRecord
  module Calculations

    def count(column_name = nil)
      if block_given?
        to_a.count { |*block_args| yield(*block_args) }
      else
        calculate(:count, column_name)
      end
    end

  end
end

So now, we can pass a block to count method.

Rails 5.1 has introduced Date#all_day helper

Sometimes, we want to query records over the whole day for a given date.

>> User.where(created_at: Date.today.beginning_of_day..Date.today.end_of_day)

=> SELECT "users".* FROM "users" WHERE ("users"."created_at" BETWEEN $1 AND $2) [["created_at", 2017-04-09 00:00:00 UTC], ["created_at", 2017-04-09 23:59:59 UTC]]

Rails 5.1 has introduced a helper method for creating this range object for a given date in the form of Date#all_day.

>> User.where(created_at: Date.today.all_day)

=> SELECT "users".* FROM "users" WHERE ("users"."created_at" BETWEEN $1 AND $2) [["created_at", 2017-04-09 00:00:00 UTC], ["created_at", 2017-04-09 23:59:59 UTC]]

We can confirm that the Date#all_day method returns the range object for a given date.

>> Date.today.all_day

=> Sun, 09 Apr 2017 00:00:00 UTC +00:00..Sun, 09 Apr 2017 23:59:59 UTC +00:00

Binding irb - Runtime Invocation for IRB

This blog is part of our Ruby 2.4 series.

It’s very common to see a ruby programmer write a few puts or p statements, either for debugging or for knowing the value of variables.

pry did make our lives easier with the usage of binding.pry. However, it was still a bit of an inconvenience to have it installed at runtime, while working with the irb.

Ruby 2.4 has now introduced binding.irb. By simply adding binding.irb to our code we can open an IRB session.

class ConvolutedProcess
  def do_something
    @variable = 10

    binding.irb
    # opens a REPL here
  end
end

irb(main):029:0* ConvolutedProcess.new.do_something
irb(#<ConvolutedProcess:0x007fc55c827f48>):001:0> @variable
=> 10

Using Kubernetes Persistent volume to store persistent data

In one of our projects we are running Rails application on Kubernetes cluster. It is proven tool for managing and deploying docker containers in production.

In kubernetes containers are managed using deployments and they are termed as pods. deployment holds the specification of pods. It is responsible to run the pod with specified resources. When pod is restarted or deployment is deleted then data is lost on pod. We need to retain data out of pods lifecycle when the pod or deployment is destroyed.

We use docker-compose during development mode. In docker-compose linking between host directory and container directory works out of the box. We wanted similar mechanism with kuberentes to link volumes. In kubernetes we have various types of volumes to use. We chose persistent volume with AWS EBS storage. We used persistent volume claim as per the need of application.

As per the Persistent Volume’s definition (PV) Cluster administrators must first create storage in order for Kubernetes to mount it.

Our Kubernetes cluster is hosted on AWS. We created AWS EBS volumes which can be used to create persistent volume.

Let’s create a sample volume using aws cli and try to use it in the deployment.

aws ec2 create-volume --availability-zone us-east-1a --size 20 --volume-type gp2

This will create a volume in us-east-1a region. We need to note VolumeId once the volume is created.

$ aws ec2 create-volume --availability-zone us-east-1a --size 20 --volume-type gp2
{
    "AvailabilityZone": "us-east-1a",
    "Encrypted": false,
    "VolumeType": "gp2",
    "VolumeId": "vol-123456we7890ilk12",
    "State": "creating",
    "Iops": 100,
    "SnapshotId": "",
    "CreateTime": "2017-01-04T03:53:00.298Z",
    "Size": 20
}

Now let’s create a persistent volume template test-pv to create volume using this EBS storage.

kind: PersistentVolume
apiVersion: v1
metadata:
  name: test-pv
  labels:
    type: amazonEBS
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteMany
  awsElasticBlockStore:
    volumeID: <your-volume-id>
    fsType: ext4

Once we had template to create persistent volume, we used kubectl to launch it. Kubectl is command line tool to interact with Kubernetes cluster.

$ kubectl create -f  test-pv.yml
persistentvolume "test-pv" created

Once persistent volume is created you can check using following command.

$ kubectl get pv
NAME       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS      CLAIM               REASON    AGE
test-pv     10Gi        RWX           Retain          Available                                7s

Now that our persistent volume is in available state, we can claim it by creating persistent volume claim policy.

We can define persistent volume claim using following template test-pvc.yml.

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-pvc
  labels:
    type: amazonEBS
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10Gi

Let’s create persistent volume claime using above template.

$ kubectl create -f  test-pvc.yml

persistentvolumeclaim "test-pvc" created

After creating the persistent volume claim, our persistent volume will change from available state to bound state.

$ kubectl get pv
NAME       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS     CLAIM               REASON    AGE
test-pv    10Gi        RWX           Retain          Bound      default/test-pvc              2m

$kubectl get pvc
NAME        STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
test-pvc    Bound     test-pv   10Gi        RWX           1m

Now we have persistent volume claim available on our Kubernetes cluster, Let’s use it in deployment.

Deploying Kubernetes application

We will use following deployment template as test-pv-deployment.yml.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: test-pv
  labels:
    app: test-pv
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: test-pv
        tier: frontend
    spec:
      containers:
      - image: <your-repo>/<your-image-name>:latest
        name: test-pv
        imagePullPolicy: Always
        env:
        - name: APP_ENV
          value: staging
        - name: UNICORN_WORKER_PROCESSES
          value: "2"
        volumeMounts:
        - name: test-volume
          mountPath: "/<path-to-my-app>/shared/data"
        ports:
        - containerPort: 80
      imagePullSecrets:
        - name: registrypullsecret
      volumes:
      - name: test-volume
        persistentVolumeClaim:
          claimName: test-pvc

Now launch the deployment using following command.

$ kubectl create -f  test-pvc.yml
deployment "test-pv" created

Once the deployment is up and running all the contents on shared directory will be stored on persistent volume claim. Further when pod or deployment crashes for any reason our data will be always retained on the persistent volume. We can use it to launch the application deployment.

This solved our goal of retaining data across deployments across pod restarts.

Ruby 2.4 has added additional parameters for Logger#new

This blog is part of our Ruby 2.4 series.

The Logger class in Ruby provides a simple but sophisticated logging utility.

After creating the logger object we need to set its level.

Ruby 2.3

require 'logger'
logger = Logger.new(STDOUT)
logger.level = Logger::INFO

If we are working with ActiveRecord::Base.logger, then same code would look something like this.

require 'logger'
ActiveRecord::Base.logger = Logger.new(STDOUT)
ActiveRecord::Base.logger.level = Logger::INFO

As we can see in the both the cases we need to set the level separately after instantiating the object.

Ruby 2.4

In Ruby 2.4, level can now be specified in the constructor.

#ruby 2.4
require 'logger'
logger = Logger.new(STDOUT, level: Logger::INFO)

# let's verify it
logger.level      #=> 1

Similarly, other options such as progname, formatter and datetime_format, which prior to Ruby 2.4 had to be explicitly set, can now be set during the instantiation.

#ruby 2.3
require 'logger'
logger = Logger.new(STDOUT)
logger.level = Logger::INFO
logger.progname = 'bigbinary'
logger.datetime_format = '%Y-%m-%d %H:%M:%S'
logger.formatter = proc do |severity, datetime, progname, msg|
  "#{severity} #{datetime} ==> App: #{progname}, Message: #{msg}\n"
end

logger.info("Program started...")
#=> INFO 2017-03-16 18:43:58 +0530 ==> App: bigbinary, Message: Program started...

Here is same stuff in Ruby 2.4.

#ruby 2.4
require 'logger'
logger = Logger.new(STDOUT,
  level: Logger::INFO,
  progname: 'bigbinary',
  datetime_format: '%Y-%m-%d %H:%M:%S',
  formatter: proc do |severity, datetime, progname, msg|
    "#{severity} #{datetime} ==> App: #{progname}, Message: #{msg}\n"
  end
)

logger.info("Program started...")
#=> INFO 2017-03-16 18:47:39 +0530 ==> App: bigbinary, Message: Program started...

Ruby 2.4 has default basename for Tempfile#create

This blog is part of our Ruby 2.4 series.

Tempfile class

Tempfile is used for managing temporary files in Ruby. A Tempfile object creates a temporary file with a unique filename. It behaves just like a File object, and therefore we can perform all the usual file operations on it.

Why Tempfile when we can use File

These days it is common to store file on services like S3. Let’s say that we have a users.csv file on S3. Working with this file remotely is problematic. In such cases it is desirable to download the file on local machine for manipulation. After the work is done then file should be deleted. Tempfile is ideal for such cases.

Basename for tempfile

If we want to create a temporary file then we needed to pass parameter to it prior to Ruby 2.3.

require 'tempfile'
file = Tempfile.new('bigbinary')
#=> #<Tempfile:/var/folders/jv/fxkfk9_10nb_964rvrszs2540000gn/T/bigbinary-20170304-10828-1w02mqi>

As we can see above the generated file name begins with “bigbinary” word.

Since Tempfile ensures that the generate filename will always be unique the point of passing the argument is meaningless. Ruby doc calls this passing “basename”.

So in Ruby 2.3.0 it was decided that the basename parameter was meaningless for Tempfile#new and an empty string will be the default value.

require 'tempfile'
file = Tempfile.new
#=> #<Tempfile:/var/folders/jv/fxkfk9_10nb_964rvrszs2540000gn/T/20170304-10828-1v855bf>

But the same was not implemented for Tempfile#create.

# Ruby 2.3.0
require 'tempfile'
Tempfile.create do |f|
  f.write "hello"
end

ArgumentError: wrong number of arguments (given 0, expected 1..2)

This was fixed in Ruby 2.4. So now the basename parameter for Tempfile.create is set to empty string by default, to keep it consistent with the Tempfile#new method.

# Ruby 2.4
require 'tempfile'
Tempfile.create do |f|
  f.write "hello"
end
=> 5

New arguments supported for float and integer modifiers in Ruby 2.4

This blog is part of our Ruby 2.4 series.

In Ruby, there are many methods available which help us to modify a float or integer value.

Ruby 2.3.x

In the previous versions of Ruby, we could use methods such as floor, ceil and truncate in following ways.

5.54.floor          #=> 5
5.54.ceil           #=> 6
5.54.truncate       #=> 5

Providing an argument to these methods would result in ArgumentError exception.

Ruby 2.4

Ruby community decided to come up with an option to add precision argument .

The precision argument, which can be negative, helps us to get result to the required precision to either side of the decimal point.

The default value for the precision argument is 0.

876.543.floor(-2)       #=> 800
876.543.floor(-1)       #=> 870
876.543.floor           #=> 876
876.543.floor(1)        #=> 876.5
876.543.floor(2)        #=> 876.54

876.543.ceil(-2)        #=> 900
876.543.ceil(-1)        #=> 880
876.543.ceil            #=> 877
876.543.ceil(1)         #=> 876.6
876.543.ceil(2)         #=> 876.55

876.543.truncate(-2)    #=> 800
876.543.truncate(-1)    #=> 870
876.543.truncate        #=> 876
876.543.truncate(1)     #=> 876.5
876.543.truncate(2)     #=> 876.54

These methods all work the same on Integer as well.

5.floor(2)              #=> 5.0
5.ceil(2)               #=> 5.0
5.truncate(2)           #=> 5.0

Ruby 2.4 introduces Enumerable#uniq and Enumerable::Lazy#uniq

This blog is part of our Ruby 2.4 series.

In Ruby, we commonly use uniq method on an array to fetch the collection of all unique elements. But there may be cases where we might need elements in a hash by virtue of uniqueness of its values.

Let’s consider an example of countries that have hosted the Olympics. We only want to know when was the first time a country hosted it.

# given object
{ 1896 => 'Athens',
  1900 => 'Paris',
  1904 => 'Chicago',
  1906 => 'Athens',
  1908 => 'Rome' }

# expected outcome
{ 1896 => 'Athens',
  1900 => 'Paris',
  1904 => 'Chicago',
  1908 => 'Rome' }

One way to achieve this is to have a collection of unique country names and then check if that value is already taken while building the result.

olympics =
{ 1896 => 'Athens',
  1900 => 'Paris',
  1904 => 'Chicago',
  1906 => 'Athens',
  1908 => 'Rome' }

unique_nations = olympics.values.uniq

olympics.select{ |year, country| !unique_nations.delete(country).nil? }
#=> {1896=>"Athens", 1900=>"Paris", 1904=>"Chicago", 1908=>"Rome"}

As we can see, the above code requires constructing an additional array unique_nations.

In processing larger data, loading an array of considerably big size in memory and then carrying out further processing on it, may result in performance and memory issues.

In Ruby 2.4, Enumerable class introduces uniq method that collects unique elements while iterating over the enumerable object.

The usage is similar to that of Array#uniq. Uniqueness can be determined by the elements themselves or by a value yielded by the block passed to the uniq method.

olympics = {1896 => 'Athens', 1900 => 'Paris', 1904 => 'Chicago', 1906 => 'Athens', 1908 => 'Rome'}

olympics.uniq { |year, country| country }.to_h
#=> {1896=>"Athens", 1900=>"Paris", 1904=>"Chicago", 1908=>"Rome"}

Similar method is also implemented in Enumerable::Lazy class. Hence we can now call uniq on lazy enumerables.

(1..Float::INFINITY).lazy.uniq { |x| (x**2) % 10 }.first(6)
#=> [1, 2, 3, 4, 5, 10]

Ruby 2.4 has optimized lstrip and strip methods for ASCII strings

This blog is part of our Ruby 2.4 series.

Ruby has lstrip and rstrip methods which can be used to remove leading and trailing whitespaces respectively from a string.

Ruby also has strip method which is a combination of lstrip and rstrip and can be used to remove both, leading and trailing whitespaces, from a string.

"    Hello World    ".lstrip    #=> "Hello World    "
"    Hello World    ".rstrip    #=> "    Hello World"
"    Hello World    ".strip     #=> "Hello World"

Prior to Ruby 2.4, the rstrip method was optimized for performance, but the lstrip and strip were somehow missed. In Ruby 2.4, String#lstrip and String#strip methods too have been optimized to get the performance benefit of String#rstrip .

Let’s run following snippet in Ruby 2.3 and Ruby 2.4 to benchmark and compare the performance improvement.

require 'benchmark/ips'

Benchmark.ips do |bench|
  str1 = " " * 10_000_000 + "hello world" + " " * 10_000_000
  str2 = str1.dup
  str3 = str1.dup

  bench.report('String#lstrip') do
    str1.lstrip
  end

  bench.report('String#rstrip') do
    str2.rstrip
  end

  bench.report('String#strip') do
    str3.strip
  end
end

Result for Ruby 2.3

Warming up --------------------------------------
       String#lstrip     1.000  i/100ms
       String#rstrip     8.000  i/100ms
        String#strip     1.000  i/100ms
Calculating -------------------------------------
       String#lstrip     10.989  (± 0.0%) i/s -     55.000  in   5.010903s
       String#rstrip     92.514  (± 5.4%) i/s -    464.000  in   5.032208s
        String#strip     10.170  (± 0.0%) i/s -     51.000  in   5.022118s

Result for Ruby 2.4

Warming up --------------------------------------
       String#lstrip    14.000  i/100ms
       String#rstrip     8.000  i/100ms
        String#strip     6.000  i/100ms
Calculating -------------------------------------
       String#lstrip    143.424  (± 4.2%) i/s -    728.000  in   5.085311s
       String#rstrip     89.150  (± 5.6%) i/s -    448.000  in   5.041301s
        String#strip     67.834  (± 4.4%) i/s -    342.000  in   5.051584s

From the above results, we can see that in Ruby 2.4, String#lstrip is around 14x faster while String#strip is around 6x faster. String#rstrip as expected, has nearly the same performance as it was already optimized in previous versions.

Performance remains same for multi-byte strings

Strings can have single byte or multi-byte characters.

For example Lé Hello World is a multi-byte string because of the presence of é which is a multi-byte character.

'e'.bytesize        #=> 1
'é'.bytesize        #=> 2

Let’s do performance benchmarking with string Lé hello world instead of hello world.

Result for Ruby 2.3

Warming up --------------------------------------
       String#lstrip     1.000  i/100ms
       String#rstrip     1.000  i/100ms
        String#strip     1.000  i/100ms
Calculating -------------------------------------
       String#lstrip     11.147  (± 9.0%) i/s -     56.000  in   5.034363s
       String#rstrip      8.693  (± 0.0%) i/s -     44.000  in   5.075011s
        String#strip      5.020  (± 0.0%) i/s -     26.000  in   5.183517s

Result for Ruby 2.4

Warming up --------------------------------------
       String#lstrip     1.000  i/100ms
       String#rstrip     1.000  i/100ms
        String#strip     1.000  i/100ms
Calculating -------------------------------------
       String#lstrip     10.691  (± 0.0%) i/s -     54.000  in   5.055101s
       String#rstrip      9.524  (± 0.0%) i/s -     48.000  in   5.052678s
        String#strip      4.860  (± 0.0%) i/s -     25.000  in   5.152804s

As we can see, the performance for multi-byte strings is almost the same across Ruby 2.3 and Ruby 2.4.

Explanation

The optimization introduced is related to how the strings are parsed to detect for whitespaces. Checking for whitespaces in multi-byte string requires an additional overhead. So the patch adds an initial condition to check if the string is a single byte string, and if so, processes it separately.

In most of the cases, the strings are single byte so the performance improvement would be visible and helpful.

IO#readlines now accepts chomp flag as an argument

This blog is part of our Ruby 2.4 series.

Consider the following file which needs to be read in Ruby. We can use the IO#readlines method to get the lines in an array.

# lotr.txt

Three Rings for the Elven-kings under the sky,
Seven for the Dwarf-lords in their halls of stone,
Nine for Mortal Men doomed to die,
One for the Dark Lord on his dark throne
In the Land of Mordor where the Shadows lie.

Ruby 2.3

IO.readlines('lotr.txt')
#=> ["Three Rings for the Elven-kings under the sky,\n", "Seven for the Dwarf-lords in their halls of stone,\n", "Nine for Mortal Men doomed to die,\n", "One for the Dark Lord on his dark throne\n", "In the Land of Mordor where the Shadows lie."]

As we can see, the lines in the array have a \n, newline character, which is not skipped while reading the lines. The newline character needs to be chopped in most of the cases. Prior to Ruby 2.4, it could be done in the following way.

IO.readlines('lotr.txt').map(&:chomp)
#=> ["Three Rings for the Elven-kings under the sky,", "Seven for the Dwarf-lords in their halls of stone,", "Nine for Mortal Men doomed to die,", "One for the Dark Lord on his dark throne", "In the Land of Mordor where the Shadows lie."]

Ruby 2.4

Since it was a common requirement, Ruby team decided to add an optional parameter to the readlines method. So the same can now be achieved in Ruby 2.4 in the following way.

IO.readlines('lotr.txt', chomp: true)
#=> ["Three Rings for the Elven-kings under the sky,", "Seven for the Dwarf-lords in their halls of stone,", "Nine for Mortal Men doomed to die,", "One for the Dark Lord on his dark throne", "In the Land of Mordor where the Shadows lie."]

Additionally, IO#gets, IO#readline, IO#each_line, IO#foreach methods also have been modified to accept an optional chomp flag.

open-uri in Ruby 2.4 allows http to https redirection

In Ruby 2.3, if the argument to open-uri is http and the host redirects to https , then open-uri would throw an error.

> require 'open-uri'
> open('http://www.google.com/gmail')

RuntimeError: redirection forbidden: http://www.google.com/gmail -> https://www.google.com/gmail/

To get around this issue, we could use open_uri_redirections gem.

> require 'open-uri'
> require 'open_uri_redirections'
> open('http://www.google.com/gmail/', :allow_redirections => :safe)

=> #<Tempfile:/var/folders/jv/fxkfk9_10nb_964rvrszs2540000gn/T/open-uri20170228-41042-2fffoa>

Ruby 2.4

In Ruby 2.4, this issue is fixed. So now http to https redirection is possible using open-uri.

> require 'open-uri'
> open('http://www.google.com/gmail')
=> #<Tempfile:/var/folders/jv/fxkfk9_10nb_964rvrszs2540000gn/T/open-uri20170228-41077-1bkm1dv>

Note that redirection from https to http will raise an error, like it did in previous versions, since that has possible security concerns.