San Francisco, USA

5214F Diamond Heights Blvd #553
San Francisco, CA 94131

Pune, India

203, Jewel Towers, 2nd Floor
Lane Number 5, Koregaon Park
Pune 411001, India

301 - 275 - 3997
hello@BigBinary.com

Debugging failing tests in puppeteer because of background tab

We have been using puppeteer in one of our projects to write end-to-end tests. We run our tests in headful mode to see the browser in action.

If we start puppeteer tests and do nothing in our laptop (just watch the tests being executed) then all the tests will pass.

However if we are doing our regular work in our laptop while tests are running then tests would fail randomly. This was quite puzzling.

Debugging such flaky tests is hard. We first suspected that the test cases themselves needed more of implicit waits for element/text to be present/visible on the DOM.

After some debugging using puppeteer protocol logs, it seemed like the browser was performing certain actions very slowly or was waiting for the browser to be active ( in view ) before performing those actions.

Chrome starting with version 57 introduced throtlling of background tabs for improving performance and battery life. We execute one test per browser meaning we didn’t make use of multiple tabs. Also tests failed only when the user was performing some other activities while the tests were executing in other background windows. Pages were hidden only when user switched tabs or minimized the browser window containing the tab.

After observing closely we noticed that the pages were making requests to the server. The issue was the page was not painting if the page is not in view. We added flag --disable-background-timer-throttling but we did not notice any difference.

After doing some searches we noticed the flag --disable-renderer-backgrounding was being used in karma-launcher. The comment states that it is specifically required on macOS. Here is the code responsible for lowering the priority of the renderer when it is hidden.

But the new flag didn’t help either.

While looking at all the available command line switches for chromium, we stumbled upon --disable-backgrounding-occluded-windows. Chromium also backgrounds the renderer while the window is not visible to the user. It seems from the comment that the flag kDisableBackgroundingOccludedWindowsForTesting is specifically added to avoid non-deterministic behavior during tests.

We have added following flags to chromium for running our integration suite and this solved our problem.

const chromeArgs = [
  '--disable-background-timer-throttling',
  '--disable-backgrounding-occluded-windows',
  '--disable-renderer-backgrounding'
];

References


Using Kubernetes ingress controller for authenticating applications

Kubernetes Ingress has redefined the routing in this era of containerization and with all these freehand routing techniques the thought of “My router my rules” seems real.

We use nginx-ingress as a routing service for our applications. There is a lot more than routing we can do with ingress. One of the important features is setting up authentication using ingress for our application. As all the traffic goes from ingress to our service, it makes sense to setup authentication on ingress.

As mentioned in ingress repository there are different types of techniques available for authentication including:

  • Basic authentication
  • Client-certs authentication
  • External authentication
  • Oauth external authentication

In this blog, we will set up authentication for the sample application using basic ingress authentication technique.

Pre-requisites

First, let’s create ingress resources from upstream example by running the following command.

$ kubectl create -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/mandatory.yaml
namespace "ingress-nginx" created
deployment "default-http-backend" created
service "default-http-backend" created
configmap "nginx-configuration" created
configmap "tcp-services" created
configmap "udp-services" created
serviceaccount "nginx-ingress-serviceaccount" created
clusterrole "nginx-ingress-clusterrole" created
role "nginx-ingress-role" created
rolebinding "nginx-ingress-role-nisa-binding" created
clusterrolebinding "nginx-ingress-clusterrole-nisa-binding" created
deployment "nginx-ingress-controller" created

Now that ingress controller resources are created we need a service to access the ingress.

Use following manifest to create service for ingress.

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
  labels:
    k8s-addon: ingress-nginx.addons.k8s.io
  name: ingress-nginx
  namespace: ingress-nginx
spec:
  externalTrafficPolicy: Cluster
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: http
  - name: http
    port: 80
    protocol: TCP
    targetPort: http
  selector:
    app: ingress-nginx
  type: LoadBalancer

Now, get the ELB endpoint and bind it with some domain name.

$kubectl create -f ingress-service.yml
service ingress-nginx created

$ kubectl -n ingress-nginx get svc  ingress-nginx -o wide
NAME            CLUSTER-IP      EXTERNAL-IP                                                               PORT(S)                      AGE       SELECTOR
ingress-nginx   100.71.250.56   abcghccf8540698e8bff782799ca8h04-1234567890.us-east-2.elb.amazonaws.com   80:30032/TCP,443:30108/TCP   10s       app=ingress-nginx

Let’s create a deployment and service for our sample application kibana. We need elasticsearch to run kibana.

Here is manifest for the sample application.

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: kibana
  name: kibana
  namespace: ingress-nginx
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
       - image: kibana:latest
         name: kibana
         ports:
           - containerPort: 5601
---
apiVersion: v1
kind: Service
metadata:
  annotations:
  labels:
    app: kibana
  name: kibana
  namespace: ingress-nginx

spec:
  ports:
  - name: kibana
    port: 5601
    targetPort: 5601
  selector:
    app: kibana
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: elasticsearch
  name: elasticsearch
  namespace: ingress-nginx
spec:
  replicas: 1
  strategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
       - image: elasticsearch:latest
         name: elasticsearch
         ports:
           - containerPort: 5601
---
apiVersion: v1
kind: Service
metadata:
  annotations:
  labels:
    app: elasticsearch
  name: elasticsearch
  namespace: ingress-nginx
spec:
  ports:
  - name: elasticsearch
    port: 9200
    targetPort: 9200
  selector:
    app: elasticsearch

Create the sample application.

kubectl apply -f kibana.yml
deployment "kibana" created
service "kibana" created
deployment "elasticsearch" created
service "elasticsearch" created

Now that we have created application and ingress resources, it’s time to create an ingress and access the application.

Use the following manifest to create ingress.

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
  name: kibana-ingress
  namespace: ingress-nginx
spec:
  rules:
    - host: logstest.myapp-staging.com
      http:
        paths:
          - path: /
            backend:
              serviceName: kibana
              servicePort: 5601
$kubectl -n ingress-nginx create -f ingress.yml
ingress "kibana-ingress" created.

Now that our application is up, when we access the kibana dashboard using URL http://logstest.myapp-staging.com We directly have access to our Kibana dashboard and anyone with this URL can access logs as shown in the following image.

Kibana dashboard without authentication

Now, let’s set up a basic authentication using htpasswd.

Follow below commands to generate the secret for credentials.

Let’s create an auth file with username and password.

$ htpasswd -c auth kibanaadmin
New password: <kibanaadmin>
New password:
Re-type new password:
Adding password for user kibanaadmin

Create k8s secret.

$ kubectl -n ingress-nginx create secret generic basic-auth --from-file=auth
secret "basic-auth" created

Verify the secret.

kubectl get secret basic-auth -o yaml
apiVersion: v1
data:
  auth: Zm9vOiRhcHIxJE9GRzNYeWJwJGNrTDBGSERBa29YWUlsSDkuY3lzVDAK
kind: Secret
metadata:
  name: basic-auth
  namespace: ingress-nginx
type: Opaque

Use following annotations in our ingress manifest by updating the ingress manifest.

kubectl -n ingress-nginx edit ingress kibana ingress

Paste the following annotations

nginx.ingress.kubernetes.io/auth-type: basic
nginx.ingress.kubernetes.io/auth-secret: basic-auth
nginx.ingress.kubernetes.io/auth-realm: "Kibana Authentication Required - kibanaadmin"

Now that ingress is updated, hit the URL again and as shown in the image below we are asked for authentication.

Kibana dashboard without authentication


Ruby 2.6 adds write_timeout to Net::HTTP

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Before Ruby 2.6, if we created a large request with Net::HTTP, it would hang forever until request is interrupted. To fix this issue, write_timeout attribute and write_timeout= method is added to Net::HTTP in Ruby 2.6. Default value for write_timeout is 60 seconds and can be set to an integer or a float value.

Similarly, write_timeout attribute and write_timeout= method is added to Net::BufferedIO class.

If any chunk of response is not written within number of seconds provided to write_timeout, Net::WriteTimeout exception is raised. Net::WriteTimeout exception is not raised on Windows systems.

Example
# server.rb

require 'socket'

server = TCPServer.new('localhost', 2345)
loop do
  socket = server.accept
end
Ruby 2.5.1
# client.rb

require 'net/http'

connection = Net::HTTP.new('localhost', 2345)
connection.open_timeout = 1
connection.read_timeout = 3
connection.start

post = Net::HTTP::Post.new('/')
body = (('a' * 1023) + "\n") * 5_000
post.body = body

puts "Sending #{body.bytesize} bytes"
connection.request(post)
Output
$ RBENV_VERSION=2.5.1 ruby client.rb

Sending 5120000 bytes

Ruby 2.5.1 processes request endlessly unless above program is interrupted.

Ruby 2.6.0-dev

Add write_timeout attribute to Net::HTTP instance in client.rb program.

# client.rb

require 'net/http'

connection = Net::HTTP.new('localhost', 2345)
connection.open_timeout = 1
connection.read_timeout = 3

# set write_timeout to 10 seconds
connection.write_timeout = 10

connection.start

post = Net::HTTP::Post.new('/')
body = (('a' * 1023) + "\n") * 5_000
post.body = body

puts "Sending #{body.bytesize} bytes"
connection.request(post)
Output
$ RBENV_VERSION=2.6.0-dev ruby client.rb

Sending 5120000 bytes
Traceback (most recent call last):
    13: `from client.rb:17:in `<main>``
    12: `from /net/http.rb:1479:in `request``
    11: `from /net/http.rb:1506:in `transport_request``
    10: `from /net/http.rb:1506:in `catch``
     9: `from /net/http.rb:1507:in `block in transport_request``
     8: `from /net/http/generic_request.rb:123:in `exec``
     7: `from /net/http/generic_request.rb:189:in `send_request_with_body``
     6: `from /net/protocol.rb:221:in `write``
     5: `from /net/protocol.rb:239:in `writing``
     4: `from /net/protocol.rb:222:in `block in write``
     3: `from /net/protocol.rb:249:in `write0``
     2: `from /net/protocol.rb:249:in `each_with_index``
     1: `from /net/protocol.rb:249:in `each``
`/net/protocol.rb:270:in `block in write0`: Net::WriteTimeout (Net::WriteTimeout)`

In Ruby 2.6.0, above program is terminated raising Net::WriteTimeout exception after 10 seconds (value set to write_timeout attribute).

Here is relevant commit and discussion for this change.


Ruby 2.6 Introduces Dir#each_child and Dir#children instance methods

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Ruby 2.5 had introduced class level methods Dir::each_child and Dir::children. We wrote a detailed blog about it.

In Ruby 2.6, same methods are added as instance methods on Dir class. Dir#children returns array of all the filenames except . and .. in the directory. Dir#each_child yields all the filenames and operates on it.

Let’s have a look at examples to understand it better.

Dir#children
directory = Dir.new('/Users/tejaswinichile/workspace')

directory.children
=> ["panda.png", "apple.png", "banana.png", "camera.jpg"]

Dir#each_child iterates and calls block for each file entry in the given directory. It uses filename as a parameter to the block.

Dir#each_child
directory = Dir.new('/Users/tejaswinichile/workspace')

directory.each_child { |filename| puts "Curently reading: #{filename}"}

Curently reading: panda.png
Curently reading: apple.png
Curently reading: banana.png
Curently reading: camera.jpg
=> #<Dir:/Users/tejaswinichile/Desktop>

If we don’t pass any block to each_child, it returns enumerator instead.

directory = Dir.new('/Users/tejaswinichile/workspace')

directory.each_child

=> #<Enumerator: #<Dir:/Users/tejaswinichile/Desktop>:each_child>

Here is relevant commit and discussion for this change.


Ruby 2.6 adds option to not raise exception for Integer, Float methods

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

We can use Integer and Float methods to convert values to integers and floats respectively. Ruby also has to_i and to_f methods for same purpose. Let’s see how it differs from the Integer method.

>> "1one".to_i
=> 1

>> Integer("1one")
ArgumentError: invalid value for Integer(): "1one"
	from (irb):2:in `Integer'
	from (irb):2
	from /Users/prathamesh/.rbenv/versions/2.4.0/bin/irb:11:in `<main>'
>>

The to_i method tries to convert the given input to integer as much as possible whereas the Integer method throws an ArgumentError if it can’t covert the input to integer. The Integer and Float methods parse more strictly compared to to_i and to_f respectively.

Some times, we might need the strictness of Integer and Float but ability to not raise an exception every time the input can’t be parsed.

Before Ruby 2.6 it was possible to achieve it in following way.

>> Integer("msg") rescue nil

In Ruby 2.6, the Integer and Float methods accept a keyword argument exception which can be either true or false. If it is false then no exception is raised if the input can’t be parsed and nil is returned.

>> Float("foo", exception: false)
=> nil
>> Integer("foo", exception: false)
=> nil

This is also faster than rescuing the exception and returning nil.

>> Benchmark.ips do |x|
?>       x.report("rescue") {
?>           Integer('foo') rescue nil
>>       }
>>     x.report("kwarg") {
?>           Integer('foo', exception: false)
>>       }
>>     x.compare!
>> end
Warming up --------------------------------------
              rescue    41.896k i/100ms
               kwarg    81.459k i/100ms
Calculating -------------------------------------
              rescue    488.006k (± 4.5%) i/s -      2.472M in   5.076848s
               kwarg      1.024M (±11.8%) i/s -      5.050M in   5.024937s

Comparison:
               kwarg:  1023555.3 i/s
              rescue:   488006.0 i/s - 2.10x  slower

As we can see, rescuing the exception is twice slower than using the new keyword argument. We can still use the older technique if we want to return a different value from nil.

>> Integer('foo') rescue 42
=> 42

By default, the keyword argument exception is set to true for backward compatibility.

The Chinese version of this blog is available here.


Speeding up Docker image build process of a Rails application

tl;dr : We reduced the Docker image building time from 10 minutes to 5 minutes by re-using bundler cache and by precompiling assets.

We deploy one of our Rails applications on a dedicated Kubernetes cluster. Kubernetes is a good fit for us since as per the load and resource consumption, Kubernetes horizontally scales the containerized application automatically. The prerequisite to deploy any kind of application on Kubernetes is that the application needs to be containerized. We use Docker to containerize our application.

We have been successfully containerizing and deploying our Rails application on Kubernetes for about a year now. Although containerization was working fine, we were not happy with the overall time spent to containerize the application whenever we changed the source code and deployed the app.

We use Jenkins for building on-demand Docker images of our application with the help of CloudBees Docker Build and Publish plugin.

We observed that the average build time of a Jenkins job to build a Docker image was about 9 to 10 minutes.

Investigating what takes most time

We wipe the workspace folder of the Jenkins job after finishing each Jenkins build to avoid any unintentional behavior caused by the residue left from a previous build. The application’s folder is about 500 MiB in size. Each Jenkins build spends about 20 seconds to perform a shallow Git clone of the latest commit of the specified git branch from our remote GitHub repository.

After cloning the latest source code, Jenkins executes docker build command to build a Docker image with a unique tag to containerize the cloned source code of the application.

Jenkins build spends another 10 seconds invoking docker build command and sending build context to Docker daemon.

01:05:43 [docker-builder] $ docker build --build-arg RAILS_ENV=production -t bigbinary/xyz:production-role-management-feature-1529436929 --pull=true --file=./Dockerfile /var/lib/jenkins/workspace/docker-builder
01:05:53 Sending build context to Docker daemon 489.4 MB

We use the same Docker image on a number of Kubernetes pods. Therefore, we do not want to execute bundle install and rake assets:precompile tasks while starting a container in each pod which would prevent that pod from accepting any requests until these tasks are finished.

The recommonded approach is to run bundle install and rake assets:precompile tasks while or before containerizing the Rails application.

Following is a trimmed down version of our actual Dockerfile which is used by docker build command to containerize our application.

FROM bigbinary/xyz-base:latest

ENV APP_PATH /data/app/

WORKDIR $APP_PATH

ADD . $APP_PATH

ARG RAILS_ENV

RUN bin/bundle install --without development test

RUN bin/rake assets:precompile

CMD ["bin/bundle", "exec", "puma"]

The RUN instructions in the above Dockerfile executes bundle install and rake assets:precompile tasks while building a Docker image. Therefore, when a Kubernetes pod is created using such a Docker image, Kubernetes pulls the image, starts a Docker container using that image inside the pod and runs puma server immediately.

The base Docker image which we use in the FROM instruction contains necessary system packages. We rarely need to update any system package. Therefore, an intermediate layer which may have been built previously for that instruction is reused while executing the docker build command. If the layer for FROM instruction is reused, Docker reuses cached layers for the next two instructions such as ENV and WORKDIR respectively since both of them are never changed.

01:05:53 Step 1/8 : FROM bigbinary/xyz-base:latest
01:05:53 latest: Pulling from bigbinary/xyz-base
01:05:53 Digest: sha256:193951cad605d23e38a6016e07c5d4461b742eb2a89a69b614310ebc898796f0
01:05:53 Status: Image is up to date for bigbinary/xyz-base:latest
01:05:53  ---> c2ab738db405
01:05:53 Step 2/8 : ENV APP_PATH /data/app/
01:05:53  ---> Using cache
01:05:53  ---> 5733bc978f19
01:05:53 Step 3/8 : WORKDIR $APP_PATH
01:05:53  ---> Using cache
01:05:53  ---> 0e5fbc868af8

Docker checks contents of the files in the image and calculates checksum for each file for an ADD instruction. Since source code changes often, the previously cached layer for the ADD instruction is invalidated due to the mismatching checksums. Therefore, the 4th instruction ADD in our Dockerfile has to add the local files in the provided build context to the filesystem of the image being built in a separate intermediate container instead of reusing the previously cached instruction layer. On an average, this instruction spends about 25 seconds.

01:05:53 Step 4/8 : ADD . $APP_PATH
01:06:12  ---> cbb9a6ac297e
01:06:17 Removing intermediate container 99ca98218d99

We need to build Docker images for our application using different Rails environments. To achieve that, we trigger a parameterized Jenkins build by specifying the needed Rails environment parameter. This parameter is then passed to the docker build command using --build-arg RAILS_ENV=production option. The ARG instruction in the Dockerfile defines RAILS_ENV variable and is implicitly used as an environment variable by the rest of the instructions defined just after that ARG instruction. Even if the previous ADD instruction didn’t invalidate build cache; if the ARG variable is different from a previous build, then a “cache miss” occurs and the build cache is invalidated for the subsequent instructions.

01:06:17 Step 5/8 : ARG RAILS_ENV
01:06:17  ---> Running in b793b8cc2fe7
01:06:22  ---> b8a70589e384
01:06:24 Removing intermediate container b793b8cc2fe7

The next two RUN instructions are used to install gems and precompile static assets using sprockets. As earlier instruction(s) already invalidates the build cache, these RUN instructions are mostly executed instead of reusing cached layer. The bundle install command takes about 2.5 minutes and the rake assets:precompile task takes about 4.35 minutes.

01:06:24 Step 6/8 : RUN bin/bundle install --without development test
01:06:24  ---> Running in a556c7ca842a
01:06:25 bin/bundle install --without development test
01:08:22  ---> 82ab04f1ff42
01:08:40 Removing intermediate container a556c7ca842a
01:08:58 Step 7/8 : RUN bin/rake assets:precompile
01:08:58  ---> Running in b345c73a22c
01:08:58 bin/bundle exec rake assets:precompile
01:09:07 ** Invoke assets:precompile (first_time)
01:09:07 ** Invoke assets:environment (first_time)
01:09:07 ** Execute assets:environment
01:09:07 ** Invoke environment (first_time)
01:09:07 ** Execute environment
01:09:12 ** Execute assets:precompile
01:13:20  ---> 57bf04f3c111
01:13:23 Removing intermediate container b345c73a22c

Above both RUN instructions clearly looks like the main culprit which were slowing down the whole docker build command and thus the Jenkins build.

The final instruction CMD which starts the puma server takes another 10 seconds. After building the Docker image, the docker push command spends another minute.

01:13:23 Step 8/8 : CMD ["bin/bundle", "exec", "puma"]
01:13:23  ---> Running in 104967ad1553
01:13:31  ---> 35d2259cdb1d
01:13:34 Removing intermediate container 104967ad1553
01:13:34 Successfully built 35d2259cdb1d
01:13:35 [docker-builder] $ docker inspect 35d2259cdb1d
01:13:35 [docker-builder] $ docker push bigbinary/xyz:production-role-management-feature-1529436929
01:13:35 The push refers to a repository [docker.io/bigbinary/xyz]
01:14:21 d67854546d53: Pushed
01:14:22 production-role-management-feature-1529436929: digest: sha256:07f86cfd58fac412a38908d7a7b7d0773c6a2980092df416502d7a5c051910b3 size: 4106
01:14:22 Finished: SUCCESS

So, we found the exact commands which were causing the docker build command to take so much time to build a Docker image.

Let’s summarize the steps involved in building our Docker image and the average time each needed to finish.

Command or Instruction Average Time Spent
Shallow clone of Git Repository by Jenkins 20 Seconds
Invocation of docker build by Jenkins and sending build context to Docker daemon 10 Seconds
FROM bigbinary/xyz-base:latest 0 Seconds
ENV APP_PATH /data/app/ 0 Seconds
WORKDIR $APP_PATH 0 Seconds
ADD . $APP_PATH 25 Seconds
ARG RAILS_ENV 7 Seconds
RUN bin/bundle install --without development test 2.5 Minutes
RUN bin/rake assets:precompile 4.35 Minutes
CMD ["bin/bundle", "exec", "puma"] 1.15 Minutes
Total 9 Minutes

Often, people build Docker images from a single Git branch, like master. Since changes in a single branch are incremental and hardly has differences in the Gemfile.lock file across commits, bundler cache need not be managed explicitly. Instead, Docker automatically re-uses the previously built layer for the RUN bundle install instruction if the Gemfile.lock file remains unchanged.

In our case, this does not happen. For every new feature or a bug fix, we create a separate Git branch. To verify the changes on a particular branch, we deploy a separate review app which serves the code from that branch. To achieve this workflow, everyday we need to build a lot of Docker images containing source code from varying Git branches as well as with varying environments. Most of the times, the Gemfile.lock and assets have different versions across these Git branches. Therefore, it is hard for Docker to cache layers for bundle install and rake assets:precompile tasks and re-use those layers during every docker build command run with different application source code and a different environment. This is why the previously built Docker layer for the RUN bin/bundle install instruction and the RUN bin/rake assets:precompile instruction was often not being used in our case. This reason was causing the RUN instructions to be executed without re-using the previously built Docker layer cache while performing every other Docker build.

Before discussing the approaches to speed up our Docker build flow, let’s familiarize with the bundle install and rake assets:precompile tasks and how to speed up them by reusing cache.

Speeding up “bundle install” by using cache

By default, Bundler installs gems at the location which is set by Rubygems. Also, Bundler looks up for the installed gems at the same location.

This location can be explicitly changed by using --path option.

If Gemfile.lock does not exist or no gem is found at the explicitly provided location or at the default gem path then bundle install command fetches all remote sources, resolves dependencies if needed and installs required gems as per Gemfile.

The bundle install --path=vendor/cache command would install the gems at the vendor/cache location in the current directory. If the same command is run without making any change in Gemfile, since the gems were already installed and cached in vendor/cache, the command will finish instantly because Bundler need not to fetch any new gems.

The tree structure of vendor/cache directory looks like this.

vendor/cache
├── aasm-4.12.3.gem
├── actioncable-5.1.4.gem
├── activerecord-5.1.4.gem
├── [...]
├── ruby
│   └── 2.4.0
│       ├── bin
│       │   ├── aws.rb
│       │   ├── dotenv
│       │   ├── erubis
│       │   ├── [...]
│       ├── build_info
│       │   └── nokogiri-1.8.1.info
│       ├── bundler
│       │   └── gems
│       │       ├── activeadmin-043ba0c93408
│       │       [...]
│       ├── cache
│       │   ├── aasm-4.12.3.gem
│       │   ├── actioncable-5.1.4.gem
│       │   ├── [...]
│       │   ├── bundler
│       │   │   └── git
│       └── specifications
│           ├── aasm-4.12.3.gemspec
│           ├── actioncable-5.1.4.gemspec
│           ├── activerecord-5.1.4.gemspec
│           ├── [...]
│           [...]
[...]

It appears that Bundler keeps two separate copies of the .gem files at two different locations, vendor/cache and vendor/cache/ruby/VERSION_HERE/cache.

Therefore, even if we remove a gem in the Gemfile, then that gem will be removed only from the vendor/cache directory. The vendor/cache/ruby/VERSION_HERE/cache will still have the cached .gem file for that removed gem.

Let’s see an example.

We have 'aws-sdk', '2.11.88' gem in our Gemfile and that gem is installed.

$ ls vendor/cache/aws-sdk-*
vendor/cache/aws-sdk-2.11.88.gem
vendor/cache/aws-sdk-core-2.11.88.gem
vendor/cache/aws-sdk-resources-2.11.88.gem

$ ls vendor/cache/ruby/2.4.0/cache/aws-sdk-*
vendor/cache/ruby/2.4.0/cache/aws-sdk-2.11.88.gem
vendor/cache/ruby/2.4.0/cache/aws-sdk-core-2.11.88.gem
vendor/cache/ruby/2.4.0/cache/aws-sdk-resources-2.11.88.gem

Now, we will remove the aws-sdk gem from Gemfile and run bundle install.

$ bundle install --path=vendor/cache
Using rake 12.3.0
Using aasm 4.12.3
[...]
Updating files in vendor/cache
Removing outdated .gem files from vendor/cache
  * aws-sdk-2.11.88.gem
  * jmespath-1.3.1.gem
  * aws-sdk-resources-2.11.88.gem
  * aws-sdk-core-2.11.88.gem
  * aws-sigv4-1.0.2.gem
Bundled gems are installed into `./vendor/cache`

$ ls vendor/cache/aws-sdk-*
no matches found: vendor/cache/aws-sdk-*

$ ls vendor/cache/ruby/2.4.0/cache/aws-sdk-*
vendor/cache/ruby/2.4.0/cache/aws-sdk-2.11.88.gem
vendor/cache/ruby/2.4.0/cache/aws-sdk-core-2.11.88.gem
vendor/cache/ruby/2.4.0/cache/aws-sdk-resources-2.11.88.gem

We can see that the cached version of gem(s) remained unaffected.

If we add the same gem 'aws-sdk', '2.11.88' back to the Gemfile and perform bundle install, instead of fetching that gem from remote Gem repository, Bundler will install that gem from the cache.

$ bundle install --path=vendor/cache
Resolving dependencies........
[...]
Using aws-sdk 2.11.88
[...]
Updating files in vendor/cache
  * aws-sigv4-1.0.3.gem
  * jmespath-1.4.0.gem
  * aws-sdk-core-2.11.88.gem
  * aws-sdk-resources-2.11.88.gem
  * aws-sdk-2.11.88.gem

$ ls vendor/cache/aws-sdk-*
vendor/cache/aws-sdk-2.11.88.gem
vendor/cache/aws-sdk-core-2.11.88.gem
vendor/cache/aws-sdk-resources-2.11.88.gem

What we understand from this is that if we can reuse the explicitly provided vendor/cache directory every time we need to execute bundle install command, then the command will be much faster because Bundler will use gems from local cache instead of fetching from the Internet.

Speeding up “rake assets:precompile” task by using cache

JavaScript code written in TypeScript, Elm, JSX etc cannot be directly served to the browser. Almost all web browsers understands JavaScript (ES4), CSS and image files. Therefore, we need to transpile, compile or convert the source asset into the formats which browsers can understand. In Rails, Sprockets is the most widely used library for managing and compiling assets.

In development environment, Sprockets compiles assets on-the-fly as and when needed using Sprockets::Server. In production environment, recommended approach is to pre-compile assets in a directory on disk and serve it using a web server like Nginx.

Precompilation is a multi-step process for converting a source asset file into a static and optimized form using components such as processors, transformers, compressors, directives, environments, a manifest and pipelines with the help of various gems such as sass-rails, execjs, etc. The assets need to be precompiled in production so that Sprockets need not resolve inter-dependencies between required source dependencies every time a static asset is requested. To understand how Sprockets work in great detail, please read this guide.

When we compile source assets using rake assets:precompile task, we can find the compiled assets in public/assets directory inside our Rails application.

$ ls public/assets
manifest-15adda275d6505e4010b95819cf61eb3.json
icons-6250335393ad03df1c67eafe138ab488.eot
icons-6250335393ad03df1c67eafe138ab488.eot.gz
cons-b341bf083c32f9e244d0dea28a763a63.svg
cons-b341bf083c32f9e244d0dea28a763a63.svg.gz
application-8988c56131fcecaf914b22f54359bf20.js
application-8988c56131fcecaf914b22f54359bf20.js.gz
xlsx.full.min-feaaf61b9d67aea9f122309f4e78d5a5.js
xlsx.full.min-feaaf61b9d67aea9f122309f4e78d5a5.js.gz
application-adc697aed7731c864bafaa3319a075b1.css
application-adc697aed7731c864bafaa3319a075b1.css.gz
FontAwesome-42b44fdc9088cae450b47f15fc34c801.otf
FontAwesome-42b44fdc9088cae450b47f15fc34c801.otf.gz
[...]

We can see that the each source asset has been compiled and minified along with its gunzipped version.

Note that the assets have a unique and random digest or fingerprint in their file names. A digest is a hash calculated by Sprockets from the contents of an asset file. If the contents of an asset is changed, then that asset’s digest also changes. The digest is mainly used for busting cache so a new version of the same asset can be generated if the source file is modified or the configured cache period is expired.

The rake assets:precompile task also generates a manifest file along with the precompiled assets. This manifest is used by Sprockets to perform fast lookups without having to actually compile our assets code.

An example manifest file, in our case public/assets/manifest-15adda275d6505e4010b95819cf61eb3.json looks like this.

{
  "files": {
    "application-8988c56131fcecaf914b22f54359bf20.js": {
      "logical_path": "application.js",
      "mtime": "2018-07-06T07:32:27+00:00",
      "size": 3797752,
      "digest": "8988c56131fcecaf914b22f54359bf20"
    },
    "xlsx.full.min-feaaf61b9d67aea9f122309f4e78d5a5.js": {
      "logical_path": "xlsx.full.min.js",
      "mtime": "2018-07-05T22:06:17+00:00",
      "size": 883635,
      "digest": "feaaf61b9d67aea9f122309f4e78d5a5"
    },
    "application-adc697aed7731c864bafaa3319a075b1.css": {
      "logical_path": "application.css",
      "mtime": "2018-07-06T07:33:12+00:00",
      "size": 242611,
      "digest": "adc697aed7731c864bafaa3319a075b1"
    },
    "FontAwesome-42b44fdc9088cae450b47f15fc34c801.otf": {
      "logical_path": "FontAwesome.otf",
      "mtime": "2018-06-20T06:51:49+00:00",
      "size": 134808,
      "digest": "42b44fdc9088cae450b47f15fc34c801"
    },
    [...]
  },
  "assets": {
    "icons.eot": "icons-6250335393ad03df1c67eafe138ab488.eot",
    "icons.svg": "icons-b341bf083c32f9e244d0dea28a763a63.svg",
    "application.js": "application-8988c56131fcecaf914b22f54359bf20.js",
    "xlsx.full.min.js": "xlsx.full.min-feaaf61b9d67aea9f122309f4e78d5a5.js",
    "application.css": "application-adc697aed7731c864bafaa3319a075b1.css",
    "FontAwesome.otf": "FontAwesome-42b44fdc9088cae450b47f15fc34c801.otf",
    [...]
  }
}

Using this manifest file, Sprockets can quickly find a fingerprinted file name using that file’s logical file name and vice versa.

Also, Sprockets generates cache in binary format at tmp/cache/assets in the Rails application’s folder for the specified Rails environment. Following is an example tree structure of the tmp/cache/assets directory automatically generated after executing RAILS_ENV=environment_here rake assets:precompile command for each Rails environment.

$ cd tmp/cache/assets && tree
.
├── demo
│   ├── sass
│   │   ├── 7de35a15a8ab2f7e131a9a9b42f922a69327805d
│   │   │   ├── application.css.sassc
│   │   │   └── bootstrap.css.sassc
│   │   ├── [...]
│   └── sprockets
│       ├── 002a592d665d92efe998c44adc041bd3
│       ├── 7dd8829031d3067dcf26ffc05abd2bd5
│       └── [...]
├── production
│   ├── sass
│   │   ├── 80d56752e13dda1267c19f4685546798718ad433
│   │   │   ├── application.css.sassc
│   │   │   └── bootstrap.css.sassc
│   │   ├── [...]
│   └── sprockets
│       ├── 143f5a036c623fa60d73a44d8e5b31e7
│       ├── 31ae46e77932002ed3879baa6e195507
│       └── [...]
└── staging
    ├── sass
    │   ├── 2101b41985597d41f1e52b280a62cd0786f2ee51
    │   │   ├── application.css.sassc
    │   │   └── bootstrap.css.sassc
    │   ├── [...]
    └── sprockets
        ├── 2c154d4604d873c6b7a95db6a7d5787a
        ├── 3ae685d6f922c0e3acea4bbfde7e7466
        └── [...]

Let’s inspect the contents of an example cached file. Since the cached file is in binary form, we can forcefully see the non-visible control characters as well as the binary content in text form using cat -v command.

$ cat -v tmp/cache/assets/staging/sprockets/2c154d4604d873c6b7a95db6a7d5787a

^D^H{^QI"
class^F:^FETI"^SProcessedAsset^F;^@FI"^Qlogical_path^F;^@TI"^]components/Comparator.js^F;^@TI"^Mpathname^F;^@TI"T$root/app/assets/javascripts/components/Comparator.jsx^F;^@FI"^Qcontent_type^F;^@TI"^[application/javascript^F;^@TI"
mtime^F;^@Tl+^GM-gM-z;[I"^Klength^F;^@Ti^BM-L^BI"^Kdigest^F;^@TI"%18138d01fe4c61bbbfeac6d856648ec9^F;^@FI"^Ksource^F;^@TI"^BM-L^Bvar Comparator = function (props) {
  var comparatorOptions = [React.createElement("option", { key: "?", value: "?" })];
  var allComparators = props.metaData.comparators;
  var fieldDataType = props.fieldDataType;
  var allowedComparators = allComparators[fieldDataType] || allComparators.integer;
  return React.createElement(
    "select",
    {
      id: "comparator-" + props.id,
      disabled: props.disabled,
      onChange: props.handleComparatorChange,
      value: props.comparatorValue },
    comparatorOptions.concat(allowedComparators.map(function (comparator, id) {
      return React.createElement(
        "option",
        { key: id, value: comparator },
        comparator
      );
    }))
  );
};^F;^@TI"^Vdependency_digest^F;^@TI"%d6c86298311aa7996dd6b5389f45949f^F;^@FI"^Srequired_paths^F;^@T[^FI"T$root/app/assets/javascripts/components/Comparator.jsx^F;^@FI"^Udependency_paths^F;^@T[^F{^HI"   path^F;^@TI"T$root/app/assets/javascripts/components/Comparator.jsx^F;^@F@^NI"^^2018-07-03T22:38:31+00:00^F;^@T@^QI"%51ab9ceec309501fc13051c173b0324f^F;^@FI"^M_version^F;^@TI"%30fd133466109a42c8cede9d119c3992^F;^@F

We can see that there are some weird looking characters in the above file because it is not a regular file to be read by humans. Also, it seems to be holding some important information such as mime-type, original source code’s path, compiled source, digest, paths and digests of required dependencies, etc. Above compiled cache appears to be of the original source file located at app/assets/javascripts/components/Comparator.jsx having actual contents in JSX and ES6 syntax as shown below.

const Comparator = (props) => {
  const comparatorOptions = [<option key="?" value="?" />];
  const allComparators = props.metaData.comparators;
  const fieldDataType = props.fieldDataType;
  const allowedComparators = allComparators[fieldDataType] || allComparators.integer;
  return (
    <select
      id={`comparator-${props.id}`}
      disabled={props.disabled}
      onChange={props.handleComparatorChange}
      value={props.comparatorValue}>
      {
        comparatorOptions.concat(allowedComparators.map((comparator, id) =>
          <option key={id} value={comparator}>{comparator}</option>
        ))
      }
    </select>
  );
};

If similar cache exists for a Rails environment under tmp/cache/assets and if no source asset file is modified then re-running the rake assets:precompile task for the same environment will finish quickly. This is because Sprockets will reuse the cache and therefore will need not to resolve the inter-assets dependencies, perform conversion, etc.

Even if certain source assets are modified, Sprockets will rebuild the cache and re-generate compiled and fingerprinted assets just for the modified source assets.

Therefore, now we can understand that that if we can reuse the directories tmp/cache/assets and public/assets every time we need to execute rake assets:precompile task, then the Sprockets will perform precompilation much faster.

Speeding up “docker build” – first attempt

As discussed above, we were now familiar about how to speed up the bundle install and rake assets:precompile commands individually.

We decided to use this knowledge to speed up our slow docker build command. Our initial thought was to mount a directory on the host Jenkins machine into the filesystem of the image being built by the docker build command. This mounted directory then can be used as a cache directory to persist the cache files of both bundle install and rake assets:precompile commands run as part of docker build command in each Jenkins build. Then every new build could re-use the previous build’s cache and therefore could finish faster.

Unfortunately, this wasn’t possible due to no support from Docker yet. Unlike the docker run command, we cannot mount a host directory into docker build command. A feature request for providing a shared host machine directory path option to the docker build command is still open here.

To reuse cache and perform faster, we need to carry the cache files of both bundle install and rake assets:precompile commands between each docker build (therefore, Jenkins build). We were looking for some place which can be treated as a shared cache location and can be accessed during each build.

We decided to use Amazon’s S3 service to solve this problem.

To upload and download files from S3, we needed to inject credentials for S3 into the build context provided to the docker build command.

Alternatively, these S3 credentials can be provided to the docker build command using --build-arg option as discussed earlier.

We used s3cmd command-line utility to interact with the S3 service.

Following shell script named as install_gems_and_precompile_assets.sh was configured to be executed using a RUN instruction while running the docker build command.

set -ex

# Step 1.
if [ -e s3cfg ]; then mv s3cfg ~/.s3cfg; fi

bundler_cache_path="vendor/cache"
assets_cache_path="tmp/assets/cache"
precompiled_assets_path="public/assets"
cache_archive_name="cache.tar.gz"
s3_bucket_path="s3://docker-builder-bundler-and-assets-cache"
s3_cache_archive_path="$s3_bucket_path/$cache_archive_name"

# Step 2.
# Fetch tarball archive containing cache and extract it.
# The "tar" command extracts the archive into "vendor/cache",
# "tmp/assets/cache" and "public/assets".
if s3cmd get $s3_cache_archive_path; then
  tar -xzf $cache_archive_name && rm -f $cache_archive_name
fi

# Step 3.
# Install gems from "vendor/cache" and pack up them.
bin/bundle install --without development test --path $bundler_cache_path
bin/bundle pack --quiet

# Step 4.
# Precompile assets.
# Note that the "RAILS_ENV" is already defined in Dockerfile
# and will be used implicitly.
bin/rake assets:precompile

# Step 5.
# Compress "vendor/cache", "tmp/assets/cache"
# and "public/assets" directories into a tarball archive.
tar -zcf $cache_archive_name $bundler_cache_path \
                             $assets_cache_path  \
                             $precompiled_assets_path

# Step 6.
# Push the compressed archive containing updated cache to S3.
s3cmd put $cache_archive_name $s3_cache_archive_path || true

# Step 7.
rm -f $cache_archive_name ~/.s3cfg

Let’s discuss the various steps annotated in the above script.

  1. The S3 credentials file injected by Jenkins into the build context needs to be placed at ~/.s3cfg location, so we move that credentials file accordingly.
  2. Try to fetch the compressed tarball archive comprising directories such as vendor/cache, tmp/assets/cache and public/assets. If exists, extract the tarball archive at respective paths and remove that tarball.
  3. Execute the bundle install command which would re-use the extracted cache from vendor/cache.
  4. Execute the rake assets:precompile command which would re-use the extracted cache from tmp/assets/cache and public/assets.
  5. Compress the cache directories vendor/cache, tmp/assets/cache and public/assets in a tarball archive.
  6. Upload the compressed tarball archive containing updated cache directories to S3.
  7. Remove the compressed tarball archive and the S3 credentials file.

Please note that, in our actual case we had generated different tarball archives depending upon the provided RAILS_ENV environment. For demonstration, here we use just a single archive instead.

The Dockerfile needed to update to execute the install_gems_and_precompile_assets.sh script.

FROM bigbinary/xyz-base:latest

ENV APP_PATH /data/app/

WORKDIR $APP_PATH

ADD . $APP_PATH

ARG RAILS_ENV

RUN install_gems_and_precompile_assets.sh

CMD ["bin/bundle", "exec", "puma"]

With this setup, average time of the Jenkins builds was now reduced to about 5 minutes. This was a great achievement for us.

We reviewed this approach in a great detail. We found that although the approach was working fine, there was a major security flaw. It is not at all recommended to inject confidential information such as login credentials, private keys, etc. as part of the build context or using build arguments while building a Docker image using docker build command. And we were actually injecting S3 credentials into the Docker image. Such confidential credentials provided while building a Docker image can be inspected using docker history command by anyone who has access to that Docker image.

Due to above reason, we needed to abandon this approach and look for another.

Speeding up “docker build” – second attempt

In our second attempt, we decided to execute bundle install and rake assets:precompile commands outside the docker build command. Outside meaning the place to execute these commands was Jenkins build itself. So with the new approach, we had to first execute bundle install and rake assets:precompile commands as part of the Jenkins build and then execute docker build as usual. With this approach, we could now avail the inter-build caching benefits provided by Jenkins.

The prerequisite was to have all the necessary system packages installed on the Jenkins machine required by the gems enlisted in the application’s Gemfile. We installed all the necessary system packages on our Jenkins server.

Following screenshot highlights the things that we needed to configure in our Jenkins job to make this approach work.

1. Running the Jenkins build in RVM managed environment with the specified Ruby version

Sometimes, we need to use different Ruby version as specified in the .ruby-version in the cloned source code of the application. By default, the bundle install command would install the gems for the system Ruby version available on the Jenkins machine. This was not acceptable for us. Therefore, we needed a way to execute the bundle install command in Jenkins build in an isolated environment which could use the Ruby version specified in the .ruby-version file instead of the default system Ruby version. To address this, we used RVM plugin for Jenkins. The RVM plugin enabled us to run the Jenkins build in an isolated environment by using or installing the Ruby version specified in the .ruby-version file. The section highlighted with green color in the above screenshot shows the configuration required to enable this plugin.

2. Carrying cache files between Jenkins builds required to speed up “bundle install” and “rake assets:precompile” commands

We used Job Cacher Jenkins plugin to persist and carry the cache directories such as vendor/cache, tmp/cache/assets and public/assets between builds. At the beginning of a Jenkins build just after cloning the source code of the application, the Job Cacher plugin restores the previously cached version of these directories into the current build. Similarly, before finishing a Jenkins build, the Job Cacher plugin copies the current version of these directories at /var/lib/jenkins/jobs/docker-builder/cache on the Jenkins machine which is outside the workspace directory of the Jenkins job. The section highlighted with red color in the above screenshot shows the necessary configuration required to enable this plugin.

3. Executing the “bundle install” and “rake assets:precompile” commands before “docker build” command

Using the “Execute shell” build step provided by Jenkins, we execute bundle install and rake assets:precompile commands just before the docker build command invoked by the CloudBees Docker Build and Publish plugin. Since the Job Cacher plugin already restores the version of vendor/cache, tmp/cache/assets and public/assets directories from the previous build into the current build, the bundle install and rake assets:precompile commands re-uses the cache and performs faster.

The updated Dockerfile has lesser number of instructions now.

FROM bigbinary/xyz-base:latest

ENV APP_PATH /data/app/

WORKDIR $APP_PATH

ADD . $APP_PATH

CMD ["bin/bundle", "exec", "puma"]

With this approach, average Jenkins build time is now between 3.5 to 4.5 minutes.

Following graph shows the build time trend of some of the recent builds on our Jenkins server.

Please note that the spikes in the above graphs shows that certain Jenkins builds took more than 5 minutes sometimes due to concurrently running builds at that time. Because our Jenkins server has a limited set of resources, concurrently running builds often run longer than estimated.

We are still looking to improve the containerization speed even more and still maintaining the image size small. Please let us know if there’s anything else we can do to improve the containerization process.

Note that that our Jenkins server runs on the Ubuntu OS which is based on Debian. Our base Docker image is also based on Debian. Some of the gems in our Gemfile are native extensions written in C. The pre-installed gems on Jenkins machine have been working without any issues while running inside the Docker containers on Kubernetes. It may not work if both of the platforms are different since native extension gems installed on Jenkins host may fail to work inside the Docker container.


Ruby 2.6 adds Binding#source_location

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Before Ruby 2.6, if we want to know file name with location and line number of source code, we would need to use Binding#eval .

binding.eval('[__FILE__, __LINE__]')
=> ["/Users/taha/blog/app/controllers/application_controller", 2]

Ruby 2.6 adds more readable method Binding#source_location to achieve similar result.

binding.source_location
=> ["/Users/taha/blog/app/controllers/application_controller", 2]

Here is relevant commit and discussion for this change.

The Chinese version of this blog is available here.


Ruby 2.6 adds String#split with block

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Before Ruby 2.6, String#split returned array of splitted strings.

In Ruby 2.6, a block can be passed to String#split which yields each split string and operates on it. This avoids creating an array and thus is memory efficient.

We will add method is_fruit? to understand how to use split with a block.

def is_fruit?(value)
  %w(apple mango banana watermelon grapes guava lychee).include?(value)
end

Input is a comma separated string with vegetables and fruits names. Goal is to fetch names of fruits from input string and store it in an array.

String#split
input_str = "apple, mango, potato, banana, cabbage, watermelon, grapes"

splitted_values = input_str.split(", ")
=> ["apple", "mango", "potato", "banana", "cabbage", "watermelon", "grapes"]

fruits = splitted_values.select { |value| is_fruit?(value) }
=> ["apple", "mango", "banana", "watermelon", "grapes"]

Using split an intermediate array is created which contains both fruits and vegetables names.

String#split with a block
fruits = []

input_str = "apple, mango, potato, banana, cabbage, watermelon, grapes"

input_str.split(", ") { |value| fruits << value if is_fruit?(value) }
=> "apple, mango, potato, banana, cabbage, watermelon, grapes"

fruits
=> ["apple", "mango", "banana", "watermelon", "grapes"]

When a block is passed to split, it returns the string on which split was called and does not create an array. String#split yields block on each split string, which in our case was to push fruit names in a separate array.

Update

Benchmark

We created a large random string to benchmark performance of split and split with block

require 'securerandom'

test_string = ''

100_000.times.each do
  test_string += SecureRandom.alphanumeric(10)
  test_string += ' '
end
require 'benchmark'

Benchmark.bmbm do |bench|

  bench.report('split') do
    arr = test_string.split(' ')
    str_starts_with_a = arr.select { |str| str.start_with?('a') }
  end

  bench.report('split with block') do
    str_starts_with_a = []
    test_string.split(' ') { |str| str_starts_with_a << str if str.start_with?('a') }
  end

end

Results

Rehearsal ----------------------------------------------------
split              0.023764   0.000911   0.024675 (  0.024686)
split with block   0.012892   0.000553   0.013445 (  0.013486)
------------------------------------------- total: 0.038120sec

                       user     system      total        real
split              0.024107   0.000487   0.024594 (  0.024622)
split with block   0.010613   0.000334   0.010947 (  0.010991)

We did another iteration of benchmarking using benchmark/ips.

require 'benchmark/ips'
Benchmark.ips do |bench|


  bench.report('split') do
    splitted_arr = test_string.split(' ')
    str_starts_with_a = splitted_arr.select { |str| str.start_with?('a') }
  end

  bench.report('split with block') do
    str_starts_with_a = []
    test_string.split(' ') { |str| str_starts_with_a << str if str.start_with?('a') }
  end

  bench.compare!
end

Results

Warming up --------------------------------------
               split     4.000  i/100ms
    split with block    10.000  i/100ms
Calculating -------------------------------------
               split     46.906  (± 2.1%) i/s -    236.000  in   5.033343s
    split with block    107.301  (± 1.9%) i/s -    540.000  in   5.033614s

Comparison:
    split with block:      107.3 i/s
               split:       46.9 i/s - 2.29x  slower

This benchmark shows that split with block is about 2 times faster than split.

Here is relevant commit and discussion for this change.

The Chinese version of this blog is available here.


How to upload source maps to Honeybadger

During the development of a chrome extension, debugging was difficult because line number of minified JavaScript file is of no use without a source map. Previously, Honeybadger could only download the source map files which were public and our source maps were inside the .crx package which was inaccessible to honeybadger.

Recently, Honeybadger released a new feature to upload the source maps to Honeybadger. We have written a grunt plugin to upload the source maps to Honeybadger.

Here is how we can upload source map to Honeybadger.

First, install the grunt plugin.

npm install --save-dev grunt-honeybadger-sourcemaps

Configure the gruntfile.

grunt.initConfig({
  honeybadger_sourcemaps: {
    default_options:{
      options: {
        appId: "xxxx",
        token: "xxxxxxxxxxxxxx",
        urlPrefix: "http://example.com/",
        revision: "<app version>"
        prepareUrlParam: function(fileSrc){
          // Here we can manipulate the filePath
          return filesrc.replace('built/', '');
        },
      },
      files: [{
        src: ['@path/to/**/*.map']
      }],
    }
  },
});
grunt.loadNpmTasks('grunt-honeybadger-sourcemaps');
grunt.registerTask('upload_sourcemaps', ['honeybadger_sourcemaps']);

We can get the appId and token from Honeybadger project settings.

grunt upload_sourcemaps

Now, we can upload the source maps to Honeybadger and get better error stack trace.

Testing

Clone the following repo.

git clone https://github.com/bigbinary/grunt-honeybadger-sourcemaps

Replace appId and token in Gruntfile.js and run grunt test. It should upload the sample source maps to your project.


Ruby 2.6 raises exception when 'else' is used inside 'begin..end' block without 'rescue'

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Ruby 2.5

If we use else without rescue inside begin..end block in Ruby 2.5, it gives a warning.

  irb(main):001:0> begin
  irb(main):002:1>    puts "Inside begin block"
  irb(main):003:1>  else
  irb(main):004:1>    puts "Inside else block"
  irb(main):005:1> end
  (irb):5: warning: else without rescue is useless

This warning is present as code inside else block will never get executed

Ruby 2.6

In Ruby 2.6 it will raise an exception if we use else without rescue in begin..end block. This commit changed warning into exception in Ruby 2.6. Changes made in the commit are experimental.

  irb(main):001:0>  begin
  irb(main):002:1>    puts "Inside begin block"
  irb(main):003:1>  else
  irb(main):004:1>    puts "Inside else block"
  irb(main):005:1>  end
  Traceback (most recent call last):
        1: from /usr/local/bin/irb:11:in `<main>'
  SyntaxError ((irb):3: else without rescue is useless)

The Chinese version of this blog is available here.


Automatically Format your Elm code with elm-format before committing

In one of our earlier posts we talked about how we set up prettier and rubocop to automatically format our JavaScript and Ruby code on git commit.

Recently we started working with Elm in a couple of our projects - APISnapshot and AceHelp.

Tools like prettier and rubocop have really helped us take a load off our mind with regards to formatting code. And one of the very first things we wanted to sort out when we started doing Elm was pretty printing our Elm code.

elm-format created by Aaron VonderHaar formats Elm source code according to a standard set of rules based on the official Elm Style Guide.

Automatic code formatting

Let’s setup git hook to automatically take care of code formatting. We can acheive this much like how we did it in our previous post, using Husky and Lint-staged.

Let’s add Husky and lint-staged as dev dependencies to our project. And for completeness also include elm-format as a dev dependency.

npm install --save-dev husky lint-staged elm-format

Husky makes it real easy to create git hooks. Git hooks are scripts that are executed by git before or after an event. We will be using the pre-commit hook which is run after you do a git commit command but before you type in a commit message.

This way we can change and format files that’s about to be commited by running elm-format using Husky.

But there is one problem here. The changed files do not get added back to our commit.

This is where Lint-staged comes in. Lint-staged is built to run linters on staged files. So instead of running elm-format on a pre-commit hook we would run lint-staged. And we can configure lint-staged such that elm-format is run on all staged elm files.

We can also include Prettier to take care of all staged JavaScript files too.

Lets do this by editing our package.json file.

{
  "scripts": {
   "precommit": "lint-staged"
  },
  "lint-staged": {
    "*.elm": [
      "elm-format --yes",
      "git add"
    ],
    "*.js": [
      "prettier --write",
      "git add"
    ]
  }
}

All set and done!

Now whenever we do a git commit command, all our staged elm and JavaScript files will get properly formatted before the commit goes in.


Ruby 2.6 adds endless range

This blog is part of our Ruby 2.6 series. Ruby 2.6.0-preview2 was recently released.

Before Ruby 2.6, if we want endless loop with index, we would need to use Float::INFINITY with upto or Range, or use Numeric#step.

Ruby 2.5.0
irb> (1..Float::INFINITY).each do |n|
irb*   # logic goes here
irb> end

OR

irb> 1.step.each do |n|
irb*   # logic goes here
irb> end

Ruby 2.6.0

Ruby 2.6 makes infinite loop more readable by changing mandatory second argument in range to optional. Internally, Ruby changes second argument to nil if second argument is not provided. So, both (0..) and (0..nil) are same in Ruby 2.6.

Using endless loop in Ruby 2.6
irb> (0..).each do |n|
irb*   # logic goes here
irb> end
irb> (0..nil).size
=> Infinity
irb> (0..).size
=> Infinity

In Ruby 2.5, nil is not an acceptable argument and (0..nil) would throw ArgumentError.

irb> (0..nil)
ArgumentError (bad value for range)

Here is the relevant commit and discussion for this change.

The Chinese version of this blog is available here.


Rails 5.2 added method write_multi to cache store

This blog is part of our Rails 5.2 series.

Before 5.2 it was not possible to write multiple entries to cache store in one shot even though cache stores like Redis has MSET command to set multiple keys in a single atomic operation. However we were not able to use this feature of Redis because of the way Rails had implemented caching.

Rails has implemented caching using an abstract class ActiveSupport::Cache::Store which defines the interface that all cache store classes should implement. Rails also provides few common functionality that all cache store classes will need.

Prior to Rails 5.2 ActiveSupport::Cache::Store didn’t have any method to set multiple entities at once.

In Rails 5.2, write_multi was added . Each cache store can implement this method and provide the functionality to add multiple entries at once. If cache store does not implement this method, then the default implementation is to loop over each key value pair and sets it individually using write_entity method.

Multiple entities can be set as shown here.

Rails.cache.write_multi name: 'Alan Turning', country: 'England'

redis-rails gem provides redis as cache store. However it does not implement write_multi method.

However if we are using Rails 5.2, then there is no point in using redis-rails gem, as Rails 5.2 comes with built in support for redis cache store, which implements write_multi method. It was added by this PR.

We need to make following change.

# before
config.cache_store = :redis_store

# after
config.cache_store = :redis_cache_store

redis-rails repo has a pull request to notify users that development of this gem is ceased. So it’s better to use redis cache store that comes with Rails 5.2.


Continuously release chrome extension using CircleCI

In recent times we worked on many chrome extensions. Releasing new chrome extensions manually gets tiring afer a while.

So, we thought of automating it with CircleCI similar to continuous deployment.

We are using the following configuration in circle.yml to continuously release chrome extension from the master branch.

workflows:
  version: 2
  main:
    jobs:
      - test:
          filters:
            branches:
              ignore: []
      - build:
          requires:
            - test
          filters:
            branches:
              only: master
      - publish:
          requires:
            - build
          filters:
            branches:
              only: master

version: 2
jobs:
  test:
    docker:
      - image: cibuilds/base:latest
    steps:
      - checkout
      - run:
          name: "Install Dependencies"
          command: |
            apk add --no-cache yarn
            yarn
      - run:
          name: "Run Tests"
          command: |
            yarn run test
  build:
    docker:
      - image: cibuilds/chrome-extension:latest
    steps:
      - checkout
      - run:
          name: "Install Dependencies"
          command: |
            apk add --no-cache yarn
            apk add --no-cache zip
            yarn
      - run:
          name: "Package Extension"
          command: |
            yarn run build
            zip -r build.zip build
      - persist_to_workspace:
          root: /root/project
          paths:
            - build.zip

  publish:
    docker:
      - image: cibuilds/chrome-extension:latest
    environment:
      - APP_ID: <APP_ID>
    steps:
      - attach_workspace:
          at: /root/workspace
      - run:
          name: "Publish to the Google Chrome Store"
          command: |
            ACCESS_TOKEN=$(curl "https://accounts.google.com/o/oauth2/token" -d "client_id=${CLIENT_ID}&client_secret=${CLIENT_SECRET}&refresh_token=${REFRESH_TOKEN}&grant_type=refresh_token&redirect_uri=urn:ietf:wg:oauth:2.0:oob" | jq -r .access_token)
            curl -H "Authorization: Bearer ${ACCESS_TOKEN}" -H "x-goog-api-version: 2" -X PUT -T /root/workspace/build.zip -v "https://www.googleapis.com/upload/chromewebstore/v1.1/items/${APP_ID}"
            curl -H "Authorization: Bearer ${ACCESS_TOKEN}" -H "x-goog-api-version: 2" -H "Content-Length: 0" -X POST -v "https://www.googleapis.com/chromewebstore/v1.1/items/${APP_ID}/publish"

We have created three jobs named as test, build and publish and used these jobs in our workflow to run tests, build the extension and publish to chrome store respectively. Every step requires the previous step to run successfully.

Let’s check each job one by one.

test:
  docker:
    - image: cibuilds/base:latest
  steps:
    - checkout
    - run:
        name: "Install Dependencies"
        command: |
          apk add --no-cache yarn
          yarn
    - run:
        name: "Run Tests"
        command: |
          yarn run test

We are using cibuilds docker image for this job. First, we are doing a checkout to the branch and then using yarn to install dependencies. Alternatively, we can use npm to install dependencies too. Then as the last step, we are using yarn run test to run tests. We can skip this step if running tests are not needed.

build:
  docker:
    - image: cibuilds/chrome-extension:latest
  steps:
    - checkout
    - run:
        name: "Install Dependencies"
        command: |
          apk add --no-cache yarn
          apk add --no-cache zip
          yarn
    - run:
        name: "Package Extension"
        command: |
          yarn run build
          zip -r build.zip build
    - persist_to_workspace:
        root: /root/project
        paths:
          - build.zip

For building chrome extension we are using chrome-extension image. Here again, first we are doing a checkout and then installing dependencies using yarn. Note that we are installing zip utility along with yarn because we need to zip our chrome extension before publishing it in next step. Also, we are not generating version numbers on our own. The version number will be picked from the manifest file. This step assumes that we have a task named build in package.json to build our app.

Chrome store rejects multiple uploads with the same version number. So, we have to make sure to update the version number which should be unique in the manifest file before this step.

In the last step, we are using persist_to_workspace to make build.zip available to next step for publishing.

publish:
  docker:
    - image: cibuilds/chrome-extension:latest
  environment:
    - APP_ID: <APP_ID>
  steps:
    - attach_workspace:
        at: /root/workspace
    - run:
        name: "Publish to the Google Chrome Store"
        command: |
          ACCESS_TOKEN=$(curl "https://accounts.google.com/o/oauth2/token" -d "client_id=${CLIENT_ID}&client_secret=${CLIENT_SECRET}&refresh_token=${REFRESH_TOKEN}&grant_type=refresh_token&redirect_uri=urn:ietf:wg:oauth:2.0:oob" | jq -r .access_token)
          curl -H "Authorization: Bearer ${ACCESS_TOKEN}" -H "x-goog-api-version: 2" -X PUT -T /root/workspace/build.zip -v "https://www.googleapis.com/upload/chromewebstore/v1.1/items/${APP_ID}"
          curl -H "Authorization: Bearer ${ACCESS_TOKEN}" -H "x-goog-api-version: 2" -H "Content-Length: 0" -X POST -v "https://www.googleapis.com/chromewebstore/v1.1/items/${APP_ID}/publish"

For publishing, of chrome extension, we are using chrome-extension image.

We need APP_ID, CLIENT_ID, CLIENT_SECRET and REFRESH_TOKEN/ACCESS_TOKEN to publish our app to chrome store.

APP_ID needs to be fetched from Google Webstore Developer Dashboard. APP_ID is unique for each app whereas CLIENT_ID, CLIENT_SECRET and REFRESH_TOKEN/ACCESS_TOKEN can be used for multiple apps. Since APP_ID is generally public, we specify that in yml file. CLIENT_ID, CLIENT_SECRET and REFRESH_TOKEN/ACCESS_TOKEN are stored as private environment variables using CircleCI UI. For cases when our app is unlisted on chrome store, we need to store APP_ID as a private environment variable.

CLIENT_ID and CLIENT_SECRET need to be fetched from Google API console. There we need to select a project and then click on credentials tab. If there is no project, we need to create one and then access the credentials tab.

REFRESH_TOKEN needs to be fetched from Google API. It also defines the scope of access for Google APIs. We need to refer Google OAuth2 for obtaining the refresh token. We can use any language library.

In the first step of publish job, we are attaching workspace to access build.zip which was created previously. Now by using all required tokens obtained previously, we need to obtain an access token from Google OAuth API which must be used to push the app to chrome store. Then, we make a PUT request to Chrome store API to push the app and then using the same API again to publish the app.

Upload via API has one more advantage over manual upload. Manual upload generally takes up to 1 hour to reflect app on chrome store. Whereas upload using Google API generally reflects app within 5-10 minutes considering app does not go for a review by Google.


Rails 5.2 uses AES-256-GCM authenticated encryption as default cipher for encrypting messages

This blog is part of our Rails 5.2 series.

Before Rails 5.2, AES-256-CBC authenticated encryption was the default cipher for encrypting messages.

It was proposed to use AES-256-GCM authenticated encryption as the default cipher for encrypting messages because of following reasons:

  • It produces shorter ciphertexts and performs quick encryption and decryption.
  • It is less error prone and more secure.

So, AES-256-GCM became default cipher for encrypting messages in Rails 5.2 .

If we do not want AES-256-GCM as default cipher for encrypting messages in our rails application, then we can disable it.

Rails.application.config.active_support.use_authenticated_message_encryption = false

Default Encryption for cookies and sessions was also updated to use AES-256-GCM in this pull request.

If we do not want AES-256-GCM as default encryption of cookies and sessions, then we can disable it too.

Rails.application.config.active_support.use_authenticated_cookie_encryption = false

Our Thoughts on iOS 12

iOS 12 on iPhone 8 Red

Apple announced iOS 12 at WWDC 2018 a few days back. Being honest, it was a bit disappointing to see some of the most requested features not being there in iOS 12. Like users have been asking for Dark mode since before iOS 11, and the ability to set default apps. It’s more of an update focussed on performance improvements, usability, and compatibility. The fact that iOS 12 is also available for the iPhones Apple released 5 years back is a great effort from Apple to keep users happy. And unlike last couple of years, this time we decided to calm our curiosities and installed iOS 12 beta 1 on our devices right away after Apple released it for developers. This blog is based on our experience with iOS 12 on iPhone 8 Plus.

Installing iOS 12 on your iPhone

First things first, make sure you have iPhone 5s or newer. And before getting started, plug-in your phone to iTunes and take a full-backup in case your phone gets bricked while installing iOS 12, which is very unlikely.

Once done, download and install beta profile for iOS 12 and then download and update from “Software Update” section just like you install a regular iOS update. It’s a straightforward OTA update process which you’re already familiar with.

Note: This beta profile is from a third-party developer and is not officially from Apple. Apple will officially release public beta in around a month.

We’ve been running iOS 12 since last week now and here are our thoughts on the additions and changes introduced in iOS 12.

iOS 12 is fast

The performance improvements are significant and definitely noticeable. Previously on iOS 11, while accessing spotlight by swiping down from home screen, it used to lag. And not just that, sometimes keyboard didn’t even used to show up and we had to repeat the same action to make it work. But things are faster and better in iOS 12. The keyboard gets up as soon as spotlight shows up.

Another thing that we’ve noticed is the multitasking shortcut for switching to the last app by 3d touching on left wasn’t that reliable in iOS 11, so much that it was easy to ignore the feature altogether than to use it, but in iOS 12 the same shortcut is very smooth. Although there are times when it still doesn’t work well, but that’s very rare.

Spotlight widgets load faster in iOS 12 than they used to in iOS 11. Apart from this, 3d touch feels pretty smooth too. It’s good to see Apple squeezing out the power to improve the already good performance in iOS.

Notifications

Notifications in iOS 12

Notifications in iOS 11 are a mess, there’s no grouping, there’s no way to quickly control notifications, there’s no way to set priorities. Apple has added notifications grouping and better notification management in iOS 12, so now notifications from the same app are grouped together and you can control how often you want to get notifications from the apps right from the notification.

We think the implementation can be a whole lot better. For the notifications that are grouped, you get to see only the last notification from that app, a better way would’ve been to show two or three notifications and cut the rest. There’s no Notification pinning or snoozing which could’ve been very useful features.

Screen time and App limits

There’s a new feature in iOS 12 called Screen Time which is more like bird’s-eye view of your phone usage. There’s a saying that you can’t improve something that you can’t measure. Screen Time is a feature that’s going to be very useful for everyone who wants to cut down time on Social apps or overall phone usage. It shows you every little detail of how many apps you use and for how much time and at what times. Not only this, it also keeps track of how many times you picked up your phone, and how many notifications you receive from the apps you have on your phone.

Screen Time in iOS 12

Other useful sub-feature of Screen time is App limits, which allows you to set limit on app usage based on app or category. So let’s say you don’t want to use WhatsApp for more than 30 mins a day, you can do that through App limits. It works for app categories including Games, Social Networking, Entertainment, Creativity, Productivity, Education, Reading & Reference, Health & Fitness. So you can limit it via category which works across apps. Plus, it syncs across your other iOS devices, so you can’t cheat that way.

Siri and Shortcuts app

Siri Shortcuts in iOS 12

In iOS 12, you can assign custom shortcuts to Siri to trigger specific actions, which not only works for System apps but also with third-party apps. So now if you want to send a specific message to someone on WhatsApp, you can assign a command for that to Siri and you can trigger that action just from Siri using that command.

Apple has also introduced a new app in iOS 12 called Shortcuts. Shortcuts app lets you group actions and run those actions quickly. Although Shortcuts app isn’t there in iOS 12 beta 1 but we think it’s one of best addition in iOS 12.

Updated Photos app

Photos app now has new section called “For you”, where it shows all the new Albums, your best moments, share suggestions, photos and effect suggestions. This is more like the Assistant tab of Google Photos app. Also you can share selected photos or albums with your friends right from the Photos app.

The Album tab in Photos app is redesigned for easier navigation. Also there’s a new tab for Search which has been advanced, so you can now search photos using tags like “surfing” and “vacation”.

It’s good to see Apple paying attention to up the Photos app but we still think Google Photos is a better option for average person considering it lets you store photos in Cloud for free. Also photo organization in Google Photos is much better than in new Photos app in iOS 12.

Enhanced Do Not Disturb Mode

Do Not Disturb in iOS 12 is enhanced to be more flexible. You can now enable Do Not Disturb mode to end automatically in an hour, or at night, or according to your Calendar events, or even based on your location.

Not just that, Do Not Disturb has a new Bedtime mode enabling which will shut all your notifications during bed time and dim your display. And when you will wake up, it’ll show you a welcome back message along with weather details on the lock screen.

Conclusion

There are other updates and under the hood improvements as well, like new Measure app, redesigned iBooks app, tracking prevention, group FaceTime etc. Overall, we think it’s an okay update considering there are not as many bugs as there should be according to Apple standards. The force touch in the keyboard to drag cursor doesn’t work, Skype and some other apps crash, but for the most part, it’s good enough to be installed on your primary device.


Using Concurrent Ruby in a Ruby on Rails Application

Concurrent Ruby is a concurrency toolkit that builds on a lot of interesting ideas from many functional languages and classic concurrency patterns. When it comes to writing threaded code in Rails applications, look no further since concurrent ruby is already included in Rails via Active Support.

Using Concurrent::Future

In one of our applications, to improve performance we added threaded code using Concurrent::Future. It worked really well for us until one day it stopped working.

“Why threads?” one might ask. The code in question was a textbook threading use case. It had a few API calls, some DB requests and finally an action that was performed on all the data that was aggregated.

Let us look at what this code looks like.

Non threaded code

selected_shipping_companies.each do | carrier |
  # api calls
  distance_in_miles = find_distance_from_origin_to_destination
  historical_average_rate = historical_average_for_this_particular_carrier

  # action performed
  build_price_details_for_this_carrier(distance_in_miles,
                                       historical_average_rate)
end

Converting the above code to use Concurrent::Future is trivial.

futures = selected_shipping_companies.map do |carrier|
  Concurrent::Future.execute do
    # api calls
    distance_in_miles = find_distance_from_origin_to_destination
    historical_average_rate = historical_average_for_this_particular_carrier

    # action performed
    build_price_details_for_this_carrier(distance_in_miles,
                                         historical_average_rate)
  end
end

futures.map(&:value)

A bit more about Concurrent::Future

It is often intimidating to work with threads. They can bring in complexity and can have unpredictable behaviors due to lack of thread-safety. Ruby, being a language of mutable references, we often find it difficult to write 100% thread-safe code.

Inspired by Clojure’s Future function, Concurrent::Future is a primitive that guarantees thead safety. It takes a block of work and performs the work asynchronously using Concurrent Ruby’s global thread-pool. Once a block of work is scheduled, Concurrent Ruby gives us a handle to this future work, on which when #value (or #deref) is called block’s value is returned.

The Bug

Usually, when an exception occurs in the main thread, the interpreter stops and gathers the exception data. In the case of Ruby Threads, any unhandled exceptions are reported only when Thread#join is called. Setting Thread#abort_on_exception to true, is an better alternative which will cause all threads to exit when an exception is raised in any running thread. We published a blog recently which talks about this in great detail.

Exception handling in Concurrent Ruby

future = Concurrent::Future.execute {
            raise StandardError.new("Boom!")
          }

sleep(0.1) # giving arbitrary time for future to execute

future.value     #=> nil

Where did the exception go? This code fails silently and swallows the exceptions. How can we find out if the code executed successfully?

future = Concurrent::Future.execute {
              raise StandardError.new("Boom!")
          }

sleep(0.1) # giving arbitrary time for future to execute

future.value     #=> nil

future.rejected? #=> true
future.reason    #=> "#<StandardError: Boom!>"

How we fixed our issue

We found places in our application where Concurrent::Future was used in a way that would swallow exceptions. It is also a possibility that people might overlook the explicit need to manually report exception. We addressed these concerns with the following wrapper class.

module ConcurrentExecutor
  class Error < StandardError
    def initialize(exceptions)
      @exceptions = exceptions
      super
    end

    def message
      @exceptions.map { | e | e.message }.join "\n"
    end

    def backtrace
      traces = @exceptions.map { |e| e.backtrace }
      ["ConcurrentExecutor::Error START", traces, "END"].flatten
    end
  end

  class Future
    def initialize(pool: nil)
      @pool = pool || Concurrent::FixedThreadPool.new(20)
      @exceptions = Concurrent::Array.new
    end

    # Sample Usage
    # executor = ConcurrentExecutor::Future.new(pool: pool)
    # executor.execute(carriers) do | carrier |
    #   ...
    # end
    #
    # values = executor.resolve

    def execute array, &block
      @futures = array.map do | element |
        Concurrent::Future.execute({ executor: @pool }) do
          yield(element)
        end.rescue do | exception |
          @exceptions << exception
        end
      end

      self
    end

    def resolve
      values = @futures.map(&:value)

      if @exceptions.length > 0
        raise ConcurrentExecutor::Error.new(@exceptions)
      end

      values
    end
  end
end

Please note that using Concurrent Ruby Futures caused segmentation fault while running specs in Circle CI. As of this writing, we are using normal looping instead of Futures in Circle CI until the reason for the segfault is isolated and fixed.

Update

Concurrent::Future also gives us another API which not only returns the value of the block but also posts/raises any exceptions that occur into the main thread.

thread_pool = Concurrent::FixedThreadPool.new(20)
executors = [1, 2, 3, 4].map do |random_number|
  Concurrent::Future.execute({ executor: thread_pool }) do
    random_number / (random_number.even? ? 0 : 1)
  end
end

executors.map(&:value)
=> [1, nil, 3, nil]

executors.map(&:value!)

> ZeroDivisionError: divided by 0
> from (pry):4:in `/'

We thank Jonathan Rochkind for pointing us to this undocumented api in his reddit post.


Modelling state in Elm to reflect business logic

We recently made ApiSnapshot open source. As mentioned in that blog we ported code from React.js to Elm.

One of the features of ApiSnapshot is support for Basic Authentication.

ApiSnapshot with basic authentication

While we were rebuilding the whole application in Elm, we had to port the “Add Basic Authentication” feature. This feature can be accessed from the “More” drop-down on the right-hand side of the app and it lets user add username and password to the request.

Let’s see how the Model of our Elm app looks.

type alias Model =
    { request : Request.MainRequest.Model
    , response : Response.MainResponse.Model
    , route : Route
    }

Here is the Model in Request.MainRequest module.

type alias APISnapshotRequest =
    { url : String
    , httpMethod : HttpMethod
    , requestParameters : RequestParameters
    , requestHeaders : RequestHeaders
    , username : Maybe String
    , password : Maybe String
    , requestBody : Maybe RequestBody
    }

type alias Model =
    { request : APISnapshotRequest
    , showErrors : Bool
    }

username and password fields are optional for the users so we kept them as Maybe types.

Note that API always responds with username and password whether user clicked to add Basic Authentication or not. The API would respond with a null for both username and password when a user tries to retrieve a snapshot for which user did not fill username and password.

Here is a sample API response.

{
"url": "http://dog.ceo/api/breed/affenpinscher/images/random",
"httpMethod": "GET",
"requestParams": {},
"requestHeaders": {},
"requestBody": null,
"username": "alanturning",
"password": "welcome",
"assertions": [],
"response": {
    "response_headers": {
        "age": "0",
        "via": "1.1 varnish (Varnish/6.0), 1.1 varnish (Varnish/6.0)",
        "date": "Thu, 03 May 2018 09:43:11 GMT",
        "vary": "",
        "cf_ray": "4151c826ac834704-EWR",
        "server": "cloudflare"
    },
    "response_body": "{\"status\":\"success\",\"message\":\"https:\\/\\/images.dog.ceo\\/breeds\\/affenpinscher\\/n02110627_13221.jpg\"}",
    "response_code": "200"
  }
}

Let’s look at the view code which renders the data received from the API.

view : (Maybe String, Maybe String) -> Html Msg
view usernameAndPassword =
    case usernameAndPassword of
        (Nothing, Nothing) -> text ""
        (Just username, Nothing) -> basicAuthenticationView username ""
        (Nothing, Just password) -> basicAuthenticationView "" password
        (Just username, Just password) -> basicAuthenticationView username password


basicAuthenticationView : String -> String -> Html Msg
basicAuthenticationView username password =
    [ div [ class "form-row" ]
        [ input
            [ type_ "text"
            , placeholder "Username"
            , value username
            , onInput (UpdateUsername)
            ]
            []
        , input
            [ type_ "password"
            , placeholder "Password"
            , value password
            , onInput (UpdatePassword)
            ]
            []
        , a
            [ href "javascript:void(0)"
            , onClick (RemoveBasicAuthentication)
            ]
            [ text "×" ]
        ]
    ]

To get the desired view we apply following rules.

  1. Check if both the values are string.
  2. Check if either of the values is string.
  3. Assume that both the values are null.

This works but we can do a better job of modelling it.

What’s happening here is that we were trying to translate our API responses directly to the Model . Let’s try to club username and password together into a new type called BasicAuthentication.

In the model add a parameter called basicAuthentication which would be of type Maybe BasicAuthentication. This way if user has opted to use basic authentication fields then it is a Just BasicAuthentication and we can show the input boxes. Otherwise it is Nothing and we show nothing!

Here is what the updated Model for Request.MainRequest would look like.

type alias BasicAuthentication =
    { username : String
    , password : String
    }


type alias APISnapshotRequest =
    { url : String
    , httpMethod : HttpMethod
    , requestParameters : RequestParameters
    , requestHeaders : RequestHeaders
    , basicAuthentication : Maybe BasicAuthentication
    , requestBody : Maybe RequestBody
    }


type alias Model =
    { request : APISnapshotRequest
    , showErrors : Bool
    }

Elm compiler is complaining that we need to make changes to JSON decocding for APISnapshotRequest type because of this change.

Before we fix that let’s take a look at how JSON decoding is currently being done.

import Json.Decode as JD
import Json.Decode.Pipeline as JP

decodeAPISnapshotRequest : Response -> APISnapshotRequest
decodeAPISnapshotRequest hitResponse =
    let
        result =
            JD.decodeString requestDecoder hitResponse.body
    in
        case result of
            Ok decodedValue ->
                decodedValue

            Err err ->
                emptyRequest


requestDecoder : JD.Decoder APISnapshotRequest
requestDecoder =
    JP.decode Request
        |> JP.optional "username" (JD.map Just JD.string) Nothing
        |> JP.optional "password" (JD.map Just JD.string) Nothing

Now we need to derive the state of the application from our API response .

Let’s introduce a type called ReceivedAPISnapshotRequest which would be the shape of our old APISnapshotRequest with no basicAuthentication field. And let’s update our requestDecoder function to return a Decoder of type ReceivedAPISnapshotRequest instead of APISnapshotRequest.

type alias ReceivedAPISnapshotRequest =
    { url : String
    , httpMethod : HttpMethod
    , requestParameters : RequestParameters
    , requestHeaders : RequestHeaders
    , username : Maybe String
    , password : Maybe String
    , requestBody : Maybe RequestBody
    }


requestDecoder : JD.Decoder ReceivedAPISnapshotRequest

We need to now move our earlier logic that checks to see if a user has opted to use the basic authentication fields or not from the view function to the decodeAPISnapshotRequest function.

decodeAPISnapshotRequest : Response -> APISnapshotRequest
decodeAPISnapshotRequest hitResponse =
    let
        result =
            JD.decodeString requestDecoder hitResponse.body
    in
        case result of
            Ok value ->
                let
                    extractedCreds =
                        ( value.username, value.password )

                    derivedBasicAuthentication =
                        case extractedCreds of
                            ( Nothing, Nothing ) ->
                                Nothing

                            ( Just receivedUsername, Nothing ) ->
                                Just { username = receivedUsername, password = "" }

                            ( Nothing, Just receivedPassword ) ->
                                Just { username = "", password = receivedPassword }

                            ( Just receivedUsername, Just receivedPassword ) ->
                                Just { username = receivedUsername, password = receivedPassword }
                in
                    { url = value.url
                    , httpMethod = value.httpMethod
                    , requestParameters = value.requestParameters
                    , requestHeaders = value.requestHeaders
                    , basicAuthentication = derivedBasicAuthentication
                    , requestBody = value.requestBody
                    }

            Err err ->
                emptyRequest

We extract the username and password into extractedCreds as a Pair from ReceivedAPISnapshotRequest after decoding and construct our APISnapshotRequest from it.

And now we have a clean view function which just takes a BasicAuthentication type and returns us a Html Msg type.

view : BasicAuthentication -> Html Msg
view b =
    [ div [ class "form-row" ]
        [ input
            [ type_ "text"
            , placeholder "Username"
            , value b.username
            , onInput (UpdateUsername)
            ]
            []
        , input
            [ type_ "password"
            , placeholder "Password"
            , value b.password
            , onInput (UpdatePassword)
            ]
            []
        , a
            [ href "javascript:void(0)"
            , onClick (RemoveBasicAuthentication)
            ]
            [ text "×" ]
        ]
    ]

We now have a Model that better captures the business logic. And should we change the logic of basic authentication parameter selection in the future, We do not have to worry about updating the logic in the view .


Using Logtrail to tail log with Elasticsearch and Kibana on Kubernetes

Monitoring and Logging are important aspects of deployments. Centralized logging is always useful in helping us identify the problems.

EFK (Elasticsearch, Fluentd, Kibana) is a beautiful combination of tools to store logs centrally and visualize them on a single click. There are many other open-source logging tools available in the market but EFK (ELK if Logstash is used) is one of the most widely used centralized logging tools.

This blog post shows how to integrate Logtrail which has a papertrail like UI to tail the logs. Using Logtrail we can also apply filters to tail the logs centrally.

As EFK ships as an addon with Kubernetes, all we have to do is deploy the EFK addon on our k8s cluster.

Pre-requisite:

Installing EFK addon from kubernetes upstream is simple. Deploy EFK using following command.

$ kubectl create -f https://raw.githubusercontent.com/kubernetes/kops/master/addons/logging-elasticsearch/v1.6.0.yaml
serviceaccount "elasticsearch-logging" created
clusterrole "elasticsearch-logging" created
clusterrolebinding "elasticsearch-logging" created
serviceaccount "fluentd-es" created
clusterrole "fluentd-es" created
clusterrolebinding "fluentd-es" created
daemonset "fluentd-es" created
service "elasticsearch-logging" created
statefulset "elasticsearch-logging" created
deployment "kibana-logging" created
service "kibana-logging" created

Once k8s resources are created access the Kibana dashboard. To access the dashboard get the URL using kubectl cluster-info

$ kubectl cluster-info | grep Kibana
Kibana is running at https://api.k8s-test.com/api/v1/proxy/namespaces/kube-system/services/kibana-logging

Now goto Kibana dashboard and we should be able to see the logs on our dashboard.

Kibana dashboard

Above dashboard shows the Kibana UI. We can create metrics and graphs as per our requirement.

We also want to view logs in tail style. We will use logtrail to view logs in tail format. For that, we need docker image having logtrail plugin pre-installed.

Note: If upstream Kibana version of k8s EFK addon is 4.x, use kibana 4.x image for installing logtrail plugin in your custom image. If addon ships with kibana version 5.x, make sure you pre-install logtrail on kibana 5 image.

Check the kibana version for addon here.

We will replace default kibana image with kubernetes-logtrail image.

To replace docker image update the kibana deployment using below command.

$ kubectl -n kube-system set image deployment/kibana-logging kibana-logging=rahulmahale/kubernetes-logtrail:latest
deployment "kibana-logging" image updated

Once the image is deployed go to the kibana dashboard and click on logtrail as shown below.

Switch to logtrail

After switching to logtrail we will start seeing all the logs in real time as shown below.

Logs in Logtrail

This centralized logging dashboard with logtrail allows us to filter on several parameters.

For example let’s say we want to check all the logs for namespace myapp. We can use filter kubernetes.namespace_name:"myapp". We can user filter kubernetes.container_name:"mycontainer" to monitor log for a specific container.


RubyKaigi 2018 Day two

RubyKaigi is happening at Sendai, Japan from 31st May to 2nd June. It is perhaps the only conference where one can find almost all the core Ruby team members in attendance.

This is Prathamesh. I bring you live details about what is happening at the Kaigi over the next two days. If you are at the conference please come and say “Hi” to me.

Check out what happened on day 1 .

Faster Apps, No Memory Thrash: Get Your Memory Config Right by Noah Gibbs

Noah gave an awesome talk on techniques to manage the memory used by Ruby applications. One of the main point while dealing with GC is to make it run less, which means don’t create too many objects. He also mentioned that if application permits then destructive operations such as gsub! or concat should be used since they save CPU cycles and memory. Ruby allows setting up environment variables for managing the heap memory but it is really hard to choose values for these environment variables blindly.

Noah has built a tool which uses GC.stat results from applications to estimate the values of the memory related environment variables. Check out the EnvMem gem.

In the end, he discussed some advanced debugging methods like checking fragmentation percentage. The formula is prepared by Nate Berkopec.

s = GC.stat
used_ratio = s[:heap_live_slots].to_f / (s[:heap_eden_pages] * 408)
fragmentation = 1 - used_ratio

We can also use GC::Profiler to profile the code in real time to see how GC is behaving.

Benchmark used for this talk can be found here. Slides for this talk can be found here.

Guild prototype

Next I attended talk by Koichi Sasada on Guild prototype. He discussed the proposed design spec for Guild with a demo of Fibbonacci number program with 40 core CPU and 40 guilds. One of the interesting observations is that performance drops as number of guilds increases because of the global locking.

Guild performance

He discussed the concept of sharable and non-sharable objects. Sharable objects can be shared across multiple Guilds whereas non-sharable objects can only be used one Guild. This also means, you can’t make thread unsafe program with Guilds “by design”. He discussed about the challanges in specifying the sharable objects for Guilds.

Overall, there is stil a lot of work left to be done for Guilds to become a part of Ruby. It includes defining protocols for sharable and non-sharable objects, making sure GC runs properly in case of Guilds, synchronization between different Guilds.

The slides for this talk can be found here.

Ruby programming with type checking

Soutaro from SideCI gave a talk on Steep, a gradual type checker for Ruby.

In the past, Matz has said that he doesn’t like type definitions to be present in the Ruby code. Steep requires type definitions to be present in separate files with extension .rbi. The Ruby source code needs little amount of annotations. Steep also has a scaffold generator to generate the basic type definitions for existing code.

Steep v/s Sorbet

As of now, Steep runs slower than Sorbet which was discussed yesterday by Stripe team. Soutaro also discssued issues in type definitions due to meta programming in libraries such as Active Record. That looks like a challange for Steep as of now.

Web console

After the tea break, I attended talk by Genadi on how web console works.

He discussed the implementation of web-console in detail with references to Ruby internals related to bindings. He compared the web-console interface with IRB and pry and explained the difference. As of now, web console has to monkey patch some of the Rails internals. Genadi has added support for registering interceptors which will prevent this monkey patching in Rails 6. He is also mentoring a Google summer of code student to work on Actionable errors project where the user can take actions like running pending migrations via the webpage itself when the error is shown.

Ruby committers v/s the World

Ruby Committers v/s the World

RubyKaigi offers this unique event where all the Ruby committers come on stage and face the questions from the audience. This year the format was slightly different and it was run in the style of the Ruby core developer meeting. The agenda was predecided with some questions from the people and some tickets to discuss. The session started with discussion of features coming up in Ruby 2.6.

After that, the questions and the tickets in the agenda were discussed. It was good to see how the Ruby team takes decisions about features and suggestions.

Apart from this, there were talks on SciRuby, mRuby, linting, gem upgrades, c extensions and more which I could not attend.

That’s all for the day two. Looking forward to the day three already!

Oh and here is the world map of all the attendees from RubyKaigi.

World map of all attendees