Rails 5.1 has introduced Date#all_day helper

Sometimes, we want to query records over the whole day for a given date.

>> User.where(created_at: Date.today.beginning_of_day..Date.today.end_of_day)

=> SELECT "users".* FROM "users" WHERE ("users"."created_at" BETWEEN $1 AND $2) [["created_at", 2017-04-09 00:00:00 UTC], ["created_at", 2017-04-09 23:59:59 UTC]]

Rails 5.1 has introduced a helper method for creating this range object for a given date in the form of Date#all_day.

>> User.where(created_at: Date.today.all_day)

=> SELECT "users".* FROM "users" WHERE ("users"."created_at" BETWEEN $1 AND $2) [["created_at", 2017-04-09 00:00:00 UTC], ["created_at", 2017-04-09 23:59:59 UTC]]

We can confirm that the Date#all_day method returns the range object for a given date.

>> Date.today.all_day

=> Sun, 09 Apr 2017 00:00:00 UTC +00:00..Sun, 09 Apr 2017 23:59:59 UTC +00:00

Binding irb - Runtime Invocation for IRB

This blog is part of our Ruby 2.4 series.

It’s very common to see a ruby programmer write a few puts or p statements, either for debugging or for knowing the value of variables.

pry did make our lives easier with the usage of binding.pry. However, it was still a bit of an inconvenience to have it installed at runtime, while working with the irb.

Ruby 2.4 has now introduced binding.irb. By simply adding binding.irb to our code we can open an IRB session.

class ConvolutedProcess
  def do_something
    @variable = 10

    binding.irb
    # opens a REPL here
  end
end

irb(main):029:0* ConvolutedProcess.new.do_something
irb(#<ConvolutedProcess:0x007fc55c827f48>):001:0> @variable
=> 10

Using Kubernetes Persistent volume to store persistent data

In one of our projects we are running Rails application on Kubernetes cluster. It is proven tool for managing and deploying docker containers in production.

In kubernetes containers are managed using deployments and they are termed as pods. deployment holds the specification of pods. It is responsible to run the pod with specified resources. When pod is restarted or deployment is deleted then data is lost on pod. We need to retain data out of pods lifecycle when the pod or deployment is destroyed.

We use docker-compose during development mode. In docker-compose linking between host directory and container directory works out of the box. We wanted similar mechanism with kuberentes to link volumes. In kubernetes we have various types of volumes to use. We chose persistent volume with AWS EBS storage. We used persistent volume claim as per the need of application.

As per the Persistent Volume’s definition (PV) Cluster administrators must first create storage in order for Kubernetes to mount it.

Our Kubernetes cluster is hosted on AWS. We created AWS EBS volumes which can be used to create persistent volume.

Let’s create a sample volume using aws cli and try to use it in the deployment.

aws ec2 create-volume --availability-zone us-east-1a --size 20 --volume-type gp2

This will create a volume in us-east-1a region. We need to note VolumeId once the volume is created.

$ aws ec2 create-volume --availability-zone us-east-1a --size 20 --volume-type gp2
{
    "AvailabilityZone": "us-east-1a",
    "Encrypted": false,
    "VolumeType": "gp2",
    "VolumeId": "vol-123456we7890ilk12",
    "State": "creating",
    "Iops": 100,
    "SnapshotId": "",
    "CreateTime": "2017-01-04T03:53:00.298Z",
    "Size": 20
}

Now let’s create a persistent volume template test-pv to create volume using this EBS storage.

kind: PersistentVolume
apiVersion: v1
metadata:
  name: test-pv
  labels:
    type: amazonEBS
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteMany
  awsElasticBlockStore:
    volumeID: <your-volume-id>
    fsType: ext4

Once we had template to create persistent volume, we used kubectl to launch it. Kubectl is command line tool to interact with Kubernetes cluster.

$ kubectl create -f  test-pv.yml
persistentvolume "test-pv" created

Once persistent volume is created you can check using following command.

$ kubectl get pv
NAME       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS      CLAIM               REASON    AGE
test-pv     10Gi        RWX           Retain          Available                                7s

Now that our persistent volume is in available state, we can claim it by creating persistent volume claim policy.

We can define persistent volume claim using following template test-pvc.yml.

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-pvc
  labels:
    type: amazonEBS
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10Gi

Let’s create persistent volume claime using above template.

$ kubectl create -f  test-pvc.yml

persistentvolumeclaim "test-pvc" created

After creating the persistent volume claim, our persistent volume will change from available state to bound state.

$ kubectl get pv
NAME       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS     CLAIM               REASON    AGE
test-pv    10Gi        RWX           Retain          Bound      default/test-pvc              2m

$kubectl get pvc
NAME        STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
test-pvc    Bound     test-pv   10Gi        RWX           1m

Now we have persistent volume claim available on our Kubernetes cluster, Let’s use it in deployment.

Deploying Kubernetes application

We will use following deployment template as test-pv-deployment.yml.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: test-pv
  labels:
    app: test-pv
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: test-pv
        tier: frontend
    spec:
      containers:
      - image: <your-repo>/<your-image-name>:latest
        name: test-pv
        imagePullPolicy: Always
        env:
        - name: APP_ENV
          value: staging
        - name: UNICORN_WORKER_PROCESSES
          value: "2"
        volumeMounts:
        - name: test-volume
          mountPath: "/<path-to-my-app>/shared/data"
        ports:
        - containerPort: 80
      imagePullSecrets:
        - name: registrypullsecret
      volumes:
      - name: test-volume
        persistentVolumeClaim:
          claimName: test-pvc

Now launch the deployment using following command.

$ kubectl create -f  test-pvc.yml
deployment "test-pv" created

Once the deployment is up and running all the contents on shared directory will be stored on persistent volume claim. Further when pod or deployment crashes for any reason our data will be always retained on the persistent volume. We can use it to launch the application deployment.

This solved our goal of retaining data across deployments across pod restarts.

Ruby 2.4 has added additional parameters for Logger#new

This blog is part of our Ruby 2.4 series.

The Logger class in Ruby provides a simple but sophisticated logging utility.

After creating the logger object we need to set its level.

Ruby 2.3

require 'logger'
logger = Logger.new(STDOUT)
logger.level = Logger::INFO

If we are working with ActiveRecord::Base.logger, then same code would look something like this.

require 'logger'
ActiveRecord::Base.logger = Logger.new(STDOUT)
ActiveRecord::Base.logger.level = Logger::INFO

As we can see in the both the cases we need to set the level separately after instantiating the object.

Ruby 2.4

In Ruby 2.4, level can now be specified in the constructor.

#ruby 2.4
require 'logger'
logger = Logger.new(STDOUT, level: Logger::INFO)

# let's verify it
logger.level      #=> 1

Similarly, other options such as progname, formatter and datetime_format, which prior to Ruby 2.4 had to be explicitly set, can now be set during the instantiation.

#ruby 2.3
require 'logger'
logger = Logger.new(STDOUT)
logger.level = Logger::INFO
logger.progname = 'bigbinary'
logger.datetime_format = '%Y-%m-%d %H:%M:%S'
logger.formatter = proc do |severity, datetime, progname, msg|
  "#{severity} #{datetime} ==> App: #{progname}, Message: #{msg}\n"
end

logger.info("Program started...")
#=> INFO 2017-03-16 18:43:58 +0530 ==> App: bigbinary, Message: Program started...

Here is same stuff in Ruby 2.4.

#ruby 2.4
require 'logger'
logger = Logger.new(STDOUT,
  level: Logger::INFO,
  progname: 'bigbinary',
  datetime_format: '%Y-%m-%d %H:%M:%S',
  formatter: proc do |severity, datetime, progname, msg|
    "#{severity} #{datetime} ==> App: #{progname}, Message: #{msg}\n"
  end
)

logger.info("Program started...")
#=> INFO 2017-03-16 18:47:39 +0530 ==> App: bigbinary, Message: Program started...

Ruby 2.4 has default basename for Tempfile#create

This blog is part of our Ruby 2.4 series.

Tempfile class

Tempfile is used for managing temporary files in Ruby. A Tempfile object creates a temporary file with a unique filename. It behaves just like a File object, and therefore we can perform all the usual file operations on it.

Why Tempfile when we can use File

These days it is common to store file on services like S3. Let’s say that we have a users.csv file on S3. Working with this file remotely is problematic. In such cases it is desirable to download the file on local machine for manipulation. After the work is done then file should be deleted. Tempfile is ideal for such cases.

Basename for tempfile

If we want to create a temporary file then we needed to pass parameter to it prior to Ruby 2.3.

require 'tempfile'
file = Tempfile.new('bigbinary')
#=> #<Tempfile:/var/folders/jv/fxkfk9_10nb_964rvrszs2540000gn/T/bigbinary-20170304-10828-1w02mqi>

As we can see above the generated file name begins with “bigbinary” word.

Since Tempfile ensures that the generate filename will always be unique the point of passing the argument is meaningless. Ruby doc calls this passing “basename”.

So in Ruby 2.3.0 it was decided that the basename parameter was meaningless for Tempfile#new and an empty string will be the default value.

require 'tempfile'
file = Tempfile.new
#=> #<Tempfile:/var/folders/jv/fxkfk9_10nb_964rvrszs2540000gn/T/20170304-10828-1v855bf>

But the same was not implemented for Tempfile#create.

# Ruby 2.3.0
require 'tempfile'
Tempfile.create do |f|
  f.write "hello"
end

ArgumentError: wrong number of arguments (given 0, expected 1..2)

This was fixed in Ruby 2.4. So now the basename parameter for Tempfile.create is set to empty string by default, to keep it consistent with the Tempfile#new method.

# Ruby 2.4
require 'tempfile'
Tempfile.create do |f|
  f.write "hello"
end
=> 5

New arguments supported for float and integer modifiers in Ruby 2.4

This blog is part of our Ruby 2.4 series.

In Ruby, there are many methods available which help us to modify a float or integer value.

Ruby 2.3.x

In the previous versions of Ruby, we could use methods such as floor, ceil and truncate in following ways.

5.54.floor          #=> 5
5.54.ceil           #=> 6
5.54.truncate       #=> 5

Providing an argument to these methods would result in ArgumentError exception.

Ruby 2.4

Ruby community decided to come up with an option to add precision argument .

The precision argument, which can be negative, helps us to get result to the required precision to either side of the decimal point.

The default value for the precision argument is 0.

876.543.floor(-2)       #=> 800
876.543.floor(-1)       #=> 870
876.543.floor           #=> 876
876.543.floor(1)        #=> 876.5
876.543.floor(2)        #=> 876.54

876.543.ceil(-2)        #=> 900
876.543.ceil(-1)        #=> 880
876.543.ceil            #=> 877
876.543.ceil(1)         #=> 876.6
876.543.ceil(2)         #=> 876.55

876.543.truncate(-2)    #=> 800
876.543.truncate(-1)    #=> 870
876.543.truncate        #=> 876
876.543.truncate(1)     #=> 876.5
876.543.truncate(2)     #=> 876.54

These methods all work the same on Integer as well.

5.floor(2)              #=> 5.0
5.ceil(2)               #=> 5.0
5.truncate(2)           #=> 5.0

Ruby 2.4 introduces Enumerable#uniq and Enumerable::Lazy#uniq

This blog is part of our Ruby 2.4 series.

In Ruby, we commonly use uniq method on an array to fetch the collection of all unique elements. But there may be cases where we might need elements in a hash by virtue of uniqueness of its values.

Let’s consider an example of countries that have hosted the Olympics. We only want to know when was the first time a country hosted it.

# given object
{ 1896 => 'Athens',
  1900 => 'Paris',
  1904 => 'Chicago',
  1906 => 'Athens',
  1908 => 'Rome' }

# expected outcome
{ 1896 => 'Athens',
  1900 => 'Paris',
  1904 => 'Chicago',
  1908 => 'Rome' }

One way to achieve this is to have a collection of unique country names and then check if that value is already taken while building the result.

olympics =
{ 1896 => 'Athens',
  1900 => 'Paris',
  1904 => 'Chicago',
  1906 => 'Athens',
  1908 => 'Rome' }

unique_nations = olympics.values.uniq

olympics.select{ |year, country| !unique_nations.delete(country).nil? }
#=> {1896=>"Athens", 1900=>"Paris", 1904=>"Chicago", 1908=>"Rome"}

As we can see, the above code requires constructing an additional array unique_nations.

In processing larger data, loading an array of considerably big size in memory and then carrying out further processing on it, may result in performance and memory issues.

In Ruby 2.4, Enumerable class introduces uniq method that collects unique elements while iterating over the enumerable object.

The usage is similar to that of Array#uniq. Uniqueness can be determined by the elements themselves or by a value yielded by the block passed to the uniq method.

olympics = {1896 => 'Athens', 1900 => 'Paris', 1904 => 'Chicago', 1906 => 'Athens', 1908 => 'Rome'}

olympics.uniq { |year, country| country }.to_h
#=> {1896=>"Athens", 1900=>"Paris", 1904=>"Chicago", 1908=>"Rome"}

Similar method is also implemented in Enumerable::Lazy class. Hence we can now call uniq on lazy enumerables.

(1..Float::INFINITY).lazy.uniq { |x| (x**2) % 10 }.first(6)
#=> [1, 2, 3, 4, 5, 10]

Ruby 2.4 has optimized lstrip and strip methods for ASCII strings

This blog is part of our Ruby 2.4 series.

Ruby has lstrip and rstrip methods which can be used to remove leading and trailing whitespaces respectively from a string.

Ruby also has strip method which is a combination of lstrip and rstrip and can be used to remove both, leading and trailing whitespaces, from a string.

"    Hello World    ".lstrip    #=> "Hello World    "
"    Hello World    ".rstrip    #=> "    Hello World"
"    Hello World    ".strip     #=> "Hello World"

Prior to Ruby 2.4, the rstrip method was optimized for performance, but the lstrip and strip were somehow missed. In Ruby 2.4, String#lstrip and String#strip methods too have been optimized to get the performance benefit of String#rstrip .

Let’s run following snippet in Ruby 2.3 and Ruby 2.4 to benchmark and compare the performance improvement.

require 'benchmark/ips'

Benchmark.ips do |bench|
  str1 = " " * 10_000_000 + "hello world" + " " * 10_000_000
  str2 = str1.dup
  str3 = str1.dup

  bench.report('String#lstrip') do
    str1.lstrip
  end

  bench.report('String#rstrip') do
    str2.rstrip
  end

  bench.report('String#strip') do
    str3.strip
  end
end

Result for Ruby 2.3

Warming up --------------------------------------
       String#lstrip     1.000  i/100ms
       String#rstrip     8.000  i/100ms
        String#strip     1.000  i/100ms
Calculating -------------------------------------
       String#lstrip     10.989  (± 0.0%) i/s -     55.000  in   5.010903s
       String#rstrip     92.514  (± 5.4%) i/s -    464.000  in   5.032208s
        String#strip     10.170  (± 0.0%) i/s -     51.000  in   5.022118s

Result for Ruby 2.4

Warming up --------------------------------------
       String#lstrip    14.000  i/100ms
       String#rstrip     8.000  i/100ms
        String#strip     6.000  i/100ms
Calculating -------------------------------------
       String#lstrip    143.424  (± 4.2%) i/s -    728.000  in   5.085311s
       String#rstrip     89.150  (± 5.6%) i/s -    448.000  in   5.041301s
        String#strip     67.834  (± 4.4%) i/s -    342.000  in   5.051584s

From the above results, we can see that in Ruby 2.4, String#lstrip is around 14x faster while String#strip is around 6x faster. String#rstrip as expected, has nearly the same performance as it was already optimized in previous versions.

Performance remains same for multi-byte strings

Strings can have single byte or multi-byte characters.

For example Lé Hello World is a multi-byte string because of the presence of é which is a multi-byte character.

'e'.bytesize        #=> 1
'é'.bytesize        #=> 2

Let’s do performance benchmarking with string Lé hello world instead of hello world.

Result for Ruby 2.3

Warming up --------------------------------------
       String#lstrip     1.000  i/100ms
       String#rstrip     1.000  i/100ms
        String#strip     1.000  i/100ms
Calculating -------------------------------------
       String#lstrip     11.147  (± 9.0%) i/s -     56.000  in   5.034363s
       String#rstrip      8.693  (± 0.0%) i/s -     44.000  in   5.075011s
        String#strip      5.020  (± 0.0%) i/s -     26.000  in   5.183517s

Result for Ruby 2.4

Warming up --------------------------------------
       String#lstrip     1.000  i/100ms
       String#rstrip     1.000  i/100ms
        String#strip     1.000  i/100ms
Calculating -------------------------------------
       String#lstrip     10.691  (± 0.0%) i/s -     54.000  in   5.055101s
       String#rstrip      9.524  (± 0.0%) i/s -     48.000  in   5.052678s
        String#strip      4.860  (± 0.0%) i/s -     25.000  in   5.152804s

As we can see, the performance for multi-byte strings is almost the same across Ruby 2.3 and Ruby 2.4.

Explanation

The optimization introduced is related to how the strings are parsed to detect for whitespaces. Checking for whitespaces in multi-byte string requires an additional overhead. So the patch adds an initial condition to check if the string is a single byte string, and if so, processes it separately.

In most of the cases, the strings are single byte so the performance improvement would be visible and helpful.

IO#readlines now accepts chomp flag as an argument

This blog is part of our Ruby 2.4 series.

Consider the following file which needs to be read in Ruby. We can use the IO#readlines method to get the lines in an array.

# lotr.txt

Three Rings for the Elven-kings under the sky,
Seven for the Dwarf-lords in their halls of stone,
Nine for Mortal Men doomed to die,
One for the Dark Lord on his dark throne
In the Land of Mordor where the Shadows lie.

Ruby 2.3

IO.readlines('lotr.txt')
#=> ["Three Rings for the Elven-kings under the sky,\n", "Seven for the Dwarf-lords in their halls of stone,\n", "Nine for Mortal Men doomed to die,\n", "One for the Dark Lord on his dark throne\n", "In the Land of Mordor where the Shadows lie."]

As we can see, the lines in the array have a \n, newline character, which is not skipped while reading the lines. The newline character needs to be chopped in most of the cases. Prior to Ruby 2.4, it could be done in the following way.

IO.readlines('lotr.txt').map(&:chomp)
#=> ["Three Rings for the Elven-kings under the sky,", "Seven for the Dwarf-lords in their halls of stone,", "Nine for Mortal Men doomed to die,", "One for the Dark Lord on his dark throne", "In the Land of Mordor where the Shadows lie."]

Ruby 2.4

Since it was a common requirement, Ruby team decided to add an optional parameter to the readlines method. So the same can now be achieved in Ruby 2.4 in the following way.

IO.readlines('lotr.txt', chomp: true)
#=> ["Three Rings for the Elven-kings under the sky,", "Seven for the Dwarf-lords in their halls of stone,", "Nine for Mortal Men doomed to die,", "One for the Dark Lord on his dark throne", "In the Land of Mordor where the Shadows lie."]

Additionally, IO#gets, IO#readline, IO#each_line, IO#foreach methods also have been modified to accept an optional chomp flag.

open-uri in Ruby 2.4 allows http to https redirection

In Ruby 2.3, if the argument to open-uri is http and the host redirects to https , then open-uri would throw an error.

> require 'open-uri'
> open('http://www.google.com/gmail')

RuntimeError: redirection forbidden: http://www.google.com/gmail -> https://www.google.com/gmail/

To get around this issue, we could use open_uri_redirections gem.

> require 'open-uri'
> require 'open_uri_redirections'
> open('http://www.google.com/gmail/', :allow_redirections => :safe)

=> #<Tempfile:/var/folders/jv/fxkfk9_10nb_964rvrszs2540000gn/T/open-uri20170228-41042-2fffoa>

Ruby 2.4

In Ruby 2.4, this issue is fixed. So now http to https redirection is possible using open-uri.

> require 'open-uri'
> open('http://www.google.com/gmail')
=> #<Tempfile:/var/folders/jv/fxkfk9_10nb_964rvrszs2540000gn/T/open-uri20170228-41077-1bkm1dv>

Note that redirection from https to http will raise an error, like it did in previous versions, since that has possible security concerns.

Ruby 2.4 now has Dir.empty? and File.empty? methods

This blog is part of our Ruby 2.4 series.

In Ruby, to check if a given directory is empty or not, we check it as

Dir.entries("/usr/lib").size == 2       #=> false
Dir.entries("/home").size == 2          #=> true

Every directory in Unix filesystem contains at least two entries. These are .(current directory) and ..(parent directory).

Hence, the code above checks if there are only two entries and if so, consider a directory empty.

Again, this code only works for UNIX filesystems and fails on Windows machines, as Windows directories don’t have . or ...

Dir.empty?

Considering all this, Ruby has finally included a new method Dir.empty? that takes directory path as argument and returns boolean as an answer.

Here is an example.

Dir.empty?('/Users/rtdp/Documents/posts')   #=> true

Most importantly this method works correctly in all platforms.

File.empty?

To check if a file is empty, Ruby has File.zero? method. This checks if the file exists and has zero size.

File.zero?('/Users/rtdp/Documents/todo.txt')    #=> true

After introducing Dir.empty? it makes sense to add File.empty? as an alias to File.zero?

File.empty?('/Users/rtdp/Documents/todo.txt')    #=> true

Ruby 2.4 implements Integer#digits for extracting digits in place-value notation

This blog is part of our Ruby 2.4 series.

If we want to extract all the digits of an integer from right to left, the newly added Integer#digits method will come in handy.

567321.digits
#=> [1, 2, 3, 7, 6, 5]

567321.digits[3]
#=> 7

We can also supply a different base as an argument.

0123.digits(8)
#=> [3, 2, 1]

0xabcdef.digits(16)
#=> [15, 14, 13, 12, 11, 10]

Use case of digits

We can use Integer#digits to sum all the digits in an integer.

123.to_s.chars.map(&:to_i).sum
#=> 6

123.digits.sum
#=> 6

Also while calculating checksums like Luhn and Verhoeff, Integer#digits will help in reducing string allocation.

Ruby 2.4 adds Set#compare_by_identity and Set#compare_by_identity? methods

This blog is part of our Ruby 2.4 series.

In Ruby, Object#equal? method is used to compare two objects by their identity, that is, the two objects are exactly the same or not. Ruby also has Object#eql? method which returns true if two objects have the same value.

For example:

str1 = "Sample string"
str2 = str1.dup

str1.eql?(str2)     #=> true

str1.equal?(str2)   #=> false

We can see that object ids of the objects are not same.

str1.object_id      #=> 70334175057920

str2.object_id      #=> 70334195702480

In ruby, Set does not allow duplicate items in its collection. To determine if two items are equal or not in a Set ruby uses Object#eql? and not Object#equal?.

So if we want to add two different objects with the same values in a set, that would not have been possible prior to Ruby 2.4 .

Ruby 2.3

require 'set'

set = Set.new           #=> #<Set: {}>

str1 = "Sample string"  #=> "Sample string"
str2 = str1.dup         #=> "Sample string"

set.add(str1)           #=> #<Set: {"Sample string"}>
set.add(str2)           #=> #<Set: {"Sample string"}>

But with the new Set#compare_by_identity method introduced in Ruby 2.4, sets can now compare its values using Object#equal? and check for the exact same objects.

Ruby 2.4

require 'set'

set = Set.new.compare_by_identity           #=> #<Set: {}>

str1 = "Sample string"                      #=> "Sample string"
str2 = str1.dup                             #=> "Sample string"

set.add(str1)                               #=> #<Set: {"Sample string"}>
set.add(str2)                               #=> #<Set: {"Sample string", "Sample string"}>

Set#compare_by_identity?

Ruby 2.4 also provides the compare_by_identity? method to know if the set will compare its elements by their identity.

require 'set'

set1= Set.new                          #=> #<Set: {}>
set2= Set.new.compare_by_identity      #=> #<Set: {}>

set1.compare_by_identity?              #=> false

set2.compare_by_identity?              #=> true

Ruby 2.4 adds support for extracting named capture groups using MatchData#values_at

This blog is part of our Ruby 2.4 series.

Ruby 2.3

We can use MatchData#[] to extract named capture and positional capture groups.

pattern=/(?<number>\d+) (?<word>\w+)/
pattern.match('100 thousand')[:number]
#=> "100"

pattern=/(\d+) (\w+)/
pattern.match('100 thousand')[2]
#=> "thousand"

Positional capture groups could also be extracted using MatchData#values_at.

pattern=/(\d+) (\w+)/
pattern.match('100 thousand').values_at(2)
#=> ["thousand"]

Changes in Ruby 2.4

In Ruby 2.4, we can pass string or symbol to extract named capture groups to method #values_at.

pattern=/(?<number>\d+) (?<word>\w+)/
pattern.match('100 thousand').values_at(:number)
#=> ["100"]

Ruby 2.4 adds infinite? and finite? methods to Numeric

This blog is part of our Ruby 2.4 series.

Prior to Ruby 2.4

Prior to Ruby 2.4, Float and BigDecimal responded to methods infinite? and finite?, whereas Fixnum and Bignum did not.

Ruby 2.3

#infinite?

5.0.infinite?
=> nil

Float::INFINITY.infinite?
=> 1

5.infinite?
NoMethodError: undefined method `infinite?' for 5:Fixnum
#finite?

5.0.finite?
=> true

5.finite?
NoMethodError: undefined method `finite?' for 5:Fixnum

Ruby 2.4

To make behavior for all the numeric values to be consistent, infinite? and finite? were added to Fixnum and Bignum even though they would always return nil.

This gives us ability to call these methods irrespective of whether they are simple numbers or floating numbers.

#infinite?

5.0.infinite?
=> nil

Float::INFINITY.infinite?
=> 1

5.infinite?
=> nil
#finite?

5.0.finite?
=> true

5.finite?
=> true

Ruby 2.4 adds Comparable#clamp method

This blog is part of our Ruby 2.4 series.

In Ruby 2.4, clamp method is added to the Comparable module. This method can be used to clamp an object within a specific range of values.

clamp method takes min and max as two arguments to define the range of values in which the given argument should be clamped.

Clamping numbers

clamp can be used to keep a number within the range of min, max.

10.clamp(5, 20)
=> 10

10.clamp(15, 20)
=> 15

10.clamp(0, 5)
=> 5

Clamping strings

Similarly, strings can also be clamped within a range.

"e".clamp("a", "s")
=> "e"

"e".clamp("f", "s")
=> "f"

"e".clamp("a", "c")
=> "c"

"this".clamp("thief", "thin")
=> "thin"

Internally, this method relies on applying the spaceship <=> operator between the object and the min & max arguments.

if x <=> min < 0, x = min; 
if x <=> max > 0 , x = max
else x

Ruby 2.4 introduces liberal_parsing option for parsing bad CSV data

This blog is part of our Ruby 2.4 series.

Comma-Separated Values (CSV) is a widely used data format and almost every langauge has a module to parse it. In Ruby, we have CSV class to do that.

According to RFC 4180, we cannot have unescaped double quotes in CSV input since such data can’t be parsed.

We get MalformedCSVError error when the CSV data does not conform to RFC 4180.

Ruby 2.4 has added a liberal parsing option to parse such bad data. When it is set to true, Ruby will try to parse the data even when the data does not conform to RFC 4180.

# Before Ruby 2.4

> CSV.parse_line('one,two",three,four')

CSV::MalformedCSVError: Illegal quoting in line 1.


# With Ruby 2.4

> CSV.parse_line('one,two",three,four', liberal_parsing: true)

=> ["one", "two\"", "three", "four"]

Passing block with Enumerable#chunk is not mandatory in Ruby 2.4

This blog is part of our Ruby 2.4 series.

Enumerable#chunk method can be used on enumerator object to group consecutive items based on the value returned from the block passed to it.

[1, 4, 7, 10, 2, 6, 15].chunk { |item| item > 5 }.each { |values| p values }

=> [false, [1, 4]]
[true, [7, 10]]
[false, [2]]
[true, [6, 15]]

Prior to Ruby 2.4, passing a block to chunk method was must.

array = [1,2,3,4,5,6]
array.chunk

=> ArgumentError: no block given

Enumerable#chunk without block in Ruby 2.4

In Ruby 2.4, we will be able to use chunk without passing block. It just returns the enumerator object which we can use to chain further operations.

array = [1,2,3,4,5,6]
array.chunk

=> <Enumerator: [1, 2, 3, 4, 5, 6]:chunk>

Reasons for this change

Let’s take the case of listing consecutive integers in an array of ranges.

# Before Ruby 2.4

integers = [1,2,4,5,6,7,9,13]

integers.enum_for(:chunk).with_index { |x, idx| x - idx }.map do |diff, group|
  [group.first, group.last]
end

=> [[1,2],[4,7],[9,9],[13,13]]

We had to use enum_for here as chunk can’t be called without block.

enum_for creates a new enumerator object which will enumerate by calling the method passed to it. In this case the method passed was chunk.

With Ruby 2.4, we can use chunk method directly without using enum_for as it does not require a block to be passed.

# Ruby 2.4

integers = [1,2,4,5,6,7,9,13]

integers.chunk.with_index { |x, idx| x - idx }.map do |diff, group|
  [group.first, group.last]
end

=> [[1,2],[4,7],[9,9],[13,13]]

Ruby 2.4 unifies Fixnum and Bignum into Integer

This blog is part of our Ruby 2.4 series.

Ruby uses Fixnum class for representing small numbers and Bignum class for big numbers.

# Before Ruby 2.4

1.class         #=> Fixnum
(2 ** 62).class #=> Bignum

In general routine work we don’t have to worry about whether the number we are dealing with is Bignum or Fixnum. It’s just an implementation detail.

Interestingly, Ruby also has Integer class which is superclass for Fixnum and Bignum.

Starting with Ruby 2.4, Fixnum and Bignum are unified into Integer.

# Ruby 2.4

1.class         #=> Integer
(2 ** 62).class #=> Integer

Starting with Ruby 2.4 usage of Fixnum and Bignum constants is deprecated.

# Ruby 2.4

>> Fixnum
(irb):6: warning: constant ::Fixnum is deprecated
=> Integer

>> Bignum
(irb):7: warning: constant ::Bignum is deprecated
=> Integer

How to know if a number is Fixnum, Bignum or Integer?

We don’t have to worry about this change most of the times in our application code. But libraries like Rails use the class of numbers for taking certain decisions. These libraries need to support both Ruby 2.4 and previous versions of Ruby.

Easiest way to know whether the Ruby version is using integer unification or not is to check class of 1.

# Ruby 2.4

1.class #=> Integer

# Before Ruby 2.4
1.class #=> Fixnum

Look at PR #25056 to see how Rails is handling this case.

Similarly Arel is also supporting both Ruby 2.4 and previous versions of Ruby.

Ruby 2.4 implements Array#min and Array#max

This blog is part of our Ruby 2.4 series.

Ruby has Enumerable#min and Enumerable#max which can be used to find the minimum and the maximum value in an Array.

(1..10).to_a.max
#=> 10
(1..10).to_a.method(:max)
#=> #<Method: Array(Enumerable)#max>

Ruby 2.4 adds Array#min and Array#max which are much faster than Enumerable#max and Enuermable#min.

Following benchmark is based on https://blog.blockscore.com/new-features-in-ruby-2-4 .

Benchmark.ips do |bench|
  NUM1 = 1_000_000.times.map { rand }
  NUM2 = NUM1.dup

  ENUM_MAX = Enumerable.instance_method(:max).bind(NUM1)
  ARRAY_MAX = Array.instance_method(:max).bind(NUM2)

  bench.report('Enumerable#max') do
    ENUM_MAX.call
  end

  bench.report('Array#max') do
    ARRAY_MAX.call
  end

  bench.compare!
end

Warming up --------------------------------------
      Enumerable#max     1.000  i/100ms
           Array#max     2.000  i/100ms
Calculating -------------------------------------
      Enumerable#max     17.569  (± 5.7%) i/s -     88.000  in   5.026996s
           Array#max     26.703  (± 3.7%) i/s -    134.000  in   5.032562s

Comparison:
           Array#max:       26.7 i/s
      Enumerable#max:       17.6 i/s - 1.52x  slower

Benchmark.ips do |bench|
  NUM1 = 1_000_000.times.map { rand }
  NUM2 = NUM1.dup

  ENUM_MIN = Enumerable.instance_method(:min).bind(NUM1)
  ARRAY_MIN = Array.instance_method(:min).bind(NUM2)

  bench.report('Enumerable#min') do
    ENUM_MIN.call
  end

  bench.report('Array#min') do
    ARRAY_MIN.call
  end

  bench.compare!
end

Warming up --------------------------------------
      Enumerable#min     1.000  i/100ms
           Array#min     2.000  i/100ms
Calculating -------------------------------------
      Enumerable#min     18.621  (± 5.4%) i/s -     93.000  in   5.007244s
           Array#min     26.902  (± 3.7%) i/s -    136.000  in   5.064815s

Comparison:
           Array#min:       26.9 i/s
      Enumerable#min:       18.6 i/s - 1.44x  slower

This benchmark shows that the new methods Array#max and Array#min are about 1.5 times faster than Enumerable#max and Enumerable#min.

Similar to Enumerable#max and Enumerable#min, Array#max and Array#min also assumes that the objects use Comparable mixin to define spaceship <=> operator for comparing the elements.