C programming language allows developers to directly access the memory where variables are stored. Ruby does not allow that. There are times while working in Ruby when you need to access the underlying bits and bytes. Ruby provides two methods `pack`

and `unpack`

for that.

Here is an example.

In the above case ‘A’ is a string which is being stored and using `unpack`

I am trying to read the bit value. The ASCII table says that ASCII value of ‘A’ is 65 and the binary representation of 65 is `10000010`

.

Here is another example.

Notice the difference in result from the first case. What’s the difference between `b*`

and `B*`

. In order to understand the difference first lets discuss MSB and LSB.

## Most significant bit vs Least significant bit

All bits are not created equal. `C`

has ascii value of 67. The binary value of 67 is `1000011`

.

First let’s discuss MSB (most significant bit) style . If you are following MSB style then going from left to right (and you always go from left to right) then the most significant bit will come first. Because the most significant bit comes first we can pad an additional zero to the left to make the number of bits eight. After adding an additional zero to the left the binary value looks like `01000011`

.

If we want to convert this value in the LSB (Least Significant Bit) style then we need to store the least significant bit first going from left to right. Given below is how the bits will be moved if we are converting from MSB to LSB. Note that in the below case position 1 is being referred to the leftmost bit.

After the exercise is over the value will look like `11000010`

.

We did this exercise manually to understand the difference between `most significant bit`

and `least significant bit`

. However unpack method can directly give the result in both MSB and LSB. The `unpack`

method can take both `b*`

and `B*`

as the input. As per the ruby documentation here is the difference.

Now let’s take a look at two examples.

Both `b*`

and `B*`

are looking at the same underlying data. It’s just that they represent the data differently.

## Different ways of getting the same data

Let’s say that I want binary value for string `hello`

. Based on the discussion in the last section that should be easy now.

The same information can also be derived as

Let’s break down the previous statement in small steps.

Directive `C*`

gives the `8-bit unsigned integer`

value of the character. Note that ascii value of `h`

is `104`

, ascii value of `e`

is `101`

and so on.

Using the technique discussed above I can find hex value of the string.

Hex value can also be achieved directly.

## High nibble first vs Low nibble first

Notice the difference in the below two cases.

As per ruby documentation for unpack

A byte consists of 8 bits. A nibble consists of 4 bits. So a byte has two nibbles. The ascii value of ‘h’ is `104`

. Hex value of 104 is `68`

. This `68`

is stored in two nibbles. First nibble, meaning 4 bits, contain the value `6`

and the second nibble contains the value `8`

. In general we deal with high nibble first and going from left to right we pick the value `6`

and then `8`

.

However if you are dealing with low nibble first then low nibble value `8`

will take the first slot and then `6`

will come. Hence the result in “low nibble first” mode will be `86`

.

This pattern is repeated for each byte. And because of that a hex value of `68 65 6c 6c 6f`

looks like `86 56 c6 c6 f6`

in low nibble first format.

## Mix and match directives

In all the previous examples I used `*`

. And a `*`

means to keep going as long as it has to keep going. Lets see a few examples.

A single `C`

will get a single byte.

You can add more `Cs`

if you like.

Rather than repeating all those directives, I can put a number to denote how many times you want previous directive to be repeated.

I can use `*`

to capture al the remaining bytes.

Below is an example where `MSB`

and `LSB`

are being mixed.

### pack is reverse of unpack

Method `pack`

is used to read the stored data. Let’s discuss a few examples.

In the above case the binary value is being interpreted as `8 bit unsigned integer`

and the result is ‘A’.

In the above case the input ‘A’ is not ASCII ‘A’ but the hex ‘A’. Why is it hex ‘A’. It is hex ‘A’ because the directive ‘H’ is telling pack to treat input value as hex value. Since ‘H’ is high nibble first and since the input has only one nibble then that means the second nibble is zero. So the input changes from `['A']`

to `['A0']`

.

Since hex value `A0`

does not translate into anything in the ASCII table the final output is left as it and hence the result is `\xA0`

. The leading `\x`

indicates that the value is hex value.

Notice the in hex notation `A`

is same as `a`

. So in the above example I can replace `A`

with `a`

and the result should not change. Let’s try that.

Let’s discuss another example.

In the above example notice the change. I changed directive from `H`

to `h`

. Since `h`

means low nibble first and since the input has only one nibble the value of low nibble becomes zero and the input value is treated as high nibble value. That means value changes from `['a']`

to `['0a']`

. And the output will be `\x0A`

. If you look at ASCII table then hex value `A`

is ASCII value 10 which is `NL line feed, new line`

. Hence we see `\n`

as the output because it represents “new line feed”.

## Usage of unpack in Rails source code

I did a quick grep in Rails source code and found following usage of unpack.

Already we have seen the usage of directive `C*`

and `H`

for unpack. The directive `m`

gives the base64 encoded value and the directive `U*`

gives the UTF-8 character. Here is an example.

## Testing environment

Above code was tested with ruby 1.9.2 .

French version of this article is available here .