BigBinary Blog

Archives

Jquery-ujs and Jquery Trigger

ajax using jquery

jQuery’s ajax method’s success callback function takes three parameters. Here is the api

1
success(data, textStatus, jqXHR)

So if you are making ajax call using jQuery the code might look like

1
2
3
4
5
6
$.ajax({
  url: 'ajax/test.html',
  success: function(data, textStatus, jqXHR) {
    console.log(data);
  }
});

ajax using jquery-ujs

If you are using Rails and jquery-ujs then you might have code like this

1
2
3
4
5
<a href="/users/1" data-remote="true" data-type="json">Show</a>

$('a').bind('ajax:success', function(data, status, xhr) {
  alert(data.name);
});

Above code will not work. In order to make it work the very first element passed to the callback must be an event object. Here is the code that will work.

1
2
3
$('a').bind('ajax:success', function(event, data, status, xhr) {
  alert(data.name);
});

Remember that jQuery api says that the first parameter should be “data” then why we need to pass event object to make it work.

Why event object is needed

Here is snippet from jquery-ujs code

1
2
3
success: function(data, status, xhr) {
  element.trigger('ajax:success', [data, status, xhr]);
}

The thing about trigger method is that the event object is always passed as the first parameter to the event handler. This is why when you are using jquery-ujs you have to have the first parameter in the callback function an event object.

XSS and Rails

XSS stands for Cross-site scripting. However it has nothing to do with cross-site. It has everything to do with your site.

XSS is consistently a top web application security risk as per The Open Web Application Security Project (OWASP) .

XSS vulnerability allows hacker to execute JavaScript code.

Your site has a form. I enter <script>alert(document.cookie)</script> and I hit submit. If I see an alert then it means I can execute JavaScript code on your site and your site has XSS vulnerability.

If hacker can execute JavaScript code then then can see your cookie.

If you are logged into your application then your application sets a cookie. That is how your application knows that you are logged in.

If a hacker can see your cookie then the hacker can log in.

By the way having SSL does not protect the site from XSS vulnerability.

Prevention

An easy way to prevent XSS is to not to allow users to execute JavaScript code. This is the reason why when you go to post comment many sites have messages similar to this one.

limited html example

Some sites allow some html code and other do not allow any html code at all.

It is possible for the hacker to insert JavaScript code in your system. For example if database stores <script>alert(document.cookie)</script> then JavaScript code has come inside the application. However as long as you deny the opportunity for that code to be executed in browser you are safe.

You can also take precaution and sanitize all user input so that JavaScript code does not comes into the system at all.

The right approach depends on your need and the application framework you are working with.

What tools Rails provides

In the earlier version of Rails we were encouraged to use <%= h post.comment %> in views. Here h is a short name for html_escape. In Rails 3.x the content is automatically escaped. It means if the hacker enters <script>alert(document.cookie)</script> then after html_escape has performed its operation the browser sees "&lt;script&gt;alert(document.cookie)&lt;/script&gt;". In this way the user does not get to see the alert message which is a good thing. It means users cannot execute JavaScript code on your site and your site is XSS safe.

If you do want to format the text a little bit then you can use simple_format . If user enters a bunch of text in text area then simple_format can help make the text look pretty without compromising security. It will strip away <script> and security sensitive tags. html_escape internally uses sanitize method. Checkout that method to see what options you can pass. Think before you act because you might be opening a security vulnerability.

Also be careful with raw method. It will output without escaping the string.

In case of Json you need to handle escaping yourself

Note that when user entered <script>alert(document.cookie)</script> in the textarea then database stored the value as <script>alert(document.cookie)</script> . No escaping is done before storing the value in the database. It is the ERB that does the escaping and ensures that site is protected.

All is well and after a few months boss comes and asks to make that page ajaxy. Now data is sent to browser in JSON format.

Now Controller looks like format.json { render json: @user }.

This will produce JSON structure like this "{\"about\":\"<script>alert(document.cookie)</script>\"}".

On the client side you have code to display the content. $('body').append(data.about). Well when the about content is added to dom the script will be executed and now your site is vulnerable to XSS.

You would think that using json_escape should solve the problem. However json_escape produces invalid JSON. Yes that is right. The output of json_escape is invalid JSON. There is an open pull request to take care of that issue.

The point is that if you are passing JSON data to be displayed on the browser then , by default, you do not have the escape protection that ERB provides.

Please post comments at Hacker news .

CSRF and Rails

CSRF stands for Cross-site request forgery.

Unlike XSS CSRF does not try to steal your informationt to log into the system. CSRF assumes that you are aleady logged in at your site and when you visit comments section of some other site then an attack is done on your site without you knowing it.

Here is how it might work

  • User logs in at www.mysite.com .
  • User visits www.gardening.com site since he is interested in gardeing .
  • He is browsing the comments posted on the gardening.com forum and one of the comments posted is <img src="http://www.mysite.com/grant_access?user_id=1&project_id=123" />
  • If the user is admin of the project “123” then unknowingly he might grant access to user_id 1 .

I know. You are thiniking that loading an image will make a GET request and granting access is hidden behind POST request. So you are safe. Well the hacker can easily change code to make a POST request. The code might look like this

1
2
3
4
<script>
 document.write('<form name=hack method=post action="http://mysite.com/grant_access?user_id=1&project_id=123"></form>')
</script>
<img src='' onLoad="document.hack.submit()" />

Now when the image is loaded then a POST request is sent to the server and the application might grant access to this new user. Not good.

Prevention

In order to prevent such things from happening Rails uses authenticity_token.

If you look at source code of any form generated through Rails scaffolding you will see that form markup contains following code

<input name="authenticity_token" type="hidden" value="LhT7dqqRByvOhJJ56BsPb7jJ2p24hxNu6ZuJA+8l+YA=" />.

The exact value of the authenticity_token will be different. When form is submitted then Rails checks the authenticity_token and only when it is verified the request is sent for further processing.

In a brand new rails application the application_controller.rb has only one line.

1
2
3
class ApplicationController < ActionController::Base
  protect_from_forgery
end

That line protect_from_forgery checks for the authentication of the incoming request.

Here is code that is responsible for generating csrf_token.

1
2
3
4
# Sets the token value for the current session.
def form_authenticity_token
  session[:_csrf_token] ||= SecureRandom.base64(32)
end

Since this “csrf_token” is a random value there is no way for hacker to know what the “csrf_token” is for my session. And he will not be able to pass the correct “authenticity_token”.

Note that if the site is vulnerable to XSS then the hacker submits request as if he is logged in and in that case the CSRF attack will go through.

Please post comments at Hacker news .

Tsort in Ruby

You have been assigned the task of figuring out in what order following tasks should be executed given their dependencies on other tasks.

1
2
3
4
5
Task11 takes input from task5 and task7.
Task10 takes input from task11 and task3.
Task9 takes input from task8 and task11.
Task8 takes input from task3 and task7.
Task2 takes input from task11.

If you look at these tasks and draw a graph then it might look like this.

directed acyclic graph

Directed acyclic graph

The graph shown above is a “Directed acyclic graph” . In Directed acyclic graphs if you start following the arrow then you should never be able to get to the node from where you started.

Directed acyclic graphs are great at describing problems where a task is dependent on another set of tasks.

We started off with a set of tasks that are dependent on another set of tasks. To get the solution we need to sort the tasks in such a way that first task is not dependent on any task and the next task is only dependent on task previously done. So basically we need to sort the directed acyclic graph such that the prerequisites are done before getting to the next task.

Sorting of directed acyclic graph in the manner described above is called topological sorting .

TSort

Ruby provides TSort which allows us to implement “topological sorting”.

Lets write code to find solution to the original problem.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
require "tsort"

class Project
  include TSort

  def initialize
    @requirements = Hash.new{|h,k| h[k] = []}
  end

  def add_requirement(name, *requirement_dependencies)
    @requirements[name] = requirement_dependencies
  end

  def tsort_each_node(&block)
    @requirements.each_key(&block)
  end

  def tsort_each_child(name, &block)
    @requirements[name].each(&block) if @requirements.has_key?(name)
  end

end

p = Project.new
p.add_requirement(:r11, :r5, :r2)
p.add_requirement(:r10, :r11, :r3)
p.add_requirement(:r9, :r8, :r11)
p.add_requirement(:r8, :r3, :r7)

puts p.tsort

If I execute above code in ruby 1.9.2 I get following result.

1
2
3
4
5
6
7
8
r5
r2
r11
r3
r10
r7
r8
r9

So that is the order in which tasks should be executed .

Where it is used

When Rails boots it invokes a lot of initializers. Rails uses tsort to get the order in which initializers should be invoked. Here is the list of unsorted initializers. After sorting the initializers list is this .

Here is the code from Rails.

1
2
3
4
5
6
7
8
9
10
11
alias :tsort_each_node :each
def tsort_each_child(initializer, &block)
  select { |i| i.before == initializer.name || i.name == initializer.after }.each(&block)
end

............
............

initializers.tsort.each do |initializer|
  initializer.run(*args) if initializer.belongs_to?(group)
end

Bundler uses tsort to find the order in which gems should be installed.

Tsort can also be used to statically analyze programming code by looking at method dependency graph.

Image source: http://en.wikipedia.org/wiki/Directed_acyclic_graph

Alias vs Alias_method

It comes up very often. Should I use alias or alias_method . Lets take a look at them in a bit detail.

Usage of alias

1
2
3
4
5
6
7
8
9
10
class User

  def full_name
    puts "Johnnie Walker"
  end

  alias name full_name
end

User.new.name #=>Johnnie Walker

Usage of alias_method

1
2
3
4
5
6
7
8
9
10
class User

  def full_name
    puts "Johnnie Walker"
  end

  alias_method :name, :full_name
end

User.new.name #=>Johnnie Walker

First difference you will notice is that in case of alias_method we need to use a comma between the “new method name” and “old method name”.

alias_method takes both symbols and strings as input. Following code would also work.

1
alias_method 'name', 'full_name'

That was easy. Now lets take a look at how scoping impacts usage of alias and alias_method .

Scoping with alias

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class User

  def full_name
    puts "Johnnie Walker"
  end

  def self.add_rename
    alias_method :name, :full_name
  end
end

class Developer < User
  def full_name
    puts "Geeky geek"
  end
  add_rename
end

Developer.new.name #=> 'Gekky geek'

In the above case method “name” picks the method “full_name” defined in “Developer” class. Now lets try with alias.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class User

  def full_name
    puts "Johnnie Walker"
  end

  def self.add_rename
    alias :name :full_name
  end
end

class Developer < User
  def full_name
    puts "Geeky geek"
  end
  add_rename
end

Developer.new.name #=> 'Johnnie Walker'

With the usage of alias the method “name” is not able to pick the method “full_name” defined in Developer.

This is because alias is a keyword and it is lexically scoped. It means it treats self as the value of self at the time the source code was read . In contrast alias_method treats self as the value determined not at the run time.

Overall my recommendation would be to use alias_method. Since alias_method is a method defined in class Module it can be overridden later and it offers more flexibility. Also because of lexical scoping of alias it can do some weird things unless all the developers who whats going on.

Understanding Bind and bindAll in Backbone.js

Backbone.js users use bind and bindAll methods provide by underscore.js a lot. In this blog I am going to discuss why these methods are needed and how it all works.

It all starts with apply

Function bindAll internally uses bind . And bind internally uses apply. So it is important to understand what apply does.

1
2
3
4
var func = function beautiful(){
  alert(this + ' is beautiful');
};
func();

If I execute above code then I get [object window] is beautiful. I am getting that message because when function is invoked then this is window, the default global object.

In order to change the value of this we can make use of method apply as given below.

1
2
3
4
var func = function beautiful(){
  alert(this + ' is beautiful');
};
func.apply('Internet');

In the above case the alert message will be Internet is beautiful . Similarly following code will produce Beach is beautiful .

1
2
3
4
var func = function beautiful(){
  alert(this + ' is beautiful');
};
func.apply('Beach'); //Beach is beautiful

In short, apply lets us control the value of this when the function is invoked.

Why bind is needed

In order to understand why bind method is needed first let’s look at following example.

1
2
3
4
5
6
7
8
function Developer(skill) {
  this.skill = skill;
  this.says = function(){
    alert(this.skill + ' rocks!');
  }
}
var john = new Developer('Ruby');
john.says(); //Ruby rocks!

Above example is pretty straight forward. john is an instance of Developer and when says function is invoked then we get the right alert message.

Notice that when we invoked says we invoked like this john.says(). If we just want to get hold of the function that is returned by says then we need to do john.says. So the above code could be broken down to following code.

1
2
3
4
5
6
7
8
9
function Developer(skill) {
  this.skill = skill;
  this.says = function(){
    alert(this.skill + ' rocks!');
  }
}
var john = new Developer('Ruby');
var func = john.says;
func();// undefined rocks!

Above code is similar to the code above it. All we have done is to store the function in a variable called func. If we invoke this function then we should get the alert message we expected. However if we run this code then the alert message will be undefined rocks!.

We are getting undefined rocks! because in this case func is being invoked in the global context. this is pointing to global object called window when the function is executed. And window does not have any attribute called skill . Hence the output of this.skill is undefined.

Earlier we saw that using apply we can fix the problem arising out of this. So lets try to use apply to fix it.

1
2
3
4
5
6
7
8
9
function Developer(skill) {
  this.skill = skill;
  this.says = function(){
    alert(this.skill + ' rocks!');
  }
}
var john = new Developer('Ruby');
var func = john.says;
func.apply(john);

Above code fixes our problem. This time the alert message we got was Ruby rocks!. However there is an issue and it is a big one.

In JavaScript world functions are first class citizens. The reason why we create function is so that we can easily pass it around. In the above case we created a function called func. However along with the function func now we need to keep passing john around. That is not a good thing. Secondly the responsibility of rightly invoking this function has been shifted from the function creator to the function consumer. That’s not a good API.

We should try to create functions which can easily be called by the consumers of the function. This is where bind comes in

How bind solves the problem

First lets see how using bind solves the problem.

1
2
3
4
5
6
7
8
9
function Developer(skill) {
  this.skill = skill;
  this.says = function(){
    alert(this.skill + ' rocks!');
  }
}
var john = new Developer('Ruby');
var func = _.bind(john.says, john);
func();// Ruby rocks!

To solve the problem regarding this issue we need a function that is already mapped to john so that we do not need to keep carrying john around. That’s precisly what bind does. It returns a new function and this new function has this bound to the value that we provide.

Here is a snippet of code from bind method

1
2
3
return function() {
  return func.apply(obj, args.concat(slice.call(arguments)));
};

As you can see bind internally uses apply to set this to the second parameter we passed while invoking bind.

Notice that bind does not change existing function. It returns a new function and that new function should be used.

How bindAll solves the problem

Instead of bind we can also use bindAll . Here is solution with bindAll.

1
2
3
4
5
6
7
8
9
10
function Developer(skill) {
  this.skill = skill;
  this.says = function(){
    alert(this.skill + ' rocks!');
  }
}
var john = new Developer('Ruby');
_.bindAll(john, 'says');
var func = john.says;
func(); //Ruby rocks!

Above code is similar to bind solution but there are some big differences.

The first big difference is that we do not have to worry about the returned value of bindAll . In case of bind we must use the returned function. In bindAll we do not have to worry about the returned value but it comes with a price. bindAll actually mutates the function. What does that mean.

See john object has an attribute called says which returns a function . bindAll goes and changes the attribute says so that when it returns a function, that function is already bound to john.

Herer is a snippet of code from bindAll method.

1
function(f) { obj[f] = _.bind(obj[f], obj); }

Notice that bindAll internally calls bind and it overrides the existing attribute with the function returned by bind.

The other difference between bind and bindAll is that in bind first paramter is a function john.says and the second parameter is the value of this john. In bindAll first paramter is value of this john and the second parameter is not a function but the attribute name.

Things to watch out for

While developing a Backbone.js application someone had code like this

1
2
3
4
5
6
window.ProductView = Backbone.View.extrend({
  initialize: function() {
    _.bind(this.render, this);
    this.model.bind('change', this.render);
  }
});

Above code will not work because the returned value of bind is not being used. The correct usage will be

1
2
3
4
5
window.ProductView = Backbone.View.extrend({
  initialize: function() {
    this.model.bind('change', _.bind(this.render, this));
  }
});

Or you can use bindAll as given below.

1
2
3
4
5
6
window.ProductView = Backbone.View.extrend({
  initialize: function() {
    _.bindAll(this, this.render);
    this.model.bind('change', this.render);
  }
});

Ruby Pack Unpack

C programming language allows developers to directly access the memory where variables are stored. Ruby does not allow that. There are times while working in Ruby when you need to access the underlying bits and bytes. Ruby provides two methods pack and unpack for that.

Here is an example.

1
2
$ 'A'.unpack('b*')
=> ["10000010"]

In the above case ‘A’ is a string which is being stored and using unpack I am trying to read the bit value. The ASCII table says that ASCII valule of ‘A’ is 65 and the binary representation of 65 is 10000010 .

Here is another example.

1
2
$ 'A'.unpack('B*')
=> ["01000001"]

Notice the difference in result from the first case. What’s the difference between b and B. In order to understand the difference first lets discuss MSB and LSB.

Most significant bit vs Least significant bit

All bits are not created equal. C has ascii value of 67. The binary value of 67 is 1000011.

First let’s discuss MSB (most significant bit) style . If you are following MSB style then going from left to right (and you always go from left to right) then the most significant bit will come first. Because the most significant bit comes first we can pad an additional zero to the left to make the number of bits eight. After adding an additional zero to the left the binary value looks like 01000011.

If we want to convert this value in the LSB (Least Significant Bit) style then we need to store the least significant bit first going from left to right. Given below is how the bits will be moved if we are converting from MSB to LSB. Note that in the below case position 1 is being referred to the leftmost bit.

move value 1 from position 8 of MSB to position 1 of LSB
move value 1 from position 7 of MSB to position 2 of LSB
move value 0 from position 6 of MSB to position 3 of LSB
and so on and so forth

After the exercise is over the value will look like 11000010.

We did this exercise manually to understand the difference between most significant bit and least significant bit. However unpack method can directly give the result in both MSB and LSB. The unpack method can take both b and B as the input. As per the ruby documentation here is the differnce.

B | bit string (MSB first)
b | bit string (LSB first)

Now let’s take a look at two examples.

1
2
3
4
5
$ 'C'.unpack('b*')
=> ["11000010"]

$ 'C'.unpack('B*')
=> ["01000011"]

Both b and B are looking at the same underlying data. It’s just that they represent the data differently.

Different ways of getting the same data

Let’s say that I want binary value for string hello . Based on the discussion in the last section that should be easy now.

1
2
$ "hello".unpack('B*')
=> ["0110100001100101011011000110110001101111"]

The same information can also be derived as

1
2
$ "hello".unpack('C*').map {|e| e.to_s 2}
=> ["1101000", "1100101", "1101100", "1101100", "1101111"]

Let’s break down the previous statement in small steps.

1
2
$ "hello".unpack('C*')
=> [104, 101, 108, 108, 111]

Directive C* gives the 8-bit unsigned integer value of the character. Note that ascii value of h is 104, ascii value of e is 101 and so on.

Using the technique discussed above I can find hex value of the string.

1
2
$ "hello".unpack('C*').map {|e| e.to_s 16}
=> ["68", "65", "6c", "6c", "6f"]

Hex value can also be achieved directly.

1
2
$ "hello".unpack('H*')
=> ["68656c6c6f"]

High nibble first vs Low nibble first

Notice the difference in the below two cases.

1
2
3
4
5
$ "hello".unpack('H*')
=> ["68656c6c6f"]

$ "hello".unpack('h*')
=> ["8656c6c6f6"]

As per ruby documentation for unpack

H | hex string (high nibble first)
h | hex string (low nibble first)

A byte consists of 8 bits. A nibble consists of 4 bits. So a byte has two nibbles. The ascii value of ‘h’ is 104. Hex value of 104 is 68. This 68 is stored in two nibbles. First nibble, meaning 4 bits, contain the value 6 and the second nibble contains the value 8. In general we deal with high nibble first and going from left to right we pick the value 6 and then 8.

However if you are dealing with low nibble first then low nibble value 8 will take the first slot and then 6 will come. Hence the result in “low nibble first” mode will be 86.

This pattern is repeated for each byte. And because of that a hex value of 68 65 6c 6c 6f looks like 86 56 c6 c6 f6 in low nibble first format.

Mix and match directives

In all the previous examples I used . And a means to keep going as long as it has to keep going. Lets see a few examples.

A single C will get a single byte.

1
2
$ "hello".unpack('C')
=> [104]

You can add more Cs if you like.

1
2
3
4
5
6
7
8
$ "hello".unpack('CC')
=> [104, 101]

$ "hello".unpack('CCC')
=> [104, 101, 108]

$ "hello".unpack('CCCCC')
=> [104, 101, 108, 108, 111]

Rather than repeating all those directives, I can put a number to denote how many times you want previous directive to be repeated.

1
2
$ "hello".unpack('C5')
=> [104, 101, 108, 108, 111]

I can use * to capture al the remaining bytes.

1
2
$ "hello".unpack('C*')
=> [104, 101, 108, 108, 111]

Below is an example where MSB and LSB are being mixed.

1
2
$ "aa".unpack('b8B8')
=> ["10000110", "01100001"]

pack is reverse of unpack

Method pack is used to read the stored data. Let’s discuss a few examples.

1
2
$  [1000001].pack('C')
=> "A"

In the above case the binary value is being interpreted as 8 bit unsigned integer and the result is ‘A’.

1
2
$ ['A'].pack('H')
=> "\xA0"

In the above case the input ‘A’ is not ASCII ‘A’ but the hex ‘A’. Why is it hex ‘A’. It is hex ‘A’ because the directive ‘H’ is telling pack to treat input value as hex value. Since ‘H’ is high nibble first and since the input has only one nibble then that means the second nibble is zero. So the input changes from [‘A’] to [‘A0’] .

Since hex value A0 does not translate into anything in the ASCII table the final output is left as it and hence the result is \xA0. The leading \x indicates that the value is hex value.

Notice the in hex notation A is same as a. So in the above example I can replace A with a and the result should not change. Let’s try that.

1
2
$ ['a'].pack('H')
=> "\xA0"

Let’s discuss another example.

1
2
$ ['a'].pack('h')
=> "\n"

In the above example notice the change. I changed directive from H to h. Since h means low nibble first and since the input has only one nibble the value of low nibble becomes zero and the input value is treated as high nibble value. That means value changes from [‘a’] to [‘0a’]. And the output will be \x0A. If you look at ASCII table then hex value A is ASCII value 10 which is NL line feed, new line. Hence we see \n as the output because it represents “new line feed”.

Usage of unpack in Rails source code

I did a quick grep in Rails source code and found following usage of unpack.

email_address_obfuscated.unpack('C*')
'mailto:'.unpack('C*')
email_address.unpack('C*')
char.unpack('H2')
column.class.string_to_binary(value).unpack("H*")
data.unpack("m")
s.unpack("U*")

Already we have seen the usage of directive C and H for unpack. The directive m gives the base64 encoded value and the directive U gives the UTF-8 character. Here is an example.

1
2
$ "Hello".unpack('U*')
=> [72, 101, 108, 108, 111]

Tested with

Above code was tested with ruby 1.9.2 .

Infinite Hash and Default_proc

I you already know how this infinite hash works then you are all set. If not read along.

Default value of Hash

If I want a hash to have a default value then that’s easy.

1
2
h = Hash.new(0)
puts h['usa'] #=> 0

Above code will give me a fixed value if key is not found. If I want dynamic value then I can use block form.

1
2
3
h = Hash.new{|h,k| h[k] = k.upcase}
puts h['usa'] #=> USA
puts h['india'] #=> INDIA

Default value is hash

If I want the default value to be a hash then it seems easy but it falls apart soon.

1
2
3
4
h = Hash.new{|h,k| h[k] = {} }
puts h['usa'].inspect #=> {}
puts h['usa']['ny'].inspect #=> nil
puts h['usa']['ny']['nyc'].inspect #=> NoMethodError: undefined method `[]' for nil:NilClass

In the above if a key is missing for h then it returns a hash. However that returned hash is an ordinary hash which does not have a capability of returning another hash if a key is missing.

This is where default_proc comes into picture. hash.default_proc returns the block which was passed to Hash.new .

1
2
h = Hash.new{|h,k| Hash.new(&h.default_proc)}
puts h['usa']['ny']['nyc'].inspect #=> {}

Mime Type Resolution in Rails

This is a long blog. If you want a summary then José Valim has provided a summary in less than 140 characters.

It is common to see following code in Rails

1
2
3
4
respond_to do |format|
  format.html
  format.xml  { render :xml => @users }
end

If you want output in xml format then request with .xml extension at the end like this localhost:3000/users.xml and you will get the outupt in xml format.

What we saw is only one part of the puzzle. The other side of the equation is HTTP header field Accept defined in HTTP RFC.

HTTP Header Field Accept

When browser sends a request then it also sends the information about what kind of resources the browser is capable of handling. Here are some of the examples of the Accept header a browser can send.

text/plain

image/gif, images/x-xbitmap, images/jpeg, application/vnd.ms-excel, application/msword,
application/vnd.ms-powerpoint, */*

text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

application/vnd.wap.wmlscriptc, text/vnd.wap.wml, application/vnd.wap.xhtml+xml,
application/xhtml+xml, text/html, multipart/mixed, */*

If you are reading this blog on a browser then you can find out what kind of Accept header your browser is sending by visiting this link. Here is list of Accept header sent by different browsers on my machine.

Chrome: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Firefox: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8,application/json
Safari: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
IE: application/x-ms-application, image/jpeg, application/xaml+xml, image/gif,
image/pjpeg, application/x-ms-xbap, application/x-shockwave-flash, */*

Let’s take a look at the Accept header sent by Safari.

Safari: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5

Safari is saying that I can handle documents which are xml (application/xml), html (text/html) or plain text (text/plain) documents. And I can handle images such as image/png. If all else fails then send me whatever you can and I will try to render that document to the best of my ability.

Notice that there are also q values. That signifies the priority order. This is what HTTP spec has to say about q.

Each media-range MAY be followed by one or more accept-params, beginning with the “q” parameter for indicating a relative quality factor. The first “q” parameter (if any) separates the media-range parameter(s) from the accept-params. Quality factors allow the user or user agent to indicate the relative degree of preference for that media-range, using the qvalue scale from 0 to 1 (section 3.9). The default value is q=1.

The spec is saying is that each document type has a default value of q as 1. When q value is specified then take that value into account. For all documents that have same q value give high priority to the one that came first in the list. Based on that this should be the order in which documents should be sent to safari browser.

application/xml (q is 1)
application/xhtml+xml (q is 1)
image/png (q is 1)
text/html (q is 0.9)
text/plain (q is 0.8)
\*/\* (q is 0.5)

Notice that Safari is nice enough to put a lower priority for */*. Chrome and Firefox also puts */* at a lower priority which is a good thing. Not so with IE which does not declare any q value for */* .

Look at the order again and you can see that application/xml has higher priority over text/html. What it means is that safari is telling Rails that I would prefer application/xml over text/html. Send me text/html only if you cannot send application/xml.

And let’s say that you have developed a RESTful app which is capable of sending output in both html and xml formats.

Rails being a good HTTP citizen should follow the HTTP_ACCEPT protocol and should send an xml document in this case. Again all you did was visit a website and safari is telling rails that send me xml document over html document. Clearly HTTP_ACCEPT values being sent by Safari is broken.

HTTP_ACCEPT is broken

HTTP_ACCEPT attribute concept is neat. It defines the order and the priority. However the implementation is broken by all the browser vendors. Given the case that browsers do not send proper HTTP_ACCEPT what can rails do. One solution is to ignore it completely. If you want xml output then request http://localhost:3000/users.xml . Solely relying on formats make life easy and less buggy. This is what Rails did for a long time.

Starting this commit ,by default, rails did ignore HTTP_ACCEPT attribute. Same is true for Twitter API where HTTP_ACCEPT attribute is ignored and twitter solely relies on format to find out what kind of document should be returned.

Unfortunately this solution has its own sets of problems. Web has been there for a long time and there are a lot of applications who expect the response type to be RSS feed if they are sending application/rss+xml in their HTTP_ACCEPT attribute. It is not nice to take a hard stand and ask all of them to request with extension .rss .

Parsing HTTP_ACCEPT attribute

Parsing and obeying HTTP_ACCEPT attribute is filled with many edge cases. First let’s look at the code that decides what to parse and how to handle the data.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
BROWSER_LIKE_ACCEPTS = /,\s*\*\/\*|\*\/\*\s*,/

def formats
  accept = @env['HTTP_ACCEPT']

  @env["action_dispatch.request.formats"] ||=
    if parameters[:format]
      Array(Mime[parameters[:format]])
    elsif xhr? || (accept && accept !~ BROWSER_LIKE_ACCEPTS)
      accepts
    else
      [Mime::HTML]
    end
end

Notice that if a format is passed like http://localhost:3000/users.xml or http://localhost:3000/users.js then Rails does not even parse the HTTP_ACCEPT values. Also note that if browser is sending */* along with other values then Rails totally bails out and just returns Mime::HTML unless the request is ajax request.

Next I am going to discuss some of the cases in greater detail which should bring more clarity around this issue.

Case 1: HTTP_ACCEPT is */*

I have following code.

1
2
3
4
respond_to do |format|
  format.html { render :text => 'this is html' }
  format.js  { render :text => 'this is js' }
end

I am assuming that HTTP_ACCEPT value is */* . In this case browser is saying that send me whatever you got. Since browser is not dictating the order in which documents should be sent Rails will look at the order in which Mime types are declared in respond_to block and will pick the first one. Here is the corresponding code

1
2
3
4
5
6
7
8
9
10
11
def negotiate_mime(order)
  formats.each do |priority|
    if priority == Mime::ALL
      return order.first
    elsif order.include?(priority)
      return priority
    end
  end

  order.include?(Mime::ALL) ? formats.first : nil
end

What it’s saying is that if Mime::ALL is sent then pick the first one declared in the respond_to block. So be careful with order in which formats are declared inside the respond_to block.

The order in which formats are declared can be real issue. Checkout these two cases where the author ran into issue because of the order in which formats are declared.

So far so good. However what if there is no respond_to block. If I don’t have respond_to block and if I have index.html.erb, index.js.erb and index.xml.builder files in my view directory then which one will be picked up. In this case Rails will go over all the registered formats in the order in which they are delcared and will try to find a match . So in this case it matters in what order Mime types are registered. Here is the code that registers Mime types.

Mime::Type.register "text/html", :html, %w( application/xhtml+xml ), %w( xhtml )
Mime::Type.register "text/plain", :text, [], %w(txt)
Mime::Type.register "text/javascript", :js, %w( application/javascript application/x-javascript )
Mime::Type.register "text/css", :css
Mime::Type.register "text/calendar", :ics
Mime::Type.register "text/csv", :csv
Mime::Type.register "application/xml", :xml, %w( text/xml application/x-xml )
Mime::Type.register "application/rss+xml", :rss
Mime::Type.register "application/atom+xml", :atom
Mime::Type.register "application/x-yaml", :yaml, %w( text/yaml )

Mime::Type.register "multipart/form-data", :multipart_form
Mime::Type.register "application/x-www-form-urlencoded", :url_encoded_form

# http://www.ietf.org/rfc/rfc4627.txt
# http://www.json.org/JSONRequest.html
Mime::Type.register "application/json", :json, %w( text/x-json application/jsonrequest )

# Create Mime::ALL but do not add it to the SET.
Mime::ALL = Mime::Type.new("*/*", :all, [])

As you can see text/html is first in the list, text/javascript next and then application/xml. So Rails will look for view file in the following order: index.html.erb , index.js.erb and index.xml.builder .

Case 2: HTTP_ACCEPT with no */*

I am going to assume that in this case HTTP_ACCEPT sent by browser looks really simple like this

text/javascript, text/html, text/plain

I am also assuming that my respond_to block looks like this

1
2
3
4
respond_to do |format|
  format.html { render :text => 'this is html' }
  format.js  { render :text => 'this is js' }
end

So browser is saying that I prefer documents in following order

 js
 html
 plain

The order in which formats are delcared is

html (format.html)
js (format.js)

In this case rails will go through each Mime type that browser supports from top to bottom one by one. If a match is found then response is sent otherwise rails tries find match for next Mime type. First in the list of Mime types suppported by browser is js and Rails does find that my respond_to block supports .js . Rails executes format.js block and response is sent to browser.

Case 3: Ajax requests

When an AJAX request is made the Safari, Firefox and Chrome send only one item in HTTP_ACCEPT and that is */*. So if you are making an AJAX request then HTTP_ACCEPT for these three browsers will look like

Chrome: */*
Firefox: */*
Safari: */*

and if your respond_to block looks like this

1
2
3
4
respond_to do |format|
  format.html { render :text => 'this is html' }
  format.js  { render :text => 'this is js' }
end

then the first one will be served based on the formats order. And in this case html respsone would be sent for an AJAX request. This is not what you want.

This is the reason why if you are using jQuery and if you are sending AJAX request then you should add something like this in your application.js file

1
2
3
4
5
6
7
$(function() {
  $.ajaxSetup({
    'beforeSend': function(xhr) {
      xhr.setRequestHeader("Accept", "text/javascript");
    }
  });
});

If you are using a newer version of rails.js then you don’t need to add above code since it is already take care of for you through this commit .

Trying it out

If you want to play with HTTP_ACCEPT header then put the following line in your controller to inspect the HTTP_ACCEPT attribute.

1
puts request.headers['HTTP_ACCEPT']

I used following rake task to set custom HTTP_ACCEPT attribute.

1
2
3
4
5
6
7
8
9
10
11
12
13
require "net/http"
require "uri"

task :custom_accept do
  uri = URI.parse("http://localhost:3000/users")
  http = Net::HTTP.new(uri.host, uri.port)

  request = Net::HTTP::Get.new(uri.request_uri)
  request["Accept"] = "text/html, application/xml, */*"

  response = http.request(request)
  puts response.body
end

Thanks

I got familiar with intricacies of mime parsing while working on ticket #6022 . A big thanks to José Valim for patiently dealing with me while working on this ticket.

Variable Declaration at the Top Is Not Just Pretty Thing

I was discussing JavaScript code with a friend and he noticed that I had declared all the variables at the top.

He likes to declare the variable where they are used to be sure that the variable being used is declared with var otherwise that variable will become global variable. This fear of accidentally creating a global variables wants him to see variable declaration next to where it is being used.

Use the right tool

1
2
var payment;
payment = soldPrice + shippingCost;

In the above case user has declared payment variable in the middle so that he is sure that payment is declared. However if there is a typo as given below then he has accidentally created a global variable “payment”.

1
2
var paymnet; //there is a typo
payment = soldPrice + shippingCost;

Having variable declaration next to where variable is being used is not a safe way of guarnateeing that variable is declared. Use the right tool and that would be jslint validation. I use MacVim and I use Javascript Lint. So every time I save a JavaScript file validation is done and I get warning if I am accidentally creating a global variable.

You can configure such that JSLint validation runs when you check your code into git or when you push to github. Or you can have a custom rake task. Many solutions are available choose the one that fits you. But do not rely on manual inspection.

Variable declaration are being moved to the top by the browser

Take a look at following code. One might expect that console.log will print “Neeraj” but the output will be “undefined” . That is because even though you have declaration variables next to where they are being used, browsers lift those declarations to the very top.

1
2
3
4
5
6
7
name = 'Neeraj';
function lab(){
 console.log(name);
 var name = 'John';
 console.log(name);
};
lab();

Browser converts above code into one shown below.

1
2
3
4
5
6
7
8
name = 'Neeraj';
function lab(){
 var name = undefined;
 console.log(name);
 name = 'John';
 console.log(name);
};
lab();

In order to avoid this kind of mistakes it is preferred to declared variables at the top like this.

1
2
3
4
5
6
7
name = 'Neeraj';
function lab(){
 var name = 'John';
 console.log(name);
 console.log(name);
};
lab();

Looking at the first set of code a person might think that

Also remember that scope of variable in JavaScript at the function level.

Implications on how functions are declared

There are two ways of declaring a function.

1
2
var myfunc = function(){};
function myfunc2(){};

In the first case only the variable declaration myfunc is getting hoisted up. The defintion of myfunc is NOT getting hoisted. In the second case both variable declaration and function defintion is getting hoisted up. For more information on this refer to my previous blog on the same topic.