is an open source search platform from Apache. It has a very powerful full-text search capability among other things.
Solr is written in Java. And it runs as a standalone search server within a servlet container like Tomcat. When you are working on a Ruby on Rails application you do not want to maintain Tomcat server. This is where websolr comes in picture. Websolr manages the index and the Rails application interacts with index using a gem called sunspot-rails .
Here I am interested in searching products.
Using sunspot gem
Above command creates config/sunspot.yml file. By default this file looks like following.
The way sunspot works is that after every single web request it updates solr about the changes that took place in the request. This is not desirable. To turn that off add auto_commit_after_request option to false in the config/sunsunspot.yml file.
I would also change the log_level for development to DEBUG . The revised config/sunspot.yml file would look like
Taking care of callbacks
In the above case anytime I create, update or destroy a product then as part of after_save callback solr commit commands are issued. Since after_save callbacks are part of ActiveRecord transaction, this slows up the create, update and destroy operation. I like all these operations to happen in background.
Here is how I handled it
In the above case I used Delayed Job but you can use any background job processing tool.
In case of Delayed Job the higher the priority value the less is the priority. By bumping the priority value to 50, I’m making sure that emails and other background jobs are processed before solr work is taken up.
Problem with remove_from_index
In the above case the call to remove_from_index has been deferred to Delayed Job. However the record has already been destroyed. So when Delayed Job takes up the work it first tries to retrieve the record. However the record is missing and the background job fails.
Here is how we solved this problem.
Add another worker named remove_index.rb .
Connecting to websolr
From the websolr documentation it was not clear that the sunspot gem first looks for an environment variable called WEBSOLR_URL and if that envrionment variable has a value then sunspot assumes that the solr index is at that url. If no value is found then it assumes that it is dealing with local solr instance.
So if you are using websolr then make sure that your application has environment variable WEBSOLR_URL properly configured in staging and in production environment.
thoughtbot team outlined how they test their factories first. I like this approach. Since we prefer using minitest here is how we implemented it. It is similar to how the thoughtbot blog has described. However I still want to blog about it so that in our other projects we can use similar approach.
First under spec directory create a file called factories_spec.rb . Here is how our file looks.
Next I need to tell rake to always run this test file first.
When rake command is executed then it goes through all the .rake and loads them. So all we need to do is to create a rake file called factory.rake and put this file under lib/tasks .
Here a dependency is being added to test . And if factory test fails then dependency is not met and the main test suite will not run.
That’s it. Now each unit test does not need to test factory first. All factories are getting tested here.
The problem with above code is that class names in the html markup was meant for web design.
By using css class for functional work,
I have made both the design team and the front end development team perpetually terrified of making any change.
Class is meant for CSS
If designer wants to change markup from
There has to be a better way which clearly separates the design elements from the functional elements.
data-behavior usage can be best understood by an example.
The html markup will change from
Above code would produce html looking something like this
More usage of data-behavior
Based on this data-behavior approach I changed some part of nimbleShop to use data-behavior. Here is the commit.
Code snippet for reference
Over the period of time we have used this technique in many projects successfully. However sometimes I need to spend a while to find the right way to add data-behavior.
I’m adding some code snippet so that I can find them here when I need them.
emberjs has mixin feature which allows code reuse and keep code modular. It also support _super() method.
mixin using apply
mixin in create and extend
Now lets see usage of mixin in create and extend. Since create and extend work similarly I am only going to discuss create scenario .
Notice that in the first case the mixin code was executed first. In the second case the mixin code was execute later.
Here is how it works
Here is mergeMixins code which accepts the mixins and the base class. In the first case the mixins list is just the mixin and the base class is the main class.
At run time all the mixin properties are looped through. In the first case the mixin m has a property called skill .
Runtime detects that both mixin and the base class has a property called skill . Since base class has the first claim to the property a call is made to link the _super of the second function to the first function.
So at the end of the execution the mixin code points to base code as _super.
It reveres itself in case of create
Next comes the main function and since the key is already taken the wrap function is used to map _super of main to point to the mixin .
Remember in Create and Extend it is the last one that executes first
Here is an example with two mixins.
Embjers makes good use of mixin
emberjs has features like comparable, freezable, enumerable, sortable, observable. Take a look at this to checkout their code.
Here we have a Util class. But notice that all the methods on this class are class methods. This class does not have any instance variables. Usually a class is used to carry both data and behavior and ,in this case, the Util class has only behavior and no data.
Similar utility tools in ruby
Now to get some perspective on this discussion lets look at some ruby methods that do similar thing. Here are a few.
In all the above cases the class method is invoked without creating an instance first. So this is similar to the way I used Util.double .
However lets see what is the class of all these objects.
So these are not classes but modules. That begs the question why the smart guys at ruby-core implemented them as modules instead of creating a class the way I did for Util.
Reason is that Class is too heavy for creating only methods like double. As we discussed earlier a class is supposed to have both data and behavior. If the only thing you care about is behavior then ruby suggests to implement it as a module.
extend self is the answer
Before I go on to discuss extend self here is how my Util class will look after moving from Class to Module.
So how does extend self work
First lets see what extend does.
In the above case Calculator is extending module M and hence all the instance methods of module M are directly available to Calculator.
In this case Calculator is a class that extended the module M. However Calculator does not have to be a class to extend a module.
Now lets try a variation where Calculator is a module.
Here Calculator is a module that is extending another module.
Now that we understand that a module can extend another module look at the above code and question why module M is even needed. Why can’t we move the method double to module Calculator directly. Let’s try that.
I got rid of module M and moved the method double inside module Calculator. Since module M is gone I changed from extend M to extend Calculator.
One last fix.
Inside the module Calculator what is self. self is the module Calculator itself. So there is no need to repeat Calculator twice. Here is the final version
Converting A Class into a Module
Everytime I would encounter code like extend self my brain will pause for a moment. Then I would google for it. Will read about it. Three months later I will repeat the whole process.
The best way to learn it is to use it. So I started looking for a case to use extend self. It is not a good practice to go hunting for code to apply an idea you have in your mind but here I was trying to learn.
Here is a before snapshot of methods from Util class I used in a project.
After using extend self code became
Much better. It makes the intent clear and ,I believe, it is in line with the way ruby would expect us to use.
Another usage inline with how Rails uses extend self
Here I am building an ecommerce application and each new order needs to get a new order number from a third party sales application. The code might look like this. I have omitted the implementation of the methods because they are not relevant to this discussion.
Here the method next_order_number might be making a complicated call to another sales system. Ideally the class Order should not expose method next_order_number . So we can make this method private but that does not solve the root problem. The problem is that model Order should not know how the new order number is generated. Well we can move the method next_order_number to another Util class but that would create too much distance.
Here is a solution using extend self.
Much better. The class Order is not exposing method next_order_number and this method is right there in the same file. No need to open the Util class.
To see practical examples of extend self please look at Rails source code and search for extend self. You will find some interesting usage.
This is my first serious attempt to learn usage of extend self so that next time when I come across such code my brain does not freeze. If you think I have missed out something then do let me know.
to_s method is define in Object class and hence all ruby objects have method to_s.
Certain methods always call to_s method. For example when we do string interpolation then to_s method is called. puts invokes to_s method
to_s is simply the string representation of the object.
Before we look at to_str let’s see a case where ruby raises error.
Here is the result
In the first two cases the to_s method of object e was printed.
However in case ‘3’ ruby raised an error.
Let’s read the error message again.
In this case on the left hand side we have a string object.
To this string object we are trying to add object e.
Ruby could have called to_s method on e and could have produced the result.
But ruby refused to do so.
Ruby refused to do so because it found that the object we are trying to add to string is not of type String.
When we call to_s we get the string representation of the string. But the object might or might not be behaving like a string.
Here we are not looking for the string representation of e.
What we want is for e to behave a like string.
And that is where to_str comes in picture. I have a few more examples to clear this thing so hang in there.
What is to_str
If an object implements to_str method then it is telling the world that my class might not be String but for all practical purposes treat me like a string.
So if we want to make exception object behave like a string then we can add to_str method to it like this.
Now when we run the code we do not get any exception.
What would happen if Fixnum has to_str method
Here is an example where ruby raises exception.
Here Ruby is saying that Fixnum is not like a string and it should not be added to String.
We can make Fixnum to behave like a string by adding a to_str method.
The practical usage of this example can be seen here.
In the above case ruby is refusing to invoke to_s on “1” because it
knows that adding “1” to a string does not feel right.
However we can add method to_str to Fixnum as shown in the last
section and then we will not get any error. In this case the result
will be as shown below.
Before the refactoring was done Path is a subclass of String. So it is String and it has all the methods of a string.
As part of refactoring Path is no longer extending from String. However for all practical purposes it acts like a string. This line is important and I am going to repeat it. For all practical purposes Path here is like a String.
Here we are not talking about the string representation of Path. Here Path is so close to String that practically it can be replaced for a string.
So in order to be like a String class Path should have to_str method and that’s exactly what was done as part of refactoring.
During discussion with my friends someone suggested instead of defining to_str tenderlove could have just defined to_s and the result would have been same.
Yes the result would be same whether you have defined to_s or to_str
if you doing puts.
However in the following case just defining to_s will cause error.
Only by having to_str following case will work.
So the difference between defining to_s and to_str is not just what
you see in the output.
If a class defines to_str then that class is telling the world that although my class is not String you can treat me like a String.
jQuery’s ajax method’s success callback function takes three parameters. Here is the
So if you are making ajax call using jQuery the code might look like
ajax using jquery-ujs
If you are using Rails and jquery-ujs then you might have code like this
Above code will not work. In order to make it work the very first element passed to the callback must be an event object. Here is the code that will work.
Remember that jQuery api says that the first parameter should be “data” then why we need to pass event object to make it work.
Why event object is needed
Here is snippet from jquery-ujs code
The thing about trigger method is that the event object is always passed as the first parameter to the event handler. This is why when you are using jquery-ujs you have to have the first parameter in the callback function an event object.
hacker has put in.
Most web applications has a form.
in address field and hits submit.
It means site has XSS vulnerability.
is able to run that code. No one should be allowed to
other persons’ cookie. Later we will see how hacker can do that.
If you are logged into an application then that application sets a cookie.
That is how the application knows that you are logged in.
If a hacker can see someone else’s cookie then the hacker can log in
as that person by
Having SSL does not protect site from XSS vulnerability.
XSS stands for Cross-site scripting.
It is a very misleading name because
XSS has absolutely nothing to do with
It has everything to do with a site, any site.
A practical example
It is very common to display address in a formatted way. Usually the code is something like this.
When developer looks at the html page developer will see something like
<br /> tag is literally shown on the screen.
Developer looks at the html markup rendered by Rails and it looks like
So the developer comes back to code and marks the string html_safe as shown below.
Now the browser renders the address with proper <br /> tag and the address looks nicely formatted
as shown below.
The developer is happy and the developer moves on.
However notice that developer has marked user input data like address1 as html_safe
and that’s dangerous.
Hacker in action
The application has a number of users and everything is running
All the users are seeing properly formatted address.
And then one day a hacker tried to hack the site.
The hacker puts in address1 as
If we look at the html markup then the html might look like this.
Hacker had put in <script> and the application sent that code to
in the process hacker is able to see the cookie.
How would hacker steal someone else’s information.
Let’s say that an application has a comment form.
In the comment form hacker puts in comment as following.
Next day another user,Mary, comes to the site and logs in. She is reading
the same post and that post has a lot of comments and one of the
comments is comment posted by the hacker.
The application loads all the comments including the comment
posted by the hacker.
And now Mary’s cookie information has been sent to hacker-site
and Mary is not even aware of it.
This is a classic case of XSS attack and this is how hacker
can next time login as Mary just by using her cookie information.
our application question is how do we prevent it.
Well there is only way to prevent it. And that is do not send <script>
tag to the browser. If we send <script> tag to the browser then
So what can we do so that <script> tag is not sent to the browser.
Rails default behavior is to keep things secure
Before we start looking at solutions lets revisit what happened when
earlier we did not mark content as html_safe. So let’s remove
html_safe and lets try to see the content posted by the hacker.
So the code without html_safe would look like this.
And if we execute this code then hackers address would look like this.
see the address hacker had posted. Why is that. To answer that let’s
look at the html markup.
As we can see Rails did not render the address exactly as it was posted
by the hacker. Rails did something because of which
<script> turned into <script>.
Rails html escaped the content by using method
By default Rails assumes that all content is not safe and thus Rails
subjects all content to html_escape method.
Problem is that here we are trying to format the content using <br />
and Rails is escaping that also. We need to escape only the user content
and not escape <br />. Here is how we can do that.
In the above case we are marking the content as html_safe because
we subjected the content through html_escape and now we are sure
that no unescaped user content can go through.
This will show address in the browser like this.
Above solution worked. <br /> is not escaped and user input was
Another solution using content_tag
In the above case we used html_escape and it worked. However if we
need to add say <strong> tag then adding the opening tag and then
closing tag could be quite cumbersome. For such cases we can use
By default content_tag escapes the input text.
simple_format for simple formatting
If you want to format the text a little bit then you can use
If user enters a bunch of text in text area then simple_format can help make the text look pretty
without compromising security.
It will strip away <script> and security sensitive tags.
html_escape internally uses
method. Note that simple_format will remove script tag while
solutions like html_escape will preserve script tag in escaped
$('body').append(data.about) does the job.
Well when that content is added to
now we are back to the same problem.
There are two ways we can handle this problem.
We can send the data as it is in JSON format. Then
data in such a way that html tags like script are not executed.
jQuery provides text(input) method which
escapes input value. Here is an example.
In this case the entire responsibility of escaping the content rests on
aware of which content is user input and must be escaped and which
content is not user input.
That is why we favor the solution where JSON content is escaped to begin
with. For escaping the content we can use h or html_escape helper
As you can see the user content is escaped. Now this data can be sent to
client side and we do not need to worry about script tag being
CSRF stands for Cross-site request forgery.
It is a technique hackers use to hack into a web application.
CSRF does not try to steal your information to log into the system.
CSRF assumes that you are already logged in at your site and when you visit say comments section of some other site then an attack is done on your site without you knowing it.
Here is how it might work.
You log in at www.mysite.com .
Now you open a new tab and you are visiting www.gardening.com since you are interested in gardening.
You are browsing the comments posted on the gardening.com forum. One of the comments posted has url which has source like this <img src="http://www.mysite.com/grant_access?user_id=1&project_id=123" />
Now if you are the admin of the project “123” in www.mysite.com then unknowingly you have granted admin access to user 1. And you did not even know that you did that.
I know you are thinking that loading an image will make a GET request and granting access is hidden behind POST request. So you are safe. Well the hacker can easily change code to make a POST request. In that case the code might look like this
Now when the image is loaded then a POST request is sent to the server and the application might grant access to this new user. Not good.
In order to prevent such things from happening Rails uses authenticity_token.
If you look at source code of any form generated by Rails you will see that form contains following code
The exact value of the authenticity_token will be different for you.
When form is submitted then authentication_token is submitted and
the authenticity_token and only when it is verified the request is passed along for further processing.
In a brand new rails application the application_controller.rb has only one line.
That line protect_from_forgery checks for the authentication of the incoming request.
Here is code that is responsible for generating csrf_token.
Since this csrf_token is a random value there is no way for hacker to know what the “csrf_token” is for my session. And hacker will not be able to pass the correct “authenticity_token”.
Do keep in mind that this protection is applied only to POST, PUT and DELETE requests by Rails. Rails states that GET should not be changing database in the first place so no need for check for authenticity of the token.
Update for Rails 4
If you generate a brand new Rails application using Rails 4 then the application_controller.rb would look like this
Now the default value is to raise an exception if the token is not matched. The API calls will not have the token. If the application is expecting api calls then the strategy should be changed from :exception to :null_session.
Note that if the site is vulnerable to XSS then the hacker submits request as if he is logged in and in that case the CSRF attack will go through.
You have been assigned the task of figuring out in what order following tasks should be executed given their dependencies on other tasks.
If you look at these tasks and draw a graph then it might look like this.
Directed acyclic graph
The graph shown above is a “Directed acyclic graph” . In Directed acyclic graphs if you start following the arrow then you should never be able to get to the node from where you started.
Directed acyclic graphs are great at describing problems where a task is dependent on another set of tasks.
We started off with a set of tasks that are dependent on another set of tasks. To get the solution we need to sort the tasks in such a way that first task is not dependent on any task and the next task is only dependent on task previously done. So basically we need to sort the directed acyclic graph such that the prerequisites are done before getting to the next task.
Sorting of directed acyclic graph in the manner described above is called topological sorting .
Lets write code to find solution to the original problem.
If I execute above code in ruby 1.9.2 I get following result.
So that is the order in which tasks should be executed .
How Tsort works
tsort requires that following two methods must be implemented.
#tsort_each_node - as the name suggests it is used to iterate over all the nodes in the graph. In the above example all the requirements are stored as a hash key . So to iterate over all the nodes we need to go through all the hash keys. And that can be done using #each_key method of hash.
#tsort_each_child - this method is used to iterate over all the child nodes for the given node. Since this is directed acyclic graph all the child nodes are the dependencies. We stored all the dependencies of a project as an array. So to get the list of all the dependencies for a node all we need to do is @requirements[name].each.
To make things clearer lets try to solve the same problem in a different way.
When I execute the above code this is the result I get
If you look at the code here I am doing exactly the same thing as in the
Using before and after option
Let’s try to solve the same problem one last time using before and after option. Here is the code.
Here is the result.
Sorting of rails initializer
If you have written a rails plugin then you can use code like this
The way rails figures out the exact order in which initializer should be executed is exactly same as I illustrated above. Here is the code from rails.
When Rails boots it invokes a lot of initializers. Rails uses tsort to get the order in which initializers should be invoked. Here is the list of unsorted initializers. After sorting the initializers list is this .
Where else it is used
Bundler uses tsort to find the order in which gems should be installed.
Tsort can also be used to statically analyze programming code by looking at method dependency graph.
It comes up very often. Should I use alias or alias_method . Let’s take a look at them in a bit detail.
Usage of alias
Usage of alias_method
First difference you will notice is that in case of alias_method we need to use a comma between the “new method name” and “old method name”.
alias_method takes both symbols and strings as input. Following code would also work.
That was easy. Now let’s take a look at how scoping impacts usage of alias and alias_method .
Scoping with alias
In the above case method “name” picks the method “full_name” defined in “Developer” class. Now let’s try with alias.
With the usage of alias the method “name” is not able to pick the method “full_name” defined in Developer.
This is because alias is a keyword and it is lexically scoped. It means it treats self as the value of self at the time the source code was read . In contrast alias_method treats self as the value determined at the run time.
Overall my recommendation would be to use alias_method. Since alias_method is a method defined in class Module it can be overridden later and it offers more flexibility.
Function bindAll internally uses bind . And bind internally uses apply. So it is important to understand what apply does.
If I execute above code then I get [object window] is beautiful. I am getting that message because when function is invoked then this is window, the default global object.
In order to change the value of this we can make use of method apply as given below.
In the above case the alert message will be Internet is beautiful . Similarly following code will produce Beach is beautiful .
In short, apply lets us control the value of this when the function is invoked.
Why bind is needed
In order to understand why bind method is needed first let’s look at following example.
Above example is pretty straight forward. john is an instance of Developer and when says function is invoked then we get the right alert message.
Notice that when we invoked says we invoked like this john.says(). If we just want to get hold of the function that is returned by says then we need to do john.says. So the above code could be broken down to following code.
Above code is similar to the code above it. All we have done is to store the function in a variable called func. If we invoke this function then we should get the alert message we expected. However if we run this code then the alert message will be undefined rocks!.
We are getting undefined rocks! because in this case func is being invoked in the global context. this is pointing to global object called window when the function is executed. And window does not have any attribute called skill . Hence the output of this.skill is undefined.
Earlier we saw that using apply we can fix the problem arising out of this. So lets try to use apply to fix it.
Above code fixes our problem. This time the alert message we got was Ruby rocks!. However there is an issue and it is a big one.
We should try to create functions which can easily be called by the consumers of the function. This is where bind comes in.
How bind solves the problem
First lets see how using bind solves the problem.
To solve the problem regarding this issue we need a function that is already mapped to john so that we do not need to keep carrying john around. That’s precisely what bind does. It returns a new function and this new function has this bound to the value that we provide.
Here is a snippet of code from bind method
As you can see bind internally uses apply to set this to the second parameter we passed while invoking bind.
Notice that bind does not change existing function. It returns a new function and that new function should be used.
How bindAll solves the problem
Instead of bind we can also use bindAll . Here is solution with bindAll.
Above code is similar to bind solution but there are some big differences.
The first big difference is that we do not have to worry about the returned value of bindAll . In case of bind we must use the returned function. In bindAll we do not have to worry about the returned value but it comes with a price. bindAll actually mutates the function. What does that mean.
See john object has an attribute called says which returns a function . bindAll goes and changes the attribute says so that when it returns a function, that function is already bound to john.
Here is a snippet of code from bindAll method.
Notice that bindAll internally calls bind and it overrides the existing attribute with the function returned by bind.
The other difference between bind and bindAll is that in bind first parameter is a function john.says and the second parameter is the value of this john. In bindAll first parameter is value of this john and the second parameter is not a function but the attribute name.
Things to watch out for
While developing a Backbone.js application someone had code like this
Above code will not work because the returned value of bind is not being used. The correct usage will be
Or you can use bindAll as given below.
C programming language allows developers to directly access the memory where variables are stored. Ruby does not allow that. There are times while working in Ruby when you need to access the underlying bits and bytes. Ruby provides two methods pack and unpack for that.
Here is an example.
In the above case ‘A’ is a string which is being stored and using unpack I am trying to read the bit value. The ASCII table says that ASCII value of ‘A’ is 65 and the binary representation of 65 is 10000010 .
Here is another example.
Notice the difference in result from the first case. What’s the difference between b* and B*. In order to understand the difference first lets discuss MSB and LSB.
Most significant bit vs Least significant bit
All bits are not created equal. C has ascii value of 67. The binary value of 67 is 1000011.
First let’s discuss MSB (most significant bit) style . If you are following MSB style then going from left to right (and you always go from left to right) then the most significant bit will come first. Because the most significant bit comes first we can pad an additional zero to the left to make the number of bits eight. After adding an additional zero to the left the binary value looks like 01000011.
If we want to convert this value in the LSB (Least Significant Bit) style then we need to store the least significant bit first going from left to right. Given below is how the bits will be moved if we are converting from MSB to LSB. Note that in the below case position 1 is being referred to the leftmost bit.
After the exercise is over the value will look like 11000010.
We did this exercise manually to understand the difference between most significant bit and least significant bit. However unpack method can directly give the result in both MSB and LSB. The unpack method can take both b* and B* as the input. As per the ruby documentation here is the difference.
Now let’s take a look at two examples.
Both b* and B* are looking at the same underlying data. It’s just that they represent the data differently.
Different ways of getting the same data
Let’s say that I want binary value for string hello . Based on the discussion in the last section that should be easy now.
The same information can also be derived as
Let’s break down the previous statement in small steps.
Directive C* gives the 8-bit unsigned integer value of the character. Note that ascii value of h is 104, ascii value of e is 101 and so on.
Using the technique discussed above I can find hex value of the string.
Hex value can also be achieved directly.
High nibble first vs Low nibble first
Notice the difference in the below two cases.
As per ruby documentation for unpack
A byte consists of 8 bits. A nibble consists of 4 bits. So a byte has two nibbles. The ascii value of ‘h’ is 104. Hex value of 104 is 68. This 68 is stored in two nibbles. First nibble, meaning 4 bits, contain the value 6 and the second nibble contains the value 8. In general we deal with high nibble first and going from left to right we pick the value 6 and then 8.
However if you are dealing with low nibble first then low nibble value 8 will take the first slot and then 6 will come. Hence the result in “low nibble first” mode will be 86.
This pattern is repeated for each byte. And because of that a hex value of 68 65 6c 6c 6f looks like 86 56 c6 c6 f6 in low nibble first format.
Mix and match directives
In all the previous examples I used *. And a * means to keep going as long as it has to keep going. Lets see a few examples.
A single C will get a single byte.
You can add more Cs if you like.
Rather than repeating all those directives, I can put a number to denote how many times you want previous directive to be repeated.
I can use * to capture al the remaining bytes.
Below is an example where MSB and LSB are being mixed.
pack is reverse of unpack
Method pack is used to read the stored data. Let’s discuss a few examples.
In the above case the binary value is being interpreted as 8 bit unsigned integer and the result is ‘A’.
In the above case the input ‘A’ is not ASCII ‘A’ but the hex ‘A’. Why is it hex ‘A’. It is hex ‘A’ because the directive ‘H’ is telling pack to treat input value as hex value. Since ‘H’ is high nibble first and since the input has only one nibble then that means the second nibble is zero. So the input changes from ['A'] to ['A0'] .
Since hex value A0 does not translate into anything in the ASCII table the final output is left as it and hence the result is \xA0. The leading \x indicates that the value is hex value.
Notice the in hex notation A is same as a. So in the above example I can replace A with a and the result should not change. Let’s try that.
Let’s discuss another example.
In the above example notice the change. I changed directive from H to h. Since h means low nibble first and since the input has only one nibble the value of low nibble becomes zero and the input value is treated as high nibble value. That means value changes from ['a'] to ['0a']. And the output will be \x0A. If you look at ASCII table then hex value A is ASCII value 10 which is NL line feed, new line. Hence we see \n as the output because it represents “new line feed”.
Usage of unpack in Rails source code
I did a quick grep in Rails source code and found following usage of unpack.
Already we have seen the usage of directive C* and H for unpack. The directive m gives the base64 encoded value and the directive U* gives the UTF-8 character. Here is an example.
Above code was tested with ruby 1.9.2 .
French version of this article is available here .
If you want output in xml format then request with .xml extension at the end like this localhost:3000/users.xml and you will get the output in xml format.
What we saw is only one part of the puzzle. The other side of the equation is HTTP header field Accept defined in HTTP RFC.
HTTP Header Field Accept
When browser sends a request then it also sends the information about what kind of resources the browser is capable of handling. Here are some of the examples of the Accept header a browser can send.
If you are reading this blog on a browser then you can find out what kind of Accept header your browser is sending by visiting this link. Here is list of Accept header sent by different browsers on my machine.
Let’s take a look at the Accept header sent by Safari.
Safari is saying that I can handle documents which are xml (application/xml), html (text/html) or plain text (text/plain) documents. And I can handle images such as image/png. If all else fails then send me whatever you can and I will try to render that document to the best of my ability.
Notice that there are also q values. That signifies the priority order. This is what HTTP spec has to say about q.
Each media-range MAY be followed by one or more accept-params, beginning with the “q” parameter for indicating a relative quality factor. The first “q” parameter (if any) separates the media-range parameter(s) from the accept-params. Quality factors allow the user or user agent to indicate the relative degree of preference for that media-range, using the qvalue scale from 0 to 1 (section 3.9). The default value is q=1.
The spec is saying is that each document type has a default value of q as 1. When q value is specified then take that value into account. For all documents that have same q value give high priority to the one that came first in the list. Based on that this should be the order in which documents should be sent to safari browser.
Notice that Safari is nice enough to put a lower priority for */*. Chrome and Firefox also puts */* at a lower priority which is a good thing. Not so with IE which does not declare any q value for */* .
Look at the order again and you can see that application/xml has higher priority over text/html. What it means is that safari is telling Rails that I would prefer application/xml over text/html. Send me text/html only if you cannot send application/xml.
And let’s say that you have developed a RESTful app which is capable of sending output in both html and xml formats.
Rails being a good HTTP citizen should follow the HTTP_ACCEPT protocol and should send an xml document in this case. Again all you did was visit a website and safari is telling rails that send me xml document over html document. Clearly HTTP_ACCEPT values being sent by Safari is broken.
HTTP_ACCEPT is broken
HTTP_ACCEPT attribute concept is neat. It defines the order and the priority. However the implementation is broken by all the browser vendors. Given the case that browsers do not send proper HTTP_ACCEPT what can rails do. One solution is to ignore it completely. If you want xml output then request http://localhost:3000/users.xml . Solely relying on formats make life easy and less buggy. This is what Rails did for a long time.
Starting this commit ,by default, rails did ignore HTTP_ACCEPT attribute. Same is true for Twitter API where HTTP_ACCEPT attribute is ignored and twitter solely relies on format to find out what kind of document should be returned.
Unfortunately this solution has its own sets of problems. Web has been there for a long time and there are a lot of applications who expect the response type to be RSS feed if they are sending application/rss+xml in their HTTP_ACCEPT attribute. It is not nice to take a hard stand and ask all of them to request with extension .rss .
Parsing HTTP_ACCEPT attribute
Parsing and obeying HTTP_ACCEPT attribute is filled with many edge cases. First let’s look at the code that decides what to parse and how to handle the data.
Notice that if a format is passed like http://localhost:3000/users.xml or http://localhost:3000/users.js then Rails does not even parse the HTTP_ACCEPT values. Also note that if browser is sending */* along with other values then Rails totally bails out and just returns Mime::HTML unless the request is ajax request.
Next I am going to discuss some of the cases in greater detail which should bring more clarity around this issue.
Case 1: HTTP_ACCEPT is */*
I have following code.
I am assuming that HTTP_ACCEPT value is */* . In this case browser is saying that send me whatever you got. Since browser is not dictating the order in which documents should be sent Rails will look at the order in which Mime types are declared in respond_to block and will pick the first one. Here is the corresponding code
What it’s saying is that if Mime::ALL is sent then pick the first one declared in the respond_to block. So be careful with order in which formats are declared inside the respond_to block.
The order in which formats are declared can be real issue. Checkout these twocases where the author ran into issue because of the order in which formats are declared.
So far so good. However what if there is no respond_to block. If I don’t have respond_to block and if I have index.html.erb, index.js.erb and index.xml.builder files in my view directory then which one will be picked up. In this case Rails will go over all the registered formats in the order in which they are declared and will try to find a match . So in this case it matters in what order Mime types are registered. Here is the code that registers Mime types.
Case 2: HTTP_ACCEPT with no */*
I am going to assume that in this case HTTP_ACCEPT sent by browser looks really simple like this
I am also assuming that my respond_to block looks like this
So browser is saying that I prefer documents in following order
The order in which formats are declared is
In this case rails will go through each Mime type that browser supports from top to bottom one by one. If a match is found then response is sent otherwise rails tries find match for next Mime type. First in the list of Mime types supported by browser is js and Rails does find that my respond_to block supports .js . Rails executes format.js block and response is sent to browser.
Case 3: Ajax requests
When an AJAX request is made the Safari, Firefox and Chrome send only one item in HTTP_ACCEPT and that is */*. So if you are making an AJAX request then HTTP_ACCEPT for these three browsers will look like
and if your respond_to block looks like this
then the first one will be served based on the formats order. And in this case html response would be sent for an AJAX request. This is not what you want.
This is the reason why if you are using jQuery and if you are sending AJAX request then you should add something like this in your application.js file
If you are using a newer version of rails.js then you don’t need to add above code since it is already take care of for you through this commit .
Trying it out
If you want to play with HTTP_ACCEPT header then put the following line in your controller to inspect the HTTP_ACCEPT attribute.
I used following rake task to set custom HTTP_ACCEPT attribute.
I got familiar with intricacies of mime parsing while working on
ticket #6022 .
A big thanks to
for patiently dealing with me while working on this ticket.
He likes to declare the variable where they are used to be sure that the variable being used is declared with var otherwise that variable will become global variable. This fear of accidentally creating a global variables wants him to see variable declaration next to where it is being used.
Use the right tool
In the above case user has declared payment variable in the middle so that he is sure that payment is declared. However if there is a typo as given below then he has accidentally created a global variable “payment”.
You can configure such that JSLint validation runs when you check your code into git or when you push to github. Or you can have a custom rake task. Many solutions are available choose the one that fits you. But do not rely on manual inspection.
Variable declaration are being moved to the top by the browser
Take a look at following code. One might expect that console.log will print “Neeraj” but the output will be “undefined” . That is because even though you have declaration variables next to where they are being used, browsers lift those declarations to the very top.
Browser converts above code into one shown below.
In order to avoid this kind of mistakes it is preferred to declared variables at the top like this.
Looking at the first set of code a person might think that
Implications on how functions are declared
There are two ways of declaring a function.
In the first case only the variable declaration myfunc is getting hoisted up. The definition of myfunc is NOT getting hoisted. In the second case both variable declaration and function definition is getting hoisted up. For more information on this refer to my previous blog on the same topic.
jQuery 1.4.3 was recently released. If you upgrade to jQuery 1.4.3 you will notice that the behavior of return false has changed in this version. First let’s see what return false does.
First ensure that above code is executed on domready. Now if I click on any link then two things will happen.
As the name suggests, calling e.preventDefault() will make sure that the default behavior is not executed.
If above link is clicked then the default behavior of the browser is to take you to www.google.com. However by invoking e.preventDefault() browser will not go ahead with default behavior and I will not be taken to www.google.com.
When a link is clicked then an event “click event” is created. And this event bubbles all the way up to the top. By invoking e.stopPropagation I am asking browser to not to propagate the event. In other words the event will stop bubbling.
If I click on “click me” then “click event” will start bubbling. Now let’s say that I catch this event at .two and if I call e.stopPropagation() then this event will never reach to .first .
First note that you can bind more than one event to an element. Take a look at following case.
I am going to bind three events to the above element.
In this case there are three events bound to the same element. Notice that second event binding invokes e.stopImmediatePropagation() . Calling e.stopImmediatePropagation does two things.
Just like stopPropagation it will stop the bubbling of the event. So any parent of this element will not get this event.
However stopImmdiatePropagation stops the event bubbling even to the siblings. It kills the event right then and there. That’s it. End of the event.
Once again calling stopPropagation means stop this event going to parent. And calling stopImmediatePropagation means stop passing this event to other event handlers bound to itself.
Now that I have described what preventDefault, stopPropagation and stopImmediatePropagation does lets see what changed in jQuery 1.4.3.
In jQuery 1.4.2 when I execute “return false” then that action was same as executing:
Now e.stopImmediatePropagation internally calls e.stopPragation but I have added here for visual clarity.
Fact that return false was calling e.stopImmeidatePropagation was a bug. Get that. It was a bug which got fixed in jquery 1.4.3.
So in jquery 1.4.3 e.stopImmediatePropagation is not called. Checkout this piece of code from events.js of jquery code base.
As you can see when return false is invoked then e.stopImmediatePropagation is not called.
I tried to find which commit made this change but I could not go far because of this issue.
It gets complicated with live and a bug in jQuery 1.4.3
To make the case complicated, jQuery 1.4.3 has a bug in which e.preventStopImmediatePropagation doest not work. Here is a link to this bug I reported.
To understand the bug take a look at following code:
Since I am invoking e.stopImmediatePropagation I should never see alert world. However you will see that alert if you are using jQuery 1.4.3. You can play with it here .
This bug has been fixed as per this commit . Note that the commit mentioned was done after the release of jQuery 1.4.3. To get the fix you will have to wait for jQuery 1.4.4 release or use jQuery edge.
I am using rails.js (jquery-ujs). What do I do?
As I have shown “return false” does not work in jQuery 1.4.3 . However I would have to like have as much backward compatibility in jquery-ujs as much possible so that the same code base works with jQuery 1.4 through 1.4.3 since not every one upgrades immediately.
should make jquery-ujs jquery 1.4.3 compatible.
have been logged at jquery-ujs and I will take a look at all of them one by one. Pleaes do provide your feedback.
Nothing great there. However try passing a parameter to instance_eval .
You will get following error.
So instance_eval does not allow you to pass parameters to a block.
How to get around to the restriction that instance_eval does not accept parameters
instance_exec was added to ruby 1.9 and it allows you to pass parameters to a proc. This feature has been backported to ruby 1.8.7 so we don’t really need ruby 1.9 to test this feature. Try this.
Above code works. So now we can pass parameters to block. Good.
Changing value of self
Another feature of instance_exec is that it changes the value of self. To illustrate that I need to give a longer example.
Notice that in that above case Developer.lab says “Human”. And that is the right answer from ruby perspective. However that is not what I intended. ruby stores the binding of the proc in the context it was created and hence it rightly reports that self is “Human” even though it is being called by Developer.
Rails developers know that in development mode classes are loaded on demand. In production mode all the classes are loaded as part of bootstrapping the system. Also in development mode classes are reloaded every single time page is refreshed.
In order to reload the class, Rails first has to unload . That unloading is done something like this.
However a class might have other constants and they need to be unloaded too. Before you unload those constants you need to know all the constants that are defined in the class that is being loaded. Long story short rails keep track of every single constant that is loaded when it loads User or UserController.
Dependency mechanism is not perfect
Sometimes dependency mechanism by rails lets a few things fall through the crack. Try following case.
Start the server in development mode and visit http://localhost:3000/users . First time every thing will come up fine. Now refresh the page. This time you should get an exception uninitialized constant OpenURI .
So what’s going on.
After the page is served the very first time then at the end of response rails will unload all the constants that were autoloaded including UsersController. However while unloading UsersContorller rails will also unload OpenURI.
When the page is refreshed then UsersController will be loaded and require 'open-uri' will be called. However that require will return false.
Why require returns false
Try the following test case in irb.
step 3 : ensure that OpenStruct is truly removed
Notice that in the above case in step 4 require returns false. ‘require’ checks against $LOADED_FEATURES. When OpenStruct was removed then it was not removed from $LOADED_FEATURES and hence ruby thought ostruct is already loaded.
How to get around to this issue.
require loads only once. However load loads every single time. In stead of ‘require’, ‘load’ could be used in this case.
Back to the original problem
In our rails application refresh of the page is failing. To get around to that issue use require_dependency instead of require. require_dependency is a rails thing. Under the hood rails does the same trick we did in the previous step. Rails calls kernel.load to load the constants that would fail if require were used.