jQuery’s motto is to select something and do something with it. As jQuery users, we provide the selection criteria and then we get busy with doing something with the result. This is a good thing. jQuery provides extermely simple API for selecting elements. If you are selecting ids then just prefix the name with ‘#’. If you are selecting a class then prefix it with ‘.’.
However it is important to understand what goes on behind the scene for many reasons. And one of the important reasons is the performance of Rich Client. As more and more web pages use more and more jQuery code, understanding of how jQuery selects elements will speed up the loading of pages.
What is a selector engine
HTML documents are full of html markups. It’s a tree like structure. Ideally speaking all the html documents should be 100% valid xml documents. However if you miss out on closing a div then browsers forgive you ( unless you have asked for strict parsing). Ultimately browser engine sees a well formed xml document. Then the browser engine renders that xml on the browser as a web page.
After a page is rendered then those xml elements are referred as DOM elements.
Browsers help you to get to certain elements
Browsers do provide some helper functions to get to certain types of elements. For example if you want to get DOM element with id header then document.getElementById function can be used like this
Similarly if you want to collect all the p elements in a document then you could use following code .
However if you want something complex like the one given below then browsers were not much help. It was possible to walk up and down the tree however traversing the tree was tricky because of two reasons: a) DOM spec is not very intuitive b) Not all the browsers implemented DOM spec in same way.
Later selector API came out.
The latest version of all the major browsers support this specification including IE8. However IE7 and IE6 do not support it. This API provides querySelectorAll method which allows one to write complex selector query like document.querySelectorAll(“#score>tbody>tr>td:nth-of-type(2)” .
It means that if you are using IE8 or current version of any other modern browser then jQuery code jQuery(‘#header a’) will not even hit Sizzle. That query will be served by a call to querySelectorAll .
However if you are using IE6 or IE7, Sizzle will be invoked for jQuery(‘#header a’). This is one of the reasons why some apps perform much slower on IE6/7 compared to IE8 since a native browser function is much faster then elements retrieval by Sizzle.
jQuery has a lot of optimization baked in to make things run faster. In this section I will go through some of the queries and will try to trace the route jQuery follows.
When jQuery sees that the input string is just one word and is looking for an id then jQuery invokes document.getElementById . Straight and simple. Sizzle is not invoked.
$(‘#header a’) on a modern browser
If the browser supports querySelectorAll then querySelectorAll will satisfy this request. Sizzle is not invoked.
$(‘.header a[href!=”hello”]’) on a modern browser
In this case jQuery will try to use querySelectorAll but the result would be an exception (atleast on firefox). The browser will throw an exception because the querySelectorAll method does not support certain selection criteria. In this case when browser throws an exception, jQuery will pass on the request to Sizzle. Sizzle not only supports css 3 selector but it goes above and beyond that.
$(‘.header a’) on IE6/7
On IE6/7 querySelectorAll is not available so jQuery will pass on this request to Sizzle. Let’s see a little bit in detail how Sizzle will go about handling this case.
Sizzle gets the selector string ‘.header a’. It splits the string into two parts and stores in variable called parts.
Next step is the one which sets Sizzle apart from other selector engines. Instead of first looking for elements with class header and then going down, Sizzle starts with the outer most selector string. As per this presentation from Paul Irish YUI3 and NWMatcher also go right to left.
So in this case Sizzle starts looking for all a elements in the document. Sizzle invokes the method find. Inside the find method Sizzle attempts to find out what kind of pattern this string matches. In this case Sizzle is dealing with string a .
Here is snippet of code from Sizzle.find .
1 2 3 4 5 6 7 8 9 10
One by one Sizzle will go through all the match definitions. In this case since a is a valid tag, a match will be found for TAG. Next following function will be called.
1 2 3
Now result consists of all a elements.
Next task is to find if each of these elements has a parent element matching .header. In order to test that a call will be made to method dirCheck. In short this is what the call looks like.
1 2 3 4
dirCheck method returns whether each element of checkSet passed the test. After that a call is made to method preFilter. In this method the key code is below
For our example this is what is being checked
This operation is repeated for all the elements on the checkSet. Elements not matching the criteria are rejected.
More methods in Sizzle
if you dig more into Sizzle code you would see functions defined as +, > and ~ . Also you will see methods like
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81
I use all these methods almost daily and it was good to see how these methods are actually implemented.
Now that I have little more understanding of how Sizzle works, I can better optimize my selector queries. Here are two selectors doing the same thing.
1 2 3
Since Sizzle goes from right to left, in the first case Sizzle will pick up all the elements with the class employment and then Sizzle will try to filter that list. In the second case Sizzle will pick up only the p elements with class employment and then it will filter the list. In the second case the right most selection criteria is more specific and it will bring better performance.
So the rule with Sizzle is to go more specific on right hand side and to go less specific on left hand side. Here is another example.
1 2 3
The second query will perform better because the right side query is more specific.