Over the years I've used all sorts of techniques for extracting content out of an HTML DOM. Regexes, XPath, and other things you probably don't want to know about...
Recently I've come to realise that CSS selectors are the canonical way for navigating the DOM in the specific case of HTML. What's nice is that a lot of tools use these, and you can reduce the impedance mismatch between other selection techniques by using CSS selectors everywhere. You may also be able to share your selectors between designers, devs and testers.
Reducing the impedance mismatch
Some situations I use CSS selectors
- CSS - OK well this one's pretty obvious
- Selenium tests - Traditionally most people either use the default element id selector or (for anything more complex) XPath selectors. XPath is great for XML but CSS selectors have a syntactic edge for the standard HTML attributes. Just compare the readability of div#top.selected a with //div[id='top' class='selected']//a
- jQuery - In my opinion the best Javascript framework out there. Everything in jQuery is driven entirely by CSS selectors.
- HAML - If you're lucky enough to be using HAML you'll notice how the structure of your template mimics the CSS selectors. This makes it much easier to see how
- Simple
- Terse
- Readable
- Universal
- Geared specifically for HTML
Tools for building and testing expressions
SelectorGadget looks like a great tool for figuring out more complex selectors if you're not as familiar with the syntax as you'd like to be. Why not try it out?
One other way I like to debug CSS selector expressions is in the Firebug console. Assuming that jQuery is loaded in the page, you can use the console to interact directly with the page:
Try this. Load the jQuery site and open your Firebug console and type:
$('a').css('border','1px solid red')
This should highlight all the links in the page. Substitute 'a' for any CSS expression and you'll be able to see what is selected by that expression. (Don't forget to refresh the page though!)