Cycloctopus helps you visually explore the DOM of a page. It is mainly inspired by Kodos and Aardvark.
The name of the project comes from the titular character of the short story "Aw $#!*, It's Cycloctopus!" by Jeffrey Brown.
Kodos is a regular expression debugger. Aardvark is a Firefox extension that helps you inspect and manipulate web pages.
Project: Superior is the comics anthology that contains the Cycloctopus story. Many of the other stories in this collection are also quite refreshing.
Imagine a brilliant demo:
Unfortunately, the actual demo was given by:
Two types of queries are supported: CSS and XPath.
When you hover over a DOM element, you can see:
Text nodes cannot be selected (not visually, anyway). So Cycloctopus displays text nodes by opening a separate dialog that lists the contents of all the text nodes that were returned.
| w | Widen selection |
| n | Narrow selection |
| F5 | Execute query selection |
| Ctrl+L | Set focus to URL bar |
| Ctrl+K | Set focus to query textbox |
| Esc | Clear query selection |
XULRunner is an application framework based on Mozilla's Gecko layout engine. It comes with a lot of libraries for creating desktop applications. See the XULRunner Hall of Fame for a list of prominent projects that are implemented using XULRunner.
XUL (XML User interface Language) is an XML dialect for specifying GUI layout.
JQuery is a very lightweight JavaScript library for doing Ajax-y stuff. In Cycloctopus, I use jQuery for these things:
JQuery works great in XULRunner! This is maybe not coincidence, as the author of jQuery, John Resig, works at Mozilla.
XULRunner runs on any platform that Firefox can run on. In fact, Firefox 3 is essentially a XULRunner app (you can also run other XULRunner apps through Firefox).
XUL offers a lot of standard widgets (check out the reference). You can create your own widgets too.
One of the cool APIs you get is mozStorage, an interface to SQLite! There is also support for cryptography, networking, SVG, XSLT, HTML editing, extension management, ZIP file creation, etc.
You can connect to Java code using JavaXPCOM.
I found quite a bit of useful documentation on the XULPlanet website. However, XULPlanet's references are not always up-to-date, and this sort of reference documentation really should be on Mozilla Developer Center.
To be fair, JavaScript 1.9 is more enlightened than the JavaScript implementations in other browsers. You get list comprehensions, generators, and other ideas stolen directly from Python.
There exists a debug build of Cycloctopus that enables the JavaScript Console. There is also a way to embed the Venkman debugger, but I don't understand how I'm supposed to use it once I embed it.
I don't much care for the directory structure of XULRunner. Instead of worrying about it, I put all my code and configuration into a single directory, and use a custom build script (written in Python), to create the proper directory structure.
The deployment story on Windows and Mac is quite different. See http://developer.mozilla.org/en/docs/XULRunner:Deploying_XULRunner_1.8.
Query selection is rather resource-intensive because every selection box consists of four absolute-positioned DIV elements. So if I select 100 elements, I have to create 400 additional DOM elements.
DOM nodes in the browser are often created/manipulated through JavaScript. These nodes aren't the same when you use urllib to download the page. So the XPath expressions you use in Cycloctopus won't always work when you switch to lxml.
One of the weirder bugs is that Cycloctopus crashes when you try to visit MSN.com. Fortunately, I do not have a strong desire to scrape from MSN, but it's still annoying. I think it's a bug in XULRunner, but I'm not sure.
In addition to displaying errors for failed queries, I should perhaps also display a "Details" link that you can click to get a detailed explanation of why a query failed.
PyXPCOM doesn't seem to be good choice, because I can't find any binaries anywhere, and also because Mozilla is supposedly moving away from XPCOM.
One possibility for the lxml integration would be to use PyQt 4.4 instead of XULRunner (version 4.4 will include a browser widget based on WebKit). I don't think I will go this route, but if XULRunner becomes too cumbersome to integrate with, I'll consider it.
Some features I might like to add: