A website appears before you! Adventures of a clicky thing

I’d had this idea in my head for a while to create an automated script that would take any web application and click on every link, looking for big obvious errors. So when I found myself with a bit of time on my hands, and a task to test every page in the application for mixed content warnings, I seized my chance. Thus, Clicky Thing was born.

The primary purpose of this utility is to find generic website errors quickly. I set it up to be able to recognize log the following types of potential errors.
– Modal dialogs of any kind.
– ASP.Net server errors.
– HTTP response codes other than 200.

In addition, the website I was testing was intended to be served entirely over SSL, so I added some detection to ensure that it was doing this too.

Design
I decided to use WebAii, which is a GUI automation tool made by Telerik. I decided on WebAii and C# because I’m familiar with it and we already have a good framework and environment set up for it here at Campaign Monitor.

Basically it starts with an empty stack. When it gets to the first page in the application, it harvests all the links from that page and puts them in the stack. Then it pops the first link off the list and clicks it. When the subsequent page loads, it checks it for errors. Then it harvests all the links from that page, puts them in the stack and so on and so forth.

If it pops a link that was on a page that it had already visited, it navigates back to that page in order to click the link.

Part of the appeal of this design is that it doesn’t depend on any components specific to the product-under-test. It dynamically harvests links and clicks them. So it’s robust from the point of view that if the product changes, Clicky Thing shouldn’t break. And it should be reusable for most other web applications.

Clicky Thing was designed so that I could examine the error log when it was done, and I could also watch its progress in case I noticed any errors that Clicky Thing had missed. As it clicks quite fast, using a video screen capture tool was a useful part of this process. In addition, I made Clicky Thing output its progress in real-time to a console so that I could see what it was trying to do. WebAii has a feature that highlights the control in use at any given time, so I was able to make good use of that to see which links were being clicked.

Example
Here’s an example of Clicky Thing in action. I pointed it towards Wikipedia and let it run for about a minute. This is best viewed in fullscreen in 720p quality.

Constraints
I don’t want Clicky Thing clicking every page on the internet, so I restricted it to a specified domain. Every time it visits a new page, it checks the URL. If the specified domain is not a part of that URL, it won’t harvest the links from that page.

I also don’t want Clicky Thing to harvest links from the same page twice. So it keeps a list of all of the pages it has already visited and checks against this list before harvesting new links.

Sometimes Clicky Thing clicks on a link that opens a new browser window. Having too many of these browser windows open can slow down performance (it was running in Internet Explorer 8), and sometimes these browser windows were set to a specific size. So I made it shut down all browser windows and open a new one if it detected that more than one window was around.

Planned improvements
Sometimes Clicky Thing will click something which will reveal more links on the page without changing the URL. The problem arises when Clicky Thing navigates back to that page looking for the link, but Clicky Thing is too naive to know that it has to click something else to get the page back into the state where the link was visible. So it can’t click the link. I couldn’t think of a quick solution to this problem, so I dealt with it by discarding these links for this version of Clicky Thing. It’s an improvement to think about for the next version.

If Clicky Thing fails for any reason, it has to start all over again from the initial page. I experimented with a “save game” feature, which would automatically save the contents of the stack and the visited page list to files so that it could pick up where it left off. However, this was not finished in time, so it’s another item on my feature list.

As Clicky Thing creates new data, it can create new pages in the application. Sometimes Clicky Thing will spend a really long time surfing around very similar pages, simply because of all of the data that has been created. I have yet to ponder this enough to think of a good solution.

Sometimes Clicky Thing seems to finish without having clicked much at all. I’m still looking into this issue, but in the meantime I have added some scoring as a little motivator. It gets one point for every link it clicks. If it finishes, by running out of links to click, a score is displayed in the output.

I plan to write the next version using Test-Driven Development practices under the guidance of my colleague James. I have never done TDD before and I don’t know too much about it, so this will be a TDD learning exercise for me. It will be interesting to see how the design changes as a result of this.

Testing
I wrote an automated test that used a simple series of static web pages as the input for Clicky Thing, and compared Clicky Thing’s output file to a baseline. This simple regression test proved extremely useful to me when I was making improvements to Clicky Thing, as it caught a few issues that I may otherwise have taken a longer time to notice and isolate.

This is the output generated from my automated test that tests Clicky Thing. At this point you may notice that I had decided to make all of Clicky Thing’s progress output in the style of an old-school text-based role-playing game. Clicky Thing is on an adventure.

You stand at a browser window. The remains of previous tests are littered beneath your feet. The browser window looks cold and empty.
A web page appears before you! The web page is file:///C:/Projects/CampaignMonitor/app.tests/Tests.AdHoc/ClickyThing/Tests/page1.html
> Harvest Links
5 links harvested. You have 5 links in your inventory.
> Use click with Dialog
A wild dialog appears! It says “Message from webpage”
A web page appears before you! The web page is file:///C:/Projects/CampaignMonitor/app.tests/Tests.AdHoc/ClickyThing/Tests/page1.html#
A wild server error appears!
This page is fraught with danger! You cast a lvl 10 Secure Layer of Sockets. The spell failed.
> Harvest Links
These links have already been harvested. You have 4 links in your inventory.
> Use click with Server error
A web page appears before you! The web page is file:///C:/Projects/CampaignMonitor/app.tests/Tests.AdHoc/ClickyThing/Tests/servererror.html
A wild server error appears!
This page is fraught with danger! You cast a lvl 10 Secure Layer of Sockets. The spell failed.
> Harvest Links
> There are no links to be harvested. You have 3 links in your inventory.
0 links harvested. You have 3 links in your inventory.
> Go to file:///C:/Projects/CampaignMonitor/app.tests/Tests.AdHoc/ClickyThing/Tests/page1.html
> Use click with 404 error
A web page appears before you! The web page is file:///C:/Projects/CampaignMonitor/app.tests/Tests.AdHoc/ClickyThing/Tests/404.html
A wild HTTP error appears!
This page is fraught with danger! You cast a lvl 10 Secure Layer of Sockets. The spell failed.
> Harvest Links
> There are no links to be harvested. You have 2 links in your inventory.
0 links harvested. You have 2 links in your inventory.
> Go to file:///C:/Projects/CampaignMonitor/app.tests/Tests.AdHoc/ClickyThing/Tests/page1.html
> Use click with Page 3
A web page appears before you! The web page is file:///C:/Projects/CampaignMonitor/app.tests/Tests.AdHoc/ClickyThing/Tests/page3.html
This page is fraught with danger! You cast a lvl 10 Secure Layer of Sockets. The spell failed.
> Harvest Links
> There are no links to be harvested. You have 1 links in your inventory.
0 links harvested. You have 1 links in your inventory.
> Go to file:///C:/Projects/CampaignMonitor/app.tests/Tests.AdHoc/ClickyThing/Tests/page1.html
> Use click with Page 2
A web page appears before you! The web page is file:///C:/Projects/CampaignMonitor/app.tests/Tests.AdHoc/ClickyThing/Tests/page2.html
This page is fraught with danger! You cast a lvl 10 Secure Layer of Sockets. The spell failed.
> Harvest Links
> There are no links to be harvested. You have 0 links in your inventory.
0 links harvested. You have 0 links in your inventory.
You beat Campaign Monitor with a score of 5 points!

Outcomes
Clicky Thing served its purpose – it didn’t take very long to write, it was quick to run and it found some application bugs that may have otherwise taken tedious hours of clicking to find. I’ll be able to reuse it for future projects without much modification. So I believe it was a worthwhile exercise.

It wasn’t always reliable, and I couldn’t guarantee that it made its way through every link in the product, so there is definitely room for improvement. I look forward to improving it while learning TDD.

11 thoughts on “A website appears before you! Adventures of a clicky thing

  1. The Director

    A couple of potential limitations:

    * Handling Content Management Systems that use parameters and build their URLs on the fly. For example, Joomla or other custom systems I’ve seen on sites where the URL contains a “virtual” path based on how the user clicked to the current page.
    * Links that use some sort of JavaScript to pop new windows.

    A couple of different types of application/Web site where it might not work.

  2. Trish Khoo Post author

    Hey Director,

    * Handling Content Management Systems that use parameters and build their URLs on the fly. For example, Joomla or other custom systems I’ve seen on sites where the URL contains a “virtual” path based on how the user clicked to the current page.

    Yeah, this can be an issue if it navigates back to the same URL but the content’s no longer there. Any case where it needs to do more than just paste the URL into the navigation bar to get back to the page doesn’t hold up very well with this model.

    * Links that use some sort of JavaScript to pop new windows.

    Our current automation framework detects new browser windows and treats the latest browser window as the current window. So new window pop ups haven’t been much of a problem. I just put in some code to make sure extra windows are killed so they don’t multiply out of control.

    Really good points, thanks for the feedback!

  3. Tobias Geyer

    That’s a nice tool you built there!

    I’ve been wanting to do the same ever since I started with test automation.
    Unfortunately the pages I was working with were highly dynamic and there were different paths through the application depending on the users input in input fields or select boxes.
    I couldn’t find it in your blog post – does ClickyThing handle these kinds of situations as well?
    Simple Example: imagine a page with a search. From what I can tell Clicky Thing will not be able to check the search result page since it doesn’t fill in search terms, right?

    Trish:
    Thanks! Yeah, Clicky Thing doesn’t handle that kind of thing. It’s a big limitation but I couldn’t think of a simple solution for it. For instance, even if I got it to fill in forms with random data and click submit buttons, it still wouldn’t work when the forms need specifically formatted data like email addresses and numbers. It would have to be tailored to the product under test and I didn’t want to add that level of complexity to the tool.

  4. Ivor

    Hi Trish, Like the idea, would love to try out the tool. Can I download it from anywhere? Thanks, Ivor

    Trish:
    Sorry Ivor, I don’t have any plans to make it available for download. But you could try building your own, using the same design.

  5. The Director

    I realize my comments were out of the scope and the framework of the application you designed, but I wanted to show my requirements-testing fu.

    To clarify, as far as the CMS systems, the problem with the dynamic URLs is not the going back; it’s that the same content appears with different URLs, which means that the same sets of links will appear with different source URLs. That is, when your application harvests links, it could essentially harvest the same sets of links from the same content pages, but the CMS has given each appearance of that same content a new URL. I’ve run even basic link checkers on sites that go on for days because the same content appears with different URLs.

    The second with the JavaScript is ferreting out URLs in links that use JavaScript to launch them, such as javascript:launchCorporate(‘http://www.kohlscorporation.com/customer_service/ProductInfo/ProductRecalls.html’) on Kohls.com. I’ve seen a lot of links like that, especially in footers, that rely on a custom JavaScript function to open a page. It might be tricky to parse those, especially if they don’t include a full URL in the JavaScript function call.

    Another thing to add would be a single bad-on-purpose URL to trigger the 404 error page and see how its links work. You might be surprised to find how often these static HTML pages don’t keep up with the rest of the site.

    Trish:
    Cool, I see what you mean now. With regards to the dynamic URLs, initially I did run into a bit of a problem with this. It wasn’t a major drawback at the time, but it is something I’d like to address in future versions. I guess I would have to build some smarts into the way it records pages that it’s already visited. However, that would mean tailoring it to the application under test somewhat, which I have been trying to avoid as much as possible. If I can make it so that it can refer to a configuration file of some kind, that would be preferable. Anyway, excellent foresight – this is something I ran into.

    With regards to the Javascript-launched popups, Clicky Thing is getting the URLs from the address bar once it’s visited a page already – not from the page controls. I’ve tested it on pages that have Javascript like your example and it worked out fine. I guess parsing issues with the page controls has also been avoided by using WebAii’s Element objects, rather than finding elements by ID or similar. It just parses the DOM to find anything with an “A” tag and stores it as an Element object, which I think just finds the page element by its element type and index (based on how many objects of that type are on the page).

    Good point with the 404, that would be an easy way to increase its coverage. At the moment it has some pretty significant coverage limitations, so it’s definitely not at a stage where I would rely on it for full application test coverage.

    Your testing-fu is strong.

  6. Justin Hunter

    Very cool, Trish. I like it.

    I was at dinner a couple weeks ago with Adam Goucher and Dawn Canaan. Adam mentioned that Harry Robinson experimented with some similar strategies a while back. Googling to learn a bit more about Harry’s exploits (and/or emailing Harry or Adam) might turn up some interesting lessons learned from Harry’s earlier experiences with this sort of thing.

    The 1 minute demo was effective at showing what happens when you put ClickyThing through its paces.

    User Experience thought: I didn’t notice color-coded error messages. If you haven’t done so yet, you might want to think about something to make the error messages more noticeable. Perhaps putting them in in red text? For that matter, if someone ran ClickyThing for an extended over their lunch hour, came back to their desk and wanted a way to see just the errors, a “Just show me the errors” feature would be nice.

    I like the name “ClickyThing” a lot.

    Lastly, we’ve open sourced the code we use to describe recent updates here: https://github.com/belucid/Recent-Updates#readme

    An example of it in use is:
    https://app.hexawise.com/recent_updates

    If you anticipate making improvements, bug fixes, and adding new features for the forseeable future and want an easy way to keep people up to date with changes, feel free to use “Recent Updates.”

    – Justin

    Trish:
    Thanks Justin. I read about Harry Robinson’s model-based testing and some automated exploratory testing approaches a few months ago. It sounded really interesting, but much more complicated than what I am doing here.

    Good point about the colour-coded error messages, that would be nice to see. I should point out that it produces a separate error log that’s a lot easier to read than the adventure-style output. The output was actually just an afterthought, which started out as a debugging aid and evolved into a helpful narrative.

  7. Perze

    can i play with the clicky thing? i somewhat need it to click around my qa environment to generate needed traffic … and to take over the world with my friend pinky.

    Trish:
    Sorry Perze, I don’t have plans to make this tool available right now. But I’d highly recommend making your own anyway, then you can tailor it to do whatever you want. It only took me a couple of days to put this one together, so it’s not that much of a time investment.

  8. Pingback: QA Hates You » Blog Archive » Things I Learned From Forbes (I)

  9. Paul Berry

    If it’s just link following/exploring you want, Xenu does all this already, and very rapidly. The interesting exercise for the user is to write something that harnesses its power. Put one url into Xenu, stir, and you get an output file which can then be mined for all sorts of interesting metadata about the site under test.

    http://home.snafu.de/tilman/xenulink.html

  10. Pingback: A Smattering of Selenium #45 « Official Selenium Blog

  11. Pingback: Show #5 – Pass cakes and fail cakes | Testcast is a software testing podcast with Bruce Mcleod and Trish Khoo

Leave a Reply