Website testing - creating a Restfile from sitemap.xml

I have been toying with the idea to self-host my website for a while now. Probably the biggest deterrent is losing the little search traffic that I’m getting due to changing URLs or breaking links between pages.

Of course, you can set redirects if your page naming scheme changes but it’s finicky to find all the links and then ensure they’re still working.

What I’d really like to ease my mind and embark on this operation would be a list of all those links on my website and a script to run through and test them.

As the author of Rester, a tool to test web APIs and URLs, I’ve actually got a pretty good idea how I would deal with part 2 of the problem. Rester can run a yml snippet like the following to test a link:

requests:
  path blog:
    url: https://finestructure.co/blog
    validation:
      status: 200

Under requests: you can list a number of requests that are made and the validation: section within each will assert the status code, 200 in this case. (Rester allows much deeper content validation than that but checking the status code is all we need in this case.)

Running this yml snippet with Rester will give you following output:

$ rester blog.yml
🚀  Resting blog.yml ...

🎬  path blog started ...

✅  path blog PASSED (1.298s)

Executed 1 test, with 0 failures

As mentioned above, there can be a whole list of requests under the requests key and if I simply listed all my site’s URLs I’d have a simple, automated way to ensure everything is working as expected after a transition.

This is when I realised that my website actually exports a list of all URLs – via the site map at https://finestructure.co/sitemap.xml, and that’s probably true for many if not all hosted providers.

Now given that the sitemap XML is a documented format that is straightforward to parse, it is easily possible to create a Restfile listing all urls for a website for testing.

You can find a Swift script doing just that on Github:rester-sitemap.swift. If you run this script with the url of a sitemap.xml file it will print a Restfile to standard output (which I redirect to a file fs.yml here):

curl -O https://raw.githubusercontent.com/finestructure/Rester-sitemap/master/rester-sitemap.swift
swift rester-sitemap.swift https://finestructure.co/sitemap.xml > fs.yml

This file can then be run with rester to ensure all your endpoints are working as expected:

$ rester fs.yml
🚀  Resting fs.yml ...

🎬  path /about started ...

✅  path /about PASSED (0.995s)

🎬  path /all-posts started ...

✅  path /all-posts PASSED (0.854s)

🎬  path /blog started ...

✅  path /blog PASSED (0.920s)

🎬  path /blog/2008/11/07/apple-digital-media-library started ...

✅  path /blog/2008/11/07/apple-digital-media-library PASSED (0.851s)

🎬  path /blog/2008/8/31/iphone-sms-to-mail-synchronization started ...

✅  path /blog/2008/8/31/iphone-sms-to-mail-synchronization PASSED (0.939s)
...

And there you have it, an automated way to check all your links are working as expected.

Did you find this post useful? Do you have any questions or comments? Please get in touch and let me know via Twitter or email.