SEOmoz launched a new labs tool last night called Custom Crawl. While I'm waiting for the data to arrive for my first crawl (and following - what to do with it and how) I thought I'd share the roundup here.
http://www.seomoz.org/labs/cc This tool sends out a crawler (identified as rogerBot) to the given domain and starts crawling each link it finds at the given URL. The results of this job are then returned to you via the e-mail address that is associated with
your SEOmoz account. These results include:
URL
- The URL of the crawled page.
Crawl Time
- The time (GMT) our crawler crawled the given page.
Http Status Code
- The HTTP Status code of the given URL
(
http://www.seomoz.org/knowledge/url) returned
Title
- The Title Element (
http://www.seomoz.org/knowledge/title-tag) of the given page
Meta Description Tag
- The Meta Description
(http://www.seomoz.org/knowledge/meta-description) of the given page.
Outgoing link count
- The total amount of links on the page
URLs with duplicate titles (up to 5)
- The URLs of pages that have identical title elements
(http://www.seomoz.org/knowledge/title-tag)
URLs with possible duplicate content (up to 5)
- The URLs of pages that have similar content
(http://www.seomoz.org/knowledge/duplicate-content)
X-Robots-Tag Header
- Is an x-robots tag present in the http header?
(http://seogadget.co.uk/using-the-x-robots-tag-in-server-headers-on-wordpress/)
Meta Robots Tag
- The value of the meta robots tag
(http://www.seomoz.org/knowledge/robotstxt)
Content-Type Header
- The type of content as returned by the given HTTP header
301/302 Target
- The target page (if applicable) of a 301 or 302 redirect
Meta Refresh Target
- The target page (if applicable) of a meta refresh
http://www.seomoz.org/knowledge/redirection Rel Canonical Target
- The value of the rel-canonical element.
http://www.seomoz.org/knowledge/canonicalization As soon as I've played with the tool (I'm off to Norway in about 20 minutes!) I'll post about it on SEOgadget - at first glance though, this
looks like an important step in the right direction and it will be very interesting to see how this tool develops as it graduates from labs.