Loopthing is now a big site, there are over 150.000 pages waiting to be indexed. In order to achieve this indexing as fast as possible, a sitemap of sitemaps has been placed inside the robots.txt file of the service.
I’m going to extract feedback on how Google and Yahoo! manages to process all these feeds and maybe compile a technical report early next year.
The whole process is also supervised thank to Google’s Webmaster Tools service, and Yahoo’s Site Explorer.
The process of indexing the whole site, jumping from link to link it’s very difficult, even for today’s performant spiders. The sitemaps.org website has been put together with help from major search engine market players (Google, Microsoft and Yahoo!) to address this issue.
The end result is enabling webmasters suggest search engines where their pages are located. End result is that indexing becomes a much faster process.
Some reading on sitemaps:
- sitemaps.org – the sitemaps standards home page (ironically, it doesn’t have a sitemap)
- sitemaps on wikipedia (syndicated information from everywhere, condensed in a single page)


