UMBC ebiquity research group Building intelligent systems in open, heterogeneous, dynamic, distributed environments
04 July 2008, 16:45:41 EDT  
Google sitemaps

Google sitemaps

By Tim Finin on Friday, June 3rd, 2005 at 7:07 pm.

Google has published a sitemap protocol allowing site owners to inform crawlers of the URLs on the site that are available for crawling. Since the URLs can include parameters, this allows a site to expose all or parts of its “hidden web”.

“A Sitemap consists of a list of URLs and may also contain additional information about those URLs, such as when they were last modified, how frequently they change, etc.

Sitemaps are particularly beneficial when users can not reach all areas of a Web site through a browseable interface — i.e. users are unable to reach certain pages or regions of a site by following links. For example, any site where certain pages are only accessible via a search form would benefit from creating a Sitemap and submitting it to search engines.

Please note that the Sitemap Protocol supplements, but does not replace, the crawl-based mechanisms that search engines already use to discover URLs. By submitting a Sitemap (or Sitemaps) to a search engine, you will help that engine’s crawlers to do a better job of crawling your site.”

You can also define relevant attributes for each URL including how often the URL changes, when it was last modified, and its priority relative to other URLs on the same site.

Li Ding defined a similar scheme for RDF documents some months ago as part of his work on Swoogle.

Related posts: • Web becomes giant brain;  • Google Maps;  • Google RSS Reader;  

 

 

One Response to “Google sitemaps”

  1. tim finin Says:

    Filip added a google sitemap page for the EBBlog.

Leave a Reply






UMBC