Sitemap is a special file you create to describe the structure of your website to help Google kow more about your pages and how each page should be treated while Google indexes them. Essentially, sitemap is an XML file which lists all the pages you would like Google to index, along with optional parameters specifying how important or dynamic each page is.
Advantages of using sitemaps
- Website structure This is probably the best advantage of using sitemaps. Sitemap file allows you provide a full list of all the pages from your website, and Google crawler will go through the list to index all the pages. What this means to you is that this is a really good chance to make up for the possibly not perfect design of your website. For instance, if you have orphaned pages with valuable content (which you shouldn't, it's a really bad practice), they will not likely to be indexed as there are no pages linking to them. But listing such orphaned pages in your sitemap will allow Google index them anyway.
- Specify how important each page is
Instead of letting Google automatically decide which pages are most important, you have an option to suggest your opinion – it will not make the decision, but will be taken into consideration by Google's indexing mechanism.
You have an option to give a rating of 0.1 to 1.0 to any page of your website, the bigger number meaning the more important page.
- Speed up the discovery of your pages
This is another very important advantage you will gain by using a sitemap for your website. Sitemaps are scanned by Google on a regular basis, and so by adding your new pages to your sitemap, you will help Google discover these pages quicker, because you will not have to wait for one of Google crawlers to come back with a full scan of your website which doesn't happen this often. Instead you will be pointing Google crawlers to exactly the pages you've added, suggesting that they should be scanned as soon as possible.
Structure of a sitemap file
The sitemap file will consist of your descriptions of every page you have on your website in XML representation. This means, each URL will be described with something like this:
It may look crypting at first, but it's actually pretty easy to read this format. Each <url> block describes various parameters for each of the URLs of your website you want Google to know about. The following parameters are found in this example (for simplicity, you can just treat each tag name like a parameter or an option):
- loc – the URL of some page on your website
- lastmod – the last modification field, this will help Google crawler to see if a newer page exists on your website and whether it should be downloaded and indexed again.
- changefreq – the change frequency parameter, specifying how often this particular page is updated by you. For instance, some of your pages could be updated hourly, while others will remain unchanged for moths. This parameter allows you to tell Google how often you'd really like the Google crawler to come back and check this particular page for updated. Setting this parameter WILL NOT guarantee your page to be checked so regularly, but it make a suggestion which Google crawler will take into consideration. This parameter may be one of the following: always (most frequently updated), hourly, daily, weekly, monthly, yearly or never.
- priority – specifies your rating of how important this page is. Like I said, it could be anything between 0.1 and 1, with the steps of 0.1.
Don't worry about sitemap file being so complex – there are many automatic sitemap generators around, both online and plugin ones. For WordPress users, I suggest you have a look at the Google Sitemap Generator plugin.
More information on sitemaps found at Perfect Blogger