2.20.5.1. Sitemap sitemap.xml

Site map sitemap.xml Is a very useful file that allows you to show search engines a list of pages to be indexed in a standardized form. In the file itself, a certain syntax must be maintained, and all the necessary and important pages of the site must be described.

A description of the protocol is also available at official website.

Important points:

  • File sitemap.xml should have exactly this name and its encoding should be UTF-8.
  • One file sitemap.xml must not be more than 50 MB in size. If the file is more than 50 MB, then you should either use the file archiving (with the obligatory observance of the extension of the form xml.zip or xml.tar), or create a group of several sitemaps.
  • In one file sitemap.xml there should be no more than 50,000 links.
  • File sitemap.xml should be in root directory of the site... That is, it must be accessible through the browser at an address of the form http://www.example.com/sitemap.xml.
  • All links indicated in the sitemap must be absolute, that is, they must look something like this: http://www.example.com/.
  • The sitemap must meet the requirements of the desired search robot, since some of them have certain conditions for using this file.
  • The sitemap used by the search engine crawlers is only a recommendation. Robots can ignore it in case of errors in the map itself or for other reasons of their own.
  • Some special characters must be required disguised.

When creating a sitemap, you need to adhere to a certain syntax. A minimal sitemap with the correct syntax looks something like this:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <url>
      <loc>http://example.com/</loc>
   </url>
</urlset>

In file sitemap.xml the following tags apply:

  • <?xml version="1.0" encoding="UTF-8"?> Is the prologue of the XML file. This line specifies the XML encoding and version. This line should always be the first and is required. Required tag
  • <urlset>...</urlset> — the parent tag, inside which all subsequent instructions to the site pages are placed using tags <url>. Required tag
    The opening tag must indicate the current protocol, that is, like this:
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">...</urlset>
  • <url>...</url> — a tag that contains the URL itself and information about it. Required tag
  • <loc></loc> — a tag that specifies a specific URL. Required tag
  • <lastmod></lastmod> — the date of the last change. Optional tag
  • <changefreq></changefreq> — the likely frequency of changes to this page. This tag is for guidance only. Optional tag
    Valid values:
    • always — check changes at each indexing.
    • hourly / daily / weekly / monthly / yearly — check changes at a certain interval. Every: hour / day / week / month / year.
    • never — never check for changes.
  • <priority></priority> — priority of the URL relative to other URLs specified in the sitemap. The value is set from 0.0 to 1.0, the default for all URLs is 0.5. Optional tag

    Attention!

    The priority tag does not affect the search results. Its value only affects the indexing queue between site pages.

In XML files, all data (including URLs) must use the character escapes listed in the table below.

Symbol Masking
Ampersand & &
Single quotes ' '
Double quotes " "
More > >
Smaller < <

If the file sitemap.xml has a size of more than 50 MB or includes more than 50,000 links, then it should be divided into several files, while creating sitemap.xml file leading to other sitemap files.

Example sitemap index file:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <sitemap>
      <loc>http://www.example.com/sitemap1.xml</loc>
   </sitemap>
   <sitemap>
      <loc>http://www.example.com/sitemap2.xml</loc>
   </sitemap>
</sitemapindex>

The sitemap index file has the following syntax:

  • <?xml version="1.0" encoding="UTF-8"?> Is the prologue of the XML file. This line specifies the XML encoding and version. This line should always be the first and is required. Required tag
  • <sitemapindex>...</sitemapindex> — parent tag, inside which all subsequent references to sitemap files are placed. Required tag
  • <sitemap>...</sitemap> — a tag that contains a URL pointing to the sitemap file and information about it. Required tag
  • <loc></loc> — a tag that specifies a specific URL to the sitemap file. Required tag
  • <lastmod></lastmod> — date of the last change. Optional tag

Examples of services used to generate and validate sitemaps.

Attention!

Hosting.XYZ LTD has nothing to do with these services and cannot recommend a specific tool for carrying out certain actions.
Content