Oct
10

How to Manage and Split large XML Sitemap files?

10/10/2020 8:00 PM by Admin in Seo


We know big sites like eCommerce, news, and other websites containing a large number of pages. To submit 50k pages or 1 million XML sitemap pages on google search console is not an easy task, so the best way is to split the sitemap files into 500 or 1000 or 5000 or 10000 pages each sitemap file.

It depends on how many pages need to be there in each XML sitemap file, but follow the same number of pages in each of them. So that these individual XML sitemap files will be able to crawl by google bots and start indexing the URL’s.

Even Google, yahoo/bing have similar restrictions on how to split XML files, and the maximum size of XML files can be handled by webmaster tools. Many XML sitemap generator tools can give a maximum of 1000-5000 pages but they can’t split into many sitemap files. So let us discuss in detail on how to manage and split large XML sitemap files.

How to manage and split large XML sitemap files?

How to manage and split larger XML sitemap file

Before splitting the XML files have a visualization of the sitemap on how it should be viewed and what are the sublevels of the sitemap required. So let us discuss the process of managing and splitting a larger sitemap file.

1) Keep ready the larger sitemap file at the beginning

The first step is to keep ready a larger XML sitemap file by crawling the website and making sure the pages are not broken. After crawling the site, If your site has 1 million pages than it is required to know how to manage the XML file and have a general view on what you are working with.

According to the sources, the maximum sitemap can be created is 50k pages, but for good SEO practices, we need to make it simple and easy for bots to crawl.

2) Categorize the XML files that you want to split

Now know how many categories you need to split, it is either into 5 or 10 categories based on the site hierarchy structure. Suppose the site has 5 categories like articles, news, products, events, and services. Each sitemap should be well structured in a hierarchy.

Let's take an example of these 5 categories, if your site has 50000 pages of articles then it is required to split into many 10000 pages sitemap files. So you can name it as articles.xml, news.xml, products.xml, events.xml, and services.xml etc. Now split the xml files for other categories as well.

https://example.com/articles.xml or https://example.com/articles-sitemap.xml

https://example.com/news.xml or https://example.com/news-sitemap.xml

https://example.com/products.xml or https://example.com/products-sitemap.xml

https://example.com/events.xml or https://example.com/events-sitemap.xml

https://example.com/services.xml or https://example.com/services-sitemap.xml

3) Now create a Sitemap Index file

Here we are creating a Sitemap index file, the format looks very simple to a normal sitemap file but uses these tags <sitemapindex>, <sitemap>, and <loc>. There is no restriction while naming the file, but it files should close with .xml format

Here is the below Sitemap Index file code,

<?xml version="1.0" encoding="UTF-8"?>

  <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

    <sitemap>

      <loc>http://www.example.com/articles.xml</loc>

    </sitemap>

    <sitemap>

      <loc>http://www.example.com/news.xml</loc>

    </sitemap>

    <sitemap>

      <loc>http://www.example.com/products.xml</loc>

    </sitemap>

    <sitemap>

      <loc>http://www.example.com/events.xml</loc>

    </sitemap>

    <sitemap>

      <loc>http://www.example.com/services.xml</loc>

    </sitemap>

  </sitemapindex>

4) Naming the Standard XML sitemap file

Once the sitemap index file is created, now its time to name the file. Keep the index sitemap has a primary sitemap with the name sitemap.xml. For categories keep the above as a standard name or change according to your wish. 

5) Upload Sitemap Files and Submit sitemap to google search console

First double-check all the sitemap files are properly created without any errors by validating them. Next, you need to make sure all the files including index sitemap files and other sitemap files to be uploaded to the top root directory without any mistake.

Now submit each individual sitemap to the Google search console by clicking the sitemap button. If any errors do not appear, then the XML sitemap file is submitted successfully, or else the XML sitemap file has an issue that needs to be taken care of.