XML sitemaps optimisation

Technical SEO

XML sitemaps are an important part of the Technical SEO skill set you’ll need to know if you want your SEO to go to the next level. In this article, we’ll talk about what is an XML sitemap and how it will help your SEO? How can you build your XML sitemap? As well as how to submit your XML sitemap to Google.

Contents

XML sitemaps for SEO

What is an XML Sitemap?

First of all, let’s talk about what an XML sitemap is. Simply put, an XML sitemap is a file containing a list of URLs on your website that you want to have indexed in Google.

Below is a very basic XML sitemap.

basic xml sitemap example

Why is a sitemap important for SEO?

With an XML Sitemap, you can help Google better crawl, understand and give relevance to the different pages on your website. Kind of like a map telling Google where everything is on your website.

You’re making it easier for Google to do its job by giving it a map to crawl. Google will crawl a sitemap to find all the relevant pages, URLs, and resources on your website. This makes it easier for these pages to be indexed in Google’s search engine.

Do you need an XML Sitemap?

You may not actually need to have a sitemap for your website. Google doesn’t specifically require a sitemap for your website to be crawled and indexed. If your site is smaller with only a few URLs all linked together properly, you may not need to bother with a sitemap. However, it’s a good idea to have one anyway, for SEO reasons.

Keep in mind, this won’t guarantee all your pages get crawled, indexed, or have better rankings. (link:) Google has told us – “Using a sitemap doesn’t guarantee that all the items in your sitemap will be crawled and indexed, as Google processes rely on complex algorithms to schedule your crawling. However, in most cases, your site will benefit from having a sitemap, and you’ll never be penalised for having one.”

If it’s not much work for you, it’s really worth having an XML sitemap, no matter how small your website is. Personally, I’ve seen search results improve when a properly implemented and well-optimised sitemap has been implemented. So, it’s definitely worth having an XML sitemap if you want to see SEO results.

Building an XML sitemap

Check if you already have a sitemap.

Firstly, check to see if your website already has one. Just go to your website, type in /sitemap.xml, and see if the XML sitemap comes up. In our case, we’re having a look at the Semrush.com site. So we’ll type in https://www.semrush.com/sitemap.xml into the browser address bar.

semrush xml sitemap example

Use Yoast to build your sitemap

Using Yoast to build and maintain your XML sitemap is an excellent option for your WordPress website because it updates on the fly as content is changed, added, and removed. This automatically makes sure that your sitemap is up to date, allowing you to give the most relevant information to Google. Later in this article I will explain how to use Yoast to generate your sitemap

Manually create a sitemap

You may need to manually create a sitemap if your website does not have a way to generate one automatically. There are lots of tools available for this. Further in this article, I will explain how to Generate an XML Sitemap with Screaming frog.

Automatically generated sitemaps

If you’re using another CMS like Shopify, or Squarespace, then you’ve already got a sitemap because they have provided you with an automatically generated one. It will usually live in the root directory of your site, /sitemap.xml. You don’t usually have much control over these automatically generated sitemaps.

Submitting your sitemap to Google

Submitting via Google Search Console

Use Google Search Console to submit your XML sitemap. Just go to the sitemaps report. You’ll see there’s a little section where you can type in the URL for your sitemap. Enter your sitemap URL, and then submit it.

submit xml sitemap to google search console

Processing your sitemap and checking for errors

That’s it, all done! Oh, wait… You’ll need to give it some time for Google to validate and process the data, of course.

Once processed, this will either show you a little message saying that it’s a success, or if there are any errors, et cetera. You’ll want to keep an eye on the XML sitemap report and check it for errors over time. Fixing errors ASAP to make sure you get the best search results from your sitemap optimisation efforts.

Optimising for better Google results

Clean up sitemaps for a better crawl budget

Cleaning out bloated sitemaps and removing unnecessary URLs can help you get better SEO results by improving your crawl budget. We know crawl budget is a big deal, and if our XML sitemaps are bloated with unnecessary URLs, then it can affect the way that Google crawls your website. It’s definitely worth cleaning up your XML sitemap to improve your crawl budget.

Include your sitemap in your robots.txt file

This helps Google discover it, and it also shows trust and ownership.

sitemap in robots.txt

Submit using the Google ping service

You can also submit your sitemaps directly using the Google ping service by entering it directly into your browser as shown below.

Directly submit your XML sitemap using the Google ping service. Enter this URL into your browser & replace the value with your own website sitemap URL to your sitemap, and hit enter.

submit xml sitemaps using ping service

Advanced XML Sitemaps Tips

I’ve put together a list of advanced tips to help you quickly make sure you are optimising as best you can and addressing issues as quickly as possible.

XML sitemap priority and change frequency

You can set the priority and the change frequency of an XML sitemap. Something to be noted here is that the priority and change frequency in an XML sitemap may be ignored. Google came out and said they may ignore the priority and change frequency. However, we don’t always just do everything because Google said something.

XML sitemaps guidelines

  • Please list only your canonical URLs. That means any URLs that are going to a 200 OK, the status code of 200 OK. Only use fully qualified URLs.
  • Don’t use shorthand and don’t put any other messy URLs in there. Just use fully qualified URLs. Publish your sitemap in your root directory and do not include session IDs in your sitemaps.
  • Make sure to exclude utility pages, admin URLs, or any other URLs that you don’t want to have found in Google search results.
  • Including hreflang annotations in sitemaps is also an excellent thing to do if you have a multi-language website.
  • XML sitemaps must be UTF-8 encoded.
  • The maximum size for an XML sitemap is 50,000 URLs and 50 megabytes. You can use gzip to save bandwidth. This will help to make Google crawl an XML sitemap faster. You can also use an index file for better sitemap management. Submit your index file in your robots.txt and also in Google Search Console.
  • Getting a good understanding of Google Search Console sitemap errors will greatly help you optimise your XML sitemaps for better SEO results.
  • It’s really worth considering the crawl budget, so keep your XML sitemap lean.

Discover your new URLs faster

Perhaps one of the biggest optimisation tips is that XML sitemaps are a great way to help Google discover and index new pages on your site faster. So make sure your XML sitemap is in good order and well optimised.

large sitemaps and index files

What do you do with large sitemaps and index files? You can split large sitemaps of over 50,000 URLs and over 50 megabytes. Split them and use an index file for multiple sitemaps. That way, you can submit many XML sitemaps together. Shown below is a picture of a very simple XML sitemap index file.

xml sitemap index file basic example

Multi-website sitemaps

In this case, we can also use an index file. It lists verified sites’ sitemaps. You can place the index file on one verified website and include your multiple websites sitemaps in that index file. Here, in the diagram, you can see we have five websites.

google multi site sitemaps example

The sitemap’s URLs included in the index file will be stored in the website root that you want to have used for the index file, which contains the other sitemaps. So you’ll need to prove ownership of the other websites. You can do this by placing your index file in your robots.txt file.

You can also go to Google Search Console and verify those other websites and submit them that way. Once you’ve done this, you submit your XML sitemap index file, and this will get Google to crawl and process all of your other sitemaps. You can include URLs from various sites if one file ownership is verified. More here from Google on multi-site xml sitemaps.

cross site xml sitemaps need-verification in search console

Video sitemaps

You can create video sitemaps as a separate file, or you can include them in an existing sitemap. Each entry is for a video on a page. This supplies detailed video information. Google can’t guarantee when or if your videos will be indexed because Google relies on complex indexing algorithms. Here you see an example of a video sitemap.

video sitemap basic example

Image sitemaps

You can create an image sitemap as a separate file or include it in existing sitemaps. Each entry is for images on a page. You can include up to 1000 images per URL. For example, a sitemap with one URL and two images, this is a basic example here.

basic image sitemap example

Google News sitemaps

It’s really helpful and important to have a Google News sitemap if you’re a publisher. You can include articles published in the last two days, and you can remove articles after two days, keeping your XML sitemap fresh and rotating those articles through. News articles will remain indexed for 30 days. There can be up to 1000 URLs per sitemap. You can use the sitemaps index file for larger sitemaps, and you can include 50,000 sitemaps in a single sitemap index file. Here’s another example. Here is an example of a basic Google News sitemap.

basic news sitemap example

WordPress sitemaps with Yoast

A speedy and easy way of doing this, of getting an XML sitemap set up for WordPress, is to simply use the Yoast plugin. Yoast uses a sitemap index file, and this is in the form of sitemap_index.xml. Yoast also updates your sitemap automatically as content changes. Post types that are marked as noindex will not appear in your sitemap.

This is a good way of controlling which URLs appear in your sitemap. The sitemap feature can be toggled on or off because you may prefer to use the built-in XML sitemaps that are now available from WordPress Version 5.5. You can find your XML sitemap by navigating to your Yoast and as included in the picture below.

yoast xml sitemaps

Screaming Frog XML sitemap generator

How do we use the Screaming Frog XML sitemap generator to create sitemaps from a website crawl?

screaming frog xml sitemap-generator

To create sitemaps from a website crawl.

  • You can include canonical pages with a status code of 200 OK.
  • You can also include images on a CDN, or a subdomain can be included as well.

To find out more about using the Screaming Frog XML Sitemap Generator go here.

Other Sitemap Generators & APIs

There are many other sitemap generators available and APIs we can take advantage of, far too many to mention them all. But if you want to get started, you can look at Google’s list of (link: sitemap generators as we can see here).

We can also look at Semrush’s list of top 10 XML Sitemap generators.

If you’re interested in the Google API for submitting sitemaps, go here to the documentation.

XML sitemap errors

Many things can go wrong in XML Sitemaps, and a lot of errors can develop, and it’s important to address these errors as soon as possible.

You’ll need to take into particular consideration any errors caused by URL issues and canonicalization issues, as these will prevent crawlers from completing the site crawl effectively.

You can use the Semrush Audit tool to check for and fix any XML sitemap errors. In the graph below, we can see some data provided by Semrush which shows the most common XML sitemap issues. The below data was provided by Semrush and has been taken from 200,000 websites.

semrush audit tool xml sitemap common errors

Here we can see a number of issues:

  • 30.35% of the websites tested had incorrect pages found in the sitemap.xml,
  • 17.41% had a sitemap XML file not found,
  • 84.88% of all websites checked had orphaned pages in the sitemap,
  • 48.7% had a sitemap XML not specified in the robots.txt, and
  • 2.97% had invalid sitemap XML format,
  • 4.45% had HTTP URLs in XML sitemap for HTTPS site, and
  • 0.06 had a sitemap that was too large.

(Table of data)

common xml sitemap issues Issue type Sites with errors
Incorrect pages found in sitemap.xml error 30.35%
Sitemap.xml not found warning 17.41%
Orphaned sitemap pages notice 84.88%
Sitemap.xml not specified in robots.txt warning 48.77%
Invalid sitemap.xml format error 2.97%
HTTP URLs in sitemap.xml for HTTPS site warning 4.45%
Too large sitemap.xml error 0.06%

Putting it into perspective, this data comes from over 200k websites. We’re actually seeing quite a lot of optimisation problems, or opportunities as SEOs. If you can do the work of optimising your XML sitemaps, then you could get ahead of your competition.

Fewer errors with automated sitemaps

Automated site maps tend to have fewer errors, so it’s worth spending the time to get your XML Sitemap automated rather than manually creating it.

It’s worth diagnosing these errors with tools such as Google Search Console, Screaming Frog, and Semrush.

Get started with XML Sitemaps today

XML Sitemaps are an integral and fundamental aspect of Technical SEO. You will want to get a good handle on how to build, submit and troubleshoot them if you want to really kick your SEO into gear.

As I have demonstrated above, there are many ways to approach this, which is typical of many things in SEO. Whatever way you choose to approach this task, it’s worth remembering that you need to do the work if you want to get the results. So dive in and get started, and reach out to us if you have trouble. We are always happy to help.