A website that is not indexed is as if it did not exist since it will not appear in search queries and will not get organic traffic. That’s why we can’t allow Google not to index our Website. Unfortunately, this is a fairly common problem.
Don’t panic! In this post, we will explain the main techniques for getting your Website indexed by Google.
Table of Contents
How does Google indexing work?
Google is based on somewhat complex algorithms, but the process it follows to index a page is simple. The search engine relies on a series of codes that we know as web spiders, crawlers, or bots.
These spiders inspect web pages to find new and updated content. This can be a new page on an existing site or a completely new Website. The bots start by crawling a few web pages and then follow the links of those pages to find new URLs.
A while back, Google SERPs were easy to ‘manipulate,’ and we could get Google to index a Website basically by its keywords and links.
Today the situation is entirely different. Although those keywords are still important, Google also gives great weight to the user’s experience and intention behind the search. We can say that Spiders are smarter now.
Google indexing refers to how the spiders process the data found on a page as they crawl it.
How to quickly get your Website indexed by Google
As we said, it is quite common for Google not to index a page, but the causes are usually the same.
You can solve them by applying the following solutions:
Verify that you have the proper robots.txt directives
A reason why Google may not be indexing your site could be because of the directives given in the robots.txt file.
To check it out, go to yourdomain.com/robots.txt, and look for any of these two codes:
User-agent: Googlebot
Disallow: /
User-agent: *
Disallow: /
Both indicate to GoogleBot that it should not crawl any page of the site. You need to remove them to solve the problem without further ado.
Remove the noindex Tags
Google will not index your page if you tell it not to do so. This could be useful to maintain the privacy of certain web pages.
There are two ways to tell the search engine not to index a site:
Meta tags
Google does not index pages with any of these meta tags in the <head> of the HTML:
<meta name=“robots” content=“noindex”>
<meta name=“googlebot” content=“noindex”>
You need to remove the “noindex” tags located in the pages you want Google to index. To do so, you can use an SEO Crawler such as FandangoSEO to identify all the pages with the noindex tag quickly.
X-Robots-Tag
GoogleBot also respects the X-Robots Tag directive. You can check if Google cannot crawl a page due to this directive using the URL Inspection tool from Google Search Console.
You should ask your developer to exclude the pages you want to index so that they do not return this header.
Use Google’s Indexing API or GSC
Sites with many pages with little content (ads, job offers, etc.) can use Google’s Indexing API to automatically request new content and content changes to be crawled and indexed.
The API allows individual URLs to be submitted. It is beneficial so that Google keeps the index of your page up to date. Thanks to this API you will be able to:
- Update a URL: Notify Google of a new or updated URL to crawl.
- Delete a URL: Inform the search engine that an outdated page has been removed from the site.
- Check the status of a request: See when GoogleBot last crawled the URL.
This is something you can also do with FandangoSEO. The tool allows you to request Google up to 200 URLs to be indexed at once.
Another way to accelerate your web page indexation is through Google Search Console. Use the GSC URL inspection tool to ask Google to recrawl your pages. This is useful for requesting the crawl of individual URLs. It allows a maximum of 12 URL submissions per day so if you need to submit more, it is better to use an XML sitemap.
To start, you will need to inspect the URLs through the URL Inspection Tool. Then select Request Indexing. The tool will first check if you have any indexing issues. If this is not the case, the URL will be queued for crawling.
Eliminate incorrect canonical tags
The canonical tag tells Google which version of a page is preferential. Most pages do not contain it, so search engines assume that they should index them. However, if your page has an incorrect canonical tag, it could be causing an error by informing Google about a preferred version that does not exist.
If you want to review the canonical tags on a website, use the Google URL Inspection Tool or an SEO Crawler. If you detect any page that should not contain the canonical tag, remove the tag from it.
Include the relevant pages in your sitemap
Your sitemap tells Google which pages are essential and which are not. Hence the importance of providing a sitemap to Google.
The truth is that the search engine should be able to find pages on your Website regardless of whether they are on the sitemap or not, but it is a good idea to include them since it ‘makes things easier’ for Google.
You can use the URL Inspection tool from GSC to check if a page is included in the sitemap.
Detect orphan pages
An orphan page is a page without internal links. Google discovers new content when it crawls the web, but logically it can’t find orphaned pages if there are no links to it. And neither can site visitors.
You can detect if there are orphan pages on a website by using an SEO Crawler. Learn more in our Orphan Pages Guide.
Fix internal links containing nofollow attribute
Nofollow links are those that contain a rel=”nofollow” tag and used to prevent the transfer of PageRank to the destination URL. Google did not track this type of link until the March 1 2020 Nofollow update, when they stated that this attribute now only works as a hint.
You should review your internal links to identify those that contain a nofollow attribute. If you want the target page to be indexed, you will need to remove the nofollow guidelines.
Add powerful internal links
If you want Google to index a page quickly, you can show how valuable this page is by providing a good amount of link juice (or authority) to it. You can do this by linking the page as close to your home page as possible. The number of inlinks that the page has also reveals its weight on your site.
Learn more about how to create a strong internal linking in our Guide.
Avoid Duplicate content
Google bots get confused if there is duplicate content. The search engine initially indexes only one URL for each unique set of content, so similar content makes it difficult for them to decide which version to index.
As similar content pages “compete” with each other, it negatively affects the performance of all of them. That’s why you must avoid duplicate content.
Make sure your page has value
It is not likely that Google will index low-quality pages, as they do not provide value to the user. Therefore, if there is no technical issue that explains indexing failure, the reason could be the lack of content value.
Ask yourself if the page is valuable and if it’s worth clicking on. If not, it would be necessary to improve its content. Always keep in mind the user intent.
As you can see, the point is to check that no technical issues hinder the indexing of the page. And once this has been ruled out, you need to look at whether it provides value to the user.
Last Updated on June 7, 2021 by Hannah Dango