<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Indexing &#8211; SEO Blog Punto Rojo</title>
	<atom:link href="https://puntorojo.com/blog/en/indexing/feed/" rel="self" type="application/rss+xml" />
	<link>https://puntorojo.com/blog/en</link>
	<description></description>
	<lastBuildDate>Wed, 03 Aug 2022 21:53:08 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=5.8.2</generator>
	<item>
		<title>What is indexing for and how to optimize it?</title>
		<link>https://puntorojo.com/blog/en/what-is-indexing-for-and-how-to-optimize-it/</link>
		
		<dc:creator><![CDATA[Ramon Ruiz]]></dc:creator>
		<pubDate>Fri, 20 May 2022 15:00:46 +0000</pubDate>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Indexing]]></category>
		<category><![CDATA[optimization]]></category>
		<guid isPermaLink="false">https://puntorojo.com/blog/en/?p=6669</guid>

					<description><![CDATA[<p>One of the concepts that you surely may have heard when talking about SEO is “Indexing”. Indexing is one of the most important elements that we should take care of for our site, that is because a web that is not indexed is not visible in the Google search results or any other search engine.&#8230;</p>
<p>La entrada <a rel="nofollow" href="https://puntorojo.com/blog/en/what-is-indexing-for-and-how-to-optimize-it/">What is indexing for and how to optimize it?</a> aparece primero en <a rel="nofollow" href="https://puntorojo.com/blog/en">SEO Blog Punto Rojo</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><span style="font-weight: 400;">One of the concepts that you surely may have heard when talking about SEO is “Indexing”.</span></p>
<p><span style="font-weight: 400;">Indexing is one of the most important elements that we should take care of for our site, that is because a web that is not indexed is not visible in the Google search results or any other search engine.</span></p>
<p><span style="font-weight: 400;">In the same way, if you have indexed content that shouldn&#8217;t be shown in Google, this may probably affect your ranking.</span></p>
<p><span style="font-weight: 400;">Let&#8217;s see a more specific definition of indexing, followed by some advice on how to optimize it. Will you join me?</span></p>
<h2><span style="font-weight: 400;"> </span><span style="color: #0000ff;"><b>What is Indexing? </b></span></h2>
<p><span style="font-weight: 400;">To be able to define indexing, first, we have to know what is the process that Googlebot (based on the fact that Google is the most widely used search engine) uses to make a URL visible in its search results:</span></p>
<ol>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Crawling: Google uses “spiders” to navigate the different sites. The path can be forced or not. It’s forced when we do some action to get this crawling done (we will see this later), while in some sites it enters regularly.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Indexing: once this navigation is done, Google decides if it shows content in its search results or not. In case it does, your URL or web will be “indexed”. The speed at which content is indexed can be a determining factor, especially on news websites.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Publication: Google classifies this content and assigns you a position in its ranking for the different queries.</span></li>
</ol>
<p><span style="font-weight: 400;">With all this in mind, we define indexing as the process in which search engines find, analyze your content, store it in their database and then give it visibility in search results.</span></p>
<h3><span style="font-weight: 400; color: #0000ff;">Difference between indexing and crawling</span></h3>
<p><span style="font-weight: 400;">Google indexes what it considers relevant, that&#8217;s why a crawled page it’s not necessarily an indexed page.</span></p>
<p><span style="font-weight: 400;">In the same way, we can make Google crawl some of our URLs and indicate that they are not indexable.</span></p>
<p><span style="font-weight: 400;">We will see this later, but meanwhile, think which of your site pages shouldn’t be showing in the search engine and which should.</span></p>
<p><span style="font-weight: 400;">A practical case is the legal notices and privacy policy pages. Should they be indexed? No, not at all. If you copied and pasted a privacy policy on your website from a third party and that URL is indexed you are incurring duplicate content.</span></p>
<h2><span style="color: #0000ff;"><b>How to see what is your indexing status?</b></span></h2>
<h3><span style="font-weight: 400; color: #0000ff;">1. Command “site:” in Google</span></h3>
<p><span style="font-weight: 400;">If you want to know how many of your site URLs are appearing in Google you can just use the command site:yourdomain.com on the search bar.</span></p>
<p><span style="font-weight: 400;">As a result, you will get the total indexed URLs, as you can see in the following example:</span></p>
<p><img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2020/07/Screen-Shot-2020-07-15-at-11.57.21-1024x671.png" width="519" height="340" /></p>
<h3><span style="font-weight: 400; color: #0000ff;">2. Google Search Console Tools</span></h3>
<p><span style="font-weight: 400;">Knowing the status of indexed pages from Google Search Console it’s very simple. To see it, go to the left sidebar </span><span style="font-weight: 400;">&gt; Coverage &gt; &#8220;Valid&#8221;.</span></p>
<p><img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2020/07/Screen-Shot-2020-07-15-at-15.12.44-1024x373.png" width="699" height="255" /></p>
<p><span style="font-weight: 400;">Search Console itself will also show you which URLs are not being displayed due to an error or if it is valid but with warnings.</span></p>
<p><span style="font-weight: 400;">Usually, this happens when for some reason your URL was crawled, Google tries to index it but some problem prevents it. Some of the most common problems are errors 4xx, 5xx or that the URL is included in the Sitemap but is not indexable, as we can see below:</span></p>
<p><img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2020/07/Screen-Shot-2020-07-15-at-15.13.30-1024x771.png" width="628" height="473" /></p>
<p><span style="font-weight: 400;">You will see the total number of URLs with issues and, if you click on them, it will list them. You can export this list and then work on Excel or Spreadsheets.</span></p>
<p><span style="font-weight: 400;">You can also use the “URL Inspection” tool to verify a specific page:</span></p>
<p><img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2020/07/Screen-Shot-2020-07-15-at-15.15.49-1024x336.png" width="752" height="247" /></p>
<h3><span style="font-weight: 400; color: #0000ff;">3. Seerobots Extension</span></h3>
<p><span style="font-weight: 400;">Important: more than knowing if the page is indexed, this tool shows you if it can be indexed or if there is a guideline that prevents it. </span></p>
<p><span style="font-weight: 400;">An easy way to know if a URL is being indexed or not, including your competitor’s websites or those you don’t have access to in Search Console, is by using the Seerobots Extension.</span></p>
<p><span style="font-weight: 400;">The disadvantage is that it is not available for Chrome, but you can still use it in Mozilla Firefox: </span><a href="https://addons.mozilla.org/en-US/firefox/addon/seerobots/"><span style="font-weight: 400;">https://addons.mozilla.org/en-US/firefox/addon/seerobots/</span></a></p>
<p><span style="font-weight: 400;">To use it you have to enter the URL you want to analyze and click on the extension. It will show you if it is crawlable and indexable: </span></p>
<p><img loading="lazy" class="" src="https://res.cloudinary.com/scuba-dive-argentina/image/upload/c_scale,w_830,h_654/f_webp,q_auto/v1639501061/Screen-Shot-2020-07-15-at-15.16.28.png?_i=AA" width="206" height="162" /></p>
<h3><span style="font-weight: 400; color: #0000ff;">4. Screaming Frog</span></h3>
<p><span style="font-weight: 400;">Very similar to the previous point, with Screaming Frog we will know which URLs may be indexable or not, but not know if they are indexed. </span></p>
<p><span style="font-weight: 400;">To do this open </span><span style="font-weight: 400;">Screaming Frog &gt; Add the URL of your website &gt; Start crawling. </span></p>
<p><span style="font-weight: 400;">Once finished you will see that in the columns will be the following &#8220;Indexability&#8221; and &#8220;Indexability Status&#8221;:</span></p>
<p><img loading="lazy" class="" src="https://res.cloudinary.com/scuba-dive-argentina/image/upload/c_fill,g_auto,w_848,h_381/f_webp,q_auto/v1639501058/Screen-Shot-2020-07-15-at-15.17.41.png?_i=AA" width="503" height="226" /></p>
<p><span style="font-weight: 400;">Remember that you can also export this list and then work on Spreadsheets or Excel. </span></p>
<h2><span style="color: #0000ff;"><b>How to improve your indexing?</b></span></h2>
<h3><span style="font-weight: 400;"><span style="color: #0000ff;">Use of Google Search Console for manual indexing</span> </span></h3>
<p><span style="font-weight: 400;">If you have just published a page and you want it to be displayed instantly in the search engine, you can use the Google Search Console tool to achieve it.</span></p>
<p><span style="font-weight: 400;">To complete it you just have to go to URL Inspection &gt; Insert your URL </span></p>
<p><img src="https://puntorojo.com/blog/wp-content/uploads/2020/07/Screen-Shot-2020-07-15-at-15.19.09-1024x237.png" /></p>
<p><span style="font-weight: 400;">You will get the message &#8220;The URL is not in Google&#8221;. In the bottom right corner, you will have the option to request indexing manually, click there, and wait for the next message:</span></p>
<p><img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2020/07/Screen-Shot-2020-07-15-at-15.20.29-1024x361.png" width="422" height="149" /></p>
<p><span style="font-weight: 400;">With this, you will have completed the manual indexing. From the rest, you just have to wait a few minutes to check with the &#8220;site:&#8221; command if the URL appears in Google or not. </span></p>
<h3><span style="font-weight: 400; color: #0000ff;">Sitemaps.xml </span></h3>
<p><span style="font-weight: 400;">Sitemaps are a file in .xml format that makes it easier for Googlebot to read and crawl the contents of your site. Before implementing it, make sure that all the URLs included in your sitemap have code 200 status. </span></p>
<p><span style="font-weight: 400;">Although there are different methods to create and implement it, among the most used are the following: </span></p>
<h3><span style="color: #0000ff;"><span style="font-weight: 400;">Sitemap with Screaming Frog</span></span></h3>
<p><span style="font-weight: 400;">This powerful tool can also help you generate your sitemap easily. To do this you must Crawl your</span><span style="font-weight: 400;"> site &gt; Sitemaps &gt; XML Sitemaps &gt; Next and it will</span><span style="font-weight: 400;"> automatically download a .xml file that you must then upload to your domain&#8217;s FTP. </span></p>
<p><span style="font-weight: 400;">If you want to know more details, check </span><span style="font-weight: 400;"><a href="https://www.screamingfrog.co.uk/xml-sitemap-generator/">this article</a>.</span></p>
<h3><span style="font-weight: 400; color: #0000ff;">Sitemap para WordPress</span></h3>
<p><span style="font-weight: 400;">If you use WordPress you can install a plugin that automates the creation of sitemaps on your site. There are some generalists in terms of SEO, such as Rank Math or Yoast, but you can also use one like &#8220;Google Sitemaps XML&#8221; which is the one I usually use. </span></p>
<p><span style="font-weight: 400;">To download it </span><span style="font-weight: 400;"><a href="https://wordpress.org/plugins/google-sitemap-generator/">click here</a>.</span><span style="font-weight: 400;"> </span></p>
<h3><span style="font-weight: 400; color: #0000ff;">Manual sitemap generation</span></h3>
<p><span style="font-weight: 400;">You can also build your sitemap manually, although this is the longest and most tedious way. </span></p>
<p><span style="font-weight: 400;">To complete it, follow the instructions provided by Google for its generation and implementation: </span><a href="https://support.google.com/webmasters/answer/183668?hl=es"><span style="font-weight: 400;">https://support.google.com/webmasters/answer/183668?hl=es</span></a></p>
<h3><span style="font-weight: 400; color: #0000ff;">Dynamic Sitemaps</span></h3>
<p><span style="font-weight: 400;">Dynamic sitemaps are often used on sites where the content published is constantly changing and Google must index your content accordingly. This is especially true for news portals. </span></p>
<p><span style="font-weight: 400;">To build it you can do it using PHP, Javascript, or Python. If you do not know about these languages we recommend that you ask a web programmer. </span></p>
<h3><span style="font-weight: 400; color: #0000ff;">Interlinking</span></h3>
<p><span style="font-weight: 400;">Google crawls content through links (internal or external). In case you publish a new page and it does not receive links, your URL will be &#8220;orphaned&#8221;, and those orphaned URLs are more complicated to index. </span></p>
<p><span style="font-weight: 400;">Therefore, a good recommendation is that once you publish new content, it should contain inbound links from others on the same site. </span></p>
<p><span style="font-weight: 400;">A plus would be to link that URL from the most visited pages or with more backlinks. </span></p>
<p><span style="font-weight: 400;">Although these are the most important, you can also boost indexing through tools with API Indexing (limited at the moment) or through Backlinks.</span></p>
<h2><span style="color: #0000ff;"><b>How to block and control the indexing of my content?</b></span></h2>
<p><span style="font-weight: 400;">Controlling the way your site is indexing is an excellent way to improve your site&#8217;s health and therefore boost your Google ranking. </span></p>
<p><span style="font-weight: 400;">For this there are different tools and guidelines that we can use to make a URL, category, file type or even an entire website disappear from search results: </span></p>
<h3><span style="color: #0000ff;"><span style="font-weight: 400;">Robots.txt</span></span></h3>
<p><span style="font-weight: 400;">The Robots.txt file tells the crawlers (not only Googlebot) which sections of the site it can and cannot go to. </span></p>
<p><span style="font-weight: 400;">If you want to know more about how this file is put together, see Google&#8217;s official instructions: </span><a href="https://support.google.com/webmasters/answer/6062596?hl=es"><span style="font-weight: 400;">https://support.google.com/webmasters/answer/6062596?hl=es</span></a></p>
<p><span style="font-weight: 400;">To prevent Google Bot from crawling any section of your website just apply the &#8220;Disallow&#8221; directive. </span></p>
<p><span style="font-weight: 400;">Let&#8217;s imagine that you want to prevent it from crawling the privacy policy, in that case, the directive to include would be the following: </span></p>
<p><span style="font-weight: 400;">User-agent: *</span></p>
<p><span style="font-weight: 400;">Allow: /</span></p>
<p><span style="font-weight: 400;">Disallow: /politica-de-privacidad/</span></p>
<p><span style="font-weight: 400;">Google takes all these guidelines into account as recommendations, and they are generally followed. However, there is a possibility that despite the blocking from robots.txt Googlebot decides to crawl anyway. </span></p>
<p><span style="font-weight: 400;">Another case in which crawling occurs despite blocking from Robots.txt is when a blocked URL receives internal links from other crawlable and indexable pages. </span></p>
<h3><span style="font-weight: 400; color: #0000ff;">Noindex label inside the meta robots </span></h3>
<p><span style="font-weight: 400;">Just as there are semantic tags such as &lt;h1&gt; or &lt;h2&gt; to contextualize your content, there is also a tag called &#8220;meta robots&#8221; that tells crawlers whether or not to crawl and index a page. </span></p>
<p><img loading="lazy" class="" src="https://res.cloudinary.com/scuba-dive-argentina/image/upload/c_fill,g_auto,w_848,h_183/f_webp,q_auto/v1639501051/Screen-Shot-2020-07-15-at-15.25.57.png?_i=AA" width="384" height="83" /></p>
<p><span style="font-weight: 400;">The following attributes can be added to this tag: </span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><b>Index:</b><span style="font-weight: 400;"> the URL can be indexed in the search engine.  </span></li>
<li style="font-weight: 400;" aria-level="1"><b>Noindex:</b><span style="font-weight: 400;"> the URL cannot be indexed in the search engine. </span></li>
<li style="font-weight: 400;" aria-level="1"><b>Follow:</b><span style="font-weight: 400;"> the URL can be crawled. </span></li>
<li style="font-weight: 400;" aria-level="1"><b>Nofollow:</b><span style="font-weight: 400;"> the URL cannot be crawled. </span></li>
</ul>
<p><span style="font-weight: 400;">If you want your content to be deindexed your meta robots will look like this: </span></p>
<p><span style="font-weight: 400;">&lt;meta name=“robots” content=”noindex”&gt;</span></p>
<p><span style="font-weight: 400;">This code snippet must be included between &lt;head&gt; tags of the URL you want to modify. </span></p>
<p><span style="font-weight: 400;">The only thing to keep in mind: be careful not to apply it to the whole site!</span></p>
<h3><span style="font-weight: 400; color: #0000ff;">URL removal tool</span></h3>
<p><span style="font-weight: 400;">Search Console once again allows us to take control and deindex content in a very practical and fast way. It also allows us to do it in bulk mode to remove hundreds of pages at the same time (in case you have a lot of URLs on your website).</span></p>
<p><span style="font-weight: 400;">To do this go to the left </span><span style="font-weight: 400;">sidebar &gt; Remove URLs &gt; New request &gt; Enter the URL you want to index and click &#8220;Next&#8221;. </span></p>
<p><span style="font-weight: 400;">In case you want to remove an entire directory, you can set the option &#8220;Remove all URLs containing this prefix&#8221;, so you can remove a whole segment of your website in one click.</span></p>
<p><img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2020/07/Screen-Shot-2020-07-15-at-15.27.22-1024x855.png" width="379" height="316" /></p>
<h2><span style="color: #0000ff;"><b>Final Recommendations</b></span></h2>
<p><span style="font-weight: 400;">Keeping track of your coverage in Google Search Console, looking at how many URLs are indexed in Google, and determining which pages you want to appear or not will be fundamental to controlling your indexing. </span></p>
<p><span style="font-weight: 400;">In the case of media (News portals), optimizing sitemaps to improve the content indexation is vital. In that case, you should implement an optimized sitemap for Google News, and I also recommend deleting sitemaps from previous years to improve the frequency of crawling and indexing.</span></p>
<p><span style="font-weight: 400;">And you, how do you control your indexing? Do you have any new methods for it?</span></p>
<p><span style="font-weight: 400;">If so, leave it in the comments!</span></p>
<p>La entrada <a rel="nofollow" href="https://puntorojo.com/blog/en/what-is-indexing-for-and-how-to-optimize-it/">What is indexing for and how to optimize it?</a> aparece primero en <a rel="nofollow" href="https://puntorojo.com/blog/en">SEO Blog Punto Rojo</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
