<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data &#8211; SEO Blog Punto Rojo</title>
	<atom:link href="https://puntorojo.com/blog/en/data/feed/" rel="self" type="application/rss+xml" />
	<link>https://puntorojo.com/blog/en</link>
	<description></description>
	<lastBuildDate>Fri, 18 Nov 2022 13:57:48 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=5.8.2</generator>
	<item>
		<title>How to categorize your URLs, and your competitors’</title>
		<link>https://puntorojo.com/blog/en/how-to-categorize-your-urls-and-your-competitors/</link>
		
		<dc:creator><![CDATA[Dario Manoukian]]></dc:creator>
		<pubDate>Thu, 14 Jul 2022 15:39:50 +0000</pubDate>
				<category><![CDATA[digital marketing]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Data]]></category>
		<category><![CDATA[insights]]></category>
		<guid isPermaLink="false">https://puntorojo.com/blog/en/?p=6840</guid>

					<description><![CDATA[<p>One of the biggest challenges for an SEO always revolves around value, whether it&#8217;s how to show a client the value of the work done or how to extract valuable insights from our data. In today&#8217;s post, we&#8217;re going to look at how to get a little better at the second point. Google Search Console&#8230;</p>
<p>La entrada <a rel="nofollow" href="https://puntorojo.com/blog/en/how-to-categorize-your-urls-and-your-competitors/">How to categorize your URLs, and your competitors’</a> aparece primero en <a rel="nofollow" href="https://puntorojo.com/blog/en">SEO Blog Punto Rojo</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><span style="font-weight: 400;">One of the biggest challenges for an SEO always revolves around value, whether it&#8217;s how to show a client the value of the work done or how to extract valuable insights from our data. In today&#8217;s post, we&#8217;re going to look at how to get a little better at the second point.</span></p>
<p><span style="font-weight: 400;">Google Search Console allows us to easily download the list of URLs that bring the most clicks to our page. Now, what do we do with this data? As we have been seeing for a few months now, at puntorojo we like to extract as many insights as possible from our data. An insight is any information that allows us to make a decision.</span></p>
<p><span style="font-weight: 400;">Let’s start!</span></p>
<ol>
<li style="font-weight: 400;" aria-level="1"><b>Map the site to be analyzed: </b><span style="font-weight: 400;">This time we are going to use a mix of tools to be able to see which theme is central to the blogs from Ahrefs, Semrush, and Moz. To achieve this, we have to start mapping the blogs with some crawler (in this example we are going to use ScreamingFrog).</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Filter by URL type: </b><span style="font-weight: 400;">Once we have the list of URLs, we are going to filter everything that is a page to make sure that no other type of content is lifted (e.g. image URLs).</span><img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2019/09/Screen-Shot-2019-09-04-at-11.30.08-1024x547.png" width="594" height="317" /></li>
<li style="font-weight: 400;" aria-level="1"><b>Filter by subdirectory: </b><span style="font-weight: 400;">The next step is to filter the URLs to show only those that are from the blog. A simple way to do this is with the top right filter.<img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2019/09/Screen-Shot-2019-09-04-at-11.31.25-1024x320.png" width="599" height="187" />                                </span><span style="font-weight: 400;"><b>Advanced tip: </b>you can filter by regular expressions and not only text strings.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Pull up URLs that return 200: </b><span style="font-weight: 400;">Now we are going to sort the results by response code. Since we are going to work with URL slugs, we don&#8217;t want to pollute our analysis with old URLs.<img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2019/09/Screen-Shot-2019-09-04-at-11.32.19-1024x321.png" width="603" height="189" /></span><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;">For example, there is a slug that changed from &#8220;how-to-build-links-with-guest-articles&#8221; to &#8220;guest-blogging&#8221;. This indicates an optimization due to a strategy (or change of strategy) since they went from focusing on the keyword guest articles to the keyword guest blogging. As we are interested in the current strategy, we are not going to look at the discarded URL but at the final URL after the redirect.<img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2019/09/Screen-Shot-2019-09-04-at-11.33.49-1024x400.png" width="601" height="235" /></span></li>
<li style="font-weight: 400;" aria-level="1"><b>Save the URLs:</b><span style="font-weight: 400;"> In Google Sheets we are going to dump all URLs that return 200.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Filter by URLs that redirect: </b><span style="font-weight: 400;">Since we also want to know the URLs that were modified, inside the Response Codes tab, we filter by the blog pages that redirect.<img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2019/09/Screen-Shot-2019-09-04-at-11.34.55-1024x447.png" width="600" height="262" /></span></li>
<li style="font-weight: 400;" aria-level="1"><b>Save the redirection URLs: </b><span style="font-weight: 400;">As in step 5, we are going to save the URLs that interest us in the same Google Sheets, but now we are going to look at the Redirect URL column since this is the one that indicates the page that is displayed after the redirection is executed.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Remove repeated URLs: </b><span style="font-weight: 400;">Once in Google Sheets, we will remove the URLs that are repeated using the late functionality of Remove duplicates in the Data menu.</span></li>
</ol>
<p>9.<b> Remove irrelevant URLs:</b><span style="font-weight: 400;"> There are likely to be URLs that we don&#8217;t want in the listing such as blog home, paginations, author URLs, posts in other languages, and URLs that redirect and now point to external sites, etc. Clean these URLs manually or via a Sheets function if you find a repeating pattern. </span></p>
<p><b>Note</b>: In this case, we are going to remove category URLs since we are interested in building our categorization and not necessarily sticking to the one selected by the blog creators. For example, the category &#8220;technical SEO&#8221; is too broad and we will want to segment it as &#8220;page speed, machine learning, IT, etc.).</p>
<p>10. <b>Repeat with other benchmarks:</b><span style="font-weight: 400;"> Once our list is ready, we are going to do the same with all our competitors. In the Sheets spreadsheet, we are going to add a column that shows the name of the site we are benchmarking so we can then plot it in different series.</span></p>
<p>11. <b>Write categorization UDF</b><span style="font-weight: 400;">: To automatically categorize the information we extracted, we are going to use the Google Sheets script editor to create a UDF (user-defined function).</span></p>
<p>12. <b>Select categorization keywords: </b><span style="font-weight: 400;">To get the most out of each slug, we have to put together different lists (technically arrays) with keywords that determine without grayscale the content of a slug. Once the arrays are completed, run a loop through each of them checking if each word appears in the URLs being analyzed. If it appears, the name of the category is written in a cell. </span></p>
<p><b>Example script here:</b></p>
<pre>categorize function (slug) {
  //Write an array for each category and fill it with keywords 
var keywords_tutorials = [
    "tutorial", "how-to", "guide", "step-by-step",
  ];
  
  var keywords_links = [
    "link", "anchor",
  ];
  
  var keywords_content = [
    "content", "text", "copywriting", "post", "blog",
  ];
  
  var keywords_technical = [
    "core", "rankbrain", "penguin", "panda", "hummingbird", "diversity", "unnamed", "medic",
    "maccabees", "fred", "possum", "mobilegeddon", "pigeon", "everflux", "pagerank", "page-rank",
    "sandbox", "penalty", "penalties", "google-bot", "googlebot", "mobile", "phone", "301", "200",
    "404", "403", "503", "302", "tag", "robots", "html", "javascript", "php", "keyword", "key-phrase",
    "domain", "directory", "htaccess", "redirect", "tool",
  ];
 
 
  var output = ""; 
  
  //Each array must have a loop that returns the category name
  for (i=0; i &lt;= keywords_tutoriales.length-1; i++)
  {if (slug.toLowerCase().indexOf(keywords_tutorials[i]) &gt;= 0){output += "tutorials" + ", ";}}
  
  for (i=0; i &lt;= keywords_links.length-1; i++)
  {if (slug.toLowerCase().indexOf(keywords_links[i]) &gt;= 0){output += "links" + ", ";}}
  
  for (i=0; i&lt;= keywords_content.length-1;i++)
  {if (slug.toLowerCase().indexOf(keywords_content[i]) &gt;= 0){output += "content" +", ";}}
  
  for (i=0; i&lt;= keywords_tecnicas.length-1;i++)
  {if (slug.toLowerCase().indexOf(keywords_technical[i]) &gt;= 0){output += "technical" + ", ";}}
  
  //If it does not find a category, it categorizes it as 'other'.
  if (!output){
    output = "other";
    } else {
      //remove final ', '  
      output = output.substring(0,output.length - 2);
      output = output.split(", ");
      
      //eliminate duplicates
      var output_unique = [];
      for (o=0; o &lt;= output.length-1;o++){
        if(output_unique.indexOf(output[o]) == -1) {
          output_unique.push(output[o]);
        }
      }
      //sort the array alphabetically
      output = output_unique.sort();
      output = output.toString();
    }
  return output;
}</pre>
<p>13. <b>Writing the function: </b>Now that we have everything categorized we have to write the function in the spreadsheet and see the output.</p>
<p><b>Pro Tips:</b></p>
<p>&#8211; Don&#8217;t copy/paste for your entire list of slugs because if you do a lot the Google server running our script may block your requests for a couple of minutes since running this function has a cost for Google.</p>
<p><span style="font-weight: 400;">&#8211; You should copy and paste about 500 rows, wait a few seconds and run 500 more rows&#8230; until you finish all your URLs.</span></p>
<p><span style="font-weight: 400;">&#8211; Be sure to copy and paste the result of the script as a value so that the function does not run again each time you load the spreadsheet.</span></p>
<p><span style="font-weight: 400;">&#8211;</span><span style="font-weight: 400;"> Do not copy and paste as a value until you have finished running the function on all the rows.</span></p>
<p>14.<b> Sort outputs: </b><span style="font-weight: 400;">Next step, we will separate the data into columns since it is likely that some entries have more than one category.</span></p>
<p><b>Note:</b><span style="font-weight: 400;"> In this example, we hide the column with the slugs to make it easier to work with the data.</span></p>
<p>15. <b>List the benchmarked sites: </b><span style="font-weight: 400;">Using the UNIQUE function, we list all the competing sites we are considering for the report.</span></p>
<p>16. <b>Count total values per site: </b><span style="font-weight: 400;">In order to automate the spreadsheet a bit, let&#8217;s count how many entries were contemplated for each site using the following formula:</span></p>
<p><span style="font-weight: 400;">=COUNTIF(A:A,G2)</span></p>
<p>17. <b>Count unique values: </b><span style="font-weight: 400;">Making use of the total number (and considering that the first row is the header), we put together the following formula to count how many times each term appears for each competing site:</span></p>
<p><span style="font-weight: 400;">=COUNTIF($C2:$F211,&#8221;=&#8221;&amp;I$1)</span></p>
<p><b>Note: </b><span style="font-weight: 400;">Don&#8217;t forget to adjust the range contemplated in the count for each client&#8217;s rows since rows 2 to 211 correspond only to Ahrefs.</span></p>
<p>18. <b>Drawing percentages: </b><span style="font-weight: 400;">We are almost there! But if we take into account that Ahrefs has 210 blog posts, we can&#8217;t really compare it with SEMrush which has 7 times more (and Moz even more). So let&#8217;s update our formula to show the percentage of the total instead of the total amount of each category. The formula would now look like this:</span></p>
<p><span style="font-weight: 400;">=COUNTIF($C2:$F211,&#8221;=&#8221;&amp;I$1)/$H2</span></p>
<p><b>Note: </b><span style="font-weight: 400;">Be sure to update only column I and then drag I2, I3 and I4 to M2, M3 and M4, respectively.</span></p>
<p>19.<b> Format as a percentage: </b><span style="font-weight: 400;">To do things neatly, let&#8217;s convert all the values in I2:M4 to percentages by selecting the range of cells and then clicking the percentage format button.</span></p>
<p><b>Note: </b><span style="font-weight: 400;">Keep in mind that the percentages can add up to more than 100% since a note can have more than one subject (for example, contents and tutorials).</span></p>
<p>20. <b>Set the data: </b><span style="font-weight: 400;">To simplify things, we copy all the percentages and paste them as a value. Next, we delete the totals column (column H, in our case).</span></p>
<p>21. <b>Graphing: </b><span style="font-weight: 400;">The moment of truth has arrived! We select our entire dashboard and then generate the graph.</span><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;"><br />
</span><b>Final result</b></p>
<p><span style="font-weight: 400;">In conclusion, we draw the following insights&#8230;</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">SEMrush has the most number of tutorials</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Ahrefs is the one that talks the most about links (obviously)</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">SEMrush is the blog most oriented to content generation</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">SEMrush is the blog with the most technical content</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Moz is the blog with the most varied topics among the three.</span></li>
</ul>
<p><b>Note</b><span style="font-weight: 400;">: If we want more insights, we should add more categories and filters inside our script to reduce the number of blog posts that fall into others.</span></p>
<p><span style="font-weight: 400;">Well, this was a bit longer tutorial, but we didn&#8217;t want to stop sharing it with you. We hope you can find SEO value in categorizing URLs (either your own or those of your competitors). As always, if you have any questions, leave us a comment below.</span></p>
<p><span style="font-weight: 400;">Until next time and good rankings!</span></p>
<p>La entrada <a rel="nofollow" href="https://puntorojo.com/blog/en/how-to-categorize-your-urls-and-your-competitors/">How to categorize your URLs, and your competitors’</a> aparece primero en <a rel="nofollow" href="https://puntorojo.com/blog/en">SEO Blog Punto Rojo</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Data Science: How to extract and process data for decision making in your SEO strategy</title>
		<link>https://puntorojo.com/blog/en/data-science-how-to-extract-and-process-data-for-decision-making-in-your-seo-strategy/</link>
		
		<dc:creator><![CDATA[Dario Manoukian]]></dc:creator>
		<pubDate>Fri, 20 May 2022 15:01:49 +0000</pubDate>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Data]]></category>
		<category><![CDATA[Data Analytics]]></category>
		<category><![CDATA[Phyton]]></category>
		<guid isPermaLink="false">https://puntorojo.com/blog/en/?p=6677</guid>

					<description><![CDATA[<p>Data Analytics it’s the science of analyzing raw data and extracting from it patterns and insights that add value to decision making. By raw data, we mean unparsed, unprocessed, and unformatted information. Many algorithms sift through the information for you and return a complete report with insights into what was analyzed. To put it bluntly,&#8230;</p>
<p>La entrada <a rel="nofollow" href="https://puntorojo.com/blog/en/data-science-how-to-extract-and-process-data-for-decision-making-in-your-seo-strategy/">Data Science: How to extract and process data for decision making in your SEO strategy</a> aparece primero en <a rel="nofollow" href="https://puntorojo.com/blog/en">SEO Blog Punto Rojo</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><b>Data Analytics</b><span style="font-weight: 400;"> it’s the science of analyzing raw data and extracting from it patterns and insights that add value to decision making. By raw data, we mean unparsed, unprocessed, and unformatted information.</span></p>
<p><span style="font-weight: 400;">Many algorithms sift through the information for you and return a complete report with insights into what was analyzed. To put it bluntly, Data Science is the use of technology to analyze data. In this post, we are going to see how you can start extracting data from your keywords using Python.</span></p>
<p><span style="font-weight: 400;">Before we start, let’s define some terms:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><b>Data:</b><span style="font-weight: 400;"> The information collected</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Analytics:</b><span style="font-weight: 400;"> The data analysis</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Insights:</b><span style="font-weight: 400;"> What you learn after analyzing the data</span></li>
</ul>
<p><span style="font-weight: 400;">In order to bring some added value to the organic SEO industry, I&#8217;m going to stay away from Google Analytics and similar tools in this post as there are thousands of posts written about the topic by people more qualified than me.</span></p>
<p><span style="font-weight: 400;">We are going to use the following tools:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Google Analytics (I know I said I won’t use it, but we are going to extract the data from here in this example)</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Google Chrome Console</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Notepad (or equivalent)</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Python</span></li>
</ul>
<h2><b>How to extract search term data from Google Analytics</b></h2>
<ol>
<li><span style="font-weight: 400;">Open Google Analytics in Google Chrome</span></li>
<li><span style="font-weight: 400;">In Google Analytics, go to</span><span style="font-weight: 400;"> Acquisition &gt; Search Console &gt; Queries</span></li>
<li><span style="font-weight: 400;">Filter view to show 5000 rows</span></li>
<li><span style="font-weight: 400;">Open </span><span style="font-weight: 400;">Google Chrome inspector &gt; &#8220;Elements&#8221; tab</span></li>
<li><span style="font-weight: 400;">Navigate to one of the rows that display Google Search Console information to make sure it is in the inspector cache. If this step is not done correctly, keywords may not appear when running the script.</span></li>
<li><span style="font-weight: 400;">Go to the &#8220;console&#8221; tab</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Ctrl + L to clean the console</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Paste the following script and press ENTER*.<br />
</span><span style="font-weight: 400;">for(var kwds=document.getElementsByClassName(&#8220;_GApu&#8221;),clicks=document.getElementsByClassName(&#8220;_GAtjb&#8221;),total=kwds.length,content='&lt;table align=&#8221;center&#8221;&gt;&lt;tr&gt;&lt;th&gt;Keywords&lt;/th&gt;&lt;th&gt;Clicks&lt;/th&gt;&lt;th&gt;Impresiones&lt;/th&gt;&lt;th&gt;CTR&lt;/th&gt;&lt;th&gt;Ranking&lt;/th&gt;&lt;/tr&gt;&#8217;,i=0;i&lt;total;i++){var current_kwd=kwds[i].innerHTML,current_clicks=clicks[i].innerText.split(&#8220;(&#8220;)[0].replace(&#8220;%&#8221;,&#8221;&#8221;).replace(&#8220;.&#8221;,&#8221;&#8221;).replace(&#8220;,&#8221;,&#8221;.&#8221;).trim(),current_impresiones=clicks[i+1].innerText.split(&#8220;(&#8220;)[0].replace(&#8220;%&#8221;,&#8221;&#8221;).replace(&#8220;.&#8221;,&#8221;&#8221;).replace(&#8220;,&#8221;,&#8221;.&#8221;).trim(),current_ctr=clicks[i+2].innerText.split(&#8220;(&#8220;)[0].replace(&#8220;%&#8221;,&#8221;&#8221;).replace(&#8220;.&#8221;,&#8221;&#8221;).replace(&#8220;,&#8221;,&#8221;.&#8221;).trim(),current_posicion_media=clicks[i+3].innerText.split(&#8220;(&#8220;)[0].replace(&#8220;%&#8221;,&#8221;&#8221;).replace(&#8220;.&#8221;,&#8221;&#8221;).replace(&#8220;,&#8221;,&#8221;.&#8221;).trim();i+=3;var row=&#8221;&lt;td&gt;&#8221;+current_kwd+'&lt;/td&gt;&lt;td align=&#8221;center&#8221;&gt;&#8217;+current_clicks+'&lt;/td&gt;&lt;td align=&#8221;center&#8221;&gt;&#8217;+current_impresiones+'&lt;/td&gt;&lt;td align=&#8221;center&#8221;&gt;&#8217;+current_ctr+'&lt;/td&gt;&lt;td align=&#8221;center&#8221;&gt;&#8217;+current_posicion_media+&#8221;&lt;/td&gt;&#8221;;content+=&#8221;&lt;tr style=\&#8221;font-family: &#8216;Open Sans&#8217;\&#8221;&gt;&#8221;+row+&#8221;&lt;/tr&gt;&#8221;}with(output='&lt;html&gt;&lt;head&gt;&lt;title&gt;Punto Rojo Tools &#8211; Google Analytics Keyword Extractor&lt;/title&gt;&lt;link rel=&#8221;stylesheet&#8221; type=&#8221;text/css&#8221; href=&#8221;https://fonts.googleapis.com/css?family=Open+Sans&#8221; /&gt;&lt;link rel=&#8221;stylesheet&#8221; type=&#8221;text/css&#8221; href=&#8221;//fonts.googleapis.com/css?family=Open+Sans&#8221; /&gt;&lt;/head&gt;&lt;body&gt;&#8217;,output='&lt;table align=&#8221;center&#8221; width=&#8221;50%&#8221;&gt;&lt;tr&gt;&lt;td align=&#8221;left&#8221;&gt;&lt;img src=&#8221;https://puntorojo.com/blog/wp-content/themes/punto_rojo/img/logo_mini.png&#8221; width=&#8221;200px&#8221;&gt;&lt;/td&gt;&lt;td align=&#8221;right&#8221;&gt;&lt;h1 style=&#8221;font-family: \&#8217;Open Sans\'&#8221;&gt;Google Analytics Keyword Extractor&lt;/h1&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br&gt;&lt;br&gt;&#8217;,output+=content,output+=&#8221;&lt;/body&gt;&lt;/html&gt;&#8221;,window.open())document.write(output),document.close();</span></li>
<li><span style="font-weight: 400;">If you want to see directly the information without formatting to export it to a CSV, paste the following code:<br />
</span><span style="font-weight: 400;">for(var kwds=document.getElementsByClassName(&#8220;_GApu&#8221;),clicks=document.getElementsByClassName(&#8220;_GAtjb&#8221;),total=kwds.length,content=&#8221;Keywords;Clicks;Impresiones;CTR;Ranking&lt;br&gt;&#8221;,i=0;i&lt;total;i++){var current_kwd=kwds[i].innerHTML,current_clicks=clicks[i].innerText.split(&#8220;(&#8220;)[0].replace(&#8220;%&#8221;,&#8221;&#8221;).replace(&#8220;.&#8221;,&#8221;&#8221;).replace(&#8220;,&#8221;,&#8221;.&#8221;).trim(),current_impresiones=clicks[i+1].innerText.split(&#8220;(&#8220;)[0].replace(&#8220;%&#8221;,&#8221;&#8221;).replace(&#8220;.&#8221;,&#8221;&#8221;).replace(&#8220;,&#8221;,&#8221;.&#8221;).trim(),current_ctr=clicks[i+2].innerText.split(&#8220;(&#8220;)[0].replace(&#8220;%&#8221;,&#8221;&#8221;).replace(&#8220;.&#8221;,&#8221;&#8221;).replace(&#8220;,&#8221;,&#8221;.&#8221;).trim(),current_posicion_media=clicks[i+3].innerText.split(&#8220;(&#8220;)[0].replace(&#8220;%&#8221;,&#8221;&#8221;).replace(&#8220;.&#8221;,&#8221;&#8221;).replace(&#8220;,&#8221;,&#8221;.&#8221;).trim();i+=3;var row=current_kwd+&#8221;;&#8221;+current_clicks+&#8221;;&#8221;+current_impresiones+&#8221;;&#8221;+current_ctr+&#8221;;&#8221;+current_posicion_media+&#8221;&lt;br&gt;&#8221;;content+=row}var output=&#8221;&lt;html&gt;&#8221;;with(output+=content,output+=&#8221;&lt;/html&gt;&#8221;,window.open())document.write(output),document.close();</span></li>
<li><span style="font-weight: 400;">Copy keywords, paste them into Notepad</span></li>
<li><span style="font-weight: 400;">Add &#8216;Keywords&#8217; to the first row of the txt (without quotation marks and with capital K)</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Save as &#8216;keywords.csv&#8217;</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">If you want to save the script for future reference, follow the steps below:</span>
<ol style="list-style-type: lower-alpha;">
<li style="font-weight: 400;" aria-level="2"><span style="font-weight: 400;">In the Google Chrome inspector, go to th</span><span style="font-weight: 400;">e &#8216;Sources&#8217; tab</span><span style="font-weight: 400;">.</span></li>
<li style="font-weight: 400;" aria-level="2"><span style="font-weight: 400;">In the menu on the left, click on the</span><span style="font-weight: 400;"> &#8216;Snippets&#8217; tab (if it does not appear, click on the little arrow on the right)</span><span style="font-weight: 400;">.</span></li>
<li style="font-weight: 400;" aria-level="2"><span style="font-weight: 400;">Click on &#8216;+ New Snippet&#8217;.</span></li>
<li style="font-weight: 400;" aria-level="2"><span style="font-weight: 400;">Name the script &#8216;Google Analytics &#8211; Keyword Extractor&#8217; or similar.</span></li>
<li style="font-weight: 400;" aria-level="2"><span style="font-weight: 400;">Paste the code in the window next to it</span></li>
<li style="font-weight: 400;" aria-level="2"><span style="font-weight: 400;">Ctrl + S to save (or its MAC equivalent)</span></li>
<li style="font-weight: 400;" aria-level="2"><span style="font-weight: 400;">That&#8217;s it! Each time you want to run it, double click on the snippet name in the left window of </span><span style="font-weight: 400;">Sources &gt; Snippets</span></li>
</ol>
</li>
</ol>
<h2><b>How to manipulate data with Python</b></h2>
<ol>
<li style="font-weight: 400;" aria-level="1"><a href="https://lmgtfy.es/?q=c%C3%B3mo+instalar+python+3"><span style="font-weight: 400;">Install Python3</span></a></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Install modules:</span>
<ol>
<li style="font-weight: 400;" aria-level="2"><span style="font-weight: 400;">pandas</span></li>
<li style="font-weight: 400;" aria-level="2"><span style="font-weight: 400;">matplotlib</span></li>
</ol>
</li>
</ol>
<p><span style="font-weight: 400;">Now we are going to see how to load data from a CSV in Python and how to manipulate it. For practicality, the code to read the CSV will be shown only once, but it must always be in the script.</span></p>
<p><span style="font-weight: 400;">Load the data with the following snippet:</span></p>
<p><span style="font-weight: 400;">import pandas as pd</span></p>
<p><span style="font-weight: 400;">import matplotlib.pyplot as plt</span></p>
<p><span style="font-weight: 400;">data = pd.read_csv(‘keywords.csv’, delimiter = ‘;’, encoding = ‘ISO-8859-1’)</span></p>
<p><span style="color: #ff0000;"><b>Show the first 10 rows of the CSV:</b></span></p>
<p><span style="font-weight: 400;">print(data.head(10))</span></p>
<p><span style="color: #ff0000;"><b>Count the number of rows:</b></span></p>
<p><span style="font-weight: 400;">print(len(data)</span></p>
<p><span style="color: #ff0000;"><b>Count the number of rows 2:</b></span></p>
<p><span style="font-weight: 400;">print(«</span><span style="font-weight: 400;">number of rows</span><span style="font-weight: 400;">: %i» % len(data))</span></p>
<p><span style="font-weight: 400;">#The %i indicates that an integer variable goes there.</span></p>
<p><span style="font-weight: 400;">#When closing double quotes, the % indicates what should replace %i</span></p>
<p><span style="color: #ff0000;"><b>How to do data splicing:</b></span></p>
<p><span style="font-weight: 400;">print(data[:10]) #displays the first 10 rows</span></p>
<p><span style="font-weight: 400;">print(data[5:]) #displays all but the first 5 rows</span></p>
<p><span style="font-weight: 400;">print(data[-3:]) #displays the last 3 rows</span></p>
<p><span style="font-weight: 400;">print(data[:-2]) #displays all but the last 2 rows</span></p>
<p><span style="font-weight: 400;">print(data[-5:-2]) #sample from the 5th from the end to 2nd from the end</span></p>
<p><span style="color: #ff0000;"><b>Change all keywords to lowercase:</b></span></p>
<p><span style="font-weight: 400;">data = data[‘Keyword’].str.lower()</span></p>
<p><span style="font-weight: 400;">#Translated, it would be: the keywords are now the keywords but in lower case.</span></p>
<p><span style="color: #ff0000;"><b>How to get the average CTR per number of words</b></span></p>
<p><span style="font-weight: 400;">As a demonstration, let&#8217;s take a look at some of what can be done to get SEO insights with Python.</span></p>
<p><span style="font-weight: 400;">Using the code above we are going to:</span></p>
<ol>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Load pandas and matplotlib.pyplot modules</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Load the dataset with the search terms and dump them into a variable called &#8220;data&#8221;.</span></li>
</ol>
<p><span style="font-weight: 400;">The data is now in what is called a Pandas DataFrame. Pandas is a Python module that allows you to manipulate data as if it were a spreadsheet, except that it does not render the content until you ask it to. This saves system resources: imagine loading millions of rows, clicking on a cell, rendering millions of rows again, and so on&#8230; it is not practical and that is why we use Pandas.</span></p>
<p><span style="font-weight: 400;">Pandas works with columns and rows, just like Excel or Google Sheets. In these examples, our DataFrame is going to be called &#8220;data&#8221;.</span></p>
<p><span style="color: #ff0000;"><b>How to see the columns of a DataFrame in Pandas:</b></span></p>
<p><span style="font-weight: 400;">print(data.columns)</span></p>
<p><span style="color: #ff0000;"><b>How to see the content of a column in Pandas:</b></span></p>
<p><span style="font-weight: 400;">print(data[‘CTR’]) #the column is called CTR</span></p>
<p><span style="color: #ff0000;"><b>How to average the values of a column in Pandas:</b></span></p>
<p><span style="font-weight: 400;">print(data[‘CTR’].mean())</span></p>
<p><span style="color: #ff0000;"><b>How to obtain the average of a column in Pandas grouped according to another column:</b></span></p>
<p><span style="font-weight: 400;">average = data.groupby([‘Ranking’])[‘CTR’].mean()</span></p>
<p><span style="font-weight: 400;">print(average)</span></p>
<p><b>The output should be similar to this (varies depending on the data)</b></p>
<p><img loading="lazy" class="" src="https://res.cloudinary.com/scuba-dive-argentina/image/upload/c_scale,w_760,h_1094/f_webp,q_auto/v1639602909/imagen-1_4631e2528.jpg?_i=AA" width="308" height="443" /></p>
<p><span style="color: #ff0000;"><b>How to plot the average of a column in Pandas with Matplotlib:</b></span></p>
<p>#continued from previous script</p>
<p>average.plot(kind=’bar’)</p>
<p>plt.title(‘Average CTR per ranking’)</p>
<p>plt.xlabel(‘Ranking’)</p>
<p>plt.ylabel(‘CTR’)</p>
<p>plt.show()</p>
<p><b>The output should look like this (varies according to data)</b></p>
<p><img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2019/05/imagen-2-1024x575.jpg" width="658" height="369" /></p>
<p><span style="color: #ff0000;"><b>To plot only the positions from 1 to 10, it is necessary to reduce the DataFrame before drawing the averages.</b></span></p>
<p>df = df[df[‘Ranking’] &lt;= 10]</p>
<p>average = df.groupby([‘Ranking’])[‘CTR’].mean()#.plot(kind=’bar’)#, cmap=’coolwarm’)</p>
<p><span style="color: #ff0000;"><b>If we want to round the rankings and change the type of data displayed in the column so that they do not show decimals (rounding would treat, for example, a rank 1.9 as a rank 2):</b></span></p>
<p>df = df[df[‘Ranking’] &lt;= 10]</p>
<p>df[‘Ranking’] = df[‘Ranking’].round().astype(int)</p>
<p>average = df.groupby([‘Ranking’])[‘CTR’].mean()</p>
<p><img loading="lazy" class="" src="https://res.cloudinary.com/scuba-dive-argentina/image/upload/c_fill,g_auto,w_778,h_666/f_webp,q_auto/v1639602903/imagen-3_4633e258e.jpg?_i=AA" width="221" height="189" /></p>
<p><span style="color: #ff0000;"><b>And if we plot that by adding the plotting lines shown above and adding these two lines to show the X-axis numbers horizontally and without decimals:</b></span></p>
<p><span style="color: #000000;">plt.xticks(rotation=’horizontal’)</span></p>
<p><img loading="lazy" class="" src="https://puntorojo.com/blog/wp-content/uploads/2019/05/imagen-4-1024x860.jpg" width="360" height="303" /></p>
<p>And that&#8217;s it! We got to start getting into all that is <strong>Data Science for SEO</strong>. There are a lot of graphs that can be born from this data and even more after starting to generate data from the data with feature engineering. An example could be to look at positioning according to the number of words in a search term, some short, medium, and long-tail keywords, etc.</p>
<p>What programming language do you use to extract graphs and insights from your data? Let us know below so we can shape future posts. Until next time and good rankings!</p>
<p>La entrada <a rel="nofollow" href="https://puntorojo.com/blog/en/data-science-how-to-extract-and-process-data-for-decision-making-in-your-seo-strategy/">Data Science: How to extract and process data for decision making in your SEO strategy</a> aparece primero en <a rel="nofollow" href="https://puntorojo.com/blog/en">SEO Blog Punto Rojo</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
