Blog Layout

Scraping Websites with Google Sheets: Power SEO & Competitive Research Tool

Roy Bielewicz

You can use Google Sheets for SEO audits, competitive research, and more!

Have you ever wanted a quick, convenient, low cost way to pull content from a website? Or how about doing a quick audit for SEO, where you can look at H1 or title tags? Well, you can do all of that and more with Google Sheets!


Google Sheets has a number of functions that will allow you to import structured content from a webpage. That structured data can be anything from a title tag, meta description, H1 or H2 tag, prices, reviews, or even tables of data. We'll take a look at some of those that you'll find most useful.


IMPORTXML Function


The IMPORTXML function can pull structured data from a number of sources, including HTML, XML, CSV, TSV, and RSS feeds.


Just like any spreadsheet formula, the function itself is pretty straightforward. You'll just need:


  • A reference URL (or multiple URLs) from which you want to pull data or content.
  • And the "XPath" of the element in which the data is contained or accessible. (We'll show you what this means below.)


IMPORT function in Google Sheets

For instance, to scrape a page's H1 tag (or to see if it has one), you would enter the following formula:


=IMPORTXML("[a website URL]","//h1")


In the image above, you can see that instead of manually inserting a URL, we added it to the sheet and then referenced its cell in the formula. This way you can add multiple URLS and pull their respective data in as needed.


Other common XPath types that would be typical for SEO are:


  • Page title: //title
  • Page meta description: //meta[@name=’description’]/@content
  • Page H1: //h1
  • Page links: //@href


Note that when using @href, you'll only want one URL per sheet, since this will pull in all the links on the page you're referencing.

Scraping price from a website


Pulling Content From a Website


We sometimes find it necessary to pull content from a client's website, that may not be easily accessible in their CMS. To avoid having to cut and paste blog articles, or product descriptions, you can use Google Sheets to pull that for you.


In this case, it will take some knowledge of HTML, but Google Chrome's Inspect tool makes this relatively painless.


In the image above, you can see that we wanted to pull the products that are listed on a page, along with their prices.


First, we need to identify how this data appears on the website.


Inspect tool in Google Chrome

In this case, we know that the name of the products are in a span class called "title", while the brand is, yep you guessed it, "brand". We want the product name, since there may be multiple products on this page by the same company. So we'll create the formula:


=IMPORTXML(A1,"//span[@class='title']")


This will then populate our sheet with the products on this page that have the class "title" as you can see in the image below.

Sample Google Sheet formula

Now that we have the products listed on the page, let's pull in the price.


Again, we're going to look at the page code, and we can see that there is a class for "current_price".

HTML for current price on the page

So we'll add this formula to our Google Sheet:


=IMPORTXML(A1,"//span[@class='current_price']")


This will pull in the prices for the products that we pulled previously. Using this method, we could potentially also list individual product pages, and pull in not only price, but SKU, product description, and product reviews by replacing the "class" variable with the appropriate data from the page.

Code for price

Importing Table Data



Data scraping in Google Sheets isn't limited to just individual elements on a webpage. You can also import entire tables worth of data.


In this case, say we want to import a list of the hottest peppers from Wikipedia. We know that this is the second table on this page (through trial and error).

Table of hottest peppers on wikipedia

We would just need to create a sheet, and reference the URL, and the table number (for instance the first or second table on the page in this case).


The formula looks like this:


=IMPORTHTML(A1,"table",2)


Which gives us the data below.

Google sheets pulling table data

Conclusion


As you can see from these examples, Google Sheet's IMPORT function is a great tool for quick audits, competitive analysis, or content imports. It can save you a ton of time and hassle, and best of all it's free!

ADA Compliance Guide

Google Analytics

Services

Ensure that you have the right data, and the right reporting.

GET STARTED

Contact Us

Thumbnail for video about Duda website builder
By Roy Bielewicz 10 May, 2024
Duda is an easy to use website builder and CMS that doesn't require coding knowledge, or tons of plugins like WordPress.
A man with glasses is pointing at a shopify logo
By Roy Bielewicz 07 May, 2024
Launching your own ecommerce site has never been easier. We take a look at Shopify to get your started.
Thumbnail image of video for using Google Ads as an SEO tool
By Roy Bielewicz 02 May, 2024
Did you know that you can use Google Ads' free keyword planning tool for your SEO? We take a look at how.
Show More
Share by: