Guide to Extracting Website Data by Using Excel VBA

Flipnode on Jun 14 2023

blog-image

Microsoft Excel is an incredibly popular software application utilized across various industries and disciplines. Its versatile functionality enables users to store, organize, and manipulate data using a wide array of functions and formulas. Additionally, Excel offers the capability to access web pages and extract data from them.

In this tutorial, our primary focus will be on this web scraping feature within Excel, specifically using VBA (Visual Basic for Applications). We will guide you through the installation and preparation steps to set up the necessary environment. Subsequently, we will demonstrate how to write a VBA macro that serves as a scraper, effectively extracting data from a web page and importing it into Excel. Let's dive in and get started with this exciting endeavor.

What is VBA web scraping?

VBA web scraping is a unique technique that enables automated data collection from websites directly into Excel. This scraping approach leverages external applications, such as the Microsoft Edge browser, to facilitate the data extraction process. By utilizing VBA, you can harness the power of web scraping and seamlessly import website data into your Excel spreadsheets.

What is VBA?

VBA, short for Visual Basic for Applications, is a programming language developed by Microsoft Corporation. It enhances the functionalities of Microsoft Office tools, enabling users to create advanced functions and automate complex tasks. With VBA, you can write macros that extract data from websites and import it directly into Excel. This powerful language provides a versatile solution for data retrieval and automation within the Microsoft Office suite.

Pros and cons of using VBA for scraping

Before we proceed with the tutorial, it's important to discuss the advantages and disadvantages of using VBA for web scraping data into Excel. Understanding these aspects will help you make informed decisions regarding your data extraction needs.

Pros

  • Ready to use - VBA comes pre-installed with Microsoft Office, eliminating the need for separate installations. If you already have MS Office installed, you can immediately utilize VBA in all Microsoft Office tools.
  • Reliable - Microsoft Excel and VBA are developed and maintained by Microsoft, ensuring a high level of reliability. These tools can be seamlessly upgraded together to the latest versions without significant complications.
  • Built-in browser support - VBA web scrapers can leverage Microsoft's integrated browser, Microsoft Edge, which simplifies the process of scraping dynamic websites.
  • Complete automation - With VBA scripts, you can achieve complete automation. The script handles tasks such as logging in, scrolling, and button clicks without requiring any additional user intervention or interaction with the browser.

Cons

  • Limited platform compatibility - VBA scrapers are designed specifically for Windows and do not work on other platforms. While MS Office has support for Mac, creating a functional VBA scraper on it can be more challenging. Additionally, certain features like Microsoft Edge may not be available on non-Windows systems.
  • Strong reliance on MS Office - VBA scrapers heavily rely on MS Office tools and integrating third-party scraping solutions can be difficult.
  • Steep learning curve - VBA programming language is relatively complex compared to more modern languages like Python or Javascript, making it less beginner-friendly and requiring a steeper learning curve.

In summary, if you need to develop a web scraper for Windows that automatically extracts data from a website, VBA-based web scraping can be a suitable choice. However, it's important to consider the platform limitations, integration challenges, and the learning curve associated with VBA programming.

Tutorial

Before we get started, let's ensure that we have all the necessary prerequisites installed and our environment properly set up. This will make it easier for you to follow along with the tutorial.

Prerequisites

In this tutorial, we will be using Windows 10 and Microsoft Office 10. However, the steps will be similar for other versions of Windows. You will need a computer running a Windows operating system. Additionally, it is necessary to have Microsoft Office installed. If you don't have it already, you can find detailed installation instructions in the official Microsoft documentation.

Preparing the environment

Follow these steps to set up the development environment:

  • Step 1: Open Microsoft Excel
    From the Start menu or Cortana search, find Microsoft Excel and open the application. You will see a similar interface as shown below:
    Click on "File".
  • Step 2: Go to Options to enable the developer menu
    By default, Excel doesn't show the developer button in the top ribbon. To enable it, go to "Options" from the File menu.
  • Step 3: Select "Customize Ribbon"
    In the Options dialog box, select "Customize Ribbon" from the side menu. Check the box next to "Developer" and click "OK".
  • Step 4: Open the Visual Basic editor
    You will now see a new developer button on the top ribbon. Click on it to expand the developer menu. From the menu, select "Visual Basic".
  • Step 5: Insert a new module
    In the Visual Basic editor window, click on "Insert" and select "Module" to insert a new module. This will open the module editor.
  • Step 6: Add new references
    From the top menu, select "Tools" > "References..." to open the References window. Scroll through the available list of references and check the boxes next to "Microsoft HTML Object Library" and "Microsoft Internet Controls". Click "OK".
    That's it! Your development environment is now set up. Let's write our first Excel VBA scraper.
  • Step 7: Automate Microsoft Edge to open a website
    In the module editor, insert the following code:
Sub scrape_quotes()
Dim browser As InternetExplorer
Dim page As HTMLDocument
Set browser = New InternetExplorer
browser.Visible = True
browser.Navigate ("https://quotes.toscrape.com")
End Sub

This code defines a subroutine named "scrape_quotes()" that will be executed when we run the script. Inside the subroutine, two objects are defined: "browser" and "page".

The "browser" object allows us to interact with Microsoft Edge. We also set the browser as visible so that we can see it in action. The "browser.Navigate()" function instructs the VBA browser object to open the specified URL.

Step 8: Scrape data using the VBA script and save it to Excel

The next step is to scrape the quotes and authors from the website. We will store the data in the first sheet of the Excel spreadsheet and grab the top 5 quotes for now.

Define two new objects for quotes and authors:

Dim quotes As Object
Dim authors As Object

After navigating to the website, add a pause using a loop to ensure the website loads properly:

Do While browser.Busy: Loop

Next, grab the quotes and authors from the HTML document:

Set page = browser.document
Set quotes = page.getElementsByClassName("quote")
Set authors = page.getElementsByClassName("author")

Use a for loop to populate the Excel rows with the extracted data by calling the Cells function and passing the row and column positions:

For num = 1 To 5
Cells(num, 1).Value = quotes.Item(num).innerText
Cells(num, 2).Value = authors.Item(num).innerText
Next num

Finally, close the browser by calling the Quit function:

browser.Quit

That's it! You have set up the development environment and written your first Excel VBA scraper.

Output

Now, when you run the script again, it will perform the following actions:

  • Open Microsoft Edge.
  • Browse to the website quotes.toscrape.com.
  • Extract the top 5 quotes from the list.
  • Save the extracted quotes to the first sheet of the current Excel file.

By running the script repeatedly, you can fetch the latest quotes and update your Excel sheet with the most recent data.

Source Code

Here is the full source code example:

Sub scrape_quotes()
Dim browser As InternetExplorer
Dim page As HTMLDocument
Dim quotes As Object
Dim authors As Object

Set browser = New InternetExplorer
browser.Visible = True
browser.navigate ("https://quotes.toscrape.com")
Do While browser.Busy: Loop

Set page = browser.document
Set quotes = page.getElementsByClassName("quote")
Set authors = page.getElementsByClassName("author")

For num = 1 To 5
Cells(num, 1).Value = quotes.Item(num).innerText
Cells(num, 2).Value = authors.Item(num).innerText
Next num

browser.Quit
End Sub

You can use this code to create a new module in Excel's VBA editor and run the scrape_quotes() subroutine to scrape the quotes from the website and save them to the Excel sheet. Make sure to have the necessary references enabled as mentioned earlier in the setup steps.

Conclusion

Excel web scraping with VBA provides a reliable and convenient solution for Windows automation and web extraction tasks. It offers a ready-to-use environment, eliminating the need for additional steps or concerns about unexpected issues.

While the lack of cross-platform support is a significant drawback of VBA web scraping, there are alternative options available. If you require web scrapers that can be used on multiple operating systems like Linux or Mac, you can consider utilizing Excel Web Query as an alternative solution. This allows for broader compatibility and flexibility in developing web scrapers across different platforms.

News and updates

Stay up-to-date with the latest web scraping guides and news by subscribing to our newsletter.

Subscribe

Related articles

thumbnail
ProxiesHow to Use Chrome Browser Proxy Settings

Learn how to configure and utilize Chrome browser proxy settings effectively for enhanced privacy and optimized web browsing.

Flipnode
author avatar
Flipnode
8 min read
thumbnail
How to Use DataXPath vs CSS Selectors

Read this article to learn what XPath and CSS selectors are and how to create them. Find out the differences between XPath vs CSS, and know which option to choose.

Flipnode
author avatar
Flipnode
12 min read
thumbnail
ScrapersScraping Amazon Product Data: A Complete Guide

Master the art of building an Amazon scraper from scratch with this practical, step-by-step tutorial.

Flipnode
author avatar
Flipnode
11 min read