THE LAB #28: Deep dive on inventory levels tracking
A real world example of scraping inventory level from an heavily Akamai-protected website
In the previous post of The Lab, we saw some examples of websites that exposed their inventory levels via API and made some hypotheses on how to use this information.
In this episode, instead, we’ll see another website of a well-known publicly traded company and create a scraper to extract this information, with all the code available in our GitHub Repository available for paying subscribers.
If you’re one of them but don’t have access to it, please write me at email@example.com with your GitHub username so that I can give you access.
Lowe’s home improvement: a brief history
Lowe’s (LOW 0.00%↑) is the second home improvement retail chain in the US, with more than 2100 stores operating in North America.
Its product assortment ranges from furniture to animal care and this makes the website an interesting target to monitor.
Being so popular and with a large availability of different products and brands, the analysis of its inventory can be used not only to evaluate Lowe’s profitability but could be a proxy for evaluating the distributed brands and, why not, a portion of the consumer price index.
Today, Lowe’s is the 57th most valuable company in the S&P 500, so it’s quite an interesting target website to monitor.
Website preliminary analysis
Now we assessed the importance of Lowe’s for the home improvement industry and its operators, let’s try to see if we can extract some valuable insight from its website.
One common feature of companies' websites with both physical stores and online presence is the “pick up in store”.
This service implies that the website should be aware if an item is available in every store, so probably a call to some kind of API is made to get this information.
Each website threats this information differently, there might be some websites that check only if an item is available or not, while others know the exact number available of a single store, showing it also to the customer. That’s the case of Leroy Merlin in Italy, a popular home improvement retailer.
In other cases, similarly to what we’ve already seen in our IKEA post, the exact number of items available is hidden in the internal API calls a website makes to check the inventory levels.
After playing around on the website for some time, I found out that this is the case also for Lowe’s website. When scrolling the product list page of a category, we can intercept from the console an API call that returns a JSON with some exciting data.
Given that the quantity changes when we select another store and refers to the pickup option, we can reasonably think that the TotalQty number highlighted refers to the inventory level for the selected product in a particular store or warehouse.
The Web Scraping Club is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
Now we have discovered that there’s valuable information hidden between the API calls, we need to understand how difficult is to extract it at the desired scale.
From a quick view on Wappalyzer, the only anti-bot solution protecting the website is Akamai, so probably we could use Scrapy with a good proxy rotation.
Implementing the scraper
Keep reading with a 7-day free trial
Subscribe to The Web Scraping Club to keep reading this post and get 7 days of free access to the full post archives.