So, today we’re launching a new micro-application for scraping. And this app is dedicated to scraping Yelp.com. This website is an Internet directory of organizations with a convenient display of information on the map. Mostly, this site contains information about restaurants and cafes.
In the first two months of selling our main application (DataExcavator) we talked to many users. For various reasons, they usually need to extract data from the directory of Yelp organizations. Well, since our main application has an annual subscription model and is more designed for professional scraping, we decided to make a micro-scraper for the Yelp site.
That’s what we did in the end. A small application that consists of a table with data and a few buttons. It performs the only function – making data extraction from the Yelp website and saving it in .xlsx format.
Our micro scraper for Yelp has a low cost, open-ended subscription model and open source code for customers. Its interface is minimal, with two input fields and three buttons. What else do we need to quickly extract all the data from Yelp.com?
The main program window consists of two input fields and three buttons. To scrape data you need to enter the search key, geolocation, and click on “Scrape data!”. After that, our application will successively download data from the site Yelp.c by the specified key and save them in a temporary table.
The application analyzes the search results on the Yelp site, and moves sequentially from one page to the next. On each page, it clicks on the organization links and extracts the relevant data. Please note that you can also retrieve photos and user feedback from the organization pages.
By default, the “Download Pictures” and “Download Feedback” options are disabled in the application. Many customers only need to retrieve phone numbers and organization addresses. Thus, if these options are disabled, we increase the speed of the application and reduce the load on the Yelp.com site.
The main work in the application is done by the background thread. As it retrieves data on new organizations, it consistently adds results to the table located at the center of the screen. It shows the columns with information for the preview. After the thread finishes its work, you should press the “Export” button to upload the results to the .xlsx file.
We chose the .xlsx-formatted file and pictures folder as the export results. It is our policy that all our scrapers should extract the images. This product is no exception – it also extracts the images and saves them to your hard drive.
As a data export tool, we traditionally use the EPPlus library, which complies with .xlsx file recording standards.
And this is what we get as a result of the work of our application:
Application pricing policy
For this micro-application, we decided to set the lowest possible price, according to its quality and our development experience. This price is 16$ per copy for one physical user. The price includes an unlimited number of pages for scraping and an unlimited lifetime of the application. The price also includes a quality assurance.
Our market research shows that the average price of a scraper data from Yelp is $45. Thus, we offer a price almost 57% cheaper than the market, with an amazingly simple interface and great performance.
How to buy?
You can buy the application on the CodeCanyon and Codester portals. The links will be available in the near future.