DataExcavator update from 1.2.4 to 1.2.5
Whoa! It’s been a tough month. Having collected a decent number of complaints about the fine print in the interface and the constant confusion with competitive modal windows, we decided on some changes.
Let’s start with a description of the interface enhancements. First of all, we have increased the size of the font in all the windows. Secondly, we have greatly simplified many parameter names and short descriptions of these parameters. Thirdly, we have completely redesigned the mechanism of displaying information. Previously, we were convinced that competitive modal windows are good. Alas, this is not the case. It only confuses people, and indeed we ourselves have been confused several times (LOL). Accordingly, now almost all blocks with separate logic are displayed in real modal windows that do not allow you to jump from project to project. This prevents confusion – at every moment you know exactly what project you are working on. Fourthly, we have greatly improved the cards of the projects themselves on the main screen. Now each card is accompanied by a progress bar with the current percentage of scraping of all pages of the website. Also, the main menu of the project has been changed – now it is a little more convenient and understandable.
Another large block of work is the so-called “data patterns”. We have made it very easy to configure the list of nodes you want to extract from the site. Now it looks like a list with .CSS selectors (or XPath expressions), which is immediately available in the settings window. Now you don’t need to make 3-4 extra clicks to add one node for scraping. All nodes in front of you – immediately and without extra clicks.
In general, we decided to take the path of interface simplification. Even though we position our application as a professional utility for scraping, simplicity and ergonomics take their place.
We have also made some corrections to the kernel, but decided not to release a separate kernel release. In particular, we have fixed the problem with .pdf file scraping. We also fixed the incorrect work with the Clean-param parameter from the robots.txt file. On the whole, as always, the application became more stable.
DataExcavator update from 1.2.3 to 1.2.4
In this version, we have added a modal window with a choice of how to analyze the site. This window asks for details of the crawling algorithm: should the app download all pages from the site, or should it download only part of the pages by user list? And although this setting was already in the project properties window, in our opinion it should greatly simplify the work with the application for new users. It is much easier to click on an additional button at once than to dig through the depths of settings and look for the option “Method of links analysis”.
In addition, we found several small bugs concerning project testing and link downloading from separate pages. These bugs were fixed in the same way.
As always, the application has become more stable and convenient.
ExcavatorSharp and DataExcavator updates from 1.2.2 to 1.2.3
In this update, we have traditionally increased the stability of the application by fixing some bugs and running a number of load tests.
Among the significant improvements we have added project templates. Templates contain ready-made settings for well-known sites. As the first 4 templates we added Amazon, Aliexpress, Craigslist and Walmart. Now in each new release we will try to add additional, new templates. This will save time on settings – just use a ready-made template instead of dealing with projects.
In the kernel library we have fixed several significant problems. Under certain circumstances, the function “test settings” and “get links from the site page” hung. We fixed this problem. We have also eliminated the problem of eternal locking of log files, which also sometimes occurred when circumstances fail.
ExcavatorSharp and DataExcavator updates from 1.2.1 to 1.2.2
So, for this block of work we have done some work on fixing interface errors. We have added more thorough exception handling for various situations. The control over importing settings and copying projects has been significantly improved. Interception of exceptions when trying to run several instances of a program has been improved (remember – a program can be run only in one instance). Also added a nice feature of sorting patterns and copying elements inside a pattern. In general, as usual, the application has become more stable and a bit more convenient.
ExcavatorSharp and DataExcavator updates from 1.2 to 1.2.1
Some bugs have been fixed on the hot trails. In particular, the stability of work with the file system has been increased, errors in installation from under an account with limited rights have been corrected. The work of some modules with the file system has been improved. Fixed .CSS selectors auto-definition window behavior.
Interface improvements – added display of logs in the waiting windows. Now you can see live what the program is doing at some point in time. No endless loader 😉
ExcavatorSharp and DataExcavator updates from 1.1 to 1.2
Well, it’s been a tough month. We placed on Codester and CodeCanyon and got some sales. We found several problems with the application 🤷♂️ that were related to validating license keys and the application not running under administrative accounts. At the moment some of them have been resolved and we are ready to present version 1.2 both for the client part and for our library. At the moment, some cases of incorrect validation of license keys remain unclear to us, and we are working on finding and fixing this problem. In general, if you use our application and cannot activate your demo key, please create a license.key file in the folder “C:/ProgramData/DataExcavator” and copy your key there. In this case, the application will not try to activate it remotely. Thank you for your understanding. A list of the most significant improvements is provided below.
- Fixed current errors and improved overall stability of the application.
- Fixed the problem of using the SSL / TLS protocols that are not available in some versions of the OS. The problem occurred when trying to set the ServicePointManager.SecurityProtocol property to certain values on certain operating systems.
- Added principal login functionality to the site as a separate behavior for CEF. Now if you want to extract data from a site that requires authentication by login and password, you can do so not through CEFBehaviors (which requires some skill), but through CEFWebsiteAuthBehavior. Inside you find a simple set of fields, including a template script. In general, this greatly simplifies the work with sites that require authentication.
- Fixed the Excel export algorithm – the library was downgraded to a stable build without additional license fees (EPPlus starting from version 5 is no longer free).
- Fixed the algorithm of exporting through Excel and CSV in complex cases. Now if one of the export results is not recorded successfully, the overall export process does NOT stop.
- Added callback to the export mechanism which is called after each exported record. This allows you to keep track of the export process, rather than waiting for a long time for the program to finish.
- Remains to question the behavior of the program in case it is not run from under the administrator on server versions of the operating system. At the moment, if you have any problems, we recommend you to use the “Launch from under an administrator” mechanism.