Web Scraper 作者： webscraper.io

4.1 / 5 星

Web Scraper 版本历史 - 10 个版本

小心旧版本！显示这些版本是为了测试和参考目的。您应该始终使用附加组件的最新版本。
最新版本
版本 1.96.18
发布于 2025年4月28日 - 2.08 MB
适用于 firefox 113.0 及更高版本
[Feature] Added scroll option to element selector. The new scroll feature in element selector copared to element scroll handles additional edge cases:
* Automatically finds element that needs to be scrolled. Previously only worked with body element
* Extracts all data from sites where elements disappear once they are moved outside of visible screen area.
* Scrolls faster
[Refactor] Removed element scroll selector. Existing sitemaps with element scroll selector will continue to work
[Refactor] Improved css selector generation performance
源代码遵循保留所有权利发布
下载 Firefox 并安装扩展
下载文件
较早版本
版本 1.95.8
发布于 2025年1月7日 - 2 MB
适用于 firefox 113.0 及更高版本
[Feature] We launched a new feature idea gathering and voting portal for https://feedback.webscraper.io/ 2
[Improvement] Reduced extensions footprint on Browser by loading page content script only when extension is used.
[Fix] Element selection when page is zoomed
[Fix] Multiple fixes for sitemap XML selector where the selector incorrectly parsed sitemap.xml file
源代码遵循保留所有权利发布
下载文件
版本 1.89.8
发布于 2024年8月15日 - 1.97 MB
适用于 firefox 113.0 及更高版本
[Feature] added Website state setup configuration to prepare target site before scraping. For example Sign-in/Login, change currency, change location. For example if an element that indicated that you are logged into the website perform these actions:
* Navigate to url (Login page)
* type text into an input field
* click a button

[Breaking] regex extraction has been deprecated. If already configured, it will continue to work, but there won't be an option to add a new regex filter.
[Fix] Improved performance in start url editing view when a lot of urls are added
[Fix] Excessive memory usage in Pagination selector when large amounts of contents is accessed.
源代码遵循保留所有权利发布
下载文件
版本 1.79.3
发布于 2024年4月18日 - 1.94 MB
适用于 firefox 113.0 及更高版本
[Feature] improved link from any script data extraction to handle some edge cases
[Feature] added sitemap sync button within the sitemap editing view
[Fix] Fixed issues in page load detection where the scraper could think that page has loaded while it is still loading ajax data.
[Fix] Improved scraped data export. Should fix issues regarding slow data export.
[Refactor] We completed input validation refactoring. Congrats to our team!
源代码遵循保留所有权利发布
下载文件
版本 1.75.8
发布于 2024年2月5日 - 2.1 MB
适用于 firefox 113.0 及更高版本
[Feature] a mis-configuration modal will shop up in scenario when multiple selectors with multiple option enabled have been created. The modal will offer to group the selectors under one element selector.
[Feature] new link type and pagination type - Link from any script. This type will extract links in scenarios when a link can only be determined after clicking on the target element. The extractor will perform the click and monitor network traffic to see where the page is navigating. Handles window.open() and window.location=
[Refactor] Improved scrolling. Now scrolling will not skip frames in case when the target page is doing heavy rendering.
源代码遵循保留所有权利发布
下载文件
版本 1.72.9
发布于 2023年11月30日 - 2.03 MB
适用于 firefox 69.0 及更高版本
[Feature] confirmation popup will show up when deleting a selector.
[Feature] link selector will now allow to select elements that are not html links but contain links.
[Feature] when selecting html code of the hovered element will be shown.
[Feature] Link selector can now extract links from other attributes, attributes with scripts and text
[Feature] When deleting a sitemap a confirmation popup will be shown
[Feature] Selectors now can be sorted.
[Feature] During data extraction process some CSS selector will be optimized for better performance.
[Breaking] Popup link selector has been removed due to chrome MV3 javascript execution limitation. use link selector with custom link type
[Breaking] Removed popup link type from element click selector due of chrome MV3 javascript execution limitation
[Change] Improved mouse click simulation.
[Change] We completely removed integration with external CouchDB database.
[Change] Internal PouchDB database will be replaced with a simpler local storage database engine. All sitemaps will be migrated to new engine on first extension start.
[Change] This release includes an updated validation engine. Some validation rules will be stricter to prevent unexpected issues.
[Change] When during data extraction process a data element wasn't found a "null" value was stored. Now an empty value "" will be stored.
[Fix] scraper could get stuck on a bad page load.
[Refactor] We are refactoring UI code. Currently these changes should be invisible.
[Refactor] A lot of code was refactored to work with the new Chrome manifest V3 API
源代码遵循保留所有权利发布
下载文件
版本 0.6.5
发布于 2022年9月8日 - 1.77 MB
适用于 firefox 69.0 及更高版本
[Fix] Varios edge case issues with page load detection
[Feature] Sitemap sync for Firefox
[Refactor] Completely reimplemented validation.
[Feature] Exported data will be in sorted the order it was scraped
[Feature] Added limit option for scroll down selector
[Fix] Element click selector data extraction within shadow root
[Refactor] Chrome is migrationg to a new manifest version which change interal APIs. We put a lot of work into this.
[Fix] Some special chars where incorrectly exported in XLSX exporter
[Fix] Data extractors will strip invalid utf-8 characters from text
[Change] Removed delay options from most of the selectors
源代码遵循保留所有权利发布
下载文件
版本 0.6.4
发布于 2021年10月28日 - 1.57 MB
适用于 firefox 69.0 及更高版本
[Feature] XLSX export
[Change] Data extraction will be limited to 120 min from a single URL.
[Fix] Ignore elements created by google translate
[Feature] image selector will extract image URL from srcset attribute
[Feature] pagination selector
[Fix] element selection when page has zoomed elements
[Feature] element attribute selector now has attribute suggestion drop-down
[Feature] element preview shows found element count
[Feature] sitemap search in sitemap list
[Fix] lots of improvements in page load detection for edge cases
[Fix] overall lots of small fixes and updates
源代码遵循保留所有权利发布
下载文件
版本 0.5.4
发布于 2020年9月22日 - 1.03 MB
适用于 firefox 69.0 及更高版本
Fixed element selection in websites that were blocking it with CSP.
源代码遵循保留所有权利发布
下载文件
版本 0.5.3
发布于 2020年8月18日 - 1.14 MB
适用于 firefox 69.0 及更高版本
* New data selection UI engine. It is faster and more resilient to websites having CSS rules that break the UI.
* A new page load detection engine has been added. The new page load detection system will handle a lot of edge cases:
* immediate redirect after page load
* service workers
* hash tag changes
* quicker load when there is a slowly loading asset
* won't fail on an error page if there is an immediate redirect to a successful page
* data extraction will be retried if a redirect occurs during data extraction
* improved content type checking
* window.history.push changes

* CouchDB has been deprecated. Users that were using it will be able to continue to use it but new users won't be able to change data storage. We plan to replace the current data storage engine (PouchDB) with simpler one to reduce problems with sitemap and data storage.
* privacy policy page and an option to opt-out of extension metric gathering via options page.
* new users will see a welcome page with a quick startup guide
* reduced extensions permission requirements.
* Overall code quality improvements
源代码遵循保留所有权利发布
下载文件

最新版本

版本 1.96.18

较早版本

版本 1.95.8

版本 1.89.8

版本 1.79.3

版本 1.75.8

版本 1.72.9

版本 0.6.5

版本 0.6.4

版本 0.5.4

版本 0.5.3