Trending
PAT Index™
Top Ten
PAT Index™
 
1
Scrapy
 
2
Frontera
 
3
Octoparse
 
4
Portia
 
5
Pattern
 
6
TheWebMiner
 
7
DEiXTo
 
8
GNU Wget
 
9
IEPY
Random Articles
 
Top 24 Predictive Analytics Free Software
 
Top 59 Social Media Management and Analytics Software
 
Predictive Analytics Quadrant_1
What is Predictive Analytics ?
 
Top 27 Free Software for Text Analysis, Text Mining, Text Analytics
 
Top Business Intelligence Tools
Top 238 Free & Premium Business Intelligence Tools
 
Top Free Social Media Analytics Software
Top 27 Free Social Media Management and Analytics Software
 
Top Predictive Analytics Software API
Top 30 Predictive Analytics Software API
 
Predictive Analytics Value Chain
What is Predictive Modeling ?
 
Bigdata Platforms and Bigdata Analytics Software
Top 50 Bigdata Platforms and Bigdata Analytics Software
 
Cloud – SaaS – OnDemand Business Intelligence Solutions
Top 45 Cloud – SaaS – OnDemand Business Intelligence Software
 
Top Free Qualitative Data Analysis Software
Top 21 Free Qualitative Data Analysis Software
 
Top Business Intelligence Companies
Top 53 Business Intelligence Companies
 
Top Free Extract, Transform, and Load –ETL- Software
Top 35 Extract, Transform, and Load, ETL Software
 
Top 43 Online & Part Time MS Data Science Schools 2017
Web Scraping Tools Free
Most Recent
 
Read More
June 7, 2017

Frontera

Frontera is an effective code hosting platform for version control and collaboration. It is a web crawling framework consisting of crawl frontier, and distribution/scaling primitives, allowing to build a large scale online web crawler. Frontera takes care of the logic and policies to follow during the crawl. It stores and prioritises links extracted by the crawler to decide which pages to visit next, and capable of doing it in distributed manner. The frontier is initialized with a list of start URLs, that are called the seeds. Once the frontier is initialized the crawler asks it what pages should be visited next. As the crawler starts to visit the pages [...]

4.5
 
Read More
June 7, 2017

Scrapy

Scrapy is an open source and collaborative framework for extracting the data that users need from websites done in a fast, simple, yet extensible way. Scrapy is an application framework for crawling web sites and extracting structured data which can be used for a wide range of useful applications, like data mining, information processing or historical archival. Even though Scrapy was originally designed for web scraping, it can also be used to extract data using APIs (such as Amazon Associates Web Services) or as a general purpose web crawler. Scrapy is supported under Python 2.7 and Python 3.3+. Python 2.6 support was dropped starting at Scrapy 0.20. [...]

6
 
Read More
June 7, 2017

Portia

Portia is a tool that allows the user to visually scrape websites without any programming knowledge required. With Portia the user can annotate a web page to identify the data that needs to be extracted, and Portia will understand based on these annotations how to scrape data from similar pages. Web scraping involves coding and programming crawlers. If the user is a non-coder person, Portia can help extract web contents easily. This Scrapinghub’s tool lets the user use point&click UI interface to annotate (select) web content for its further scrape and store of it. I’ll go deeper inside Portia later in this post. One can use Portia within a [...]

2.5
 
Read More
June 7, 2017

DEiXTo

DEiXTo is a powerful web data extraction tool that is based on the W3C Document Object Model (DOM). It allows users to create highly accurate extraction rules that describe what pieces of data to scrape from a website. DEiXTo consists of three separate components to help users. GUI DEiXTo is an MS Windows application implementing a friendly graphical user interface that is used to manage extraction rules (build, test, fine-tune, save and modify). This is all that a user needs for small scale extraction tasks. DEiXToBot is a Perl module implementing a flexible and efficient Mechanize agent capable of extracting data of interest using GUI DEiXTo generate [...]

2
 
Read More
June 7, 2017

GNU Wget

GNU Wget is a free software package for retrieving files using HTTP, HTTPS and FTP, the most widely-used Internet protocols. It is a non-interactive commandline tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support. The recursive retrieval of HTML pages, as well as FTP sites is supported — the user can use Wget to make mirrors of archives and home pages, or traverse the web like a WWW robot (Wget understands /robots.txt). Wget works exceedingly well on slow or unstable connections, keeping getting the document until it is fully retrieved. This allows freedom of movement as the user does not always need to be [...]

1.25
 
Read More
June 7, 2017

IEPY

IEPY is an open source tool for Information Extraction focused on Relation Extraction. IEPY has a corpus annotation tool with a web-based UI, an active learning relation extraction tool pre-configured with convenient defaults and a rule based relation extraction tool for cases where the documents are semi-structured or high precision is required. To give an example of Relation Extraction, if the user is trying to find a birth date in: “John von Neumann (December 28, 1903 – February 8, 1957) was a Hungarian and American pure and applied mathematician, physicist, inventor and polymath.” Then IEPY’s task is to identify “John von Neumann” and “December 28, [...]

1.25
 
Read More
June 7, 2017

Octoparse

Octoparse is the number one Automated Web Scraping Software. Octoparse is a cloud-based web scraper that helps the user easily extract any web data without coding. Octoparse is a new modern visual web data extraction software. It provides users a point-&-click UI to develop extraction patterns, so that scrapers can apply these patterns to structured websites. Both experienced and inexperienced users find it easy to use Octoparse to bulk extract information from websites – for most of scraping tasks no coding needed! Octoparse, being a Windows application, is designed to harvest data from both static and dynamic websites (including those whose web [...]

3.75
 
Read More
June 7, 2017

Pattern

Pattern is a web mining module for the Python programming language. It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and visualization. The pattern.web module is a web toolkit that contains API’s (Google, Gmail, Bing, Twitter, Facebook, Wikipedia, Wiktionary, DBPedia, Flickr, …), a robust HTML DOM parser and a web crawler. The pattern.en module is a natural language processing (NLP) toolkit for English. Because language is ambiguous (e.g., I can ↔ a can) [...]

2.25
Sections
Bigdata
Business Intelligence
Predictive Analytics
Text
Most Recent
 
 
 
 
 
 
A/B Testing SoftwareAdvertising Analytics SoftwareAffective Computing SoftwareAI PlatformsAnalytics PlatformAnomaly Detection SoftwareAPI Management PlatformArtificial Neural Network SoftwareBalanced Scorecard SoftwareBalanced Scorecard Software FreeBehavioral Analytics SoftwareBig Data Streaming AnalyticsBigdata AnalyticsBigdata Ingestion SoftwareBigdata PlatformBusiness Analytics PlatformBusiness Intelligence Open SourceBusiness Intelligence SoftwareBusiness Process Management SoftwareBusiness Process Management Software FreeBusiness Texting AppCall Center Automation SoftwareCampaign and Lead Management SoftwareChannel Integration PlatformCloud Business Intelligence SoftwareCloud Business Intelligence Software FreeCognitive Computing SoftwareCommission Management SoftwareCompliance Management SoftwareConjoint Analysis SoftwareContact Management SoftwareContent Delivery Network ProvidersContract Lifecycle Management SoftwareCore HR SoftwareCPQ SoftwareCRM Automation SoftwareCRM SoftwareCRM Software FreeCustomer Analytics SoftwareCustomer Churn, Renew SoftwareCustomer Engagement PlatformCustomer Experience Management SoftwareCustomer Support SoftwareCustomer Upsell, Cross Sell SoftwareDashboard SoftwareDashboard Software FreeData Analysis SoftwareData Analysis Software FreeData Blending SoftwareData Discovery SoftwareData Integration PlatformData Preparation PlatformData Science PlatformData Security SoftwareData Virtualization SoftwareData Visualization SoftwareData Visualization Software FreeDatabaseDataMining SoftwareDataMining Software FreeDecision Rules Management SystemDeep Learning SoftwareDigital Asset Management SoftwareDistributed Order Management SoftwareDocument Management SoftwareDomain Registration ProvidersE-Signature SoftwareEcommerce Analytics SoftwareEcommerce PlatformeCommerce Search EngineEmail Management SoftwareEmbedded Business Intelligence SoftwareEmployee Engagement PlatformEnterprise Content Management SoftwareEnterprise Performance Management SoftwareETL SoftwareETL Software FreeEvent Management PlatformsExcel Business Intelligence SoftwareFlowchart SoftwareForecasting SoftwareGamification SoftwareGraph DatabaseHadoop Analytics PlatformHadoop Data Integration and Management SoftwareHadoop Data Lake SoftwareHadoop PlatformHadoop Platform FreeHelp Desk SoftwareHR Cloud SoftwareHR Service Delivery SoftwareHRMS SoftwareHybrid Cloud Management PlatformIn Memory Data Grid PlatformIn Memory DatabaseIndustry Business Intelligence SoftwareIntegrated Multi Channel Campaigns SoftwareInvoicing & Credit SoftwareIT Business Analytics PlatformJava Web Framework SoftwareKPI Tracking SoftwareLead Management SoftwareLog Management SoftwareLow-Code Development PlatformMachine Learning LibraryManufacturers & Distributor BI SoftwareMapping Analytics SoftwareMarketing & Sales Intelligence PlatformMarketing Analytics SoftwareMarketing Automation SoftwareMarketing Cloud PlatformMaster Data Management SoftwareMobile BI SoftwareMobile BI Software FreeMobile Commerce ApplicationsMobile CRM SoftwareMobile Payment ProvidersNamed Entity Extraction SoftwareNewSQL DatabaseNoSQL DatabaseOmnichannel SoftwareOnline Community Management SoftwareOnline Group Decision PlatformOpen-Source Digital Commerce PlatformOrder Management SoftwarePayment Management SoftwarePersonalization Software and EnginesPHP Web Framework SoftwarePredictive Analytics APIPredictive Analytics SoftwarePredictive Analytics Software FreePredictive Lead Scoring SoftwarePredictive Pricing SoftwarePredictive Social Intelligence SoftwarePrescriptive Analytics SoftwarePrivate CloudProduct Customization SoftwareProduct Management SoftwareProduct Management Software FreeProduct Reviews PlatformProject Management SoftwareProject Management Software FreePublic CloudPython Web Framework SoftwareQualitative Data Analysis SoftwareQualitative Data Analysis Software FreeQuantitative Content Analysis SoftwareRapid Application Development PlatformReal Estate CRM SoftwareReal Time MonitoringReporting SoftwareReporting Software FreeRetail Analytics SoftwareRevenue Management PlatformRevenue Management SoftwareSales Enablement PlatformSales Force Automation SoftwareSalesforce Application DevelopmentSalesforce AppsSalesforce Email IntegrationSalesforce Shipping AppsSearch Engine ServerSearch Engine Server FreeSearch Powered Analytics SoftwareSecurity Information and Event Management SoftwareSelf Service AnalyticsSelf Service Analytics FreeSelf Service Data Preparation SoftwareSentiment Analysis SoftwareSMB Business IntelligenceSMB CRM SoftwareSocial Commerce PlatformSocial CRM SoftwareSocial Listening SoftwareSocial Media Analytics SoftwareSocial Media Analytics SoftwareSocial Media Analytics Software FreeSocial Media Management SoftwareSocial Publishing SoftwareSoftware Usage Tracking SoftwareSQL Business Intelligence SoftwareSQL DatabaseSQL IDE SoftwareStatistical SoftwareStatistical Software FreeStatistical Text Analysis SoftwareSubscription Management SoftwareSupply Chain Analytics SoftwareSurvey Analysis SoftwareTalent Management SoftwareText Analytics APIText Analytics SoftwareText Analytics Software FreeText Categorization SoftwareTMS SoftwareTrade Promotion Management SoftwareUnified Communications SoftwareUnified Modeling Language ToolsUnified Modeling Language Tools FreeUnified Monitoring and Analytics SoftwareUser and Entity Behavior Analytics SoftwareUser Experience Design SoftwareVideo Communication SoftwareWarehousing, Logistics, and Fulfillment Service ProvidersWeb Analytics SoftwareWeb Analytics Software FreeWeb Content Management SystemsWeb Data Extraction SoftwareWeb Framework SoftwareWeb Hosting ServicesWeb Log Analysis SoftwareWeb Payment Gateways and ProcessorsWeb Scraping ToolsWeb Scraping Tools FreeWeb Search Engine SoftwareWebsite Analytics SoftwareWM SoftwareWorkflow Automation SoftwareWorkflow SoftwareWorkflow Software FreeWorkforce Intelligence SoftwareWorkforce Management Software
MORE
PAT Index
 
 
 
 
 
The Latest
 
Read More
1.25
Editor's Picks
 
Easily join, analyze and visualize using SiSense
 
 
 
Go To Reviews
Compare
Go