PAT Index™
PAT Index™
General Architecture for Text Engineering – GATE
RapidMiner Text Mining Extension
KH Coder
Coding Analysis Toolkit
QDA Miner Lite
Natural Language Toolkit
Apache Mahout
Apache UIMA
Random Articles
Top 41 Free and Open Source Customer Relationship Management (CRM) Software
Top 23 Free and Open Source Human Resource ( HR) Software
Top 34 Human Resource Management ( Core HR) Software
Top 24 Predictive Analytics Free Software
Top 59 Social Media Management and Analytics Software
Predictive Analytics Quadrant_1
What is Predictive Analytics ?
Top 27 Free Software for Text Analysis, Text Mining, Text Analytics
Top Business Intelligence Tools
Top 238 Free & Premium Business Intelligence Tools
Top Free Social Media Analytics Software
Top 27 Free Social Media Management and Analytics Software
Top Predictive Analytics Software API
Top 30 Predictive Analytics Software API
Predictive Analytics Value Chain
What is Predictive Modeling ?
Bigdata Platforms and Bigdata Analytics Software
Top 50 Bigdata Platforms and Bigdata Analytics Software
Cloud – SaaS – OnDemand Business Intelligence Solutions
Top 45 Cloud – SaaS – OnDemand Business Intelligence Software
Top Free Qualitative Data Analysis Software
Top 21 Free Qualitative Data Analysis Software
Text Analytics Software Free
Most Recent
Read More
June 16, 2017

Apache Mahout

The Apache Mahout project’s goal is to build an environment for quickly creating scalable performant machine learning applications. Apache Mahout is a simple and extensible programming environment and framework for building scalable algorithms and contains a wide variety of premade algorithms for Scala and Apache Spark, H2O, Apache Flink. It also used Samsara which is a vector math experimentation environment with R-like syntax which works at scale. Apache™ Mahout is a library of scalable machine-learning algorithms, implemented on top of Apache Hadoop and using the MapReduce paradigm. While Mahout’s core algorithms for clustering, classification [...]

Read More
June 16, 2017

Natural Language Toolkit

NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, and an active discussion forum. Thanks to a hands-on guide introducing programming fundamentals alongside topics in computational linguistics, plus comprehensive API documentation, NLTK is suitable for linguists, engineers, students, educators, researchers, and industry users alike. NLTK is [...]

Read More
June 16, 2017

Apache UIMA

Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at UIMA enables applications to be decomposed into components, for example “language identification” => “language specific segmentation” => “sentence boundary detection” => “entity detection (person/place names etc.)”. Each component implements interfaces [...]

Read More
June 15, 2017


Gensim is a FREE Python library that has scalable statistical semantics. It analyzes plain-text documents for semantic structure and retrieve semantically similar documents. In addition, Gensim is a robust, efficient and hassle-free piece of software to realize unsupervised semantic modelling from plain text. It stands in contrast to brittle homework-assignment-implementations that do not scale on one hand, and robust java-esque projects that take forever just to run “hello world”. Gensim is licensed under the OSI-approved GNU LGPLv2.1 license. This means that it’s free for both personal and commercial use, but if users make any modification to gensim [...]

Read More
June 7, 2017


Pattern is a web mining module for the Python programming language. It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and visualization. The pattern.web module is a web toolkit that contains API’s (Google, Gmail, Bing, Twitter, Facebook, Wikipedia, Wiktionary, DBPedia, Flickr, …), a robust HTML DOM parser and a web crawler. The pattern.en module is a natural language processing (NLP) toolkit for English. Because language is ambiguous (e.g., I can ↔ a can) [...]