Text
Now Reading
RapidMiner Text Mining Extension
0
Review

RapidMiner Text Mining Extension

Overview
Synopsis

The RapidMiner Text Extension adds all operators necessary for statistical text analysis. Texts from different data sources can be loaded and, can be transformed by different filtering techniques, to analyze text data.

Category

Text Analytics Software

Company

RapidMiner

Rating
Our Rating
User Rating
Ease of use
7.7
Features & Functionality
7.7
Advanced Features
7.7
Integration
7.7
Customer Support
7.7
Performance
7.7
Training
Implementation
Renew & Recommend
Bottom Line

The RapidMiner Text Extension adds all operators necessary for statistical text analysis. The Rapidminer Text Extensions supports several text formats including plain text, HTML, or PDF. It also provides standard filters for tokenization, stemming, stopword filtering, or n-gram generation.

7.7
Our Rating
User Rating
You have rated this

RapidMiner Text Mining Extension : RapidMiner is an open source data mining framework, which offers many operators that can be formed together into a process. A graphical user interface (GUI) allows to connect the operators with each other in the process view. The major function of a process is the analysis of the data which is retrieved at the beginning of the process. There are many packages available for RapidMiner, such as text processing, Weka extension, parallel processing, web mining, reporting extension, series processing, PMML, community, and R extension packages.

RapidMiner Text Mining Extension

The RapidMiner Text Extension adds all operators necessary for statistical text analysis. Texts from different data sources can be loaded and, can be transformed by different filtering techniques, to analyze text data. The Rapidminer Text Extensions supports several text formats including plain text, HTML, or PDF. It also provides standard filters for tokenization, stemming, stopword filtering, or n-gram generation.

RapidMiner Text Analytics

RapidMiner Text Analytics

The Text Processing package, can be installed and updated through the Update RapidMiner menu item under the Help menu. The Text Mining extension uses a special class for handling documents, called the Document class. This class stores the whole document in combination with additional meta information.

RapidMiner Text Analytics

RapidMiner Text Analytics

In the case of text mining the document is split into unique tokens. These tokens can be used to classify the complete document. Tokenization is the process of breaking a stream of text up into phrases, words, symbols, or other meaningful elements called tokens. The application of these tokenizers, result in a sheet containing the tokens in the order as they have been found in the document. Each token contains a number providing the information from which general unit it has been created. As an example, each word token of a particular sentence, contains the number of the sentence, and each sentence-token of a document contains the number of that document. There are also functionality to extend the Tokenizer class easily to create own tokenizers. There are also features for eliminating all the stop words. The other features include Stemming, which is also known as lemmatisation, a technique for the reduction of words into their stems, base or root and filtering.

RapidMiner

 

Filter reviews
User Ratings





User Company size



User role





User industry





Ease of use
Features & Functionality
Advanced Features
Integration
Customer Support
Performance
Training
Implementation
Renew & Recommend

What's your reaction?
Love It
14%
Very Good
57%
INTERESTED
14%
COOL
14%
NOT BAD
0%
WHAT !
0%
HATE IT
0%