Reviews
Now Reading
MALLET
0
Review

MALLET

Overview
Synopsis

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.

Category

Data Mining Software Free

Features

Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text
provides tools for sequence tagging
Routines for transforming text documents into numerical representations
add-on package called GRRM
Open Source Software

License

Open Source

Price

Free

Pricing

Subscription

Free Trial

Available

Users Size

Small (<50 employees), Medium (50 to 1000 employees), Enterprise (>1001 employees)

Website
Company

MALLET

Rating
Our Rating
User Rating
Ease of use
7.6
Features & Functionality
7.6
Advanced Features
7.6
Integration
7.6
Customer Support
7.6
Performance
7.6
Training
Implementation
Renew & Recommend
Bottom Line

MALLET includes sophisticated tools for document classification: efficient routines for converting text to "features", a wide variety of algorithms (including Naïve Bayes, Maximum Entropy, and Decision Trees), and code for evaluating classifier performance using several commonly used metrics.

7.6
Our Rating
User Rating
You have rated this

MALLET known as Machine Learning for LanguagE Toolkit is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text. Sophisticated tools for document classification are provided – efficient routines for converting text to “features”, a wide variety of algorithms (including Naïve Bayes, Maximum Entropy, and Decision Trees), and code for evaluating classifier performance using several commonly used metrics. It also provides tools for sequence tagging for applications such as named-entity extraction from text. Algorithms include Hidden Markov Models, Maximum Entropy Markov Models, and Conditional Random Fields and all of these methods are implemented in an extensible system for finite state transducers. In order of analyzing large collections of unlabeled text, topic models are very useful. These modeling toolkits contain efficient, sampling-based implementations of Latent Dirichlet Allocation, Pachinko Allocation, and Hierarchical LDA and MALLET includes an efficient implementation of Limited Memory BFGS, among many other optimization methods. Routines for transforming text documents into numerical representations are also included. This process is implemented through a flexible system of “pipes”, which handle distinct tasks such as tokenizing strings, removing stopwords, and converting sequences into count vectors. MALLET also has an add-on package called GRRM which contains support for inference in general graphical models, and training of CRFs with arbitrary graphical structure. The toolkit is Open Source Software, and is released under the Common Public License. Users can also import data through MALLET and there are two methods for it – first when the source data consists of many separate files, and second when the data is contained in a single file, with one instance per line.

MALLET
PAT Index
 
 
 
 
 
The Latest
 
Read More
3.25
Editor's Picks
 
Go To DataMining Software Free
Filter reviews
User Ratings





User Company size



User role





User industry





Ease of use
Features & Functionality
Advanced Features
Integration
Customer Support
Performance
Training
Implementation
Renew & Recommend

What's your reaction?
Love It
0%
Very Good
0%
INTERESTED
0%
COOL
0%
NOT BAD
0%
WHAT !
0%
HATE IT
0%