Text
Now Reading
Distributed Machine Learning Toolkit
0
Review

Distributed Machine Learning Toolkit

Overview
Synopsis

Distributed Machine Toolkit is an open source project from the Microsoft Company.To generate better accuracies in various distributed Machine learning applications it requires a large number of computation resources which has become a main challenge for common machine learning researchers and practitioners.

Category

Text Analytics Software Free

Features

• Bagging
• Column(feature) sub-sample
• Continued train with input GBDT model
• Continued train with the input score file
• Weighted training
• Validation metric output during training
• Multi validation data
• Multi metrics
• Early stopping (both training and prediction)
• Prediction for leaf index

License

Open Source

Price

Free

Pricing

Subscription

Free Trial

Available

Users Size

Small (<50 employees), Medium (50 to 1000 employees), Enterprise (>1001 employees)

Company

Distributed Machine Learning Toolkit

What is best ?

• Column(feature) sub-sample
• Continued train with input GBDT model
• Continued train with the input score file
• Weighted training
• Validation metric output during training
• Multi validation data
• Multi metrics
• Early stopping (both training and prediction)

What are the benefits ?

•Flexibility
•Efficiency
•Big Data
•Big Model

Rating
Our Rating
User Rating
Ease of use
7.6
Features & Functionality
7.6
Advanced Features
7.6
Integration
7.6
Customer Support
7.6
Performance
7.6
Implementation
Renew & Recommend
Bottom Line

Microsoft Distributed Machine Learning Toolkit (DMTK) is an open source project from the Microsoft Company, which contains both algorithmic and system innovations. These innovations make machine learning tasks on big data highly scalable, efficient, and flexible.

7.6
Our Rating
User Rating
You have rated this

Distributed Machine Toolkit is an open source project from the Microsoft Company.To generate better accuracies in various distributed Machine learning applications it requires a large number of computation resources which has become a main challenge for common machine learning researchers and practitioners. Microsoft released Microsoft Distributed Machine Learning Toolkit (DMTK), which contains both algorithmic and system innovations. These innovations make machine learning tasks on big data highly scalable, efficient, and flexible. It comprises four components.

• LightLDA: an extremely fast and scalable topic model algorithm, with a O(1) Gibbs sampler and an efficient distributed implementation.
• Distributed (Multisense) Word Embedding: a distributed version of (multi-sense) word embedding algorithm. Word embedding has become a very popular tool to compute semantic representation of words, which can serve as high-quality word features for natural language processing tasks.
• LightGBM : a very high-performance gradient boosting tree framework (supporting GBDT, GBRT, GBM, and MART), and its distributed implementation. LightGBM is evidenced to be several times faster than existing implementations of gradient boosting trees, due to its fully greedy tree-growth method and histogram-based memory and computation optimization. It also has a complete solution for distributed training, based on the DMTK framework.
• DMTK Framework: a flexible framework that supports unified interface for data parallelization, hybrid data structure for big model storage, model scheduling for big model training, and automatic pipelining for high training efficiency. It has two main components; Parameter Server which supports Hybrid data structure for model storage and Client SDK which supports pipeline between local training and model communication.

Distributed Machine Learning Toolkit
Text Analytics Software Free: Top Twenty
PAT Index™
 
1
General Architecture for Text Engineering – GATE
 
2
RapidMiner Text Mining Extension
 
3
KH Coder
 
4
Coding Analysis Toolkit
 
5
TAMS
 
6
VisualText
 
7
QDA Miner Lite
 
8
Datumbox
 
9
Natural Language Toolkit
 
10
Apache Mahout
 
11
Twinword
 
12
Pattern
 
13
Apache UIMA
 
14
Carrot2
 
15
LingPipe
 
16
Textable
 
17
Gensim
 
18
Apache OpenNLP
 
19
Aika
 
20
tm – Text Mining Package
Heat Index
 
 
 
 
 
The Latest
 
Read More
70.75
Editor's Picks
 
 
 
 
Go To Text Analytics Software Free
Filter reviews
User Ratings





User Company size



User role





User industry





Ease of use
Features & Functionality
Advanced Features
Integration
Customer Support
Performance
Implementation
Renew & Recommend

What's your reaction?
Love It
0%
Very Good
0%
INTERESTED
0%
COOL
0%
NOT BAD
0%
WHAT !
0%
HATE IT
0%