Reviews
Now Reading
Distributed Machine Learning Toolkit
0
Review

Distributed Machine Learning Toolkit

Overview
Synopsis

Distributed Machine Toolkit is an open source project from the Microsoft Company.To generate better accuracies in various distributed Machine learning applications it requires a large number of computation resources which has become a main challenge for common machine learning researchers and practitioners.

Category

Text Analytics Software Free

Features

• Bagging
• Column(feature) sub-sample
• Continued train with input GBDT model
• Continued train with the input score file
• Weighted training
• Validation metric output during training
• Multi validation data
• Multi metrics
• Early stopping (both training and prediction)
• Prediction for leaf index

License

Open Source

Price

Free

Pricing

Subscription

Free Trial

Available

Users Size

Small (<50 employees), Medium (50 to 1000 employees), Enterprise (>1001 employees)

Company

Distributed Machine Learning Toolkit

What is best?

• Column(feature) sub-sample
• Continued train with input GBDT model
• Continued train with the input score file
• Weighted training
• Validation metric output during training
• Multi validation data
• Multi metrics
• Early stopping (both training and prediction)

What are the benefits?

•Flexibility
•Efficiency
•Big Data
•Big Model

PAT Rating™ ( Beta)
Editor Rating
Aggregated User Rating
Rate Here
Ease of use
7.6
Features & Functionality
7.6
Advanced Features
7.6
Integration
7.6
Performance
7.6
Customer Support
7.6
Implementation
Renew & Recommend
Bottom Line

Microsoft Distributed Machine Learning Toolkit (DMTK) is an open source project from the Microsoft Company, which contains both algorithmic and system innovations. These innovations make machine learning tasks on big data highly scalable, efficient, and flexible.

7.6
Editor Rating
Aggregated User Rating
You have rated this

Distributed Machine Toolkit is an open source project from the Microsoft Company.To generate better accuracies in various distributed Machine learning applications it requires a large number of computation resources which has become a main challenge for common machine learning researchers and practitioners. Microsoft released Microsoft Distributed Machine Learning Toolkit (DMTK), which contains both algorithmic and system innovations. These innovations make machine learning tasks on big data highly scalable, efficient, and flexible. It comprises four components.

• LightLDA: an extremely fast and scalable topic model algorithm, with a O(1) Gibbs sampler and an efficient distributed implementation.
• Distributed (Multisense) Word Embedding: a distributed version of (multi-sense) word embedding algorithm. Word embedding has become a very popular tool to compute semantic representation of words, which can serve as high-quality word features for natural language processing tasks.
• LightGBM : a very high-performance gradient boosting tree framework (supporting GBDT, GBRT, GBM, and MART), and its distributed implementation. LightGBM is evidenced to be several times faster than existing implementations of gradient boosting trees, due to its fully greedy tree-growth method and histogram-based memory and computation optimization. It also has a complete solution for distributed training, based on the DMTK framework.
• DMTK Framework: a flexible framework that supports unified interface for data parallelization, hybrid data structure for big model storage, model scheduling for big model training, and automatic pipelining for high training efficiency. It has two main components; Parameter Server which supports Hybrid data structure for model storage and Client SDK which supports pipeline between local training and model communication.

 

Filter reviews
User Ratings





User Company size



User role





User industry





Ease of use
Features & Functionality
Advanced Features
Integration
Performance
Customer Support
Implementation
Renew & Recommend

What's your reaction?
Love It
0%
Very Good
0%
INTERESTED
0%
COOL
0%
NOT BAD
0%
WHAT !
0%
HATE IT
0%