Sign in to see all reviews and comparisons. It's Free!
By clicking Sign In with Social Media, you agree to let PAT RESEARCH store, use and/or disclose your Social Media profile and email address in accordance with the PAT RESEARCH Privacy Policy and agree to the Terms of Use.
Distributed Machine Toolkit is an open source project from the Microsoft Company.To generate better accuracies in various distributed Machine learning applications it requires a large number of computation resources which has become a main challenge for common machine learning researchers and practitioners.
Category
Text Analytics Software Free
Features
• Bagging • Column(feature) sub-sample • Continued train with input GBDT model • Continued train with the input score file • Weighted training • Validation metric output during training • Multi validation data • Multi metrics • Early stopping (both training and prediction) • Prediction for leaf index
License
Open Source
Price
Free
Pricing
Subscription
Free Trial
Available
Users Size
Small (<50 employees), Medium (50 to 1000 Enterprise (>1001 employees)
• Column(feature) sub-sample • Continued train with input GBDT model • Continued train with the input score file • Weighted training • Validation metric output during training • Multi validation data • Multi metrics • Early stopping (both training and prediction)
What are the benefits?
•Flexibility •Efficiency •Big Data •Big Model
PAT Rating™
Editor Rating
Aggregated User Rating
Rate Here
Ease of use
7.6
7.0
Features & Functionality
7.6
6.5
Advanced Features
7.6
7.3
Integration
7.6
7.6
Performance
7.6
—
Customer Support
7.6
—
Implementation
—
Renew & Recommend
8.4
Bottom Line
Microsoft Distributed Machine Learning Toolkit (DMTK) is an open source project from the Microsoft Company, which contains both algorithmic and system innovations. These innovations make machine learning tasks on big data highly scalable, efficient, and flexible.
7.6
Editor Rating
7.4
Aggregated User Rating
2 ratings
You have rated this
Distributed Machine Toolkit is an open source project from the Microsoft Company.To generate better accuracies in various distributed Machine learning applications it requires a large number of computation resources which has become a main challenge for common machine learning researchers and practitioners. Microsoft released Microsoft Distributed Machine Learning Toolkit (DMTK), which contains both algorithmic and system innovations. These innovations make machine learning tasks on big data highly scalable, efficient, and flexible. It comprises four components.
• LightLDA: an extremely fast and scalable topic model algorithm, with a O(1) Gibbs sampler and an efficient distributed implementation. • Distributed (Multisense) Word Embedding: a distributed version of (multi-sense) word embedding algorithm. Word embedding has become a very popular tool to compute semantic representation of words, which can serve as high-quality word features for natural language processing tasks. • LightGBM : a very high-performance gradient boosting tree framework (supporting GBDT, GBRT, GBM, and MART), and its distributed implementation. LightGBM is evidenced to be several times faster than existing implementations of gradient boosting trees, due to its fully greedy tree-growth method and histogram-based memory and computation optimization. It also has a complete solution for distributed training, based on the DMTK framework. • DMTK Framework: a flexible framework that supports unified interface for data parallelization, hybrid data structure for big model storage, model scheduling for big model training, and automatic pipelining for high training efficiency. It has two main components; Parameter Server which supports Hybrid data structure for model storage and Client SDK which supports pipeline between local training and model communication.
By clicking Sign In with Social Media, you agree to let PAT RESEARCH store, use and/or disclose your Social Media profile and email address in accordance with the PAT RESEARCH Privacy Policy and agree to the Terms of Use.