Sign in to see all reviews and comparisons. It's Free!
By clicking Sign In with Social Media, you agree to let PAT RESEARCH store, use and/or disclose your Social Media profile and email address in accordance with the PAT RESEARCH Privacy Policy and agree to the Terms of Use.
•Open source software for mining big data streams •Spark Streaming extension •Implemented methods CluStream; Hoeffding Decision Trees; bagging; Stream KM ++; HyperplaneGenerator;
What are the benefits?
• Open source software for mining big data streams • Spark Streaming extension • Implemented methods CluStream;Hoeffding Decision Trees;bagging;Stream KM ++; HyperplaneGenerator.
PAT Rating™
Editor Rating
Aggregated User Rating
Rate Here
Ease of use
7.6
8.5
Features & Functionality
7.6
7.4
Advanced Features
7.6
8.0
Integration
7.6
8.1
Performance
7.6
8.6
Customer Support
7.6
8.2
Implementation
8.4
Renew & Recommend
8.3
Bottom Line
Spark Streaming is an extension of the core Spark API that enables stream processing from a variety of sources. Spark is a extensible and programmable framework for massive distributed processing of datasets, called Resilient Distributed Datasets (RDD).
7.6
Editor Rating
8.2
Aggregated User Rating
2 ratings
You have rated this
streamDM is an open source software for mining big data streams that uses Spark Streaming, developed at Huawei Noah's Ark Lab. This software is licensed under Apache Software License v2.0.
Today, Big Data Stream learning is more challenging because data may not keep the same distribution over the lifetime of the stream. Learning algorithms needs to be very efficient because each example that comes in a stream can be processed once or these examples needs to be summarized with a small memory footprint.
Spark Streaming, which makes building scalable fault – tolerant streaming applications easy, is an extension of the core Spark API (fast and general engine for large-scale data processing) which enables stream processing from a variety of sources.
It is extensible and programmable framework for massive distributed processing of datasets, called Resilient Distributed Datasets (RDD) which receives input data streams and divides the data into batches and then in order of generating the results, these data are processed by the Spark engine. All of these data are into a sequence of DStreams, represented internally as a sequence of RDDs. Methods implemented in are SGD learner and perceptron; naïve bayes; CluStream; Hoeffding Decision Trees; bagging; Stream KM ++; HyperplaneGenerator; RandomTreeGenerator; RandomRBFGenerator; RandomRBFEventsGenerator.
SampleDataWriter is also implemented which can call data generators in order of creating sample data for simulation or test. When it comes to next releases adding more methods such as classification – random forests; regression – Hoeffding regression tree, Bagging, random forests; clustering – Clustree, DenStream; Frequent itemset Miner – IncMine, IncSecMine is planned.
Data mining big data streams using Spark Streaming
What are the benefits?
Spark Streaming, which is an extension of the core Spark API, works to enable stream processing from different sources. Spark Streaming works by receiving data streams of which it then divided this data into batches. These batches are then processed by the Spark engine so as to generate results.
Company size
Medium (50 to 1000)
User Role
Super User
User Industry
Manufacturing
Rating
Ease of use8.5
Features & Functionality8.1
Advanced Features8
Integration8.1
Performance8.6
Training 8.4
Customer Support8.2
Implementation8.4
Renew & Recommend8.3
ADDITIONAL INFORMATION StreamDM is one of the new open source soft wares that is used for data mining big data streams using Spark Streaming.
Data mining big data streams using Spark Streaming
Spark Streaming, which is an extension of the core Spark API, works to enable stream processing from different sources. Spark Streaming works by receiving data streams of which it then divided this data into batches. These batches are then processed by the Spark engine so as to generate results.
Medium (50 to 1000)
Super User
Manufacturing
ADDITIONAL INFORMATION
StreamDM is one of the new open source soft wares that is used for data mining big data streams using Spark Streaming.