Reviews
Now Reading
streamDM
1
Review

streamDM

Overview
Synopsis

streamDM is a new open source software for mining big data streams using Spark Streaming, developed at Huawei Noah's Ark Lab.

Category

Data Mining Software Free

Features

•Open source software for mining big data streams
•Spark Streaming extension
•Implemented methods CluStream; Hoeffding Decision Trees; bagging; Stream KM ++; HyperplaneGenerator;

License

Open Source

Price

Free

Pricing

Subscription

Free Trial

Available

Users Size

Small (<50 employees), Medium (50 to 1000 Enterprise (>1001 employees)

Website
Company

streamDM

What is best?

•Open source software for mining big data streams
•Spark Streaming extension
•Implemented methods CluStream; Hoeffding Decision Trees; bagging; Stream KM ++; HyperplaneGenerator;

What are the benefits?

• Open source software for mining big data streams
• Spark Streaming extension
• Implemented methods CluStream;Hoeffding Decision Trees;bagging;Stream KM ++; HyperplaneGenerator.

PAT Rating™
Editor Rating
Aggregated User Rating
Rate Here
Ease of use
7.6
8.5
Features & Functionality
7.6
7.4
Advanced Features
7.6
8.0
Integration
7.6
8.1
Performance
7.6
8.6
Customer Support
7.6
8.2
Implementation
8.4
Renew & Recommend
8.3
Bottom Line

Spark Streaming is an extension of the core Spark API that enables stream processing from a variety of sources. Spark is a extensible and programmable framework for massive distributed processing of datasets, called Resilient Distributed Datasets (RDD).

7.6
Editor Rating
8.2
Aggregated User Rating
2 ratings
You have rated this

streamDM is an open source software for mining big data streams that uses Spark Streaming, developed at Huawei Noah's Ark Lab. This software is licensed under Apache Software License v2.0.

Today, Big Data Stream learning is more challenging because data may not keep the same distribution over the lifetime of the stream. Learning algorithms needs to be very efficient because each example that comes in a stream can be processed once or these examples needs to be summarized with a small memory footprint.

Spark Streaming, which makes building scalable fault – tolerant streaming applications easy, is an extension of the core Spark API (fast and general engine for large-scale data processing) which enables stream processing from a variety of sources.

It is extensible and programmable framework for massive distributed processing of datasets, called Resilient Distributed Datasets (RDD) which receives input data streams and divides the data into batches and then in order of generating the results, these data are processed by the Spark engine. All of these data are into a sequence of DStreams, represented internally as a sequence of RDDs. Methods implemented in are SGD learner and perceptron; naïve bayes; CluStream; Hoeffding Decision Trees; bagging; Stream KM ++; HyperplaneGenerator; RandomTreeGenerator; RandomRBFGenerator; RandomRBFEventsGenerator.

SampleDataWriter is also implemented which can call data generators in order of creating sample data for simulation or test. When it comes to next releases adding more methods such as classification – random forests; regression – Hoeffding regression tree, Bagging, random forests; clustering – Clustree, DenStream; Frequent itemset Miner – IncMine, IncSecMine is planned.

Filter reviews
User Ratings





User Company size



User role





User industry





1 Reviews
  • Nubia Thibodaux
    April 22, 2017 at 10:48 am

    Data mining big data streams using Spark Streaming

    What are the benefits?

    Spark Streaming, which is an extension of the core Spark API, works to enable stream processing from different sources. Spark Streaming works by receiving data streams of which it then divided this data into batches. These batches are then processed by the Spark engine so as to generate results.

    Company size

    Medium (50 to 1000)

    User Role

    Super User

    User Industry

    Manufacturing

    Rating
    Ease of use8.5

    Features & Functionality8.1

    Advanced Features8

    Integration8.1

    Performance8.6

    Training 8.4

    Customer Support8.2

    Implementation8.4

    Renew & Recommend8.3

    ADDITIONAL INFORMATION
    StreamDM is one of the new open source soft wares that is used for data mining big data streams using Spark Streaming.

Ease of use
Features & Functionality
Advanced Features
Integration
Performance
Customer Support
Implementation
Renew & Recommend

What's your reaction?
Love It
0%
Very Good
0%
INTERESTED
0%
COOL
0%
NOT BAD
0%
WHAT !
0%
HATE IT
0%