Reviews

Now Reading

streamDM

Next
Prev

Review

streamDM

Overview

Synopsis

streamDM is a new open source software for mining big data streams using Spark Streaming, developed at Huawei Noah's Ark Lab.

Category

Data Mining Software Free

Features

•Open source software for mining big data streams
•Spark Streaming extension
•Implemented methods CluStream; Hoeffding Decision Trees; bagging; Stream KM ++; HyperplaneGenerator;

License

Open Source

Price

Free

Pricing

Subscription

Free Trial

Available

Users Size

Small (<50 employees), Medium (50 to 1000 Enterprise (>1001 employees)

Website

streamDM

Company

streamDM

What is best?

•Open source software for mining big data streams
•Spark Streaming extension
•Implemented methods CluStream; Hoeffding Decision Trees; bagging; Stream KM ++; HyperplaneGenerator;

What are the benefits?

• Open source software for mining big data streams
• Spark Streaming extension
• Implemented methods CluStream;Hoeffding Decision Trees;bagging;Stream KM ++; HyperplaneGenerator.

PAT Rating™

Editor Rating

Aggregated User Rating

Rate Here

Ease of use

7.6

8.5

Features & Functionality

7.6

7.4

Advanced Features

7.6

8.0

Integration

7.6

8.1

Performance

7.6

8.6

Customer Support

7.6

8.2

Implementation

8.4

Renew & Recommend

8.3

Bottom Line

Spark Streaming is an extension of the core Spark API that enables stream processing from a variety of sources. Spark is a extensible and programmable framework for massive distributed processing of datasets, called Resilient Distributed Datasets (RDD).

7.6

Editor Rating

8.2

Aggregated User Rating

2 ratings

You have rated this

streamDM is an open source software for mining big data streams that uses Spark Streaming, developed at Huawei Noah's Ark Lab. This software is licensed under Apache Software License v2.0.

Today, Big Data Stream learning is more challenging because data may not keep the same distribution over the lifetime of the stream. Learning algorithms needs to be very efficient because each example that comes in a stream can be processed once or these examples needs to be summarized with a small memory footprint.

Spark Streaming, which makes building scalable fault – tolerant streaming applications easy, is an extension of the core Spark API (fast and general engine for large-scale data processing) which enables stream processing from a variety of sources.

It is extensible and programmable framework for massive distributed processing of datasets, called Resilient Distributed Datasets (RDD) which receives input data streams and divides the data into batches and then in order of generating the results, these data are processed by the Spark engine. All of these data are into a sequence of DStreams, represented internally as a sequence of RDDs. Methods implemented in are SGD learner and perceptron; naïve bayes; CluStream; Hoeffding Decision Trees; bagging; Stream KM ++; HyperplaneGenerator; RandomTreeGenerator; RandomRBFGenerator; RandomRBFEventsGenerator.

SampleDataWriter is also implemented which can call data generators in order of creating sample data for simulation or test. When it comes to next releases adding more methods such as classification – random forests; regression – Hoeffding regression tree, Bagging, random forests; clustering – Clustree, DenStream; Frequent itemset Miner – IncMine, IncSecMine is planned.

Filter reviews

1 Reviews

Leave a Review

Nubia Thibodaux
April 22, 2017 at 10:48 am

Data mining big data streams using Spark Streaming
What are the benefits?
Spark Streaming, which is an extension of the core Spark API, works to enable stream processing from different sources. Spark Streaming works by receiving data streams of which it then divided this data into batches. These batches are then processed by the Spark engine so as to generate results.

Company size
Medium (50 to 1000)

User Role
Super User
User Industry
Manufacturing
Rating
Ease of use8.5
Features & Functionality8.1
Advanced Features8
Integration8.1
Performance8.6
Training 8.4
Customer Support8.2
Implementation8.4
Renew & Recommend8.3

ADDITIONAL INFORMATION
StreamDM is one of the new open source soft wares that is used for data mining big data streams using Spark Streaming.