Predictive Analytics
Now Reading
50 Top Free Data Mining Software
6

50 Top Free Data Mining Software

50 Top Free Data Mining Software
4.5 (90.78%) 180 ratings

Data Mining is the computational process of discovering patterns in large data sets involving methods using the artificial intelligence, machine learning, statistical analysis, and database systems with the goal to extract information from a data set and transform it into an understandable structure for further use. Orange Data mining, R Software Environment, RapidMiner, Weka Data Mining, KNIME, SpagoBI Business Intelligence, Anaconda, Shogun, ELKI, Scikit-learn, CMSR Data Miner, Fityk, mlpy, Dlib, Rattle GUI, GNU Octave, Pandas, Natural Language Toolkit, OpenNN, TANAGRA, Alteryx Project Edition, Apache UIMA, Vowpal Wabbit, MALLET, CLUTO, streamDM, DataMelt, Chemicalize.org, Jubatus, MiningMart, Databionic ESOM, Apache Mahout, TraMineR, ROSETTA, KEEL, ADaM, ML-Flex, Modular toolkit for Data Processing, Dataiku, SenticNet API, LIBSVM and LIBLINEAR, Lattice Miner, Gnome datamine tools, yooreeka, AstroML, jHepWork, ARMiner, and arules are some of the top free data mining software.

Top Free Data Mining Software: Trending

Free Data Mining Software : Top Twenty
PAT Index™
 
1
Orange Data mining
 
2
R Software Environment
 
3
Weka Data Mining
 
4
SpagoBI Business Intelligence
 
5
Anaconda
 
6
Shogun
 
7
DataMelt
 
8
Natural Language Toolkit
 
9
Apache Mahout
 
10
GraphLab Create
 
11
Scikit-learn
 
12
ELKI
 
13
GNU Octave
 
14
Lavastorm Analytics Engine
 
15
RapidMiner Starter Edition
 
16
Apache UIMA
 
17
KNIME Analytics Platform Community
 
18
CMSR Data Miner
 
19
LIBLINEAR
 
20
Rattle GUI

 

Sisense

Sisense empower the most non-technical user with the ability to access data and build interactive dashboards and business intelligence reports. Sisense provides a variety of dashboard widgets to pinpoint the best visualization for your data, such as: geographical maps, gauges to measure KPIs, line charts to determine trends, scatter plots to see correlations, and pie charts for clear comparisons.Sisense enables to customize dashboard layout with drag-and-drop features to place each widget exactly where you want for optimal representation.

Sisense Demo

Easily join, analyze and visualize using SiSense

Top Free Data Mining Software

Orange Data mining, R Software Environment, RapidMiner, Weka Data Mining, KNIME, SpagoBI Business Intelligence, Anaconda, Shogun, ELKI, Scikit-learn, CMSR Data Miner, Fityk, mlpy, Dlib, Rattle GUI, GNU Octave, Pandas, Natural Language Toolkit, OpenNN, TANAGRA, Alteryx Project Edition, Apache UIMA, Vowpal Wabbit, MALLET, CLUTO, streamDM, DataMelt, Chemicalize.org, Jubatus, MiningMart, Databionic ESOM, Apache Mahout, TraMineR, ROSETTA, KEEL, ADaM, ML-Flex, Modular toolkit for Data Processing, Dataiku, SenticNet API, LIBSVM and LIBLINEAR, Lattice Miner, Gnome datamine tools, yooreeka, AstroML, jHepWork, ARMiner, arules are some of the top free data mining software.
Free Data Mining Software
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
SpagoBI Business Intelligence
 
GraphLab
 
 
Weka Data Visualiser
 
Orange-Survey plot
 
 
 
ELKI
 
R
 
Lavastorm Analytics Engine
1

Orange Data mining

Orange is an open source data visualization and analysis tool. Orange is developed at the Bioinformatics Laboratory at the Faculty of Computer and Information Science, University of Ljubljana, Slovenia, along with open source community. Data mining is done through visual programming or Python scripting. The tool has components for machine learning, add-ons for bioinformatics and text mining and it is packed with features for data analytics. Orange is a Python library. Python scripts can run in a terminal window, integrated environments like PyCharm and PythonWin, or shells like iPython. Orange consists of a canvas interface onto which the user places…

Orange Data mining

Orange-Survey plot

2

R Software Environment

R

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. R is an integrated suite of software facilities for data manipulation, calculation and graphical display. Some of the functionalities include an effective data handling and storage facility, a suite of operators for calculations on arrays, in particular matrices, a large, coherent, integrated collection of intermediate tools for data analysis, graphical facilities for data analysis and display either directly at the computer or on hardcopy, and well developed, simple and effective programming language which includes conditionals,…

R Software Environment

R

3

RapidMiner

RapidMiner : RapidMiner provides an integrated environment for machine learning, data mining, text mining, predictive analytics and business analytics and is used for business and industrial applications as well as for research, education, training, rapid prototyping, and application development. RapidMiner supports all steps of the data mining process including results visualization, validation and optimization.RapidMiner uses a client/server model with the server offered as Software as a Service or on cloud infrastructures. RapidMiner provides data mining and machine learning procedures including: data loading and transformation, data preprocessing and visualization, predictive analytics and statistical modeling, evaluation, and deployment. RapidMiner is written in…

RapidMiner

RapidMiner

4

Weka Data Mining

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka features include machine learning, data mining, preprocessing, classification, regression, clustering, association rules, attribute selection, experiments, workflow and visualization. Weka is written in Java, developed at the University of Waikato, New Zealand. All of Weka's techniques are predicated on the assumption that the data is available as a single flat file or relation, where each data point is described by a fixed number of attributes Weka provides access to SQL databases…

Weka Data Mining

Weka Data Visualiser

5

KNIME

KNIME: KNIME, the Konstanz Information Miner, is an open source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining concept and provides a graphical user interface allows assembly of nodes for data preprocessing, for modeling and data analysis and visualization. KNIME Analytics Platform provides over 1000 data analytic routines, either natively or through R and Weka, for such topics as Univariate and Multivariate Statistics, Data Mining,Time Series, Image Processing, Web Analytics, Text Mining, Network Analysis and Social Media Analysis. KNIME analytic workflows can be run through the interactive…

KNIME

6

SpagoBI Business Intelligence

SpagoBI Business Intelligence : SpagoBI is an Open Source Business Intelligence suite, which offers a large range of analytical functions, a functional semantic layer and a set of advanced data visualization features including geospatial analytics. The modules of SpagoBI suite are SpagoBI Server, SpagoBI Studio, SpagoBI Meta and SpagoBI SDK.SpagoBI Server is the main module of the suite, offering the core and analytical functionalities. It provides two conceptual models which are called Analytical Model and Behavioural Model, administration tools and cross-platform services. SpagoBI Business Intelligence SpagoBI Studio allows the developer to design and modify analytical documents such as reports, charts,…

SpagoBI Business Intelligence

SpagoBI Business Intelligence

7

Anaconda

Anaconda is an open data science platform powered by Python. The open source version of Anaconda is a high performance distribution of Python and R and includes over 100 of the most popular Python, R and Scala packages for data science. There is also access to over 720 packages that can easily be installed with conda, the package, dependency and environment manager, that is included in Anaconda.Includes the most popular Python, R & Scala packages for stats, data mining, machine learning, deep learning, simulation & optimization, geospatial, text & NLP, graph & network, image analysis. Featured packages include: NumPy, SciPy,…

Anaconda

8

Shogun

Shogun is a free, open source toolbox written in C++. It offers numerous algorithms and data structures for machine learning problems. The focus of Shogun is on kernel machines such as support vector machines for regression and classification problems. Shogun also offers a full implementation of Hidden Markov models.The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. This enables both rapid prototyping of data pipelines and extensibility in terms of new algorithms. It now offers features that span the whole space of Machine Learning methods, including many classical methods in classification, regression, dimensionality…

Shogun

9

ELKI

The ELKI framework is written in Java and built around a modular architecture. Most currently included algorithms belong to clustering, outlier detection and database indexes. A key concept of ELKI is to allow the combination of arbitrary algorithms, data types, distance functions and indexes and evaluate these combinations. When developing new algorithms or index structures, the existing components can be reused and combined. ELKI is modeled around a database core, which uses a vertical data layout that stores data in column groups (similar to column families in NoSQL databases). This database core provides nearest neighbor search, range/radius search, and distance…

ELKI

ELKI

10

Scikit-learn

Scikit-learn is an open source machine learning library for the Python programming language.It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy. Classification : Identifying to which category an object belongs to Applications: Spam detection, Image recognition. Algorithms: SVM, nearest neighbors, random forest. Regression : Predicting a continuous-valued attribute associated with an object. Applications: Drug response, Stock prices. Algorithms: SVR, ridge regression. Clustering :Automatic grouping of similar objects into sets. Applications: Customer segmentation, Grouping experiment outcomes.…

Scikit-learn

11

CMSR Data Miner

StarProbe Data Miner or CMSR Data Miner Suite is software which provides an integrated environment for predictive modeling, segmentation, data visualization, statistical data analysis, and rule-based model evaluation. For advanced power users integrated analytics and rule-engine environment is also provided. This software has many features such as: deep learning modeling RME-EP which represents very powerful expert system shell rule engine, supporting predictive modeling such as neural network, self organizing maps, decision tree, regression etc. It has been developed to use SQL-like expressions which users can learn very easily and quickly. Also, RME-EP expert system rules can be written by non-IT…

CMSR Data Miner

12

Fityk

Fityk is a program for data processing and nonlinear curve fitting. It is primarily used by scientists who analyse data from powder diffraction, chromatography, photoluminescence and photoelectron spectroscopy, infrared and Raman spectroscopy, and other experimental techniques and also used to fit peaks – bell-shaped functions (Gaussian, Lorentzian, Voigt, Pearson VII, bifurcated Gaussian, EMG, Doniach-Sunjic, etc.), but it is suitable for fitting any curve to 2D (x,y) data. Fityk has the following features for users; intuitive graphical interface (and also command line interface), support for many data file formats, thanks to the xylib library, dozens of built-in functions and support for…

Fityk

13

mlpy

Mlpy know as Machine Learning Python represents a python method for machine learning built on top of NumPy/SciPy (Python-based ecosystem of open-source software for mathematics, science, and engineering) and the GNU Scientific Libraries (represents numerical library for C and C++ programmers where a wide range of mathematical routines such as random number generators, special functions and least-squares fitting are provided). Wide range of state-of-the-art machine learning methods are provided for supervised and unsupervised problems and mlpy is aimed at finding a reasonable compromise among modularity, maintainability, reproducibility, usability and efficiency. It provides high-level functions and classes allowing, with few lines…

mlpy

14

Dlib

Dlib is a modern C++ toolkit which contains machine learning algorithms and tools in order of creating complex software in C++ for solving real world problems. It is used in a wide range of domains including robotics, embedded devices, mobile phones, and large high performance computing environments. It is free of any charges which mean that users can use it in any app. Major features of Dlib is: documentation – it provides complete and precise documentation for every class and function, lots of example programs are provided; high quality portable code – good unit test coverage, tested on MS Windows,…

Dlib

15

Rattle GUI

Rattle is Free Open Source Software and the source code is available from the Bitbucket repository. Rattle gives the user the freedom to review the code, use it for whatever purpose the user likes, and to extend it however they like, without restriction. Rattle is a popular GUI for data mining using R. It presents statistical and visual summaries of data, transforms data that can be readily modelled, builds both unsupervised and supervised models from the data, presents the performance of models graphically, and scores new datasets. One of the most important features is that all of the user’s interactions…

Rattle GUI

16

GNU Octave

GNU Octave represents a high level language intended for numerical computations. Because of its command line interface, users can solve linear and nonlinear problems numerically and perform other numerical experiments through a language that is mostly compatible with Matlab. This software has features such as powerful mathematics-oriented syntax with built-in plotting and visualization tools, it is free software which runs on GNU/Linux, macOS, BSD, and Windows, compatible with many Matlab scripts. A syntax which is largely compatible with Matlab is the Octave syntax. It can be run in several ways - in GUI mode, as a console, or invoked as…

GNU Octave

17

Pandas

Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Pandas is a NUMFocus sponsored project. This will help ensure the success of development of pandas as a world-class open-source project, and makes it possible to donate to the project. Best way to get pandas is to install via conda Builds for osx-64,linux-64,linux-32,win-64,win-32 for Python 2.7, Python 3.4, and Python 3.5 are all available. This is a major release from 0.19.2 and includes a number of API changes, deprecations, new features, enhancements, and performance improvements along with a large…

Pandas

18

Natural Language Toolkit

NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, and an active discussion forum. Thanks to a hands-on guide introducing programming fundamentals alongside topics in computational linguistics, plus comprehensive API documentation, NLTK is suitable for linguists, engineers, students, educators, researchers, and industry users alike. NLTK is available for Windows, Mac OS X, and Linux. Best of all, NLTK…

Natural Language Toolkit

19

OpenNN

OpenNN is an open source class library written in C++ programming language which implements neural networks, a main area of machine learning research. The library implements any number of layers of non-linear processing units for supervised learning. This deep architecture allows the design of neural networks with universal approximation properties. The main advantage of OpenNN is its high performance. It is developed in C++ for better memory management and higher processing speed, and implements CPU parallelization by means of OpenMP and GPU acceleration with CUDA. OpenNN has been written in ANSI C++. This means that the library can be built…

OpenNN

20

TANAGRA

Tanagra represents free data mining software for academic and research purposes. It provides several data mining methods from exploratory data

Page 1 of 212»
6 Reviews
  • Mike
    March 17, 2014 at 9:23 am

    ADDITIONAL INFORMATION
    Hello bud, on your data mining softwares witch 1 would u recommend for email mining? Thank you

  • Phoenix
    April 1, 2014 at 11:50 pm

    ADDITIONAL INFORMATION
    Do any of these have non-English capabilities?

  • Venkatesh
    July 29, 2014 at 12:52 am

    ADDITIONAL INFORMATION
    Hi buddy! Are there any attempts to do cloud based data analytics softwares? I think such a thing can solve the problem Phoenix had mentioned.

  • K R Chin
    January 25, 2015 at 6:14 pm

    ADDITIONAL INFORMATION
    I’d like to know if there are any data mining programs which could be used to predict terrorist activities or analyze material movements (shipping, purchases, and orders) to search for indicators of suspicious activity.

    I’m a security consultant and advisor, this sort of information would be useful in my consultations.

  • Mahrez
    March 5, 2015 at 4:00 pm

    ADDITIONAL INFORMATION
    Hi KR Chin,

    To predict any activity you need to know which variables you want to base your prediction on. You also need a historical data to run your predictive analysis and find the possible correlations between different event. I know that somewhere in the US the police uses crime predictions based on historical criminality data (new Orleans if I am not mistaken)…bottom line : you need data to get the info ! have fun 🙂

  • February 17, 2017 at 11:50 am

    ADDITIONAL INFORMATION
    See AdvancedMiner by Algolytics. They provide free/community version http://algolytics.com/products/advancedminer/

What's your reaction?
Love It
26%
Very Good
46%
INTERESTED
13%
COOL
3%
NOT BAD
5%
WHAT !
6%
HATE IT
2%
About The Author
imanuel