Predictive Analytics
Now Reading
Top 34 Free Data Analysis Software
2

Top 34 Free Data Analysis Software

Top 34 Free Data Analysis Software
4.58 (91.67%) 84 ratings

Top 34 Free Data Analysis Software: List of 34+ top free data analysis software.Data Analysis is the process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision making. ELKI, Dataiku DSS, ITALASSI, R, Data Applied, DevInfo, Tanagra, Waffles, Weka, Gephi, OpenRefine, Fusion Tables, DataMelt, Orange, Wrangler Encog, RapidMiner, PAW, SCaVi, ILNumerics.Net, ROOT, Julia, MOA, NumPy, SciPy, KNIME, NetworkX, matplotlib, IPython, SymPy, Scilab, FreeMat, jMatLab, NodeXL Basic, Fluentd, and Tableau Public are some of the free or open source top software for data analysis. Data analysis can be classified into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA). Descriptive Statistics deals with quantitatively describing the main features of a collection of information. Exploratory Data Analysis focuses on discovering new features in the data.Confirmatory Data Analysis deals with confirming or falsifying existing hypotheses.

You may also like to review the top free data mining software list :
Top Free Data Mining Software

You may also like to review the top proprietary data mining software list:
Top Data Mining Software

Top Free Data Analysis Software

ELKI, Dataiku DSS, ITALASSI, R, Data Applied, DevInfo, Tanagra, Waffles, Weka, Gephi, OpenRefine, Fusion Tables, DataMelt, Orange, Wrangler Encog, RapidMiner, PAW, SCaVi, ILNumerics.Net, ROOT, Julia, MOA, NumPy, SciPy, KNIME, NetworkX, matplotlib, IPython, SymPy, Scilab, FreeMat, jMatLab, NodeXL Basic, Fluentd, and Tableau Public

Data Analysis Free Software

Data Analysis Free Software

1

ELKI

ELKI is Environment for DeveLoping KDD, Applications Supported by Index Structures. ELKI is a knowledge discovery in databases software framework developed for use in research and teaching by the database systems at the Ludwig Maximilian University of Munich, Germany. The development and evaluation of advanced data mining algorithms and their interaction with database index structures are allowed in ELKI.

In ELKI, data mining algorithms and data management tasks are separated and allow for an independent evaluation. This separation makes ELKI unique among other data mining frameworks. ELKI is open to arbitrary data types, distance or similarity measures, or file formats.

Sisense

Sisense empower the most non-technical user with the ability to access data and build interactive dashboards and business intelligence reports. Sisense provides a variety of dashboard widgets to pinpoint the best visualization for your data, such as: geographical maps, gauges to measure KPIs, line charts to determine trends, scatter plots to see correlations, and pie charts for clear comparisons.Sisense enables to customize dashboard layout with drag-and-drop features to place each widget exactly where you want for optimal representation.

Sisense

Easily join, analyze and visualize using SiSense

ELKI

ELKI

ELKI

2

Dataiku DSS

Dataiku DSS Community Edition is a collaborative data science software platform that enables teams to explore, prototype, build, and deliver their own data products more efficiently. Dataiku DSS provides an interactive visual interface where they can point, click, and build or use languages like SQL to data wrangle, model, easily re-run workflows, visualize results, and get up-to-date insights on demand. Dataiku DSS let to draft data preparation and modelisation in seconds, that wish to leverage their favorite ML libraries (scikitlearn, R, MLlib, H2O, and so on), and that rely on automating their work in a completely customizable interface. It also enables coordinate development and operations by handling workflow automations, creating predictive web services, monitoring data & model health on a daily basis, and who don’t want to worry about multi-technology platforms.

Dataiku DSS

Dataiku DSS

3

ITALASSI

ITALASSI is a freeware program from Provalis Research to facilitate interpretation of regression models, which are two independent variables, with an interaction term. The program allows to enter several regression models such as two bivariate, one multiple additive, and one multivariate with interaction in the form of equations or compute those equations from raw data and displays the various models using 2D and 3D graphs. The program can also be used in advanced stat courses to illustrate statistical interactions or applied multiple regression.

Provalis Research

ITALASSI

ITALASSI

4

R

R is a programming language and software environment for statistical computing and graphics.. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. R is an implementation of the S programming language.

R

R

R

5

Tanagra

Tanagra supports several standard data mining tasks such as: Visualization, Descriptive statistics, Instance selection, feature selection, Feature construction, regression, Factorial analysis, clustering, classification and Association rule learning.

Tanagra

Tanagra

Tanagra

6

Waffles

Waffles machine learning toolkit provides command line tools for performing various operations related to machine learning, data mining, and predictive modeling. Waffles provides tools that are simple to use in scripted experiments or processes. Algorithms included in Waffles support multi dimensional labels, classification and regression, automatically impute missing values, and automatically apply necessary filters to transform the data to a type.

Waffles

Waffle

Waffle

7

Weka

Weka is a collection of visualization tools and algorithms for data analysis and predictive modeling with graphical user interfaces for easy access. Weka supports several standard data mining tasks, more specifically, data preprocessing, clustering, classification, regression, visualization, and feature selection.

Weka

Weka

Weka

8

Gephi

Gephi is open-source and free visualization and exploration software for all kinds of graphs and networks. It can be used for exploratory data analysis: intuition-oriented analysis by networks manipulations in real time, link analysis: revealing the underlying structures of associations between objects and social network analysis: easy creation of social data connectors to map community organizations and small-world networks.

Gephi

gephi

9

OpenRefine

OpenRefine (formerly Google Refine) is a powerful tool for working with messy data, cleaning the data and transforming it from one format into another. It also extends it with web services and external data. Features include explore datasets, basic and advanced cell transformations, instantaneous links between datasets, filter and partition, named-entity extraction on full-text fields to automatically identify topics and advanced data operations.

OpenRefine

OpenRefine

10

Fusion Tables

Fusion Tables is an experimental data visualization web application to gather, visualize, and share data tables. Fusion Tables filter and summarize across hundreds of thousands of rows and enables to chart, map, network graph, or custom layout and embed or share .It also enables to collaborate, merge two or three tables to generate a single visualization that includes both sets of data and find public data to combine with your own for a better visualization.

Fusion Tables

11

DataMelt

DataMelt is a software for numeric computation, statistics, analysis of large data volumes and scientific visualization. DMelt can be used with several scripting languages, such as Python/Jython, BeanShell, Groovy, Ruby, as well as with Java. It includes more than 30,000 Java classes for computation and visualization. In addition, more than 4000 classes come with Java API, plus 500 Python modules.

DataMelt

12

Orange

Orange is open source data visualization and data analysis for novice and expert and provides interactive workflows with a large toolbox to create interactive workflows to analyse and visualize data. Orange is packed with different visualizations, from scatter plots, bar charts, trees, to dendrograms, networks and heat maps.

Orange

13

Wrangler

Wrangler is an interactive tool for data cleaning and transformation. Wrangler allows interactive transformation of messy, real-world data into the data tables analysis tools expect and to export data for use in Excel, R, Tableau, Protovis.

Wrangler

Wrangler

14

Data Applied

Data Applied is an online data mining and data visualization solution. There are visualization tools and algorithms for data analysis and data mining. The product supports several types of analytical tasks, including visual reporting, tree maps, time series forecasting, correlation analysis, outlier detection, decision trees, association rules, clustering, and self organizing maps.

Data Applied

Data Applied

Data Applied

15

DevInfo

DevInfo is a database system endorsed by the United Nations Development Group for monitoring and analyzing human development. DevInfo is distributed royalty free to all UN member states.

DevInfo is a tool for organizing, storing and presenting data in a uniform way to facilitate data sharing at the country level across government departments, UN agencies and development partners. DevInfo has features that produce tables, graphs and maps for inclusion in reports, presentations and advocacy materials. The software supports both standard indicators, the Millennium Development Goal indicators and user-defined indicators. DevInfo is compliant with international statistical standards to support open access and widespread data exchange and operates both as a desktop application as well as on the web.

DevInfo

DevInfo

DevInfo

16

RapidMiner

RapidMiner is an environment for machine learning, data mining, text mining, predictive analytics, and business analytics.

RapidMiner

Rapid Miner

Rapid Miner

17

PAW

PAW is a FORTRAN/C data analysis framework. PAW is an interactive, scriptable computer software tool for data analysis and graphical presentation in High Energy Physics developed at CERN.

PAW

PAW

PAW

18

SCaVis

SCaVisis a Java multi platform data analysis framework developed at ANL. SCaViS can be used everywhere where an analysis of large numerical data volumes, data mining, statistical data analysis and mathematics are essential. The program can be used in natural sciences, engineering, modeling and analysis of financial markets.

SCaViS

Scavis

Scavis

17

ROOT

ROOT is C++ data analysis framework developed at CERN and H5 which was originally designed for particle physics data analysis and contains several features specific to this field, but it is also used in other applications such as astronomy and data mining.

ROOT

Root

Root

18

Encog

Encog is an advanced machine learning framework. Encog supports a variety of advanced algorithms and classes to normalize and process data. Machine learning algorithms such as Support Vector Machines, Artificial Neural Networks, Genetic Programming, Bayesian Networks, Hidden Markov Models and Genetic Algorithms are supported. Encog framework available for Java, .Net, and C++.

Encog

Encog

Encog

19

ILNumerics.Net

ILNumerics is a numerical library for .NET. It supports the creation of algorithms and visualizations for scientific computing and all kinds of technical applications. ILNumerics simplifies the implementation of an array of numerical algorithms. ILNumerics does not come with an interpreter and directly utilizes features of modern development environments and programming languages like C#.

ILNumerics.Net

ILNumerics.Net

ILNumerics.Net

20

Julia

Julia is a high-level dynamic programming language designed to address the requirements of high-performance numerical and scientific computing .Julia is written in C, C++, and Scheme using the LLVM compiler framework, while most of Julia’s standard library is implemented in Julia.Julia provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. The library also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing.

Julia

21

NodeXL Basic

NodeXL Basic is a free, open-source template for Microsoft Excel that makes it easy to explore network graphs. It enables to enter a network edge list in a worksheet, click a button and see your graph, all in the familiar environment of the Excel window.

NodeXL Basic

NodeXL Basic

22

Fluentd

Fluentd allows to unify data collection and consumption for a better use and understanding of data. Fluentd decouples application logging from backend systems by the unified logging layer and this layer allows developers and data analysts to utilize application logs as they are generated. Fluentd supports most of the common programming languages.

Fluentd

23

Tableau Public

Tableau Public lets to create and share interactive charts and graphs, maps, live dashboards and applications and publish anywhere on the web. Vizzes can be shared via email, Twitter, Facebook, LinkedIn, Google+, and embedded onsite.

Tableau Public

Tableau Public

23

NumPy

NumPy is the fundamental package for scientific computing with Python. It contains a powerful N dimensional array object, broadcasting functions, tools for integrating C/C++ and Fortran code and useful linear algebra, fourier transform, and random number capabilities.

NumPy

24

SciPy

SciPy is a Python-based ecosystem of open-source software for mathematics, science, and engineering.

SciPy

25

MOA, Massive On-line Analysis

MOA, Massive On-line Analysis is a framework for data stream mining. It includes tools for evaluation and a collection of machine learning algorithms. It contains a prequential evaluation method, the EDDM concept drift methods, a reader of ARFF real datasets, and artificial stream generators as SEA concepts, STAGGER, rotating hyperplane, random tree, and random radius based functions. MOA supports bi-directional interaction with Weka .

MOA, Massive On-line Analysis

MOA (Massive On-line Analysis)

MOA (Massive On-line Analysis)

26

SAP Lumira

SAP Lumira analyze spreadsheet data with visualizations and data manipulation and visual analysis. The personal edition is free.

SAP Lumira

SAP Lumira

SAP Lumira

27

NetworkX

NetworkX is a Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.

NetworkX

28

KNIME

KNIME, which is the Konstanz Information Miner, is a user friendly and comprehensive data analytics framework which integrates various components for machine learning and data mining through its modular data pipelining concept. A graphical user interface allows assembly of nodes for data preprocessing for the Extraction, Transformation, Loading for modeling and data analysis and visualization.

KNIME

Knime

matplotlib

29

SymPy

SymPy is a Python library for symbolic mathematics. It aims to become a full-featured computer algebra system (CAS) while keeping the code as simple as possible in order to be comprehensible and easily extensible. SymPy is written entirely in Python and does not require any external libraries.

SymPy

30

Scilab

Scilab is an open source, cross-platform numerical computational package and a high-level, numerically oriented programming language which can be used for signal processing, statistical analysis, image enhancement, fluid dynamics simulations, numerical optimization, and modeling, simulation of explicit and implicit dynamical systems.

Scilab

31

FreeMat

FreeMat is a free open source numerical computing environment and programming language,similar to MATLAB and GNU Octave.In addition to supporting many MATLAB functions and some IDL functionality, it features a codeless interface to external C, C++, and Fortran code, further parallel distributed algorithm development (via MPI), and it has plotting and 3D visualization capabilities.

FreeMat

32

jMatLab

jMatLab is a platform for mathematical and numerical computations. It is a clone of Matlab and Octave. Unlike Matlab, it is free. Unlike Octave, it runs on any platform where Java is installed. It can also run on the Web browser.

jMatLab

33

IPython

IPython provides a rich architecture for interactive computing with interactive shells (terminal and Qt-based), Support for interactive data visualization and use of GUI toolkits and Flexible, embeddable interpreters to load into your own projects.

IPython

34

matplotlib

matplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell, web application servers, and six graphical user interface toolkits.

matplotlib

matplotlib

matplotlib

You may also like to review the top free data mining software list :
Top Free Data Mining Software

You may also like to review the top proprietary data mining software list:
Top Data Mining Software

 

2 Reviews
  • Dmitri
    May 23, 2014 at 10:59 pm

    ADDITIONAL INFORMATION
    One correction: at the very top, ScaVi should be called “ScaVis”. I should say I like the best SCaVis since I can program in Python while accessing very reach Java numerical libraries.

  • Robert Nutt
    August 8, 2014 at 1:12 am

    ADDITIONAL INFORMATION
    A related tool for data anlaysis is json-csv.com. It is an online converter which can convert any JSON to CSv for processing within a spreadsheet.

What's your reaction?
Love It
61%
Very Good
13%
INTERESTED
13%
COOL
4%
NOT BAD
10%
WHAT !
6%
HATE IT
7%
About The Author
imanuel