Predictive Analytics
Now Reading
Top 41 Free Data Analysis Software
2

Top 41 Free Data Analysis Software

Top 41 Free Data Analysis Software
4.5 (90.65%) 107 ratings

Top Free Data Analysis Software: List of 41+ top free data analysis software.Data Analysis is the process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision making. Orange Data mining, R Software Environment, Weka Data Mining, Tableau Public, Arcadia Data, Microsoft R, ITALASSI, Shogun, Trifacta, ELKI, Scikit-learn, Data Applied, Lavastorm Analytics Engine, Gephi, DataMelt, TANAGRA, Julia, RapidMiner Starter Edition, SciPy, KNIME Analytics Platform Community, Dataiku DSS Community, Google Fusion Tables, Massive Online Analysis, NodeXL, DataPreparator, NetworkX, NumPy, OpenRefine, DataWrangler, PAW, ILNumerics, ROOT, DataCracker, Scilab, FreeMat, Ipython, jMatLab, SymPy, Fluentd, EasyReg, Matplotlib are some of the free or open source top software for data analysis. Data analysis can be classified into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA). Descriptive Statistics deals with quantitatively describing the main features of a collection of information. Exploratory Data Analysis focuses on discovering new features in the data.Confirmatory Data Analysis deals with confirming or falsifying existing hypotheses.

 

Sisense

Sisense empower the most non-technical user with the ability to access data and build interactive dashboards and business intelligence reports. Sisense provides a variety of dashboard widgets to pinpoint the best visualization for your data, such as: geographical maps, gauges to measure KPIs, line charts to determine trends, scatter plots to see correlations, and pie charts for clear comparisons.Sisense enables to customize dashboard layout with drag-and-drop features to place each widget exactly where you want for optimal representation.

Sisense Demo

 

 

You may also like to review the top free data mining software list :
Top Free Data Mining Software

You may also like to review the top proprietary data mining software list:
Top Data Mining Software

Top Free Data Analysis Software: Trending

Top Data Analysis Software Free : Top Twenty
PAT Index™
 
1
Orange Data mining
 
2
R Software Environment
 
3
Weka Data Mining
 
4
Tableau Public
 
5
Microsoft R
 
6
Arcadia Data
 
7
Shogun
 
8
DataMelt
 
9
RapidMiner Starter Edition
 
10
Lavastorm Analytics Engine
 
11
Julia
 
12
Scikit-learn
 
13
ITALASSI
 
14
ELKI
 
15
SciPy
 
16
KNIME Analytics Platform Community
 
17
Trifacta
 
18
Gephi
 
19
Scilab
 
20
Data Applied

Top Free Data Analysis Software

Orange Data mining, R Software Environment, Weka Data Mining, Tableau Public, Arcadia Data, Microsoft R, ITALASSI, Shogun, Trifacta, ELKI, Scikit-learn, Data Applied, Lavastorm Analytics Engine, Gephi, DataMelt, TANAGRA, Julia, RapidMiner Starter Edition, SciPy, KNIME Analytics Platform Community, Dataiku DSS Community, Google Fusion Tables, Massive Online Analysis, NodeXL, DataPreparator, NetworkX, NumPy, OpenRefine, DataWrangler, PAW, ILNumerics, ROOT, DataCracker, Scilab, FreeMat, Ipython, jMatLab, SymPy, Fluentd, EasyReg, Matplotlib
Top Data Analysis Software Free
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Weka Data Visualiser
 
Orange-Survey plot
 
 
 
ELKI
 
 
 
ITALASSI
 
 
 
 
 
 
R
 
 
 
 
 
 
Trifacta’s Visual Data Profiling
 
Lavastorm Analytics Engine
1

Orange Data mining

Orange is an open source data visualization and analysis tool. Orange is developed at the Bioinformatics Laboratory at the Faculty of Computer and Information Science, University of Ljubljana, Slovenia, along with open source community. Data mining is done through visual programming or Python scripting. The tool has components for machine learning, add-ons for bioinformatics and text mining and it is packed with features for data analytics. Orange is a Python library. Python scripts can run in a terminal window, integrated environments like PyCharm and PythonWin, or shells like iPython. Orange consists of a canvas interface onto which the user places…

Bottom Line

Orange is an open source data visualization and analysis tool, where data mining is done through visual programming or Python scripting. The tool has components for machine learning, add-ons for bioinformatics and text mining and it is packed with features for data analytics.

9.5
Our Rating
8.4
User Rating
2 ratings
You have rated this

Orange Data mining

Orange-Survey plot

2

R Software Environment

R

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. R is an integrated suite of software facilities for data manipulation, calculation and graphical display. Some of the functionalities include an effective data handling and storage facility, a suite of operators for calculations on arrays, in particular matrices, a large, coherent, integrated collection of intermediate tools for data analysis, graphical facilities for data analysis and display either directly at the computer or on hardcopy, and well developed, simple and effective programming language which includes conditionals,…

Bottom Line

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. R is an integrated suite of software facilities for data manipulation, calculation and graphical display.

9.1
Our Rating
8.6
User Rating
2 ratings
You have rated this

R Software Environment

R

3

Weka Data Mining

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka features include machine learning, data mining, preprocessing, classification, regression, clustering, association rules, attribute selection, experiments, workflow and visualization. Weka is written in Java, developed at the University of Waikato, New Zealand. All of Weka's techniques are predicated on the assumption that the data is available as a single flat file or relation, where each data point is described by a fixed number of attributes Weka provides access to SQL databases…

Bottom Line

Weka is a collection of machine learning algorithms for data mining tasks. Weka features include machine learning, data mining, preprocessing, classification, regression, clustering, association rules, attribute selection, experiments, workflow and visualization. Weka is written in Java, developed at the University of Waikato, New Zealand.

9.1
Our Rating
8.0
User Rating
1 rating
You have rated this

Weka Data Mining

Weka Data Visualiser

4

Tableau Public

Tableau Public : Tableau Public is a free data storytelling application used to create and share interactive charts and graphs, stunning maps, live dashboards and fun applications and publish it anywhere on the web.Tableau Public is a free service that lets anyone publish interactive data to the web. Tableau Public includes a free desktop product which can be downloaded and use to publish interactive data visualizations to the web. There is a 1 gigabyte limit on storage space for data. Tableau Public can connect to Microsoft Excel, Microsoft Access, and multiple text file formats. There is a limit of 1,000,000…

Bottom Line

Tableau Public is a free data storytelling application used to create and share interactive charts and graphs, stunning maps, live dashboards and fun applications and publish it anywhere on the web.

7.8
Our Rating
8.6
User Rating
1 rating
You have rated this

Tableau Public

5

Arcadia Data

Arcadia Data unifies data discovery, visual analytics and business intelligence in a single, integrated platform that runs natively on Hadoop clusters. Arcadia Data does not require coding and users can go straight to into big data with intuitive drag and drop self service interface which provides exploration and semantic modeling on breadth and depth of all business data.Arcadia Data allows working on multiple sources such as Hive, Impal, Postgres, Amazon Redshift, MySQL, Teradata Aster and much more. It’s unique Active Data store models and tunes data structures continuously at Hadoop scale. Active Data automatically replaces sub-optimal curated schemas with intent…

Bottom Line

Arcadia Data unifies visual analysis, data discovery and business intelligence native in Hadoop. Its unique on-cluster execution architecture runs analytics directly on your Hadoop nodes. No need to add hardware or pull standalone extracts.

7.8
Our Rating
8.3
User Rating
2 ratings
You have rated this

Arcadia Data

6

Microsoft R

R is the world’s most powerful, and preferred, programming language for statistical computing, machine learning, and graphics, and is supported by a thriving global community of users, developers, and contributors.The Microsoft R product family includes: Microsoft R Server, Microsoft R Client, Microsoft R Open, SQL Server R Services.Microsoft R Server is the most broadly deployable enterprise-class analytics platform for R . Supporting a variety of big data statistics, predictive modeling and machine learning capabilities, R Server supports the full range of analytics exploration, analysis, visualization and modeling based on open source R. Microsoft R Client is a free, community supported,…

Bottom Line

Supporting a variety of big data statistics, predictive modeling and machine learning capabilities, R Server supports the full range of analytics – exploration, analysis, visualization and modeling based on open source R.

8.0
Our Rating
User Rating
You have rated this

Microsoft R

7

ITALASSI

ITALASSI is a freeware program which facilitate interpretation of regression models (2 independent variables) with an interaction term. The program allows you to enter several regression models (two bivariate, one multiple additive, and one multivariate with interaction) in the form of equations or compute those equations from raw data and displays the various models using 2D and 3D graphs. The program may also be used in advanced stat courses to illustrate statistical interactions or applied multiple regression. Provalis Research

Bottom Line

The program allows you to enter several regression models (two bivariate, one multiple additive, and one multivariate with interaction) in the form of equations or compute those equations from raw data and displays the various models using 2D and 3D graphs

7.1
Our Rating
5.0
User Rating
1 rating
You have rated this

ITALASSI

ITALASSI

8

Shogun

Shogun is a free, open source toolbox written in C++. It offers numerous algorithms and data structures for machine learning problems. The focus of Shogun is on kernel machines such as support vector machines for regression and classification problems. Shogun also offers a full implementation of Hidden Markov models.The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. This enables both rapid prototyping of data pipelines and extensibility in terms of new algorithms. It now offers features that span the whole space of Machine Learning methods, including many classical methods in classification, regression, dimensionality…

Bottom Line

Shogun also offers a full implementation of Hidden Markov models.The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. This enables both rapid prototyping of data pipelines and extensibility in terms of new algorithms.

7.6
Our Rating
8.1
User Rating
1 rating
You have rated this

Shogun

9

Trifacta

Trifacta : Trifacta’s Visual Data Profiling features provide immediate visibility into unique elements of the data set like data distributions and outliers to inform the transformation and analysis process.Trifacta uses data inference techniques to introspect the data and automatically apply initial shaping and metadata recommendations for the user. This greatly accelerates the transformation process. Users can quickly un-nest and iterate on the shape of their data in preparation for the dataset’s downstream use. Trifacta’s data enrichment features make standardizing data, joining datasets and aggregating data outputs to the right level, faster and more accurate.Advanced visual data profiling capabilities that guide…

Bottom Line

Trifacta’s data enrichment features make standardizing data, joining datasets and aggregating data outputs to the right level, faster and more accurate.Advanced visual data profiling capabilities that guide users through a deep understanding of the characteristics of any data set.

7.7
Our Rating
User Rating
You have rated this

Trifacta

Trifacta’s Visual Data Profiling

10

ELKI

The ELKI framework is written in Java and built around a modular architecture. Most currently included algorithms belong to clustering, outlier detection and database indexes. A key concept of ELKI is to allow the combination of arbitrary algorithms, data types, distance functions and indexes and evaluate these combinations. When developing new algorithms or index structures, the existing components can be reused and combined. ELKI is modeled around a database core, which uses a vertical data layout that stores data in column groups (similar to column families in NoSQL databases). This database core provides nearest neighbor search, range/radius search, and distance…

Bottom Line

ELKI is modeled around a database core, which uses a vertical data layout that stores data in column groups (similar to column families in NoSQL databases).

7.5
Our Rating
8.3
User Rating
1 rating
You have rated this

ELKI

ELKI

11

Scikit-learn

Scikit-learn is an open source machine learning library for the Python programming language.It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy. Classification : Identifying to which category an object belongs to Applications: Spam detection, Image recognition. Algorithms: SVM, nearest neighbors, random forest. Regression : Predicting a continuous-valued attribute associated with an object. Applications: Drug response, Stock prices. Algorithms: SVR, ridge regression. Clustering :Automatic grouping of similar objects into sets. Applications: Customer segmentation, Grouping experiment outcomes.…

Bottom Line

Scikit-learn features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

7.6
Our Rating
User Rating
You have rated this

Scikit-learn

12

Data Applied

Data Applied revolutionizes data-driven decision making by integrating rich analytics, data mining, and information visualization capabilities - all using a zero footprint Web interface, collaboration features, and a secure XML Web API. By extracting valuable knowledge from data in domains as varied as Web Analytics, Sales, Marketing, Engineering, Social Sciences or Non-Profit, Data Applied help organizations make better data-driven decisions and improve efficiency.Perform collaborative analysis by securely sharing data sets with others. Decide who has access to what, and with which access level. This helps the company in better management as the right people are given their own tasks to…

Bottom Line

Data Applied automatically discover broad categories of records using a cluster detection algorithm. The clustering algorithm finds groups of records sharing common traits, and generates profile information for each group.

7.6
Our Rating
7.8
User Rating
1 rating
You have rated this

Data Applied

13

Lavastorm Analytics Engine

Lavastorm Analytics Engine : Lavastorm is a visual data discovery solution that allows to rapidly integrate diverse data, easily discover elusive insights, and continuously detect anomalies, outliers, or patterns. Lavastorm Analytics Engine provides self-service capability for business users and rapid development capabilities for IT users in the areas of integration, analytics, and business control. Features include acquire, transform, combine, and enrich data from virtually any source, including Big Data sources without intensive modeling, pre-planning, or scripting. The solution discover data issues, such as completeness, inconsistent formats, accuracy, automate the evaluation and cleansing process. Lavastorm Analytics Engine use the visual analytic…

Bottom Line

Lavastorm Analytics Engine provides self-service capability for business users and rapid development capabilities for IT users in the areas of integration, analytics, and business control.

7.5
Our Rating
8.2
User Rating
2 ratings
You have rated this

Lavastorm Analytics Engine

Lavastorm Analytics Engine

14

Gephi

Gephi is a tool for data analysts and scientists keen to explore and understand graphs. Like Photoshop but for graph data, the user interacts with the representation, manipulate the structures, shapes and colors to reveal hidden patterns. The goal is to help data analysts to make hypothesis, intuitively discover patterns, isolate structure singularities or faults during data sourcing.Gephi is an open-source software for network visualization and analysis. It helps data analysts to intuitively reveal trends and patterns, highlight outliers and tells stories with their data. It uses a 3D render engine to display large graphs in real-time and to speed…

Bottom Line

Gephi is a tool for data analysts and scientists keen to explore and understand graphs. Like Photoshop but for graph data, the user interacts with the representation, manipulate the structures, shapes and colors to reveal hidden patterns.

7.6
Our Rating
8.5
User Rating
1 rating
You have rated this

Gephi

15

DataMelt

DataMelt, or DMelt, is a software for numeric computation, statistics, analysis of large data volumes ("big data") and scientific visualization. The program can be used in many areas, such as natural sciences, engineering, modeling and analysis of financial markets. DMelt is a computational platform. It can be used with different programming languages on different operating systems. Unlike other statistical programs, it is not limited by a single programming language. DMelt can be used with several scripting languages, such as Python/Jython, BeanShell, Groovy, Ruby, as well as with Java. Most comprehensive software. It includes more than 30,000 Java classes for computation…

Bottom Line

DataMelt, or DMelt, is a software for numeric computation, statistics, analysis of large data volumes ("big data") and scientific visualization. The program can be used in many areas, such as natural sciences, engineering, modeling and analysis of financial markets.

7.5
Our Rating
8.2
User Rating
1 rating
You have rated this

DataMelt

16

TANAGRA

Tanagra represents free data mining software for academic and research purposes. It provides several data mining methods from exploratory data analysis, statistical learning, machine learning and databases area. It is a successor of SIPINA which means that various supervised learning algorithms are provided, especially an interactive and visual construction of decision trees. Because it contains supervised learning but also other paradigms such as clustering, factorial analysis, parametric and nonparametric statistics, association rule, feature selection and construction algorithms, Tanagra is very powerful. The main goal of this project is giving researchers and student’s easy-to-use data mining software and second goal is…

Bottom Line

TANAGRA is an "open source project" as every researcher can access to the source code, and add his own algorithms, as far as he agrees and conforms to the software distribution license.The main purpose of Tanagra project is to give researchers and students an easy-to-use data mining software, conforming to the present norms of the software development in this domain (especially in the design of its GUI and the way to use it), and allowing to analyse either real or synthetic data.

7.5
Our Rating
8.1
User Rating
1 rating
You have rated this

TANAGRA

17

Julia

Julia is a sophisticated programming language that is of high performance used for numerical computation. Julia provides a comprehensive compiler, parallel execution that is distributed, a function library that is extensive mathematically and numerical accuracy. All of Julia programs encircle several dispatches by defining and compiling up functions used in argument types of different combinations which in other cases can be defined by the user. The multiple dispatch provides scientists with the ability of defining function behaviors across several combinations of arguments. Julia also features a dynamic type system which is able to deal with various types of documentation, dispatch,…

Bottom Line

Julia provides a sophisticated programming language that is of high level and performance used in distributed parallel execution, extensive mathematical calculations, in getting numerical accuracy, and as a sophisticated compiler.

7.6
Our Rating
User Rating
You have rated this

Julia

18

RapidMiner Starter Edition

RapidMiner Studio provides a wealth of functionality to speed & optimize data exploration, blending & cleansing tasks – reducing the time spent importing and wrangling your data. RapidMiner provides an integrated environment for data preparation, machine learning, deep learning, text mining, and predictive analytics. It is used for business and commercial applications as well as for research, education, training, rapid prototyping, and application development and supports all steps of the machine learning process including data preparation, results visualization, model validation and optimization. Hundreds of machine learning, text analytics, predictive modeling algorithims, automation, and process control features help you build better…

Bottom Line

RapidMiner Studio ( Data Rows- 10,000) , RapidMiner Server (2 GB RAM) and RapidMiner Radoop (Limited to Single User) are available in starter edition with limitations.

7.5
Our Rating
8.5
User Rating
1 rating
You have rated this

RapidMiner Starter Edition

19

SciPy

SciPy Stack, is a collection of open source software for scientific computing in Python, and particularly a specified set of core packages. SciPy is an open source and free python based software used for technical computing and scientific computing. SciPy is commonly used in solving science, engineering and mathematics problems. SciPy features core packages that provide computing tools for Python. The first package is the Python whose general purpose is acting as the programming language in SciPy. The python provides users with an interactive interface with the ability of interpretation and dynamically typing and suited for interactive work and fast…

Bottom Line

SciPy is an open source software that has a complete collection of features that are used in solving engineering, mathematics and science problems and is also ideal for technical computing and computing scientifically.

7.5
Our Rating
User Rating
You have rated this

SciPy

20

KNIME Analytics Platform Community

KNIME Analytics Platform is the leading open solution for data-driven innovation, helping you discover the potential hidden in your data, mine for fresh insights, or predict new futures. With more than 1000 modules, hundreds of ready-to-run examples, a comprehensive range of integrated tools, and the widest choice of advanced algorithms available, KNIME Analytics Platform is the perfect toolbox for any data scientist. A vast arsenal of native nodes, community contributions, and tool integrations makes KNIME Analytics Platform the perfect toolbox for any data scientist. https://www.youtube.com/watch?v=fw0Vb2gLsgA KNIME

Bottom Line

A vast arsenal of native nodes, community contributions, and tool integrations makes KNIME Analytics Platform the perfect toolbox for any data scientist.

7.7
Our Rating
8.3
User Rating
1 rating
You have rated this

KNIME Analytics Platform Community

21

Dataiku DSS Community

Dataiku DSS is the collaborative data science software platform for teams of data scientists, data analysts, and engineers to explore, prototype, build, and deliver their own data products more efficiently. Dataiku develops the unique advanced analytics software solution that enables companies to build and deliver their own data products more efficiently. Dataiku DSS is a collaborative and team-based user interface for data scientists and beginner analysts, to a unified framework for both development and deployment of data projects, and to immediate access to all the features and tools required to design data products from scratch. The visual interface of Dataiku…

Bottom Line

The visual interface of Dataiku DSS empowers people with a less technical background to learn the data mining process, and build projects from raw data to predictive application, without having to write a single line of code.

7.5
Our Rating
User Rating
You have rated this

Dataiku DSS Community

22

Google Fusion Tables

Fusion Tables is a web application for visualizing data that allows users to share data sets and combine them together to build data visualization online. The application is still experimental and its API has released V2. It allows users to easily create data visuals and publish them online instantly with provided subsets and an easy format similar to online files. Fusion Tables supports the ability to work through larger data sets including filtering, sorting, summarizing them in collaboration with other users online. Fusion Tables lets users combine multiple tables between users and publicly available data then merge them into one…

Bottom Line

An experimental application to store, share, query, and visualize data tables.
Make custom maps, charts, cards, and tables with your data or public data.

7.6
Our Rating
8.7
User Rating
1 rating
You have rated this

Google Fusion Tables

23

Massive Online Analysis

Massive Online Analysis (MOA) is a framework that is open source used in stream mining of data. Massive Online Analysis consists of a collection of machine learning algorithms such as regression, classification, clustering, detection, outlier, recommender systems, and concept drift detection. Massive Online Analysis also features tools used in evaluation of data stream mining. Massive Online Analysis is ideal for data scientists as it performs big data stream mining in real time and also perform large scale machine learning. The mining algorithms available in MOA can be extended and achieve new stream generators or evaluation measures. Massive Online Analysis features…

Bottom Line

Massive Online Analysis consists of a collection of machine learning algorithms and an open source framework that enables data stream mining, regression, clustering, classification, outlier, detection, concept drift detection, and recommender systems.

7.6
Our Rating
User Rating
You have rated this

Massive Online Analysis

24

NodeXL

NodeXL is a graphic application of networks. NodeXL comes in two packages; basic and pro. Basic is free, and the NodeXL application is available for Microsoft® Excel® 2007, 2010, 2013 and 2016 which makes exploration of network graphs easy. NodeXL pro, on the other hand, extends features of the basic NodeXL and provides additional features such as access to social media network data streams, text analysis as well as sentiment analysis and advanced network metrics. Both the basic and pro-NodeXL features Graph Metric Calculations, the only difference is that the pro can calculate the degree of centrality, PageRank, clustering coefficient…

Bottom Line

Node Excel is a graphic application that allows data generated from other websites and social media platforms be analyzed through graphical presentation by making use of graphic calculations, task automation as well as dynamic filtering.

7.6
Our Rating
8.3
User Rating
1 rating
You have rated this

NodeXL

25

DataPreparator

DataPreparator is a free software tool which is designed to assist with common tasks of data preparation (or data preprocessing) in data analysis and data mining. DataPreparator offers features such as character removal, text replacement, date conversion, remove selected attributes, move selected attributes, equal width, equal frequency, equal frequency from grouped data, delete records containing missing values, remove attributes containing missing values, impute missing values, predict missing values from model (dependence tree, Naive Bayes model), include missing value patterns, Z-score method, Box-plot method, create binary attributes, replace nominal values by indices, reduce number of labels, decimal, linear, hyperbolic tangent, soft-max,…

Bottom Line

DataPreparator includes operators for cleaning, discretization, numeration, scaling, attribute selection, missing values, outliers, statistics, visualization, balancing, sampling, row selection, and several other tasks.

7.6
Our Rating
User Rating
You have rated this

DataPreparator

26

NetworkX

NetworkX is a software package in Python language used in creating, manipulating, and study of the functions, structures, and dynamics of the networks that are complex. NetworkX is simply a software ideal for analyzing complex networks. NetworkX enables results to be presented in a unique and graphical way. The data structures are present for graphs, multigraphs, and digraphs. Since NetworkX is a Python package it facilitates fast prototyping and provides an easy to teach and multi-platform. Data scientists are also provided with several standard graph algorithms that are useful when dealing with complex networks. NetworkX also features generators. The generators…

Bottom Line

NetworkX is an open source software used for analyzing complex networks and uses Python language for creating, manipulating, and study of the functions, structures, and dynamics of the networks that are complex.

7.6
Our Rating
User Rating
You have rated this

NetworkX

27

NumPy

NumPy provides a comprehensive package for scientific computing using a python programming language. The NumPy library provides support to big multi-dimensional arrays and matrices. NumPy fully integrated package contains several features that makes it ideal for scientific computing, calculation of multi-dimensional arrays, matrices and even high level mathematics calculations. The first feature of NumPy is the powerful N-dimensional array object that is used in the multi-dimensional arrays. Data scientists and developers performing broadcasting are also sorted out as NumPy provides detailed and easy to use functions. NumPy also provides C or C++ tools to developers and data scientists. The C++…

Bottom Line

NumPy is a fundamental and complete suite library used for scientific computing by data scientists when using Python programming language and supports large matrices and multi-dimensional arrays and high level mathematics.

7.6
Our Rating
User Rating
You have rated this

NumPy

28

OpenRefine

OpenRefine is a sophisticated tool for working on big data and perform analytics. OpenRefine is able to perform various tasks on data. The tasks are, cleaning data, transformation of data from one form into the other format, and also extend with web services and data that are external. OpenRefine provides the explore data feature that enables data scientists go through large data sets with ease. The explore data feature is easy to be used as it also comes with a video explaining how it is used. The clean and transform data feature provided by OpenRefine enables data scientists also clean…

Bottom Line

OpenRefine is a sophisticated tool that is able to clean data, transform data from one form to the other format, extend data with web services and data that are external, work on big data, and also perform analytics.

7.6
Our Rating
User Rating
You have rated this

OpenRefine

29

DataWrangler

DataWrangler is a web-based service which is designed for cleaning and rearranging data so it is in a form that other tools such as a spreadsheet app can use. DataWrangler offers features such as exports transformation script as code which is a useful option for handling large data sets where the users first transform a sample of their data in the Wrangler interface, then run the resulting script on the full data set and supports output scripts in two languages such as Python (for data-crunching on the back end) and JavaScript (for transforming in the browser, or using node.js). DataWrangler…

Bottom Line

Wrangler is an interactive tool for data cleaning and transformation.

7.6
Our Rating
8.6
User Rating
1 rating
You have rated this

DataWrangler

30

PAW

PAW

PAW is an instrument conceived for assisting physicists in analyzing and presenting of data. PAW facilitates an statistical or mathematical analysis and a graphical presentation that are interactive. The interactive graphical presentation enables physicists work on objects familiar to them such as event files, vectors, and histograms. The PAW presentation feature provides a set of slides majorly in PostScript format that provides a general overview of the entire PAW system. The set of slides in PostScript format provides physicists with an almost complete review of the PAW functionalities. The PAW functionalities presented in set if slides in PostScript format are…

Bottom Line

PAW is an instrument used by physicists to analyze and present their data through the provided interactive graphical presentation such as event files, histograms, and vectors and statistical or mathematical analysis presentation.

7.6
Our Rating
User Rating
You have rated this

PAW

31

ILNumerics

ILNumerics is based on modern software frameworks and provides tools and solutions for scientists and engineers in all industries. ILNumerics modern software framework enables data scientists and engineers to develop and deploy highly configured technical applications in the shortest time possible. ILNumerics features the ILNumerics array visualizer. The array visualizer is simply a graphical watch window used in Visual Studio. The array visualizer enables scientists debug large and big data in a broad range of technical applications. The array visualizer has a visual representation of arbitrary data that enables it prototype your algorithms and find bugs quickly and also have…

Bottom Line

ILNumerics is a sophisticated software that provides tools and solutions to engineers and data scientists for developing and deployment of complex technical applications and is based on modern software frameworks.

7.6
Our Rating
User Rating
You have rated this

ILNumerics

32

ROOT

ROOT is a sophisticated scientific software application that provides functions required to deal with statistical analysis, large data processing, storage, and visualization. ROOT is mainly in C++ language but it can be converted into several natural languages such as R, Python and many more. The Save data feature provided by ROOT enables users to save their data using C++ object language or in a binary form that is compressed in their own file. The ROOT files are self-descriptive therefore making it easy for users to save their object format in the same ROOT file. The ROOT file contains information that…

Bottom Line

ROOT provides tools for modular scientific software framework that provide functions needed by data scientists to perform large data processing, analysis of statistical data, visualization, and storage and is mainly in C++ language.

7.6
Our Rating
User Rating
You have rated this

ROOT

33

DataCracker

DataCracker does all the work of data analysis and gives users choices that they can understand. DataCracker offers features such as Unlimited number of responses, Data files up to 50MB, Create an online report which you can share with others, Export to Excel, Export to PowerPoint, Export to printable PDF, Export to images, Embed in website, Customize the colors and designs of your charts and tables, Create new variables (JavaScript), Use templates to automatically format your charts, Use master slides to automatically lay out your slides and control default text formatting, Predictive modeling (decision trees/nonparametric regression) and Segments/Groups (latent class…

Bottom Line

Datacracker lets users import data to the survey data analysis software where it mines the data for more insights then a report is automatically written.

7.6
Our Rating
User Rating
You have rated this

DataCracker

34

Scilab

Scilab is an interpreted programming language that is associated to a detailed collection of numerical algorithms that solve many aspects of scientific problems. Users do not pay for Scilab therefor making it a free software. The binaries used in Scilab provide users with a good platform to process the 32 and 64-bit type of data. Scilab has main features that enable users interact more and easily with Scilab. They include optimization, statistics, maths and simulation, signal processing, application development, 2-D and 3-D visualization and the control system design and analysis. Scilab through the signal processing feature provides users with the…

Bottom Line

Scilab is an open and free software that uses interpreted programming language that offers platforms of 32 and 64-bit data processing and solves many aspects of scientific problems using the collection of numerical algorithms.

7.5
Our Rating
User Rating
You have rated this

Scilab

35

FreeMat

FreeMat is an environment for rapid engineering and scientific processing which is similar to commercial systems such as MATLAB from Mathworks and IDL from Research Systems, but is Open Source. FreeMat offers features such as a codeless interface to external C/C++/FORTRAN code, parallel/distributed algorithm development (via MPI), and advanced volume and 3D visualization capabilities. FreeMat now supports function handles, or function pointers where a function handle is an alias for a function or script that is stored in a variable. FreeMat now also supports the so called dynamic-field indexing expressions where the fieldname is supplied through an expression instead of…

Bottom Line

FreeMat offers Native Windows support, Native sparse matrix support, Native support for Mac OS X (no X11 server required), Function pointers (eval and feval are fully supported), Classes, operator overloading , 3D Plotting and visualization via OpenGL, Handle-based graphics and 3D volume rendering capability (via VTK).

7.6
Our Rating
User Rating
You have rated this

FreeMat

36

Ipython

IPython is open source (BSD license) which provides an easy to use, high performance tools for parallel computing. IPython offers features such as Jupyter notebook and notebook file format, Jupyter Qt console, kernel messaging protocol, ipyparallel (formerly IPython.parallel), ipykernel (minimal docs, only release notes for the ipykernel package), ipywidgets (formerly IPython.html.widgets), Traitlets, the config system used by IPython and Jupyter, interactive interpreter, an enhanced interactive Python shell, a decoupled two-process communication model and an architecture for interactive parallel computing. IPython is known to work on Linux, Most other Unix-like OSs (AIX, Solaris, BSD), Mac OS X and Windows (CygWin, XP,…

Bottom Line

IPython is a growing project, with increasing language-agnostic components and which provides a rich architecture for interactive computing.

7.6
Our Rating
User Rating
You have rated this

Ipython

37

jMatLab

jMatLab is a free platform for mathematical and numerical computations which is a clone of Matlab and Octave and runs on any platform where Java is installed or on the Web browser. jMatLab provides features such as Arithmetic, Variables, String Manipulations, Commands and Operators, Functions, Polynomials, Vectors, Differentiation, Equations (Differential), Equations (Linear Systems), Equations (Nonlinear), Equations (Nonlinear Systems), Indefinite Integrals, Input and Output, Matrices, Numerical Integration, Plots, Programming, Statistics (Data Fitting), Statistics (Descriptive), Statistics (Histograms), Statistics (Random Numbers), Taylorpolynomial and Transformations. jMathLab has its own help system where all programming modules are arranged in groups and users can list all…

Bottom Line

jMatLab can be used for simplification, differentials, integration, vectors and matrices.

7.6
Our Rating
User Rating
You have rated this

jMatLab

38

SymPy

SymPy is a Python library for symbolic mathematics which simplifies expressions, compute derivatives, integrals, and limits, solve equations and work with matrices. SymPy includes features such as modules for plotting such as coordinate modes, Plotting Geometric Entities, 2D and 3D, Interactive interface, Colors and Matplotlib support, printing like 2D pretty printed output of math formulas, or LATEX, code generation, physics, statistics, combinatorics, number theory, geometry and logic, Conversion from Python objects to SymPy objects, Optional implicit multiplication and function application parsing, Limited Mathematica and Maxima parsing: example on SymPy Live and Custom parsing transformations and Shift cipher, Affine cipher, Bifid…

Bottom Line

SymPy is free, Python-based and a pure Python library for arbitrary floating point arithmetic, making it easy to use.

7.6
Our Rating
User Rating
You have rated this

SymPy

39

Fluentd

Fluentd is an open source data collector, which lets you unify the data collection and consumption for a better use and understanding of data. Fluentd offers features such as a community-driven support, ruby gems installation, self-service configuration, OS default Memory allocator, C & Ruby language, 40mb memory, requires a certain number of gems and Ruby interpreter and more than 650 plugins available. Fluentd tries to structure data as JSON as much as possible which allows Fluentd to unify all facets of processing log data such as collecting, filtering, buffering, and outputting logs across multiple sources and destinations (Unified Logging Layer).…

Bottom Line

Fluentd is an open source data collector for building the unified logging layer and runs in the background to collect, parse, transform, analyze and store various types of data.

7.6
Our Rating
User Rating
You have rated this

Fluentd

40

EasyReg

EasyReg is an open source software that conducts several testing tasks and econometric estimation on all Windows platforms that use 32 and 64 bit form and also the Windows 7. Users using Windows 8 are also able to use EasyReg by only setting EasyReg compatibility mode to Windows XP. EasyReg is programmed to be able to work in Visual Basic 5 and also Visual Basic 5 Enterprise Edition. EasyReg is configured to be used in teaching econometrics and empirical research. The software is referred to as international because it is able to accept commas and dots as delimiters in decimal…

Bottom Line

EasyReg is a free open source software that users using Windows platforms such as Windows 7 and Windows 8 are able to perform a number of testing tasks and econometric estimation.

7.5
Our Rating
User Rating
You have rated this

EasyReg

41

Matplotlib

Matplotlib is a library for making 2D plots of arrays in Python which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib offers features such as The top level matplotlib module, afm (Adobe Font Metrics interface), animation module, artist Module, Axes class, axis and tick API, backends, cbook, cm (colormap), collections, colorbar, colors, container, dates, dviread, figure, finance, font_manager, gridspec, image, legend and legend_handler, lines, markers, mathtext, mlab, offsetbox, patches, path, patheffects, projections, pyplot, rcsetup, Sankey, scale, spines, style, text, ticker, tight_layout, working with transformations, triangular grids, type1font, units and widgets. Users can…

Bottom Line

Matplotlib can be used in Python scripts, the Python and IPython shell, the jupyter notebook, web application servers, and four graphical user interface toolkits.

7.6
Our Rating
User Rating
You have rated this

Matplotlib

 

You may also like to review the top free data mining software list :
Top Free Data Mining Software

You may also like to review the top proprietary data mining software list:
Top Data Mining Software

2 Reviews
  • Dmitri
    May 23, 2014 at 10:59 pm

    ADDITIONAL INFORMATION
    One correction: at the very top, ScaVi should be called “ScaVis”. I should say I like the best SCaVis since I can program in Python while accessing very reach Java numerical libraries.

  • Robert Nutt
    August 8, 2014 at 1:12 am

    ADDITIONAL INFORMATION
    A related tool for data anlaysis is json-csv.com. It is an online converter which can convert any JSON to CSv for processing within a spreadsheet.

What's your reaction?
Love It
61%
Very Good
12%
INTERESTED
14%
COOL
4%
NOT BAD
11%
WHAT !
5%
HATE IT
7%
About The Author
imanuel