Predictive Analytics
Now Reading
Top 41 Free Data Analysis Software
2

Top 41 Free Data Analysis Software

Top 41 Free Data Analysis Software
4.6 (91.36%) 132 ratings

Data Analysis is the process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision making.

For an organization to excel in its operation, it has to make a timely and informed decision. More often than not, decision making relies on the available data. What does this mean? Data alone is not enough; to get the best out of your data, you must ensure that it is authentic. For data to be authentic it has to be current, accurate, and reliable.

The hardest part for any organization is the retrieval and analysis of relevant data in order to gain important business insight that can be used in decision making. To effectively analyze data, most organizations are now shifting their focus to data analysis software. With advancement in technology, software developers have come up with modern data analysis software that makes it easy to retrieve, interact, and visualize the available data with an aim of improving delivery of goods and services.

What are the Top Free Data Analysis Software: Orange Data mining, Anaconda, R Software Environment, Scikit-learn, Weka Data Mining, Shogun, Tableau Public, DataMelt, Microsoft R, Trifacta, SciPy, ELKI, KNIME Analytics Platform Community, Scilab, TANAGRA, Dataiku DSS Community, DataPreparator, ITALASSI, HP Vertica Advanced Analytics, Google Fusion Tables, NodeXL, Fluentd, Displayr, NumPy, OpenRefine, Julia, Massive Online Analysis, DataWrangler, EasyReg, Matplotlib, Ipython, SymPy, FreeMat, jMatLab, PAW, ILNumerics, ROOT, NetworkX, Arcadia Data Instant, SIGVIEW, Gephi are some of the free or open source top software for data analysis.

What are Data Analysis Software?

Data Analysis Software tool that has the statistical and analytical capability of inspecting, cleaning, transforming, and modelling data with an aim of deriving important information for decision-making purposes. The software allows one to explore the available data, understand and analyze complex relationships. Besides statistical analysis, the tool also has a powerful visualization capability which allows one to share the data with other stakeholders.

Data analysis can be classified into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA). Descriptive Statistics deals with quantitatively describing the main features of a collection of information. Exploratory Data Analysis focuses on discovering new features in the data.Confirmatory Data Analysis deals with confirming or falsifying existing hypotheses.

Top Free Data Analysis Software

Top Free Data Analysis Software


  • Data acquisition: Allows one to import data from various sources using import wizard. By importing the data, you can easily carry out the analysis by use of query builder. All you need is to identify your primary table and secondary tables. From there the query builder will automatically match primary key from your primary table with the primary key in the secondary tables thus saving you time.
  • Data clean up: For data analysis to be effective, you must have clean and reliable data. The software has features that make it easy to clean up the data and make it more reliable for analysis. Besides clean up, the software can also be used to consolidate different categories from multiple entries for accurate tabulation.
  • Data visualization: This is a powerful tool that allows one to identify patterns and trends from a data setup. The tool makes it easy to explore data from different formats ranging from graphs or pie charts. The graph builder helps one to explore the data and build interactive graphical displays with ease. You can also combine multiple graphs for easy and insightful analysis of your data.
  • Basic data analysis: By use of a distribution platform, the software makes it easy for you to generate a statistical analysis from the available data. You can easily create interactive histograms and come up with customized summary statistics from the distribution platform. All you need is to identify your column of interest and the distribution platform will automatically generate graphs and other statistics as per your specification.
  • Text exploration: Analysing data from text format can be daunting especially if you don’t have the right tools. This is more so because the data is unstructured and at times it can also be uncompromising and unruly. The text explorer feature has a set of highly interactive commands that make it possible to extract words and phrases from the unstructured texts, especially from surveys and engineering notes.

Sisense

Sisense empower the most non-technical user with the ability to access data and build interactive dashboards and business intelligence reports. Sisense provides a variety of dashboard widgets to pinpoint the best visualization for your data, such as: geographical maps, gauges to measure KPIs, line charts to determine trends, scatter plots to see correlations, and pie charts for clear comparisons.Sisense enables to customize dashboard layout with drag-and-drop features to place each widget exactly where you want for optimal representation.

Try Sisense Watch Demo

You may also like to review the top free data mining software list :
Top Free Data Mining Software

You may also like to review the top proprietary data mining software list:
Top Data Mining Software

Top Free Data Analysis Software

Orange Data mining, Anaconda, R Software Environment, Scikit-learn, Weka Data Mining, Shogun, Tableau Public, DataMelt, Microsoft R, Trifacta, SciPy, ELKI, KNIME Analytics Platform Community, Scilab, TANAGRA, Dataiku DSS Community, DataPreparator, ITALASSI, HP Vertica Advanced Analytics, Google Fusion Tables, NodeXL, Fluentd, Displayr, NumPy, OpenRefine, Julia, Massive Online Analysis, DataWrangler, EasyReg, Matplotlib, Ipython, SymPy, FreeMat, jMatLab, PAW, ILNumerics, ROOT, NetworkX, Arcadia Data Instant, SIGVIEW, Gephi are some of the free or open source top software for data analysis.

Top Data Analysis Software Free
PAT Index™
 
Orange-Survey plot
 
 
R
 
 
Weka Data Visualiser
 
 
 
 
 
Trifacta’s Visual Data Profiling
 
 
ELKI
 
 
 
 
 
 
ITALASSI
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Sisense for Cloud Data Teams

Sisense for Cloud Data Teams, formerly Periscope Data is an end-to-end BI and analytics solution that lets you quickly connect your data, then analyze, visualize and share insights. Periscope Data can securely connect and join data from any source, creating a single source of truth for your organization. Perform BI reporting and advanced analytics operations all from one integrated platform. Communicate insights more effectively by selecting from Periscope Data’s wide range of visualization options (including standard charts, statistical plots, maps and more) and instantly share real-time insights via direct linking, email or Slack.

Free Sisense for Cloud Data Teams

1

Orange Data mining

Orange

Orange is an open source data visualization and analysis tool. Orange is developed at the Bioinformatics Laboratory at the Faculty of Computer and Information Science, University of Ljubljana, Slovenia, along with open source community. Data mining is done through visual programming or Python scripting. The tool has components for machine learning, add-ons for bioinformatics and text mining and it is packed with features for data analytics. Orange is a Python library. Python scripts can run in a terminal window, integrated environments like PyCharm and PythonWin, or shells like iPython. Orange consists of a canvas interface onto which the user places…

Overview
Features

• Open Source
• Interactive Data Visualization
• Visual Programming
• Supports Hands-on Training and Visual Illustrations
• Add-ons Extend Functionality

Price

Free

Website
What is best?

• Open Source
• Interactive Data Visualization
• Visual Programming

What are the benefits?

•For everyone- beginners and professionals
•Execute simple and complex data analysis
•Create beautiful and interesting graphics

Bottom Line

Orange is an open source data visualization and analysis tool, where data mining is done through visual programming or Python scripting. The tool has components for machine learning, add-ons for bioinformatics and text mining and it is packed with features for data analytics.

9.5
Editor Rating
8.2
Aggregated User Rating
112 ratings
You have rated this

Orange Data mining

2

Anaconda

Anaconda

Anaconda is an open data science platform powered by Python. The open source version of Anaconda is a high performance distribution of Python and R and includes over 100 of the most popular Python, R and Scala packages for data science. There is also access to over 720 packages that can easily be installed with conda, the package, dependency and environment manager, that is included in Anaconda. Includes the most popular Python, R & Scala packages for stats, data mining, machine learning, deep learning, simulation & optimization, geospatial, text & NLP, graph & network, image analysis. Featured packages include: NumPy,…

Overview
Features

• Analytics Workflows
• Analytics Interaction
• High Performance Distribution
• Data Engineering
• Advanced Analytics
• High Performance Scale Up
• Reproducibility
• Analytics Deployment

Price

Contact for Pricing

Website
What is best?

• Analytics Workflows
• Analytics Interaction
• High Performance Distribution

What are the benefits?

• Accelerate streamline of data science workflow from ingest through deployment
• Connect all data sources to extract the most value from data
• Create, collaborate and share with the entire team

Bottom Line

Anaconda Distribution gives superpowers to people that change the world with high performance, cross-platform Python and R that includes the best innovative data science from open source.

7.7
Editor Rating
8.0
Aggregated User Rating
23 ratings
You have rated this

Anaconda

3

R Software Environment

R

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. R is an integrated suite of software facilities for data manipulation, calculation and graphical display. Some of the functionalities include an effective data handling and storage facility, a suite of operators for calculations on arrays, in particular matrices, a large, coherent, integrated collection of intermediate tools for data analysis, graphical facilities for data analysis and display either directly at the computer or on hardcopy, and well developed, simple and effective programming language which includes conditionals,…

Overview
Features

• Open Source - Free Software
• Provides a wide variety of Statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering) and Graphical Techniques
• Effective data handling and storage facility
• Suite of operators for calculations on arrays, in particular matrices
• Large, coherent, integrated collection of intermediate tools for data analysis
• Graphical facilities for data analysis and display either on-screen or on hardcopy
• Well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities

Price

Free

Website
What is best?

• Open Source - Free Software
• Provides a wide variety of Statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering) and Graphical Techniques
• Effective data handling and storage facility

What are the benefits?

• Brings analytics to your data
• Runs on a wide variety of platforms- UNIX, Windows, MacOS
• Widely used statistical software

Bottom Line

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. R is an integrated suite of software facilities for data manipulation, calculation and graphical display.

9.1
Editor Rating
7.5
Aggregated User Rating
21 ratings
You have rated this

R Software Environment

4

Scikit-learn

Scikit-learn

Scikit-learn is an open source machine learning library for the Python programming language.It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy. Classification : Identifying to which category an object belongs to Applications: Spam detection, Image recognition. Algorithms: SVM, nearest neighbors, random forest. Regression : Predicting a continuous-valued attribute associated with an object. Applications: Drug response, Stock prices. Algorithms: SVR, ridge regression. Clustering :Automatic grouping of similar objects into sets. Applications: Customer segmentation, Grouping experiment outcomes.…

Overview
Bottom Line

Scikit-learn features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

7.6
Editor Rating
7.8
Aggregated User Rating
5 ratings
You have rated this

Scikit-learn

5

Weka Data Mining

Weka

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka features include machine learning, data mining, preprocessing, classification, regression, clustering, association rules, attribute selection, experiments, workflow and visualization. Weka is written in Java, developed at the University of Waikato, New Zealand. All of Weka's techniques are predicated on the assumption that the data is available as a single flat file or relation, where each data point is described by a fixed number of attributes Weka provides access to SQL databases…

Overview
Features

• Data Pre-Processing
• Data Classification
• Data Regression
• Data Clustering
• Data Association rules
• Data Visualization

Price

Free

Website
What is best?

• Data Pre-Processing
• Data Classification
• Data Regression

What are the benefits?

•Portable
•Free to use
•Easy to use

Bottom Line

Weka is a collection of machine learning algorithms for data mining tasks. Weka features include machine learning, data mining, preprocessing, classification, regression, clustering, association rules, attribute selection, experiments, workflow and visualization. Weka is written in Java, developed at the University of Waikato, New Zealand.

9.1
Editor Rating
6.9
Aggregated User Rating
30 ratings
You have rated this

Weka Data Mining

6

Shogun

Shogun

Shogun is a free, open source toolbox written in C++. It offers numerous algorithms and data structures for machine learning problems. The focus of Shogun is on kernel machines such as support vector machines for regression and classification problems. Shogun also offers a full implementation of Hidden Markov models. The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. This enables both rapid prototyping of data pipelines and extensibility in terms of new algorithms. It now offers features that span the whole space of Machine Learning methods, including many classical methods in classification, regression,…

Overview
Features

• Free software, community-based development and machine learning education
• Supports many languages from C++, Python, Octave, R, Java, Lua, C#, Ruby, Etc.
• Runs natively under Linux/Unix, Macos, and Windows
• Provides efficient implementation of all standard ml algorithms
• Libsvm/Liblinear, Svmlight, Libocas, Libqp, Vowpalwabbit, Tapkee, Slep, Gpml and more

Price

Free

Website
What is best?

• Free software, community-based development and machine learning education
• Supports many languages from C++, Python, Octave, R, Java, Lua, C#, Ruby, Etc.
• Runs natively under Linux/Unix, Macos, and Windows

What are the benefits?

•Completely free to use
•Goes on many operating systems
•Works on different platforms

Bottom Line

Shogun also offers a full implementation of Hidden Markov models.The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. This enables both rapid prototyping of data pipelines and extensibility in terms of new algorithms.

7.6
Editor Rating
7.8
Aggregated User Rating
4 ratings
You have rated this

Shogun

7

Tableau Public

Tableau Public is a free data storytelling application used to create and share interactive charts and graphs, stunning maps, live dashboards and fun applications and publish it anywhere on the web. Tableau Public is a free service that lets anyone publish interactive data to the web. Tableau Public includes a free desktop product which can be downloaded and use to publish interactive data visualizations to the web. There is a 10 gigabyte limit on storage space for data. Tableau Public can connect to Microsoft Excel, Microsoft Access, and multiple text file formats. There is a limit of 1,000,000 rows of…

Overview
Features

•Create interactive graphs, stunning maps
•Create Live dashboards in minutes
•Save your viz to your Tableau Public profile, and share it anywhere on the web
•Automatic mobile layouts
•Connect directly from Tableau Public to Google Sheets

What is best?

•Visualization can be embedded
•Mapping experience with vector-based maps.
•Build presentation-ready dashboards

What are the benefits?

•Content saved to Tableau Public is accessible to everyone on the internet.
•Create interactive graphs, stunning maps, and live dashboards in minutes
•Visually stunning and effectively highlight your analysis

Bottom Line

Tableau Public is a free data storytelling application used to create and share interactive charts and graphs, stunning maps, live dashboards and fun applications and publish it anywhere on the web.

9.5
Editor Rating
6.8
Aggregated User Rating
18 ratings
You have rated this

Tableau Public

8

DataMelt

DataMelt

DataMelt, or DMelt, is a software for numeric computation, statistics, analysis of large data volumes ("big data") and scientific visualization. The program can be used in many areas, such as natural sciences, engineering, modeling and analysis of financial markets. DMelt is a computational platform. It can be used with different programming languages on different operating systems. Unlike other statistical programs, it is not limited by a single programming language. DMelt can be used with several scripting languages, such as Python/Jython, BeanShell, Groovy, Ruby, as well as with Java. Most comprehensive software. It includes more than 30,000 Java classes for computation…

Overview
Features

•DMelt with all jar libraries and IDE. Mixed GPL and non-GPL licences (180 MB size)
•Online manual (basic introduction)
•Access to Java API of DMelt core library (600 classes)
•Community forum and bug tracker
•Updates of separate jar files via DMelt IDE NO YES YES
•Full version of DMelt manual
•Access to Java API (30,000 classes) with full search
•Access to Image gallery with code examples
•Web access to more than 500 DMelt examples with searchable database

Price

Many features are free. For all the features the user must pay for memebership.

Website
What is best?

•DMelt with all jar libraries and IDE. Mixed GPL and non-GPL licences (180 MB size)
•Online manual (basic introduction)
•Access to Java API of DMelt core library (600 classes)

What are the benefits?

•Access to Java API of DMelt core library
•Community forum and bug tracker
•Access to Image gallery with code examples

Bottom Line

DataMelt, or DMelt, is a software for numeric computation, statistics, analysis of large data volumes ("big data") and scientific visualization. The program can be used in many areas, such as natural sciences, engineering, modeling and analysis of financial markets.

7.5
Editor Rating
6.6
Aggregated User Rating
10 ratings
You have rated this

DataMelt

9

Microsoft R

R is the world’s most powerful, and preferred, programming language for statistical computing, machine learning, and graphics, and is supported by a thriving global community of users, developers, and contributors.The Microsoft R product family includes: Microsoft R Server, Microsoft R Client, Microsoft R Open, SQL Server R Services. Microsoft R Server is the most broadly deployable enterprise-class analytics platform for R . Supporting a variety of big data statistics, predictive modeling and machine learning capabilities, R Server supports the full range of analytics exploration, analysis, visualization and modeling based on open source R. Microsoft R Client is a free, community…

Overview
Features

• Bring analytics to your data
• Build artificial intelligence-enabled apps
• Experience enhanced, flexible deployment
• Adapt to future needs
• Choose the tools you prefer
• Scale R analytics for big data
• Access the latest innovations
• Get support you trust

Price

Contact for Pricing

What is best?

• Bring analytics to your data
• Build artificial intelligence-enabled apps
• Experience enhanced, flexible deployment

What are the benefits?

• Bring analytics to your data.
• Build artificial intelligence-enabled apps.
• Choose the tools you prefer.

Bottom Line

Supporting a variety of big data statistics, predictive modeling and machine learning capabilities, R Server supports the full range of analytics – exploration, analysis, visualization and modeling based on open source R.

8.0
Editor Rating
6.3
Aggregated User Rating
9 ratings
You have rated this

Microsoft R

10

Trifacta

Trifacta, helps individuals and organizations unlock the potential of their data by providing a new approach to how data is explored and prepared for analysis. Whether you’re trying to improve the efficiency of an existing analysis process or utilize new sources of data for an analytics initiative, Trifacta’s data wrangling solutions will empower you to do more with data of all shapes and sizes. Trifacta’s Visual Data Profiling features provide immediate visibility into unique elements of the data set like data distributions and outliers to inform the transformation and analysis process.Trifacta uses data inference techniques to introspect the data and…

Bottom Line

Trifacta’s data enrichment features make standardizing data, joining datasets and aggregating data outputs to the right level, faster and more accurate.Advanced visual data profiling capabilities that guide users through a deep understanding of the characteristics of any data set.

7.7
Editor Rating
6.7
Aggregated User Rating
11 ratings
You have rated this

Trifacta

11

SciPy

SciPy

SciPy Stack, is a collection of open source software for scientific computing in Python, and particularly a specified set of core packages. SciPy is an open source and free python based software used for technical computing and scientific computing. SciPy is commonly used in solving science, engineering and mathematics problems. SciPy features core packages that provide computing tools for Python. The first package is the Python whose general purpose is acting as the programming language in SciPy. The python provides users with an interactive interface with the ability of interpretation and dynamically typing and suited for interactive work and fast…

Overview
Features

• Python
• NumPy
• SciPy library
• Matplotib
• Pandas
• SymPy
• IPython
• Nose

Price

• Free

Website
What is best?

• Python
• NumPy
• SciPy library

Bottom Line

SciPy is an open source software that has a complete collection of features that are used in solving engineering, mathematics and science problems and is also ideal for technical computing and computing scientifically.

7.5
Editor Rating
8.9
Aggregated User Rating
4 ratings
You have rated this

SciPy

12

ELKI

ELKI

The ELKI framework is written in Java and built around a modular architecture. Most currently included algorithms belong to clustering, outlier detection and database indexes. A key concept of ELKI is to allow the combination of arbitrary algorithms, data types, distance functions and indexes and evaluate these combinations. When developing new algorithms or index structures, the existing components can be reused and combined. ELKI is modeled around a database core, which uses a vertical data layout that stores data in column groups (similar to column families in NoSQL databases). This database core provides nearest neighbor search, range/radius search, and distance…

Overview
Features

• Open source data mining software
• High performance and scalability
• Simple visualization window
• Data management tasks
• Standard Java API

Price

Free

Website
What is best?

• Open source data mining software
• High performance and scalability
• Simple visualization window

What are the benefits?

• JAVA data mining software
• Allows R code
• Data mining and data management are worked as separate tasks

Bottom Line

ELKI is modeled around a database core, which uses a vertical data layout that stores data in column groups (similar to column families in NoSQL databases).

7.5
Editor Rating
8.3
Aggregated User Rating
3 ratings
You have rated this

ELKI

13

KNIME Analytics Platform Community

KNIME Analytics Platform is the leading open solution for data-driven innovation, helping you discover the potential hidden in your data, mine for fresh insights, or predict new futures. With more than 1000 modules, hundreds of ready-to-run examples, a comprehensive range of integrated tools, and the widest choice of advanced algorithms available, KNIME Analytics Platform is the perfect toolbox for any data scientist. A vast arsenal of native nodes, community contributions, and tool integrations makes KNIME Analytics Platform the perfect toolbox for any data scientist. https://www.youtube.com/watch?v=fw0Vb2gLsgA

Overview
Features

• Powerful Analytics
• Data & Tool Blending
• Open Platform
• Over 1000 Modules and Growing
•Connectors for all major file formats and databases
•Support for a wealth of data types: XML, JSON, images, documents, and many more
•Native and in-database data blending & transformation
•Math & statistical functions
•Advanced predictive and machine learning algorithms
•Workflow control
•Tool blending for Python, R, SQL, Java, Weka, and many more
•Interactive data views & reporting

Price

Free

What is best?

•Native and in-database data blending & transformation
•Math & statistical functions
•Advanced predictive and machine learning algorithms

What are the benefits?

• Churn analysis
• Social media sentiment analysis
• Credit scoring

Bottom Line

A vast arsenal of native nodes, community contributions, and tool integrations makes KNIME Analytics Platform the perfect toolbox for any data scientist.

8.5
Editor Rating
8.7
Aggregated User Rating
5 ratings
You have rated this

KNIME Analytics Platform Community

14

Scilab

Scilab

Scilab is an interpreted programming language that is associated to a detailed collection of numerical algorithms that solve many aspects of scientific problems. Users do not pay for Scilab therefor making it a free software. The binaries used in Scilab provide users with a good platform to process the 32 and 64-bit type of data. Scilab has main features that enable users interact more and easily with Scilab. They include optimization, statistics, maths and simulation, signal processing, application development, 2-D and 3-D visualization and the control system design and analysis. Scilab through the signal processing feature provides users with the…

Overview
Features

• Optimization
• Statistics
• Signal processing
• Application development
• Maths and Simulation
• 2-D and 3-D visualization
• Control system design and analysis

Price

• Free

Website
What is best?

• Application development
• Maths and Simulation
• 2-D and 3-D visualization

Bottom Line

Scilab is an open and free software that uses interpreted programming language that offers platforms of 32 and 64-bit data processing and solves many aspects of scientific problems using the collection of numerical algorithms.

7.5
Editor Rating
7.1
Aggregated User Rating
3 ratings
You have rated this

Scilab

15

TANAGRA

TANAGRA

Tanagra represents free data mining software for academic and research purposes. It provides several data mining methods from exploratory data analysis, statistical learning, machine learning and databases area. It is a successor of SIPINA which means that various supervised learning algorithms are provided, especially an interactive and visual construction of decision trees. Because it contains supervised learning but also other paradigms such as clustering, factorial analysis, parametric and nonparametric statistics, association rule, feature selection and construction algorithms, Tanagra is very powerful. The main goal of this project is giving researchers and student’s easy-to-use data mining software and second goal is…

Overview
Features

•Free data mining software for academic and research purposes
•Provides several data mining methods from exploratory data analysis, statistical learning, machine learning and databases area
•Acts more as an experimental platform
•Open source project

Price

Free

Website
What is best?

•Free data mining software for academic and research purposes
•Provides several data mining methods from exploratory data analysis, statistical learning, machine learning and databases area
•Acts more as an experimental platform

What are the benefits?

• Easy to use data mining software
• Interactive utilization
• A wide set of data sources

Bottom Line

TANAGRA is an "open source project" as every researcher can access to the source code, and add his own algorithms, as far as he agrees and conforms to the software distribution license.The main purpose of Tanagra project is to give researchers and students an easy-to-use data mining software, conforming to the present norms of the software development in this domain (especially in the design of its GUI and the way to use it), and allowing to analyse either real or synthetic data.

7.5
Editor Rating
7.9
Aggregated User Rating
6 ratings
You have rated this

TANAGRA

16

Dataiku DSS Community

Dataiku DSS is the collaborative data science software platform for teams of data scientists, data analysts, and engineers to explore, prototype, build, and deliver their own data products more efficiently. Dataiku develops the unique advanced analytics software solution that enables companies to build and deliver their own data products more efficiently. Dataiku DSS is a collaborative and team-based user interface for data scientists and beginner analysts, to a unified framework for both development and deployment of data projects, and to immediate access to all the features and tools required to design data products from scratch. The visual interface of Dataiku…

Overview
Features

•Data connectors
•Data transformation
•Transformation engines
•Data Visualization
•Data Mining
•Machine Learning

Price

Free

What is best?

•Data connectors
•Data transformation
•Transformation engines

What are the benefits?

•Connect to more than 25 data storage systems
•Extend with plugins
•Visualize and re-run Workflows

Bottom Line

The visual interface of Dataiku DSS empowers people with a less technical background to learn the data mining process, and build projects from raw data to predictive application, without having to write a single line of code.

7.5
Editor Rating
5.1
Aggregated User Rating
5 ratings
You have rated this

Dataiku DSS Community

17

DataPreparator

DataPreparator

DataPreparator is a free software tool which is designed to assist with common tasks of data preparation (or data preprocessing) in data analysis and data mining. DataPreparator offers features such as character removal, text replacement, date conversion, remove selected attributes, move selected attributes, equal width, equal frequency, equal frequency from grouped data, delete records containing missing values, remove attributes containing missing values, impute missing values, predict missing values from model (dependence tree, Naive Bayes model), include missing value patterns, Z-score metho. Box-plot method, create binary attributes, replace nominal values by indices, reduce number of labels, decimal, linear, hyperbolic tangent, soft-max,…

Overview
Features

• Data access from text files, relational databases, and Excel workbooks
• Handling of large volumes of data (since data sets are not stored in the computer memory, with the exception of Excel workbooks and result sets of some databases where database drivers do not support data streaming)
• Stand alone tool, independent of any other tools
• User friendly graphical user interface
• Operator chaining to create sequences of preprocessing transformations (operator tree)
• Creating of model tree for test/execution data

Price

• Free

What is best?

• Data access from text files, relational databases, and Excel workbooks
• Handling of large volumes of data (since data sets are not stored in the computer memory, with the exception of Excel workbooks and result sets of some databases where database drivers do not support data streaming)
• Stand alone tool, independent of any other tools

What are the benefits?

• Provides a variety of techniques for data cleaning, transformation, and exploration
• Chaining of preprocessing operators into a flow graph (operator tree)
• Handling of large volumes of data (since data sets are not stored in the computer memory)

Bottom Line

DataPreparator includes operators for cleaning, discretization, numeration, scaling, attribute selection, missing values, outliers, statistics, visualization, balancing, sampling, row selection, and several other tasks.

7.6
Editor Rating
9.6
Aggregated User Rating
3 ratings
You have rated this

DataPreparator

18

ITALASSI

Provalis Research

ITALASSI is a freeware program which facilitate interpretation of regression models (2 independent variables) with an interaction term. The program allows you to enter several regression models (two bivariate, one multiple additive, and one multivariate with interaction) in the form of equations or compute those equations from raw data and displays the various models using 2D and 3D graphs. The program may also be used in advanced stat courses to illustrate statistical interactions or applied multiple regression.

Overview
Features

•Interpretation of regression models (2 independent variables)
•Enter several regression models (two bivariate, one multiple additive, and one multivariate with interaction)
•2D and 3D graphs

Price

Free

What is best?

•Interpretation of regression models (2 independent variables)
•Enter several regression models (two bivariate, one multiple additive, and one multivariate with interaction)
•2D and 3D graphs

What are the benefits?

• Improved 3D rotation
•Can perform analysis on SPSS for Windows .SAV files
•Interpretation of regression models

Bottom Line

The program allows you to enter several regression models (two bivariate, one multiple additive, and one multivariate with interaction) in the form of equations or compute those equations from raw data and displays the various models using 2D and 3D graphs

7.1
Editor Rating
3.9
Aggregated User Rating
3 ratings
You have rated this

ITALASSI

19

HP Vertica Advanced Analytics

Vertica Advanced Analytics manages and organizes data for businesses users. It analyzes data quickly and rapidly. The software can manage both structured and semi structured data. HPE Vertica is the most advanced SQL database analytics portfolio built from the very first line of code to address the most demanding Big Data analytics initiatives. HPE Vertica delivers speed without compromise, scale without limits, and the broadest range of consumption models. Choose Vertica on premise, in the cloud, or on Hadoop. With support for all leading BI and visualization tools, open source technologies like Hadoop and R, and built-in analytical functions, Vertica…

Overview
Features

• Parallel approach to big data
• Faster data loads and higher concurrency
• Flexibility and scalability
• Columnar storage
• Intelligent compression
• Deploy on premise, in the clouds, and on Hadoop
• Complete and advanced SQL-based analytical functions
• Certification for common ETL and visualization tools
• Geospatial and other advanced analytic functions
• Machine learning models including regression and K-means that you can predict and share with spark
• Integration with Hadoop, including parquet and ORC files

Price

Contact for Pricing

What is best?

• Parallel approach to big data
• Faster data loads and higher concurrency
• Flexibility and scalability

What are the benefits?

• Ability to deploy anywhere.
• Proactive and predictive analytics.
• Placed analysis through open source integration.

Bottom Line

HPE Vertica is the most advanced SQL database analytics portfolio built from the very first line of code to address the most demanding Big Data analytics initiatives. HPE Vertica delivers speed without compromise, scale without limits, and the broadest range of consumption models. Choose Vertica on premise, in the cloud, or on Hadoop.

7.6
Editor Rating
3.9
Aggregated User Rating
3 ratings
You have rated this

HP Vertica Advanced Analytics

20

Google Fusion Tables

Fusion Tables is a web application for visualizing data that allows users to share data sets and combine them together to build data visualization online. The application is still experimental and its API has released V2. It allows users to easily create data visuals and publish them online instantly with provided subsets and an easy format similar to online files. Fusion Tables supports the ability to work through larger data sets including filtering, sorting, summarizing them in collaboration with other users online. Fusion Tables lets users combine multiple tables between users and publicly available data then merge them into one…

Overview
Features

• Visualize Bigger Data Tables Online
• Merge multiple tables into one visualization
• Make a map in minutes
• Host Data Online

Price

•Professional Edition - $4,500/year
•Enterprise edition - $22,000/year

What is best?

• Visualize Bigger Data Tables Online
• Merge multiple tables into one visualization
• Make a map in minutes

Bottom Line

An experimental application to store, share, query, and visualize data tables.
Make custom maps, charts, cards, and tables with your data or public data.

8.5
Editor Rating
8.3
Aggregated User Rating
3 ratings
You have rated this

Google Fusion Tables

21

NodeXL

NodeXL is a graphic application of networks. NodeXL comes in two packages; basic and pro. Basic is free, and the NodeXL application is available for Microsoft® Excel® 2007, 2010, 2013 and 2016 which makes exploration of network graphs easy. NodeXL pro, on the other hand, extends features of the basic NodeXL and provides additional features such as access to social media network data streams, text analysis as well as sentiment analysis and advanced network metrics. Both the basic and pro-NodeXL features Graph Metric Calculations, the only difference is that the pro can calculate the degree of centrality, PageRank, clustering coefficient…

Overview
Features

•Graph Metric Calculations
•Flexible Import and Export
•Direct Connections to Social Networks
•Zoom and Scale
•Flexible Layout
•Easily Adjusted Appearance
•Dynamic Filtering
•Powerful Vertex Grouping
•Task Automation

Price

Free

What is best?

•Graph Metric Calculations
•Flexible Import and Export
•Direct Connections to Social Networks

Bottom Line

Node Excel is a graphic application that allows data generated from other websites and social media platforms be analyzed through graphical presentation by making use of graphic calculations, task automation as well as dynamic filtering.

7.6
Editor Rating
8.2
Aggregated User Rating
2 ratings
You have rated this

NodeXL

22

Fluentd

Fluentd is an open source data collector, which lets you unify the data collection and consumption for a better use and understanding of data. Fluentd offers features such as a community-driven support, ruby gems installation, self-service configuration, OS default Memory allocator, C & Ruby language, 40mb memory, requires a certain number of gems and Ruby interpreter and more than 650 plugins available. Fluentd tries to structure data as JSON as much as possible which allows Fluentd to unify all facets of processing log data such as collecting, filtering, buffering, and outputting logs across multiple sources and destinations (Unified Logging Layer).…

Overview
Features

• Unified Logging with JSON
• Pluggable Architecture
• Minimum Resources Required
• Built-in Reliability

Price

• Free

What is best?

• Unified Logging with JSON
• Pluggable Architecture
• Minimum Resources Required

What are the benefits?

• Fluentd decouples data sources from backend systems by providing a unified logging layer in between.
• Simple and Easy yet Flexible
• Fluentd is Apache 2.0 Licensed

Bottom Line

Fluentd is an open source data collector for building the unified logging layer and runs in the background to collect, parse, transform, analyze and store various types of data.

7.6
Editor Rating
7.0
Aggregated User Rating
5 ratings
You have rated this

Fluentd

23

Displayr

DataCracker

Displayr provides building apps that brings data science, visualization, and reporting to everyone. The two main products are Displayr, the world's first complete data science tool; and Q, the complete toolkit for market researchers. Now you can discover the story in your data, and create impressive reports, dashboards and visualizations without the need to reformat your data, have specialist coding knowledge, or involve IT or other outside consultants. Displayr is the only BI tool built specifically with survey data in mind. Displayr makes it easy to connect your data from virtually any source (without complex reformatting) letting you focus on…

Overview
Features

• Analyze up to 100 responses per survey
• Easy-to-use web-based analysis tool
• Create tables and charts
• Create word clouds
• Reformat your data
• Interesting results automatically highlighted (significance testing)

Price

• Free
• Basic - $19 per month; Billed $228 annually
• Standard - $25 per month; Billed $300 annually
• The Lot - $65 per month; Billed $780 annually

What is best?

• Analyze up to 100 responses per survey
• Easy-to-use web-based analysis tool
• Create tables and charts

What are the benefits?

• Interesting results are automatically highlighted
• Easy and intuitive to use
• You can share your insights

Bottom Line

Datacracker lets users import data to the survey data analysis software where it mines the data for more insights then a report is automatically written.

7.6
Editor Rating
8.0
Aggregated User Rating
4 ratings
You have rated this

Displayr

24

NumPy

NumPy

NumPy provides a comprehensive package for scientific computing using a python programming language. The NumPy library provides support to big multi-dimensional arrays and matrices. NumPy fully integrated package contains several features that makes it ideal for scientific computing, calculation of multi-dimensional arrays, matrices and even high level mathematics calculations. The first feature of NumPy is the powerful N-dimensional array object that is used in the multi-dimensional arrays. Data scientists and developers performing broadcasting are also sorted out as NumPy provides detailed and easy to use functions. NumPy also provides C or C++ tools to developers and data scientists. The C++…

Overview
Features

• Powerful N-dimensional array object
• Sophisticated (broadcasting) functions
• Tools for integrating C/C++ and Fortran code
• Useful linear algebra
• Fourier transform
• Random number

Price

Contact for Pricing

Website
What is best?

• Tools for integrating C/C++ and Fortran code
• Useful linear algebra
• Fourier transform

What are the benefits?

• Seamlessly and speedily integrate with a wide variety of databases
• Defines data types
• Efficient multi-dimensional container for generic data

Bottom Line

NumPy is a fundamental and complete suite library used for scientific computing by data scientists when using Python programming language and supports large matrices and multi-dimensional arrays and high level mathematics.

7.6
Editor Rating
8.5
Aggregated User Rating
2 ratings
You have rated this

NumPy

25

OpenRefine

OpenRefine

OpenRefine is a sophisticated tool for working on big data and perform analytics. OpenRefine is able to perform various tasks on data. The tasks are, cleaning data, transformation of data from one form into the other format, and also extend with web services and data that are external. OpenRefine provides the explore data feature that enables data scientists go through large data sets with ease. The explore data feature is easy to be used as it also comes with a video explaining how it is used. The clean and transform data feature provided by OpenRefine enables data scientists also clean…

Overview
Features

• Explore data
• Clean and transform data
• Reconcile and match data
• General Refine Expression Language

Price

Contact for Pricing

Website
What is best?

• Explore data
• Clean and transform data
• Reconcile and match data

What are the benefits?

• Import data in various formats
• Explore datasets in a matter of seconds
• Apply basic and advanced cell transformations

Bottom Line

OpenRefine is a sophisticated tool that is able to clean data, transform data from one form to the other format, extend data with web services and data that are external, work on big data, and also perform analytics.

7.6
Editor Rating
7.7
Aggregated User Rating
3 ratings
You have rated this

OpenRefine

26

Julia

Julia

Julia is a sophisticated programming language that is of high performance used for numerical computation. Julia provides a comprehensive compiler, parallel execution that is distributed, a function library that is extensive mathematically and numerical accuracy. All of Julia programs encircle several dispatches by defining and compiling up functions used in argument types of different combinations which in other cases can be defined by the user. The multiple dispatch provides scientists with the ability of defining function behaviors across several combinations of arguments. Julia also features a dynamic type system which is able to deal with various types of documentation, dispatch,…

Overview
Features

• Dynamic type system
• Multiple dispatch
• Built-in package manger
• Call Python functions
• Call C functions directly

Price

Contact for Pricing

Website
What are the benefits?

• Provides distributed parallel execution
• Provides library for random number generation
• Ability to overload different combinations of argument types

Bottom Line

Julia provides a sophisticated programming language that is of high level and performance used in distributed parallel execution, extensive mathematical calculations, in getting numerical accuracy, and as a sophisticated compiler.

7.6
Editor Rating
8.5
Aggregated User Rating
2 ratings
You have rated this

Julia

27

Massive Online Analysis

Massive Online Analysis

Massive Online Analysis (MOA) is a framework that is open source used in stream mining of data. Massive Online Analysis consists of a collection of machine learning algorithms such as regression, classification, clustering, detection, outlier, recommender systems, and concept drift detection. Massive Online Analysis also features tools used in evaluation of data stream mining. Massive Online Analysis is ideal for data scientists as it performs big data stream mining in real time and also perform large scale machine learning. The mining algorithms available in MOA can be extended and achieve new stream generators or evaluation measures. Massive Online Analysis features…

Overview
Features

•Machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.
•Stream mining in real time, and large scale machine learning.

Price

Contact for Pricing

What is best?

•Machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.
•Stream mining in real time, and large scale machine learning.

What are the benefits?

• Easily used with Apache Flink, Apache Storm, S4 or Samza
• Handles complex knowledge workflows
• Enables multi-label classification

Bottom Line

Massive Online Analysis consists of a collection of machine learning algorithms and an open source framework that enables data stream mining, regression, clustering, classification, outlier, detection, concept drift detection, and recommender systems.

7.6
Editor Rating
6.3
Aggregated User Rating
2 ratings
You have rated this

Massive Online Analysis

28

DataWrangler

DataWrangler is a web-based service which is designed for cleaning and rearranging data so it is in a form that other tools such as a spreadsheet app can use. DataWrangler offers features such as exports transformation script as code which is a useful option for handling large data sets where the users first transform a sample of their data in the Wrangler interface, then run the resulting script on the full data set and supports output scripts in two languages such as Python (for data-crunching on the back end) and JavaScript (for transforming in the browser, or using node.js). DataWrangler…

Overview
Features

• Designed to accelerate analysis and visualization tools
• Interactive transformation of messy, real-world data into the data tables analysis tools expect.
• Export data for use in Excel, R, Tableau, Protovis

Price

Free

What is best?

• Designed to accelerate analysis and visualization tools
• Interactive transformation of messy, real-world data into the data tables analysis tools expect.
• Export data for use in Excel, R, Tableau, Protovis

Bottom Line

Wrangler is an interactive tool for data cleaning and transformation.

8.5
Editor Rating
8.6
Aggregated User Rating
1 rating
You have rated this

DataWrangler

29

EasyReg

EasyReg

EasyReg is an open source software that conducts several testing tasks and econometric estimation on all Windows platforms that use 32 and 64 bit form and also the Windows 7. Users using Windows 8 are also able to use EasyReg by only setting EasyReg compatibility mode to Windows XP. EasyReg is programmed to be able to work in Visual Basic 5 and also Visual Basic 5 Enterprise Edition. EasyReg is configured to be used in teaching econometrics and empirical research. The software is referred to as international because it is able to accept commas and dots as delimiters in decimal…

Overview
Features

•Tabulating data.
•Calculating summary statistics of the data: sample mean and standard error, minimum, maximum
•Plotting time series.
•Drawing scatter diagrams.
•Kernel estimation of univariate and bivariate density functions (two versions: standard kernel density estimation, and •Bierens' SMINK estimation).
•Auto- and cross-correlation functions for time series. In the autocorrelation case also the Box-Pierce Q statistics, •Ljung-Box Q statistics, and the partial autocorrelations can be computed.
•Periodogram of a time series

Price

• Free

Website
What is best?

•Tabulating data.
•Calculating summary statistics of the data: sample mean and standard error, minimum, maximum
•Plotting time series.

Bottom Line

EasyReg is a free open source software that users using Windows platforms such as Windows 7 and Windows 8 are able to perform a number of testing tasks and econometric estimation.

7.5
Editor Rating
7.5
Aggregated User Rating
1 rating
You have rated this

EasyReg

30

Matplotlib

Matplotlib

Matplotlib is a library for making 2D plots of arrays in Python which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib offers features such as The top level matplotlib module, afm (Adobe Font Metrics interface), animation module, artist Module, Axes class, axis and tick API, backends, cbook, cm (colormap), collections, colorbar, colors, container, dates, dviread, figure, finance, font_manager, gridspec, image, legend and legend_handler, lines, markers, mathtext, mlab, offsetbox, patches, path, patheffects, projections, pyplot, rcsetup, Sankey, scale, spines, style, text, ticker, tight_layout, working with transformations, triangular grids, type1font, units and widgets. Users can…

Overview
Features

• Improved color conversion API and RGBA support
• New Configuration (rcParams)
• Qualitative colormaps
• Axis offset label now responds to labelcolor
• Improved offset text choice
• Style parameter blacklist
• Change in default font
• Faster text rendering
• Improvements for the Qt figure options editor
• Improved image support
• Support for HiDPI (Retina) displays in the NbAgg and WebAgg backends
• Change in the default animation codec
• Deprecated support for mencoder in animation
• Boxplot Zorder Keyword Argument
• Filled + and x markers
• rcount and ccount for plot_surface()
• Streamplot Zorder Keyword Argument Changes
• Extension to matplotlib.backend_bases.GraphicsContextBase

Price

• Free

Website
What is best?

• Improved color conversion API and RGBA support
• New Configuration (rcParams)
• Qualitative colormaps

What are the benefits?

• Produces publication quality figures
• Can be used in python scripts
• Generate plots

Bottom Line

Matplotlib can be used in Python scripts, the Python and IPython shell, the jupyter notebook, web application servers, and four graphical user interface toolkits.

7.6
Editor Rating
8.1
Aggregated User Rating
1 rating
You have rated this

Matplotlib

31

Ipython

Ipython

IPython is open source (BSD license) which provides an easy to use, high performance tools for parallel computing. IPython offers features such as Jupyter notebook and notebook file format, Jupyter Qt console, kernel messaging protocol, ipyparallel (formerly IPython.parallel), ipykernel (minimal docs, only release notes for the ipykernel package), ipywidgets (formerly IPython.html.widgets), Traitlets, the config system used by IPython and Jupyter, interactive interpreter, an enhanced interactive Python shell, a decoupled two-process communication model and an architecture for interactive parallel computing. IPython is known to work on Linux, Most other Unix-like OSs (AIX, Solaris, BSD), Mac OS X and Windows (CygWin, XP,…

Overview
Features

• A powerful interactive shell.
• A kernel for Jupyter.
• Support for interactive data visualization and use of GUI toolkits.
• Flexible, embeddable interpreters to load into your own projects.
• Easy to use, high performance tools for parallel computing.

Price

• Free

Website
What is best?

• Support for interactive data visualization and use of GUI toolkits.
• Flexible, embeddable interpreters to load into your own projects.
• Easy to use, high performance tools for parallel computing.

What are the benefits?

• Provides a rich architecture for interactive computing
• Used by a range of other projects
• Add your project to that list if it uses Ipython as a library

Bottom Line

IPython is a growing project, with increasing language-agnostic components and which provides a rich architecture for interactive computing.

7.6
Editor Rating
7.9
Aggregated User Rating
1 rating
You have rated this

Ipython

32

SymPy

SymPy

SymPy is a Python library for symbolic mathematics which simplifies expressions, compute derivatives, integrals, and limits, solve equations and work with matrices. SymPy includes features such as modules for plotting such as coordinate modes, Plotting Geometric Entities, 2D and 3D, Interactive interface, Colors and Matplotlib support, printing like 2D pretty printed output of math formulas, or LATEX, code generation, physics, statistics, combinatorics, number theory, geometry and logic, Conversion from Python objects to SymPy objects, Optional implicit multiplication and function application parsing, Limited Mathematica and Maxima parsing: example on SymPy Live and Custom parsing transformations and Shift cipher, Affine cipher, Bifid…

Overview
Features

• Basic arithmetic: Support for operators such as +, -, *, /, ** (power)
• Simplification Trigonometry, Polynomials
• Expansion: of a polynomial
• Functions: trigonometric, hyperbolic, exponential, roots, logarithms, absolute value, spherical harmonics, factorials and gamma functions, zeta functions, polynomials, special functions
• Numbers: arbitrary precision integers, rationals, and floats
• Noncommutative symbols
• Pattern matching

Price

• Free

Website
What is best?

• Basic arithmetic: Support for operators such as +, -, *, /, ** (power)
• Simplification Trigonometry, Polynomials
• Expansion: of a polynomial

What are the benefits?

• Licensed under BSD
• Written entirely in Python
• Uses Python for its language

Bottom Line

SymPy is free, Python-based and a pure Python library for arbitrary floating point arithmetic, making it easy to use.

7.6
Editor Rating
9.3
Aggregated User Rating
3 ratings
You have rated this

SymPy

33

FreeMat

FreeMat

FreeMat is an environment for rapid engineering and scientific processing which is similar to commercial systems such as MATLAB from Mathworks and IDL from Research Systems, but is Open Source. FreeMat offers features such as a codeless interface to external C/C++/FORTRAN code, parallel/distributed algorithm development (via MPI), and advanced volume and 3D visualization capabilities. FreeMat now supports function handles, or function pointers where a function handle is an alias for a function or script that is stored in a variable. FreeMat now also supports the so called dynamic-field indexing expressions where the fieldname is supplied through an expression instead of…

Overview
Features

• N-dimensional array manipulation (by default, N is limited to 6)
• Support for 8,16, and 32 bit integer types (signed and unsigned), 32 and 64 bit floating point types, and 64 and 128 bit complex types.
• Built in arithmetic for manipulation of all supported data types.
• Support for solving linear systems of equations via the divide operators.
• Eigenvalue and singular value decompositions
• Full control structure support (including, for, while, break, continue, etc.)
• 2D plotting and image display
• Heterogeneous array types (called "cell arrays" in MATLAB-speak) fully supported
• Full support for dynamic structure arrays
• Split-radix based FFT support
• Pass-by-reference support (an IDL feature)
• Keyword support (an IDL feature)
• Codeless interface to external C/C++/FORTRAN code
• Native Windows support
• Native sparse matrix support
• Native support for Mac OS X (no X11 server required).
• Function pointers (eval and feval are fully supported)
• Classes, operator overloading
• 3D Plotting and visualization via OpenGL
• Handle-based graphics
• 3D volume rendering capability (via VTK)

Price

• Free

Website
What is best?

• N-dimensional array manipulation (by default, N is limited to 6)
• Support for 8,16, and 32 bit integer types (signed and unsigned), 32 and 64 bit floating point types, and 64 and 128 bit complex types.
• Built in arithmetic for manipulation of all supported data types.

What are the benefits?

• FreeMat is now easier to build.
• The top level README.TXT includes instructions on how to build FreeMat on all three platforms (Linux, Mac OS X, and Mingw32).
• Documentation has been migrated to doxygen.

Bottom Line

FreeMat offers Native Windows support, Native sparse matrix support, Native support for Mac OS X (no X11 server required), Function pointers (eval and feval are fully supported), Classes, operator overloading , 3D Plotting and visualization via OpenGL, Handle-based graphics and 3D volume rendering capability (via VTK).

7.6
Editor Rating
7.7
Aggregated User Rating
2 ratings
You have rated this

FreeMat

34

jMatLab

jMatLab

jMatLab is a free platform for mathematical and numerical computations which is a clone of Matlab and Octave and runs on any platform where Java is installed or on the Web browser. jMatLab provides features such as Arithmetic, Variables, String Manipulations, Commands and Operators, Functions, Polynomials, Vectors, Differentiation, Equations (Differential), Equations (Linear Systems), Equations (Nonlinear), Equations (Nonlinear Systems), Indefinite Integrals, Input and Output, Matrices, Numerical Integration, Plots, Programming, Statistics (Data Fitting), Statistics (Descriptive), Statistics (Histograms), Statistics (Random Numbers), Taylorpolynomial and Transformations. jMathLab has its own help system where all programming modules are arranged in groups and users can list all…

Overview
Features

• Used for Symbolic calculations
• Numeric Evaluation of mathematical functions, special functions
• Linear algebra with vectors and matrices
• Displaying data, vectors, matrices and functions using 2D and 3D interactive plots
• Saving data in CSV files
• Random numbers using the major distributions
• Solving linear and non-linear equations and systems of equations
• Basic statistical calculations and histogramming

Price

• Free

Website
What is best?

• Used for Symbolic calculations
• Numeric Evaluation of mathematical functions, special functions
• Linear algebra with vectors and matrices

What are the benefits?

• A clone of Matlab and Octave
• Runs on any platform where Java is installed
• Can also run on the Web browser

Bottom Line

jMatLab can be used for simplification, differentials, integration, vectors and matrices.

7.6
Editor Rating
6.0
Aggregated User Rating
1 rating
You have rated this

jMatLab

35

PAW

PAW

PAW is an instrument conceived for assisting physicists in analyzing and presenting of data. PAW facilitates an statistical or mathematical analysis and a graphical presentation that are interactive. The interactive graphical presentation enables physicists work on objects familiar to them such as event files, vectors, and histograms. The PAW presentation feature provides a set of slides majorly in PostScript format that provides a general overview of the entire PAW system. The set of slides in PostScript format provides physicists with an almost complete review of the PAW functionalities. The PAW functionalities presented in set if slides in PostScript format are…

Overview
Features

• Pawpict package
• PAW presentation
• Neural networks
• WebPAW
• Hints to speed up Ntuple analysis

Price

Contact for Pricing

Website
What is best?

• Pawpict package
• PAW presentation
• Neural networks

What are the benefits?

• Easier to include pictures in LaTeX documents
• Able to send commands or execute through any Web browser and receive resulting pictures
• Easy to understand functionalities for people with no knowledge of PAW

Bottom Line

PAW is an instrument used by physicists to analyze and present their data through the provided interactive graphical presentation such as event files, histograms, and vectors and statistical or mathematical analysis presentation.

7.6
Editor Rating
9.1
Aggregated User Rating
1 rating
You have rated this

PAW

36

ILNumerics

ILNumerics

ILNumerics is based on modern software frameworks and provides tools and solutions for scientists and engineers in all industries. ILNumerics modern software framework enables data scientists and engineers to develop and deploy highly configured technical applications in the shortest time possible. ILNumerics features the ILNumerics array visualizer. The array visualizer is simply a graphical watch window used in Visual Studio. The array visualizer enables scientists debug large and big data in a broad range of technical applications. The array visualizer has a visual representation of arbitrary data that enables it prototype your algorithms and find bugs quickly and also have…

Overview
Features

• Array visualizer
• Visualization engine
• Computing engine

Price

• Visualization engine - 89,-EUR/Month1
• Computing engine - 89,-EUR/Month1
• Interpolation Toolbox2 - 69,-EUR/Month1
• Optimization Toolbox2 -69,-EUR/Month1
• Statistical Toolbox2- 49.-EUR/Month1
• Machine Learning Toolbox2- 49,EUR/Month1
• HDF5 (Hierarchical Data Format)2- 49,-EUR/Month1
• Drawing2 Plotting Extensions4- 49,-EUR/Month1

Website
What is best?

• Array visualizer
• Visualization engine
• Computing engine

What are the benefits?

• No IT expert knowledge needed
• Finds bugs easily
• Eliminates multithreading concurrency issues

Bottom Line

ILNumerics is a sophisticated software that provides tools and solutions to engineers and data scientists for developing and deployment of complex technical applications and is based on modern software frameworks.

7.6
Editor Rating
8.1
Aggregated User Rating
2 ratings
You have rated this

ILNumerics

37

ROOT

ROOT

ROOT is a sophisticated scientific software application that provides functions required to deal with statistical analysis, large data processing, storage, and visualization. ROOT is mainly in C++ language but it can be converted into several natural languages such as R, Python and many more. The Save data feature provided by ROOT enables users to save their data using C++ object language or in a binary form that is compressed in their own file. The ROOT files are self-descriptive therefore making it easy for users to save their object format in the same ROOT file. The ROOT file contains information that…

Overview
Features

• Save data
• Access data
• Process data
• Show results
• Integration with other languages
• Interactive or built application

Price

Contact for Pricing

Website
What is best?

• Process data
• Show results
• Integration with other languages

What are the benefits?

• Save data in a compressed binary form
• Access data from your PC
• Able to simulate complex systems

Bottom Line

ROOT provides tools for modular scientific software framework that provide functions needed by data scientists to perform large data processing, analysis of statistical data, visualization, and storage and is mainly in C++ language.

7.6
Editor Rating
8.8
Aggregated User Rating
1 rating
You have rated this

ROOT

38

NetworkX

NetworkX

NetworkX is a software package in Python language used in creating, manipulating, and study of the functions, structures, and dynamics of the networks that are complex. NetworkX is simply a software ideal for analyzing complex networks. NetworkX enables results to be presented in a unique and graphical way. The data structures are present for graphs, multigraphs, and digraphs. Since NetworkX is a Python package it facilitates fast prototyping and provides an easy to teach and multi-platform. Data scientists are also provided with several standard graph algorithms that are useful when dealing with complex networks. NetworkX also features generators. The generators…

Overview
Features

• Many standard graph algorithms
• Edges can hold arbitrary data
• Network structures and analysis measures
• Data structures for graphs, digraphs, and multigraphs
• Generators for classic graphs, random graphs and synthetic networks

Price

Contact for Pricing

Website
What is best?

• Many standard graph algorithms
• Edges can hold arbitrary data
• Network structures and analysis measures

What are the benefits?

• Enables creation of complex networks
• Enables manipulation of complex networks
• Enables study of structure of complex networks

Bottom Line

NetworkX is an open source software used for analyzing complex networks and uses Python language for creating, manipulating, and study of the functions, structures, and dynamics of the networks that are complex.

7.6
Editor Rating
8.4
Aggregated User Rating
2 ratings
You have rated this

NetworkX

39

Arcadia Data Instant

Arcadia Data Instan uses smart acceleration to enable ultra-fast analytics and BI with agile drag-and-drop access. Arcadia Data Instant provides an in-cluster execution engine for scale-out performance on Apache Hadoop and other modern data platforms with no data movement. Arcadia Data Instant supports visualizations on Apache Kafka. Through this, users have an excellent platform to download a kit quickly and get started with exploring visualizations of Kafka topics. The key features offered by Arcadia Data Instant include connect, discover, model, visualise, interact, manage, scale, optimize, security, share and publish, and advanced analytics. The connect feature allows accessing data inside Hadoop…

Overview
Features

• The discover feature provides browse data sources, structure and content, with full granularity and transparency
• Set hierarchies and logical datasets, for blending visualizations across sources
• The visualize feature provides easy to use familiar web-based self-service drag and drop authoring
• Flow and funnel algorithms that make it easy to measure correlation
• Create semantic relationships across multiple sources
• Assemble dashboards and applications of visuals that show the user’s work

Price

Contact for pricing

What is best?

• The discover feature provides browse data sources, structure and content, with full granularity and transparency
• Set hierarchies and logical datasets, for blending visualizations across sources
• The visualize feature provides easy to use familiar web-based self-service drag and drop authoring

What are the benefits?

• Provides an in-cluster execution engine for scale-out performance on Apache Hadoop
• Achieve linear scalability of records with native in-cluster execution
• Simplifies deployment and monitoring with certified integration

Bottom Line

Arcadia Data Instant is an email marketing platform that provides an in-cluster execution engine for scale-out performance on Apache Hadoop and other modern data platforms with no data movement.

7.6
Editor Rating
7.9
Aggregated User Rating
3 ratings
You have rated this

Arcadia Data Instant

40

SIGVIEW

SIGVIEW

SIGVIEW is a real-time and offline signal analysis software package that includes wide range of powerful signal analysis tools, statistics functions and a comprehensive visualization system. Since it is distributed as a shareware, users can download a completely functional version and even try it for 21 days to establish if it can deliver to the needs of the business. SIGVIEW’s unique and friendly user interface and philosophy provides its users the absolute freedom to combine different signal analysis methods in any possible way, this helps users able to utilize it easier and focus more on how it can help the…

Overview
Features

• Various statistics functions
• Custom filter curves can be freely defined and applied directly to time-domain signal or to the calculated spectrum
• Advanced signal display and handling options
• Signal generator
• Support for wide range of data acquisition devices
• Real time data display
• Import and export of signal files in numerous formats: WAV, MP3, ASCII, WMA, AU, AIFF, SND, 8/16/32-bit binary files, EDF...
• Optimized FFT algorithm
• Spectrogram and Time-FFT functions with powerful graphical display solutions
• Dual channel (cross-spectral) analysis
• Signal filtering (Bandstop, Bandpass, Lowpass, Highpass)
• Real-time arithmetics on signals (subtract, multiply, add, scale, normalize...)• Graphical block diagram environment
• Custom tools and workspaces
• Various command-line options for automation and remote control from external applications or from simple batch files
• No artificial or license-based limitations

Price

•SIGVIEW Standard Version
Single Seat License $139.00/month
5-Pack Seat License $490.00/month
10-Pack Seat License $690.00/month
Site License $1290.00/month

•SIGVIEW Educational
5-Pack Seat License $390.00/month
10-Pack Seat License $590.00/month
Site License $1090.00/month

Website
What is best?

• Support for wide range of data acquisition devices
• Real time data display
• Import and export of signal files in numerous formats: WAV, MP3, ASCII, WMA, AU, AIFF, SND, 8/16/32-bit binary files, EDF...

What are the benefits?

• Real time data display, signal analysis and control.
• Import and export of signal files in numerous formats.
• Signal filtering.

Bottom Line

SIGVIEW’s unique and friendly user interface and philosophy provides its users the absolute freedom to combine different signal analysis methods in any possible way, this helps users able to utilize it easier and focus more on how it can help the company.

7.6
Editor Rating
4.8
Aggregated User Rating
5 ratings
You have rated this

SIGVIEW

41

Gephi

Gephi is a tool for data analysts and scientists keen to explore and understand graphs. Like Photoshop but for graph data, the user interacts with the representation, manipulate the structures, shapes and colors to reveal hidden patterns. The goal is to help data analysts to make hypothesis, intuitively discover patterns, isolate structure singularities or faults during data sourcing. Gephi is an open-source software for network visualization and analysis. It helps data analysts to intuitively reveal trends and patterns, highlight outliers and tells stories with their data. It uses a 3D render engine to display large graphs in real-time and to…

Overview
Features

•Networks up to 100,000 nodes and 1,000,000 edges
•Iterate through visualization using dynamic filtering
•Rich tools for meaningful graph manipulation
•Force-based algorithms
•Optimize for graph readability
•Betweenness Centrality, Closeness, Diameter, Clustering Coefficient, PageRank
•Community detection (Modularity)
•Random generators
•Shortest path
•Import temporal graph with the GEXF file format
•Run metrics over time (clustering coefficient)

What is best?

•Graph streaming ready
•Customizable PDF, SVG and PNG export
•Save presets

What are the benefits?

• User can interact with the representation
• User can manipulate the structures, shapes and colors to reveal hidden patterns
• User can intuitively discover patterns

Bottom Line

Gephi is a tool for data analysts and scientists keen to explore and understand graphs. Like Photoshop but for graph data, the user interacts with the representation, manipulate the structures, shapes and colors to reveal hidden patterns.

9.5
Editor Rating
6.2
Aggregated User Rating
7 ratings
You have rated this

Gephi

What are Data Analysis Software?

Data Analysis Software tool that has the statistical and analytical capability of inspecting, cleaning, transforming, and modelling data with an aim of deriving important information for decision-making purposes. The software allows one to explore the available data, understand and analyze complex relationships.

What are the Top Free Data Analysis Software?

Orange Data mining, Anaconda, R Software Environment, Scikit-learn, Weka Data Mining, Shogun, Tableau Public, DataMelt, Microsoft R, Trifacta, SciPy, ELKI, KNIME Analytics Platform Community, Scilab, TANAGRA, Dataiku DSS Community, DataPreparator, ITALASSI, HP Vertica Advanced Analytics, Google Fusion Tables, NodeXL, Fluentd, Displayr, NumPy, OpenRefine, Julia, Massive Online Analysis, DataWrangler, EasyReg, Matplotlib, Ipython, SymPy, FreeMat, jMatLab, PAW, ILNumerics, ROOT, NetworkX, Arcadia Data Instant, SIGVIEW, Gephi are some of the free or open source top software for data analysis.

2 Reviews
  • Dmitri
    May 23, 2014 at 10:59 pm

    ADDITIONAL INFORMATION
    One correction: at the very top, ScaVi should be called “ScaVis”. I should say I like the best SCaVis since I can program in Python while accessing very reach Java numerical libraries.

  • Robert Nutt
    August 8, 2014 at 1:12 am

    ADDITIONAL INFORMATION
    A related tool for data anlaysis is json-csv.com. It is an online converter which can convert any JSON to CSv for processing within a spreadsheet.

What's your reaction?
Love It
58%
Very Good
13%
INTERESTED
13%
COOL
4%
NOT BAD
10%
WHAT !
6%
HATE IT
8%
About The Author
imanuel