Predictive Analytics
Now Reading
Top 41 Free Data Analysis Software
2

Top 41 Free Data Analysis Software

Top 41 Free Data Analysis Software
4.5 (90.61%) 115 ratings

Data Analysis is the process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision making.

For an organization to excel in its operation, it has to make a timely and informed decision. More often than not, decision making relies on the available data. What does this mean? Data alone is not enough; to get the best out of your data, you must ensure that it is authentic. For data to be authentic it has to be current, accurate, and reliable.

The hardest part for any organization is the retrieval and analysis of relevant data in order to gain important business insight that can be used in decision making. To effectively analyze data, most organizations are now shifting their focus to data analysis software. With advancement in technology, software developers have come up with modern data analysis software that makes it easy to retrieve, interact, and visualize the available data with an aim of improving delivery of goods and services.

Top Free Data Analysis Software : Orange Data mining, R Software Environment, Weka Data Mining, Tableau Public, Microsoft R, Shogun, DataMelt, Arcadia Data, RapidMiner Starter Edition, Lavastorm Analytics Engine, Trifacta, ITALASSI, ELKI, Scikit-learn, Gephi, KNIME Analytics Platform Community, SciPy, Julia, Data Applied, TANAGRA, Google Fusion Tables, NodeXL, Dataiku DSS Community, Scilab, DataPreparator, DataCracker, NumPy, OpenRefine, Massive Online Analysis, DataWrangler, EasyReg, Matplotlib, Ipython, SymPy, FreeMat, jMatLab, Fluentd, PAW, ILNumerics, ROOT, NetworkX, Watson Studio, Arcadia Data Instant are some of the free or open source top software for data analysis.

What are Data Analysis Software?

Data Analysis Software tool that has the statistical and analytical capability of inspecting, cleaning, transforming, and modelling data with an aim of deriving important information for decision-making purposes. The software allows one to explore the available data, understand and analyze complex relationships. Besides statistical analysis, the tool also has a powerful visualization capability which allows one to share the data with other stakeholders.

Data analysis can be classified into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA). Descriptive Statistics deals with quantitatively describing the main features of a collection of information. Exploratory Data Analysis focuses on discovering new features in the data.Confirmatory Data Analysis deals with confirming or falsifying existing hypotheses.

  1. Data acquisition: Allows one to import data from various sources using import wizard. By importing the data, you can easily carry out the analysis by use of query builder. All you need is to identify your primary table and secondary tables. From there the query builder will automatically match primary key from your primary table with the primary key in the secondary tables thus saving you time.
  1. Data clean up: For data analysis to be effective, you must have clean and reliable data. The software has features that make it easy to clean up the data and make it more reliable for analysis. Besides clean up, the software can also be used to consolidate different categories from multiple entries for accurate tabulation.
  1. Data visualization: This is a powerful tool that allows one to identify patterns and trends from a data setup. The tool makes it easy to explore data from different formats ranging from graphs or pie charts. The graph builder helps one to explore the data and build interactive graphical displays with ease. You can also combine multiple graphs for easy and insightful analysis of your data.
  1. Basic data analysis: By use of a distribution platform, the software makes it easy for you to generate a statistical analysis from the available data. You can easily create interactive histograms and come up with customized summary statistics from the distribution platform. All you need is to identify your column of interest and the distribution platform will automatically generate graphs and other statistics as per your specification.
  2. Text exploration: Analysing data from text format can be daunting especially if you don’t have the right tools. This is more so because the data is unstructured and at times it can also be uncompromising and unruly. The text explorer feature has a set of highly interactive commands that make it possible to extract words and phrases from the unstructured texts, especially from surveys and engineering notes.

 

Sisense

Sisense empower the most non-technical user with the ability to access data and build interactive dashboards and business intelligence reports. Sisense provides a variety of dashboard widgets to pinpoint the best visualization for your data, such as: geographical maps, gauges to measure KPIs, line charts to determine trends, scatter plots to see correlations, and pie charts for clear comparisons.Sisense enables to customize dashboard layout with drag-and-drop features to place each widget exactly where you want for optimal representation.

Sisense Demo

 

 

You may also like to review the top free data mining software list :
Top Free Data Mining Software

You may also like to review the top proprietary data mining software list:
Top Data Mining Software

Top Free Data Analysis Software

Orange Data mining, R Software Environment, Weka Data Mining, Tableau Public, Microsoft R, Shogun, DataMelt, Arcadia Data, RapidMiner Starter Edition, Lavastorm Analytics Engine, Trifacta, ITALASSI, ELKI, Scikit-learn, Gephi, KNIME Analytics Platform Community, SciPy, Julia, Data Applied, TANAGRA, Google Fusion Tables, NodeXL, Dataiku DSS Community, Scilab, DataPreparator, DataCracker, NumPy, OpenRefine, Massive Online Analysis, DataWrangler, EasyReg, Matplotlib, Ipython, SymPy, FreeMat, jMatLab, Fluentd, PAW, ILNumerics, ROOT, NetworkX, Watson Studio, Arcadia Data Instant are some of the free or open source top software for data analysis.
Top Data Analysis Software Free
PAT Index™
 
Orange-Survey plot
 
 
 
Trifacta’s Visual Data Profiling
 
R
 
Weka Data Visualiser
 
 
 
 
 
Lavastorm Analytics Engine
 
 
 
ITALASSI
 
ELKI
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1

Orange Data mining

Orange

Orange is an open source data visualization and analysis tool. Orange is developed at the Bioinformatics Laboratory at the Faculty of Computer and Information Science, University of Ljubljana, Slovenia, along with open source community. Data mining is done through visual programming or Python scripting. The tool has components for machine learning, add-ons for bioinformatics and text mining and it is packed with features for data analytics. Orange is a Python library. Python scripts can run in a terminal window, integrated environments like PyCharm and PythonWin, or shells like iPython. Orange consists of a canvas interface onto which the user places…

Bottom Line

Orange is an open source data visualization and analysis tool, where data mining is done through visual programming or Python scripting. The tool has components for machine learning, add-ons for bioinformatics and text mining and it is packed with features for data analytics.

9.5
Editor Rating
Aggregated User Rating
26 ratings
You have rated this

Orange Data mining

Orange-Survey plot

2

R Software Environment

R

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. R is an integrated suite of software facilities for data manipulation, calculation and graphical display. Some of the functionalities include an effective data handling and storage facility, a suite of operators for calculations on arrays, in particular matrices, a large, coherent, integrated collection of intermediate tools for data analysis, graphical facilities for data analysis and display either directly at the computer or on hardcopy, and well developed, simple and effective programming language which includes conditionals,…

Bottom Line

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. R is an integrated suite of software facilities for data manipulation, calculation and graphical display.

9.1
Editor Rating
8.8
Aggregated User Rating
11 ratings
You have rated this

R Software Environment

R

3

Weka Data Mining

Weka

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka features include machine learning, data mining, preprocessing, classification, regression, clustering, association rules, attribute selection, experiments, workflow and visualization. Weka is written in Java, developed at the University of Waikato, New Zealand. All of Weka's techniques are predicated on the assumption that the data is available as a single flat file or relation, where each data point is described by a fixed number of attributes Weka provides access to SQL databases…

Bottom Line

Weka is a collection of machine learning algorithms for data mining tasks. Weka features include machine learning, data mining, preprocessing, classification, regression, clustering, association rules, attribute selection, experiments, workflow and visualization. Weka is written in Java, developed at the University of Waikato, New Zealand.

9.1
Editor Rating
4.8
Aggregated User Rating
7 ratings
You have rated this

Weka Data Mining

Weka Data Visualiser

4

Tableau Public

Tableau Public is a free data storytelling application used to create and share interactive charts and graphs, stunning maps, live dashboards and fun applications and publish it anywhere on the web.Tableau Public is a free service that lets anyone publish interactive data to the web. Tableau Public includes a free desktop product which can be downloaded and use to publish interactive data visualizations to the web. There is a 1 gigabyte limit on storage space for data. Tableau Public can connect to Microsoft Excel, Microsoft Access, and multiple text file formats. There is a limit of 1,000,000 rows of data…

Bottom Line

Tableau Public is a free data storytelling application used to create and share interactive charts and graphs, stunning maps, live dashboards and fun applications and publish it anywhere on the web.

7.8
Editor Rating
7.9
Aggregated User Rating
6 ratings
You have rated this

Tableau Public

5

Microsoft R

Microsoft R

R is the world’s most powerful, and preferred, programming language for statistical computing, machine learning, and graphics, and is supported by a thriving global community of users, developers, and contributors.The Microsoft R product family includes: Microsoft R Server, Microsoft R Client, Microsoft R Open, SQL Server R Services.Microsoft R Server is the most broadly deployable enterprise-class analytics platform for R . Supporting a variety of big data statistics, predictive modeling and machine learning capabilities, R Server supports the full range of analytics exploration, analysis, visualization and modeling based on open source R. Microsoft R Client is a free, community supported,…

Bottom Line

Supporting a variety of big data statistics, predictive modeling and machine learning capabilities, R Server supports the full range of analytics – exploration, analysis, visualization and modeling based on open source R.

8.0
Editor Rating
6.2
Aggregated User Rating
3 ratings
You have rated this

Microsoft R

6

Shogun

Shogun

Shogun is a free, open source toolbox written in C++. It offers numerous algorithms and data structures for machine learning problems. The focus of Shogun is on kernel machines such as support vector machines for regression and classification problems. Shogun also offers a full implementation of Hidden Markov models.The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. This enables both rapid prototyping of data pipelines and extensibility in terms of new algorithms. It now offers features that span the whole space of Machine Learning methods, including many classical methods in classification, regression, dimensionality…

Bottom Line

Shogun also offers a full implementation of Hidden Markov models.The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. This enables both rapid prototyping of data pipelines and extensibility in terms of new algorithms.

7.6
Editor Rating
7.9
Aggregated User Rating
2 ratings
You have rated this

Shogun

7

DataMelt

DataMelt

DataMelt, or DMelt, is a software for numeric computation, statistics, analysis of large data volumes ("big data") and scientific visualization. The program can be used in many areas, such as natural sciences, engineering, modeling and analysis of financial markets. DMelt is a computational platform. It can be used with different programming languages on different operating systems. Unlike other statistical programs, it is not limited by a single programming language. DMelt can be used with several scripting languages, such as Python/Jython, BeanShell, Groovy, Ruby, as well as with Java. Most comprehensive software. It includes more than 30,000 Java classes for computation…

Bottom Line

DataMelt, or DMelt, is a software for numeric computation, statistics, analysis of large data volumes ("big data") and scientific visualization. The program can be used in many areas, such as natural sciences, engineering, modeling and analysis of financial markets.

7.5
Editor Rating
7.2
Aggregated User Rating
2 ratings
You have rated this

DataMelt

8

Arcadia Data

Arcadia Data unifies data discovery, visual analytics and business intelligence in a single, integrated platform that runs natively on Hadoop clusters. Arcadia Data does not require coding and users can go straight to into big data with intuitive drag and drop self service interface which provides exploration and semantic modeling on breadth and depth of all business data.Arcadia Data allows working on multiple sources such as Hive, Impal, Postgres, Amazon Redshift, MySQL, Teradata Aster and much more. It’s unique Active Data store models and tunes data structures continuously at Hadoop scale. Active Data automatically replaces sub-optimal curated schemas with intent…

Bottom Line

Arcadia Data unifies visual analysis, data discovery and business intelligence native in Hadoop. Its unique on-cluster execution architecture runs analytics directly on your Hadoop nodes. No need to add hardware or pull standalone extracts.

7.8
Editor Rating
8.0
Aggregated User Rating
5 ratings
You have rated this

Arcadia Data

9

RapidMiner Starter Edition

RapidMiner Studio provides a wealth of functionality to speed & optimize data exploration, blending & cleansing tasks – reducing the time spent importing and wrangling your data. RapidMiner provides an integrated environment for data preparation, machine learning, deep learning, text mining, and predictive analytics. It is used for business and commercial applications as well as for research, education, training, rapid prototyping, and application development and supports all steps of the machine learning process including data preparation, results visualization, model validation and optimization. Hundreds of machine learning, text analytics, predictive modeling algorithims, automation, and process control features help you build better…

Bottom Line

RapidMiner Studio ( Data Rows- 10,000) , RapidMiner Server (2 GB RAM) and RapidMiner Radoop (Limited to Single User) are available in starter edition with limitations.

7.5
Editor Rating
8.0
Aggregated User Rating
5 ratings
You have rated this

RapidMiner Starter Edition

10

Lavastorm Analytics Engine

Lavastorm is a visual data discovery solution that allows to rapidly integrate diverse data, easily discover elusive insights, and continuously detect anomalies, outliers, or patterns. Lavastorm Analytics Engine provides self-service capability for business users and rapid development capabilities for IT users in the areas of integration, analytics, and business control. Features include acquire, transform, combine, and enrich data from virtually any source, including Big Data sources without intensive modeling, pre-planning, or scripting. The solution discover data issues, such as completeness, inconsistent formats, accuracy, automate the evaluation and cleansing process. Lavastorm Analytics Engine use the visual analytic environment and its configurable…

Bottom Line

Lavastorm Analytics Engine provides self-service capability for business users and rapid development capabilities for IT users in the areas of integration, analytics, and business control.

7.5
Editor Rating
7.3
Aggregated User Rating
2 ratings
You have rated this

Lavastorm Analytics Engine

Lavastorm Analytics Engine

11

Trifacta

Trifacta’s Visual Data Profiling features provide immediate visibility into unique elements of the data set like data distributions and outliers to inform the transformation and analysis process.Trifacta uses data inference techniques to introspect the data and automatically apply initial shaping and metadata recommendations for the user. This greatly accelerates the transformation process. Users can quickly un-nest and iterate on the shape of their data in preparation for the dataset’s downstream use. Trifacta’s data enrichment features make standardizing data, joining datasets and aggregating data outputs to the right level, faster and more accurate.Advanced visual data profiling capabilities that guide users through…

Bottom Line

Trifacta’s data enrichment features make standardizing data, joining datasets and aggregating data outputs to the right level, faster and more accurate.Advanced visual data profiling capabilities that guide users through a deep understanding of the characteristics of any data set.

7.7
Editor Rating
9.0
Aggregated User Rating
2 ratings
You have rated this

Trifacta

Trifacta’s Visual Data Profiling

12

ITALASSI

Provalis Research

ITALASSI is a freeware program which facilitate interpretation of regression models (2 independent variables) with an interaction term. The program allows you to enter several regression models (two bivariate, one multiple additive, and one multivariate with interaction) in the form of equations or compute those equations from raw data and displays the various models using 2D and 3D graphs. The program may also be used in advanced stat courses to illustrate statistical interactions or applied multiple regression. Provalis Research

Bottom Line

The program allows you to enter several regression models (two bivariate, one multiple additive, and one multivariate with interaction) in the form of equations or compute those equations from raw data and displays the various models using 2D and 3D graphs

7.1
Editor Rating
5.0
Aggregated User Rating
2 ratings
You have rated this

ITALASSI

ITALASSI

13

ELKI

ELKI

The ELKI framework is written in Java and built around a modular architecture. Most currently included algorithms belong to clustering, outlier detection and database indexes. A key concept of ELKI is to allow the combination of arbitrary algorithms, data types, distance functions and indexes and evaluate these combinations. When developing new algorithms or index structures, the existing components can be reused and combined. ELKI is modeled around a database core, which uses a vertical data layout that stores data in column groups (similar to column families in NoSQL databases). This database core provides nearest neighbor search, range/radius search, and distance…

Bottom Line

ELKI is modeled around a database core, which uses a vertical data layout that stores data in column groups (similar to column families in NoSQL databases).

7.5
Editor Rating
8.3
Aggregated User Rating
1 rating
You have rated this

ELKI

ELKI

14

Scikit-learn

Scikit-learn

Scikit-learn is an open source machine learning library for the Python programming language.It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy. Classification : Identifying to which category an object belongs to Applications: Spam detection, Image recognition. Algorithms: SVM, nearest neighbors, random forest. Regression : Predicting a continuous-valued attribute associated with an object. Applications: Drug response, Stock prices. Algorithms: SVR, ridge regression. Clustering :Automatic grouping of similar objects into sets. Applications: Customer segmentation, Grouping experiment outcomes.…

Bottom Line

Scikit-learn features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

7.6
Editor Rating
8.6
Aggregated User Rating
2 ratings
You have rated this

Scikit-learn

15

Gephi

Gephi is a tool for data analysts and scientists keen to explore and understand graphs. Like Photoshop but for graph data, the user interacts with the representation, manipulate the structures, shapes and colors to reveal hidden patterns. The goal is to help data analysts to make hypothesis, intuitively discover patterns, isolate structure singularities or faults during data sourcing.Gephi is an open-source software for network visualization and analysis. It helps data analysts to intuitively reveal trends and patterns, highlight outliers and tells stories with their data. It uses a 3D render engine to display large graphs in real-time and to speed…

Bottom Line

Gephi is a tool for data analysts and scientists keen to explore and understand graphs. Like Photoshop but for graph data, the user interacts with the representation, manipulate the structures, shapes and colors to reveal hidden patterns.

7.6
Editor Rating
8.9
Aggregated User Rating
4 ratings
You have rated this

Gephi

16

KNIME Analytics Platform Community

KNIME Analytics Platform is the leading open solution for data-driven innovation, helping you discover the potential hidden in your data, mine for fresh insights, or predict new futures. With more than 1000 modules, hundreds of ready-to-run examples, a comprehensive range of integrated tools, and the widest choice of advanced algorithms available, KNIME Analytics Platform is the perfect toolbox for any data scientist. A vast arsenal of native nodes, community contributions, and tool integrations makes KNIME Analytics Platform the perfect toolbox for any data scientist. https://www.youtube.com/watch?v=fw0Vb2gLsgA

Bottom Line

A vast arsenal of native nodes, community contributions, and tool integrations makes KNIME Analytics Platform the perfect toolbox for any data scientist.

7.7
Editor Rating
8.0
Aggregated User Rating
3 ratings
You have rated this

KNIME Analytics Platform Community

17

SciPy

SciPy

SciPy Stack, is a collection of open source software for scientific computing in Python, and particularly a specified set of core packages. SciPy is an open source and free python based software used for technical computing and scientific computing. SciPy is commonly used in solving science, engineering and mathematics problems. SciPy features core packages that provide computing tools for Python. The first package is the Python whose general purpose is acting as the programming language in SciPy. The python provides users with an interactive interface with the ability of interpretation and dynamically typing and suited for interactive work and fast…

Bottom Line

SciPy is an open source software that has a complete collection of features that are used in solving engineering, mathematics and science problems and is also ideal for technical computing and computing scientifically.

7.5
Editor Rating
8.7
Aggregated User Rating
1 rating
You have rated this

SciPy

18

Julia

Julia

Julia is a sophisticated programming language that is of high performance used for numerical computation. Julia provides a comprehensive compiler, parallel execution that is distributed, a function library that is extensive mathematically and numerical accuracy. All of Julia programs encircle several dispatches by defining and compiling up functions used in argument types of different combinations which in other cases can be defined by the user. The multiple dispatch provides scientists with the ability of defining function behaviors across several combinations of arguments. Julia also features a dynamic type system which is able to deal with various types of documentation, dispatch,…

Bottom Line

Julia provides a sophisticated programming language that is of high level and performance used in distributed parallel execution, extensive mathematical calculations, in getting numerical accuracy, and as a sophisticated compiler.

7.6
Editor Rating
8.5
Aggregated User Rating
1 rating
You have rated this

Julia

19

Data Applied

Data Applied revolutionizes data-driven decision making by integrating rich analytics, data mining, and information visualization capabilities - all using a zero footprint Web interface, collaboration features, and a secure XML Web API. By extracting valuable knowledge from data in domains as varied as Web Analytics, Sales, Marketing, Engineering, Social Sciences or Non-Profit, Data Applied help organizations make better data-driven decisions and improve efficiency.Perform collaborative analysis by securely sharing data sets with others. Decide who has access to what, and with which access level. This helps the company in better management as the right people are given their own tasks to…

Bottom Line

Data Applied automatically discover broad categories of records using a cluster detection algorithm. The clustering algorithm finds groups of records sharing common traits, and generates profile information for each group.

7.6
Editor Rating
7.8
Aggregated User Rating
1 rating
You have rated this

Data Applied

20

TANAGRA

TANAGRA

Tanagra represents free data mining software for academic and research purposes. It provides several data mining methods from exploratory data analysis, statistical learning, machine learning and databases area. It is a successor of SIPINA which means that various supervised learning algorithms are provided, especially an interactive and visual construction of decision trees. Because it contains supervised learning but also other paradigms such as clustering, factorial analysis, parametric and nonparametric statistics, association rule, feature selection and construction algorithms, Tanagra is very powerful. The main goal of this project is giving researchers and student’s easy-to-use data mining software and second goal is…

Bottom Line

TANAGRA is an "open source project" as every researcher can access to the source code, and add his own algorithms, as far as he agrees and conforms to the software distribution license.The main purpose of Tanagra project is to give researchers and students an easy-to-use data mining software, conforming to the present norms of the software development in this domain (especially in the design of its GUI and the way to use it), and allowing to analyse either real or synthetic data.

7.5
Editor Rating
7.7
Aggregated User Rating
6 ratings
You have rated this

TANAGRA

21

Google Fusion Tables

Fusion Tables is a web application for visualizing data that allows users to share data sets and combine them together to build data visualization online. The application is still experimental and its API has released V2. It allows users to easily create data visuals and publish them online instantly with provided subsets and an easy format similar to online files. Fusion Tables supports the ability to work through larger data sets including filtering, sorting, summarizing them in collaboration with other users online. Fusion Tables lets users combine multiple tables between users and publicly available data then merge them into one…

Bottom Line

An experimental application to store, share, query, and visualize data tables.
Make custom maps, charts, cards, and tables with your data or public data.

7.6
Editor Rating
8.8
Aggregated User Rating
1 rating
You have rated this

Google Fusion Tables

22

NodeXL

NodeXL is a graphic application of networks. NodeXL comes in two packages; basic and pro. Basic is free, and the NodeXL application is available for Microsoft® Excel® 2007, 2010, 2013 and 2016 which makes exploration of network graphs easy. NodeXL pro, on the other hand, extends features of the basic NodeXL and provides additional features such as access to social media network data streams, text analysis as well as sentiment analysis and advanced network metrics. Both the basic and pro-NodeXL features Graph Metric Calculations, the only difference is that the pro can calculate the degree of centrality, PageRank, clustering coefficient…

Bottom Line

Node Excel is a graphic application that allows data generated from other websites and social media platforms be analyzed through graphical presentation by making use of graphic calculations, task automation as well as dynamic filtering.

7.6
Editor Rating
8.3
Aggregated User Rating
2 ratings
You have rated this

NodeXL

23

Dataiku DSS Community

Dataiku

Dataiku DSS is the collaborative data science software platform for teams of data scientists, data analysts, and engineers to explore, prototype, build, and deliver their own data products more efficiently. Dataiku develops the unique advanced analytics software solution that enables companies to build and deliver their own data products more efficiently. Dataiku DSS is a collaborative and team-based user interface for data scientists and beginner analysts, to a unified framework for both development and deployment of data projects, and to immediate access to all the features and tools required to design data products from scratch. The visual interface of Dataiku…

Bottom Line

The visual interface of Dataiku DSS empowers people with a less technical background to learn the data mining process, and build projects from raw data to predictive application, without having to write a single line of code.

7.5
Editor Rating
6.5
Aggregated User Rating
1 rating
You have rated this

Dataiku DSS Community

24

Scilab

Scilab

Scilab is an interpreted programming language that is associated to a detailed collection of numerical algorithms that solve many aspects of scientific problems. Users do not pay for Scilab therefor making it a free software. The binaries used in Scilab provide users with a good platform to process the 32 and 64-bit type of data. Scilab has main features that enable users interact more and easily with Scilab. They include optimization, statistics, maths and simulation, signal processing, application development, 2-D and 3-D visualization and the control system design and analysis. Scilab through the signal processing feature provides users with the…

Bottom Line

Scilab is an open and free software that uses interpreted programming language that offers platforms of 32 and 64-bit data processing and solves many aspects of scientific problems using the collection of numerical algorithms.

7.5
Editor Rating
8.6
Aggregated User Rating
2 ratings
You have rated this

Scilab

25

DataPreparator

DataPreparator

DataPreparator is a free software tool which is designed to assist with common tasks of data preparation (or data preprocessing) in data analysis and data mining. DataPreparator offers features such as character removal, text replacement, date conversion, remove selected attributes, move selected attributes, equal width, equal frequency, equal frequency from grouped data, delete records containing missing values, remove attributes containing missing values, impute missing values, predict missing values from model (dependence tree, Naive Bayes model), include missing value patterns, Z-score method, Box-plot method, create binary attributes, replace nominal values by indices, reduce number of labels, decimal, linear, hyperbolic tangent, soft-max,…

Bottom Line

DataPreparator includes operators for cleaning, discretization, numeration, scaling, attribute selection, missing values, outliers, statistics, visualization, balancing, sampling, row selection, and several other tasks.

7.6
Editor Rating
9.6
Aggregated User Rating
2 ratings
You have rated this

DataPreparator

26

DataCracker

DataCracker

DataCracker does all the work of data analysis and gives users choices that they can understand. DataCracker offers features such as Unlimited number of responses, Data files up to 50MB, Create an online report which you can share with others, Export to Excel, Export to PowerPoint, Export to printable PDF, Export to images, Embed in website, Customize the colors and designs of your charts and tables, Create new variables (JavaScript), Use templates to automatically format your charts, Use master slides to automatically lay out your slides and control default text formatting, Predictive modeling (decision trees/nonparametric regression) and Segments/Groups (latent class…

Bottom Line

Datacracker lets users import data to the survey data analysis software where it mines the data for more insights then a report is automatically written.

7.6
Editor Rating
8.3
Aggregated User Rating
1 rating
You have rated this

DataCracker

27

NumPy

NumPy

NumPy provides a comprehensive package for scientific computing using a python programming language. The NumPy library provides support to big multi-dimensional arrays and matrices. NumPy fully integrated package contains several features that makes it ideal for scientific computing, calculation of multi-dimensional arrays, matrices and even high level mathematics calculations. The first feature of NumPy is the powerful N-dimensional array object that is used in the multi-dimensional arrays. Data scientists and developers performing broadcasting are also sorted out as NumPy provides detailed and easy to use functions. NumPy also provides C or C++ tools to developers and data scientists. The C++…

Bottom Line

NumPy is a fundamental and complete suite library used for scientific computing by data scientists when using Python programming language and supports large matrices and multi-dimensional arrays and high level mathematics.

7.6
Editor Rating
8.5
Aggregated User Rating
2 ratings
You have rated this

NumPy

28

OpenRefine

OpenRefine

OpenRefine is a sophisticated tool for working on big data and perform analytics. OpenRefine is able to perform various tasks on data. The tasks are, cleaning data, transformation of data from one form into the other format, and also extend with web services and data that are external. OpenRefine provides the explore data feature that enables data scientists go through large data sets with ease. The explore data feature is easy to be used as it also comes with a video explaining how it is used. The clean and transform data feature provided by OpenRefine enables data scientists also clean…

Bottom Line

OpenRefine is a sophisticated tool that is able to clean data, transform data from one form to the other format, extend data with web services and data that are external, work on big data, and also perform analytics.

7.6
Editor Rating
8.0
Aggregated User Rating
2 ratings
You have rated this

OpenRefine

29

Massive Online Analysis

Massive Online Analysis

Massive Online Analysis (MOA) is a framework that is open source used in stream mining of data. Massive Online Analysis consists of a collection of machine learning algorithms such as regression, classification, clustering, detection, outlier, recommender systems, and concept drift detection. Massive Online Analysis also features tools used in evaluation of data stream mining. Massive Online Analysis is ideal for data scientists as it performs big data stream mining in real time and also perform large scale machine learning. The mining algorithms available in MOA can be extended and achieve new stream generators or evaluation measures. Massive Online Analysis features…

Bottom Line

Massive Online Analysis consists of a collection of machine learning algorithms and an open source framework that enables data stream mining, regression, clustering, classification, outlier, detection, concept drift detection, and recommender systems.

7.6
Editor Rating
7.5
Aggregated User Rating
2 ratings
You have rated this

Massive Online Analysis

30

DataWrangler

DataWrangler is a web-based service which is designed for cleaning and rearranging data so it is in a form that other tools such as a spreadsheet app can use. DataWrangler offers features such as exports transformation script as code which is a useful option for handling large data sets where the users first transform a sample of their data in the Wrangler interface, then run the resulting script on the full data set and supports output scripts in two languages such as Python (for data-crunching on the back end) and JavaScript (for transforming in the browser, or using node.js). DataWrangler…

Bottom Line

Wrangler is an interactive tool for data cleaning and transformation.

7.6
Editor Rating
8.6
Aggregated User Rating
1 rating
You have rated this

DataWrangler

31

EasyReg

EasyReg

EasyReg is an open source software that conducts several testing tasks and econometric estimation on all Windows platforms that use 32 and 64 bit form and also the Windows 7. Users using Windows 8 are also able to use EasyReg by only setting EasyReg compatibility mode to Windows XP. EasyReg is programmed to be able to work in Visual Basic 5 and also Visual Basic 5 Enterprise Edition. EasyReg is configured to be used in teaching econometrics and empirical research. The software is referred to as international because it is able to accept commas and dots as delimiters in decimal…

Bottom Line

EasyReg is a free open source software that users using Windows platforms such as Windows 7 and Windows 8 are able to perform a number of testing tasks and econometric estimation.

7.5
Editor Rating
7.5
Aggregated User Rating
1 rating
You have rated this

EasyReg

32

Matplotlib

Matplotlib

Matplotlib is a library for making 2D plots of arrays in Python which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib offers features such as The top level matplotlib module, afm (Adobe Font Metrics interface), animation module, artist Module, Axes class, axis and tick API, backends, cbook, cm (colormap), collections, colorbar, colors, container, dates, dviread, figure, finance, font_manager, gridspec, image, legend and legend_handler, lines, markers, mathtext, mlab, offsetbox, patches, path, patheffects, projections, pyplot, rcsetup, Sankey, scale, spines, style, text, ticker, tight_layout, working with transformations, triangular grids, type1font, units and widgets. Users can…

Bottom Line

Matplotlib can be used in Python scripts, the Python and IPython shell, the jupyter notebook, web application servers, and four graphical user interface toolkits.

7.6
Editor Rating
8.1
Aggregated User Rating
1 rating
You have rated this

Matplotlib

33

Ipython

Ipython

IPython is open source (BSD license) which provides an easy to use, high performance tools for parallel computing. IPython offers features such as Jupyter notebook and notebook file format, Jupyter Qt console, kernel messaging protocol, ipyparallel (formerly IPython.parallel), ipykernel (minimal docs, only release notes for the ipykernel package), ipywidgets (formerly IPython.html.widgets), Traitlets, the config system used by IPython and Jupyter, interactive interpreter, an enhanced interactive Python shell, a decoupled two-process communication model and an architecture for interactive parallel computing. IPython is known to work on Linux, Most other Unix-like OSs (AIX, Solaris, BSD), Mac OS X and Windows (CygWin, XP,…

Bottom Line

IPython is a growing project, with increasing language-agnostic components and which provides a rich architecture for interactive computing.

7.6
Editor Rating
7.9
Aggregated User Rating
1 rating
You have rated this

Ipython

34

SymPy

SymPy

SymPy is a Python library for symbolic mathematics which simplifies expressions, compute derivatives, integrals, and limits, solve equations and work with matrices. SymPy includes features such as modules for plotting such as coordinate modes, Plotting Geometric Entities, 2D and 3D, Interactive interface, Colors and Matplotlib support, printing like 2D pretty printed output of math formulas, or LATEX, code generation, physics, statistics, combinatorics, number theory, geometry and logic, Conversion from Python objects to SymPy objects, Optional implicit multiplication and function application parsing, Limited Mathematica and Maxima parsing: example on SymPy Live and Custom parsing transformations and Shift cipher, Affine cipher, Bifid…

Bottom Line

SymPy is free, Python-based and a pure Python library for arbitrary floating point arithmetic, making it easy to use.

7.6
Editor Rating
8.2
Aggregated User Rating
1 rating
You have rated this

SymPy

35

FreeMat

FreeMat

FreeMat is an environment for rapid engineering and scientific processing which is similar to commercial systems such as MATLAB from Mathworks and IDL from Research Systems, but is Open Source. FreeMat offers features such as a codeless interface to external C/C++/FORTRAN code, parallel/distributed algorithm development (via MPI), and advanced volume and 3D visualization capabilities. FreeMat now supports function handles, or function pointers where a function handle is an alias for a function or script that is stored in a variable. FreeMat now also supports the so called dynamic-field indexing expressions where the fieldname is supplied through an expression instead of…

Bottom Line

FreeMat offers Native Windows support, Native sparse matrix support, Native support for Mac OS X (no X11 server required), Function pointers (eval and feval are fully supported), Classes, operator overloading , 3D Plotting and visualization via OpenGL, Handle-based graphics and 3D volume rendering capability (via VTK).

7.6
Editor Rating
8.7
Aggregated User Rating
2 ratings
You have rated this

FreeMat

36

jMatLab

jMatLab

jMatLab is a free platform for mathematical and numerical computations which is a clone of Matlab and Octave and runs on any platform where Java is installed or on the Web browser. jMatLab provides features such as Arithmetic, Variables, String Manipulations, Commands and Operators, Functions, Polynomials, Vectors, Differentiation, Equations (Differential), Equations (Linear Systems), Equations (Nonlinear), Equations (Nonlinear Systems), Indefinite Integrals, Input and Output, Matrices, Numerical Integration, Plots, Programming, Statistics (Data Fitting), Statistics (Descriptive), Statistics (Histograms), Statistics (Random Numbers), Taylorpolynomial and Transformations. jMathLab has its own help system where all programming modules are arranged in groups and users can list all…

Bottom Line

jMatLab can be used for simplification, differentials, integration, vectors and matrices.

7.6
Editor Rating
6.0
Aggregated User Rating
1 rating
You have rated this

jMatLab

37

Fluentd

Fluentd is an open source data collector, which lets you unify the data collection and consumption for a better use and understanding of data. Fluentd offers features such as a community-driven support, ruby gems installation, self-service configuration, OS default Memory allocator, C & Ruby language, 40mb memory, requires a certain number of gems and Ruby interpreter and more than 650 plugins available. Fluentd tries to structure data as JSON as much as possible which allows Fluentd to unify all facets of processing log data such as collecting, filtering, buffering, and outputting logs across multiple sources and destinations (Unified Logging Layer).…

Bottom Line

Fluentd is an open source data collector for building the unified logging layer and runs in the background to collect, parse, transform, analyze and store various types of data.

7.6
Editor Rating
9.2
Aggregated User Rating
1 rating
You have rated this

Fluentd

38

PAW

PAW

PAW is an instrument conceived for assisting physicists in analyzing and presenting of data. PAW facilitates an statistical or mathematical analysis and a graphical presentation that are interactive. The interactive graphical presentation enables physicists work on objects familiar to them such as event files, vectors, and histograms. The PAW presentation feature provides a set of slides majorly in PostScript format that provides a general overview of the entire PAW system. The set of slides in PostScript format provides physicists with an almost complete review of the PAW functionalities. The PAW functionalities presented in set if slides in PostScript format are…

Bottom Line

PAW is an instrument used by physicists to analyze and present their data through the provided interactive graphical presentation such as event files, histograms, and vectors and statistical or mathematical analysis presentation.

7.6
Editor Rating
9.1
Aggregated User Rating
1 rating
You have rated this

PAW

39

ILNumerics

ILNumerics

ILNumerics is based on modern software frameworks and provides tools and solutions for scientists and engineers in all industries. ILNumerics modern software framework enables data scientists and engineers to develop and deploy highly configured technical applications in the shortest time possible. ILNumerics features the ILNumerics array visualizer. The array visualizer is simply a graphical watch window used in Visual Studio. The array visualizer enables scientists debug large and big data in a broad range of technical applications. The array visualizer has a visual representation of arbitrary data that enables it prototype your algorithms and find bugs quickly and also have…

Bottom Line

ILNumerics is a sophisticated software that provides tools and solutions to engineers and data scientists for developing and deployment of complex technical applications and is based on modern software frameworks.

7.6
Editor Rating
8.0
Aggregated User Rating
1 rating
You have rated this

ILNumerics

40

ROOT

ROOT

ROOT is a sophisticated scientific software application that provides functions required to deal with statistical analysis, large data processing, storage, and visualization. ROOT is mainly in C++ language but it can be converted into several natural languages such as R, Python and many more. The Save data feature provided by ROOT enables users to save their data using C++ object language or in a binary form that is compressed in their own file. The ROOT files are self-descriptive therefore making it easy for users to save their object format in the same ROOT file. The ROOT file contains information that…

Bottom Line

ROOT provides tools for modular scientific software framework that provide functions needed by data scientists to perform large data processing, analysis of statistical data, visualization, and storage and is mainly in C++ language.

7.6
Editor Rating
8.8
Aggregated User Rating
1 rating
You have rated this

ROOT

41

NetworkX

NetworkX

NetworkX is a software package in Python language used in creating, manipulating, and study of the functions, structures, and dynamics of the networks that are complex. NetworkX is simply a software ideal for analyzing complex networks. NetworkX enables results to be presented in a unique and graphical way. The data structures are present for graphs, multigraphs, and digraphs. Since NetworkX is a Python package it facilitates fast prototyping and provides an easy to teach and multi-platform. Data scientists are also provided with several standard graph algorithms that are useful when dealing with complex networks. NetworkX also features generators. The generators…

Bottom Line

NetworkX is an open source software used for analyzing complex networks and uses Python language for creating, manipulating, and study of the functions, structures, and dynamics of the networks that are complex.

7.6
Editor Rating
8.6
Aggregated User Rating
1 rating
You have rated this

NetworkX

42

Watson Studio

Watson Studio

Watson Studio is a data analysis application that accelerates machine and deep learning workflows required for infusing AI into your business to drive innovation. Watson Studio provides you with a suite of tools for application developers, data scientists, and subject matter experts to collaboratively and easily work with data as well as the data to train, build, and deploy models. Watson Studio also provides a choice of tools for the full AI lifecycle such as IBM tools. With Watson Studio, you have the ability to choose between code or no-code tools to enable you to build and train your ML/DL…

Bottom Line

Watson Studio is an application powered by IBM that provides data scientists, developers, and subject matter experts with an excellent platform for performing data analysis by accelerating machine and deep learning workflows.

7.6
Editor Rating
9.3
Aggregated User Rating
1 rating
You have rated this

Watson Studio

43

Arcadia Data Instant

Arcadia Data Instan uses smart acceleration to enable ultra-fast analytics and BI with agile drag-and-drop access. Arcadia Data Instant provides an in-cluster execution engine for scale-out performance on Apache Hadoop and other modern data platforms with no data movement. Arcadia Data Instant supports visualizations on Apache Kafka. Through this, users have an excellent platform to download a kit quickly and get started with exploring visualizations of Kafka topics. The key features offered by Arcadia Data Instant include connect, discover, model, visualise, interact, manage, scale, optimize, security, share and publish, and advanced analytics. The connect feature allows accessing data inside Hadoop…

Bottom Line

Arcadia Data Instant is an email marketing platform that provides an in-cluster execution engine for scale-out performance on Apache Hadoop and other modern data platforms with no data movement.

7.6
Editor Rating
8.3
Aggregated User Rating
2 ratings
You have rated this

Arcadia Data Instant

2 Reviews
  • Dmitri
    May 23, 2014 at 10:59 pm

    ADDITIONAL INFORMATION
    One correction: at the very top, ScaVi should be called “ScaVis”. I should say I like the best SCaVis since I can program in Python while accessing very reach Java numerical libraries.

  • Robert Nutt
    August 8, 2014 at 1:12 am

    ADDITIONAL INFORMATION
    A related tool for data anlaysis is json-csv.com. It is an online converter which can convert any JSON to CSv for processing within a spreadsheet.

What's your reaction?
Love It
59%
Very Good
13%
INTERESTED
13%
COOL
4%
NOT BAD
11%
WHAT !
7%
HATE IT
7%
About The Author
imanuel