Weka Data Mining
Weka is a collection of machine learning algorithms for data mining tasks. Weka features include machine learning, data mining, preprocessing, classification, regression, clustering, association rules, attribute selection, experiments, workflow and visualization.
Open Source Software
Weka Data Mining : Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka features include machine learning, data mining, preprocessing, classification, regression, clustering, association rules, attribute selection, experiments, workflow and visualization. Weka is written in Java, developed at the University of Waikato, New Zealand. All of Weka’s techniques are predicated on the assumption that the data is available as a single flat file or relation, where each data point is described by a fixed number of attributes Weka provides access to SQL databases using Java Database Connectivity and can process the result returned by a database query. It is not capable of multi-relational data mining.
Weka’s main user interface is the Explorer, the same functionality also can be accessed through the component-based Knowledge Flow interface and from the command line. There is also the Experimenter, which allows the systematic comparison of the predictive performance of Weka’s machine learning algorithms on a collection of datasets. The Explorer interface features several panels providing access to the main components of the workbench such as preprocess panel which facilities for importing data, classify panel enables the user to apply classification and regression algorithms, associate panel provides access to association rule learners,cluster panel gives access to the clustering techniques, select attributes panel provides algorithms for identifying the most predictive attributes in a dataset, and visualize panel shows a scatter plot matrix.
Weka provides comprehensive set of data pre-processing tools, learning algorithms and evaluation methods, graphical user interfaces and an environment for comparing learning algorithms. The data can be imported from a file in various formats such as ARFF, CSV, C4.5, binary. Data can also be read from a URL or from an SQL database (using JDBC). Pre-processing tools in WEKA are called “filters” and there are filters available for Discretization, normalization, resampling, attribute selection, transforming and combining attributes.
The implemented learning schemes are decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets. The meta classifiers included are bagging, boosting, stacking, error-correcting output codes, locally weighted learning. The implemented schemes are k-Means, EM, Cobweb, X-means, FarthestFirst. The Clusters can be visualized and compared to “true” clusters . Apriori can compute all rules that have a given minimum support and exceed a given confidence. In Weka, data sources, classifiers, etc. are beans and can be connected graphically.
You may also live to read, Predictive Analytics Freeware Software, Top Predictive Analytics proprietary Software, Predictive Analytics Software API, Top Free Data Mining Software, Top Data Mining Software,and Data Ingestion Tools.
More Information on Predictive Analysis Process
For more information of predictive analytics process, please review the overview of each components in the predictive analytics process: data collection (data mining), data analysis, statistical analysis, predictive modeling and predictive model deployment.