RapidMiner : RapidMiner provides an integrated environment for machine learning, data mining, text mining, predictive analytics and business analytics and is used for business and industrial applications as well as for research, education, training, rapid prototyping, and application development. RapidMiner supports all steps of the data mining process including results visualization, validation and optimization.RapidMiner uses a client/server model with the server offered as Software as a Service or on cloud infrastructures. RapidMiner provides data mining and machine learning procedures including: data loading and transformation, data preprocessing and [...]
R is the world’s most powerful, and preferred, programming language for statistical computing, machine learning, and graphics, and is supported by a thriving global community of users, developers, and contributors.The Microsoft R product family includes: Microsoft R Server, Microsoft R Client, Microsoft R Open, SQL Server R Services.Microsoft R Server is the most broadly deployable enterprise-class analytics platform for R . Supporting a variety of big data statistics, predictive modeling and machine learning capabilities, R Server supports the full range of analytics exploration, analysis, visualization and modeling based on open source R. Microsoft R [...]
Arcadia Data unifies data discovery, visual analytics and business intelligence in a single, integrated platform that runs natively on Hadoop clusters. Arcadia Data does not require coding and users can go straight to into big data with intuitive drag and drop self service interface which provides exploration and semantic modeling on breadth and depth of all business data.Arcadia Data allows working on multiple sources such as Hive, Impal, Postgres, Amazon Redshift, MySQL, Teradata Aster and much more. It’s unique Active Data store models and tunes data structures continuously at Hadoop scale. Active Data automatically replaces sub-optimal curated schemas [...]
Weka Data Mining : Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka features include machine learning, data mining, preprocessing, classification, regression, clustering, association rules, attribute selection, experiments, workflow and visualization. Weka is written in Java, developed at the University of Waikato, New Zealand. All of Weka’s techniques are predicated on the assumption that the data is available as a single flat file or relation, where each data point is described by a fixed number of attributes Weka provides [...]
Orange Data mining : Orange is an open source data visualization and analysis tool. Orange is developed at the Bioinformatics Laboratory at the Faculty of Computer and Information Science, University of Ljubljana, Slovenia, along with open source community. Data mining is done through visual programming or Python scripting. The tool has components for machine learning, add-ons for bioinformatics and text mining and it is packed with features for data analytics. Orange is a Python library. Python scripts can run in a terminal window, integrated environments like PyCharm and PythonWin, or shells like iPython.Orange Data mining
Scikit-learn is an open source machine learning library for the Python programming language.It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
Classification : Identifying to which category an object belongs to Applications: Spam detection, Image recognition. Algorithms: SVM, nearest neighbors, random forest. Regression : Predicting a continuous-valued attribute associated with an object. Applications: Drug [...]
Shogun is a free, open source toolbox written in C++. It offers numerous algorithms and data structures for machine learning problems. The focus of Shogun is on kernel machines such as support vector machines for regression and classification problems. Shogun also offers a full implementation of Hidden Markov models.The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. This enables both rapid prototyping of data pipelines and extensibility in terms of new algorithms.
It now offers features that span the whole space of Machine Learning methods, [...]
The ELKI framework is written in Java and built around a modular architecture. Most currently included algorithms belong to clustering, outlier detection and database indexes. A key concept of ELKI is to allow the combination of arbitrary algorithms, data types, distance functions and indexes and evaluate these combinations. When developing new algorithms or index structures, the existing components can be reused and combined.
ELKI is modeled around a database core, which uses a vertical data layout that stores data in column groups (similar to column families in NoSQL databases). This database core provides nearest [...]