R is the world’s most powerful, and preferred, programming language for statistical computing, machine learning, and graphics, and is supported by a thriving global community of users, developers, and contributors.The Microsoft R product family includes: Microsoft R Server, Microsoft R Client, Microsoft R Open, SQL Server R Services.Microsoft R Server is the most broadly deployable enterprise-class analytics platform for R . Supporting a variety of big data statistics, predictive modeling and machine learning capabilities, R Server supports the full range of analytics exploration, analysis, visualization and modeling based on open source R. Microsoft R [...]
Arcadia Data unifies data discovery, visual analytics and business intelligence in a single, integrated platform that runs natively on Hadoop clusters. Arcadia Data does not require coding and users can go straight to into big data with intuitive drag and drop self service interface which provides exploration and semantic modeling on breadth and depth of all business data.Arcadia Data allows working on multiple sources such as Hive, Impal, Postgres, Amazon Redshift, MySQL, Teradata Aster and much more. It’s unique Active Data store models and tunes data structures continuously at Hadoop scale. Active Data automatically replaces sub-optimal curated schemas [...]
Scikit-learn is an open source machine learning library for the Python programming language.It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
Classification : Identifying to which category an object belongs to Applications: Spam detection, Image recognition. Algorithms: SVM, nearest neighbors, random forest. Regression : Predicting a continuous-valued attribute associated with an object. Applications: Drug [...]
Shogun is a free, open source toolbox written in C++. It offers numerous algorithms and data structures for machine learning problems. The focus of Shogun is on kernel machines such as support vector machines for regression and classification problems. Shogun also offers a full implementation of Hidden Markov models.The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. This enables both rapid prototyping of data pipelines and extensibility in terms of new algorithms.
It now offers features that span the whole space of Machine Learning methods, [...]
The ELKI framework is written in Java and built around a modular architecture. Most currently included algorithms belong to clustering, outlier detection and database indexes. A key concept of ELKI is to allow the combination of arbitrary algorithms, data types, distance functions and indexes and evaluate these combinations. When developing new algorithms or index structures, the existing components can be reused and combined.
ELKI is modeled around a database core, which uses a vertical data layout that stores data in column groups (similar to column families in NoSQL databases). This database core provides nearest [...]
Dataiku DSS is the collaborative data science platform that enables teams to explore, prototype, build, and deliver their own data products more efficiently. Dataiku DSS provides an interactive visual interface where they can point, click, and build or use languages like SQL to data wrangle, model, easily re-run workflows, visualize results, and get up-to-date insights on demand.
Dataiku DSS provides tools to draft data preparation and modelisation in seconds, that wish to leverage their favorite ML libraries (scikitlearn, R, MLlib, H2O, and so on), and that rely on automating their work in a completely customizable [...]
ITALASSI is a freeware program which facilitate interpretation of regression models (2 independent variables) with an interaction term. The program allows you to enter several regression models (two bivariate, one multiple additive, and one multivariate with interaction) in the form of equations or compute those equations from raw data and displays the various models using 2D and 3D graphs. The program may also be used in advanced stat courses to illustrate statistical interactions or applied multiple regression.
You may also like to read, Text Analysis, Text [...]
RapidMiner : RapidMiner provides an integrated environment for machine learning, data mining, text mining, predictive analytics and business analytics and is used for business and industrial applications as well as for research, education, training, rapid prototyping, and application development. RapidMiner supports all steps of the data mining process including results visualization, validation and optimization.RapidMiner uses a client/server model with the server offered as Software as a Service or on cloud infrastructures. RapidMiner provides data mining and machine learning procedures including: data loading and transformation, data preprocessing and [...]