Review

Gensim

Overview

Synopsis

Gensim started off as a collection of various Python scripts for the Czech Digital Mathematics Library dml.cz in 2008, where it served to generate a short list of the most similar articles to a given article (gensim = “generate similar”).

Category

Deep Learning Software

Features

• Scalability
• Efficient implementations
• Platform independent
• Converters & I/O formats
• Robust
• Similarity queries

License

Proprietary

Price

Free

Pricing

Subscription

Free Trial

Available

Users Size

Small (<50 employees), Medium (50 to 1000 Enterprise (>1001 employees)

Website

Gensim

Company

Gensim

PAT Rating™

Editor Rating

Aggregated User Rating

Rate Here

Ease of use

7.6

6.8

Features & Functionality

7.6

6.6

Advanced Features

7.6

6.3

Integration

7.6

6.4

Performance

7.6

—

Customer Support

7.6

—

Implementation

—

Renew & Recommend

4.5

7.6

Editor Rating

6.1

Aggregated User Rating

3 ratings

You have rated this

Gensim is a FREE Python library that has scalable statistical semantics. It analyzes plain-text documents for semantic structure and retrieve semantically similar documents. In addition, Gensim is a robust, efficient and hassle-free piece of software to realize unsupervised semantic modelling from plain text. It stands in contrast to brittle homework-assignment-implementations that do not scale on one hand, and robust java-esque projects that take forever just to run “hello world”.

Gensim is licensed under the OSI-approved GNU LGPLv2.1 license. This means that it’s free for both personal and commercial use, but if users make any modification to gensim that users distribute to other people, users have to disclose the source code of these modifications. Apart from that, users are free to redistribute gensim in any way users like, though users are not allowed to modify its license.

Genism can process large, webscale corpora, using incremental online training algorithms. There is no need for the whole input corpus to reside fully in RAM at any one time. In addition, the core algorithms in genism use highly optimized math routines. Genism also contains a distributed version of several algorithms, intended to speed up processing and retrieval on machine clusters. Being a pure Python, genism runs on Linux, Windows and OS X, as well as any other platform that supports Python and NumPy. Genism further contains memory efficient implementations to several popular data formats such as Matrix Market, SVMlight, Blei’s LDA-C. These can be used for input, output, or to convert between one another.

Filter reviews