Reviews

Now Reading

Apache UIMA

Next
Prev

Review

Apache UIMA

Overview

Synopsis

Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user.

Category

Data Mining Software Free

Features

•Infrastructe
•Components
•Frameworks
•Annotators
•Tooling

License

Open Source

Price

Free

Pricing

Subscription

Free Trial

Available

Users Size

Small (<50 employees), Medium (50 to 1000 Enterprise (>1001 employees)

Website

Apache UIMA

Company

Apache UIMA

What is best?

•Infrastructe
•Components
•Frameworks
•Annotators
•Tooling

What are the benefits?

• Development source code issue management
• Tooling
• Servers
• Community forums
• Sandbox

PAT Rating™

Editor Rating

Aggregated User Rating

Rate Here

Ease of use

7.6

5.3

Features & Functionality

7.6

6.9

Advanced Features

7.6

8.1

Integration

7.6

5.3

Performance

7.6

4.9

Customer Support

7.6

6.1

Implementation

—

Renew & Recommend

9.4

Bottom Line

UIMA additionally provides capabilities to wrap components as network services, and can scale to very large volumes by replicating processing pipelines over a cluster of networked nodes.

7.6

Editor Rating

6.6

Aggregated User Rating

4 ratings

You have rated this

Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user.

An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at UIMA enables applications to be decomposed into components, for example "language identification" => "language specific segmentation" => "sentence boundary detection" => "entity detection (person/place names etc.)".

Each component implements interfaces defined by the framework and provides self-describing metadata via XML descriptor files. The framework manages these components and the data flow between them.

Components are written in Java or C++; the data that flows between components is designed for efficient mapping between these languages. UIMA additionally provides capabilities to wrap components as network services, and can scale to very large volumes by replicating processing pipelines over a cluster of networked nodes.

Apache UIMA is an Apache-licensed open source implementation of the UIMA specification [pdf] [doc] (that specification is, in turn, being developed concurrently by a technical committee within OASIS , a standards organization).

The Frameworks run the components, and are available for both Java and C++. The Java Framework supports running both Java and non-Java components (using the C++ framework). The C++ framework, besides supporting annotators written in C/C++, also supports Perl, Python, and TCL annotators.

The UIMA-AS and UIMA-DUCC are both Scaleout Frameworks and are addons to the base Java framework. The UIMA-AS supports very flexible scaleout capability based on JMS (Java Messaging Services) and ActiveMQ. The UIMA-DUCC extends UIMA-AS by providing cluster management services to automate the scale-out of UIMA pipelines over computing clusters. Visit the UIMA-DUCC live demo description and the UIMA-DUCC live demo itself.

Filter reviews

1 Reviews

Leave a Review

Brook Sacarello
September 28, 2017 at 11:26 am

Wrap components as network services
Company size
Enterprise (>1001)

User Role
Super User
User Industry
Defense
Rating
Ease of use8.3
Features & Functionality8.1
Advanced Features8.1
Customer Support8.3

ADDITIONAL INFORMATION
Apache UIMA provides users with the ability to wrap components as network services and scaling to large volumes by checking processing pipelines over the networked nodes cluster. The unstructured information management applications software provides users with a good platform for analyzing big volumes of information that is not structured in order to discover what is relevant to an end user. The Apache UIMA provides the Snowball annotator. The snowball annotator enables users to wrap up the Snowball steaming algorithm. This is achieved through the Snowball annotator iterating over the available token annotations in the CAS and creating for each token a token containing the stem. The stemming algorithm provided by Snowball annotator is good as it is available for several languages. The whitespace tokenizer annotator component provides users with an UIMA annotator implementation that is able to tokenize text documents using simple whitespace segmentation. The Addons and Sandbox is a workspace provided by Apache UIMA that enables users to host analysis components and tooling around UIMA.