Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user.
Data Mining Software Free
Small (<50 employees), Medium (50 to 1000 employees), Enterprise (>1001 employees)
• Development source code issue management
• Community forums
Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at UIMA enables applications to be decomposed into components, for example "language identification" => "language specific segmentation" => "sentence boundary detection" => "entity detection (person/place names etc.)". Each component implements interfaces defined by the framework and provides self-describing metadata via XML descriptor files. The framework manages these components and the data flow between them.
Components are written in Java or C++; the data that flows between components is designed for efficient mapping between these languages. UIMA additionally provides capabilities to wrap components as network services, and can scale to very large volumes by replicating processing pipelines over a cluster of networked nodes. Apache UIMA is an Apache-licensed open source implementation of the UIMA specification [pdf] [doc] (that specification is, in turn, being developed concurrently by a technical committee within OASIS , a standards organization).
The Frameworks run the components, and are available for both Java and C++. The Java Framework supports running both Java and non-Java components (using the C++ framework). The C++ framework, besides supporting annotators written in C/C++, also supports Perl, Python, and TCL annotators. The UIMA-AS and UIMA-DUCC are both Scaleout Frameworks and are addons to the base Java framework. The UIMA-AS supports very flexible scaleout capability based on JMS (Java Messaging Services) and ActiveMQ. The UIMA-DUCC extends UIMA-AS by providing cluster management services to automate the scale-out of UIMA pipelines over computing clusters. Visit the UIMA-DUCC live demo description and the UIMA-DUCC live demo itself.