IBM Japan
Skip to main content
 
     Home  |  Products & services  |  Support & downloads  |  My account
 Select a country
English | Japanese
 IBM Research home
Tokyo Research Lab
Projects
 Information and Interaction
 ·Text Mining
 
 
 


UIMA (Unstructured Information Management Architecture)

  
 

UIMA : Middleware for text analysis

    In the document search and text mining system, we have to analyze unstructured content. In analyzing unstructured data, we usually use a variety of natural language technologies including tokenizing, parsing and named entity extraction. To use each processing module, we have to know the detail of the technology and many components that have same function have been developed. To reuse components and integrate components easily, IBM research has been developed UIMA (Unstructured Information Management Architecture) that is the infrastructure to construct UIM (Unstructured Information Management) application, and has released UIMA SDK from alphaWorks.

    UIMA defines the data structure that stores the original information and the extracted information as CAS (Common Analysis System). It also defines the interface of the processing module as TAE (Text Analysis Engine). If one developer implements his module using these data structure and interface and makes it UIMA compliant, another developer can reuse it on UIMA and integrate with his application.

    In TRL, we are constructing the text mining system on UIMA and developing the base system that can process documents efficiently.

  
 
  About IBM  |  Privacy  |  Terms of use  |  Contact