![]() |
TM4IP
Text Mining | ![]() |
|
The TM4IP project is collaboration between the Radboud University
Nijmegen (the Netherlands) and
Matrixware (Austria).
The goals of the TM4IP project are twofold:
About Text MiningAccording to Marti Hearst, Text Mining is "the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources. A key element is the linking together of the extracted information together to form new facts or new hypotheses to be explored further by more conventional means of experimentation."Typically, Text Mining consists of a search phase, in which from a very large collection documents pertaining to a certain topic are sought, and an analysis phase, in which (parts of) those documents are presentened in such a way as to make it easy for the human user to interpret them and obtain knowledge. PHASAR and IPThe PHASAR system provides its users with a wholly new way of searching, using linguistically motivated search terms, giving the user tight control over precision and recall (avoiding long lists of spurious hits) and providing unprecedented support of the search process by information from the index and the thesauri. These properties make it wellsuited for both exploratory and exhaustive search in large collections of patent documents.For the analysis phase PHASAR provides passage retrieval (focussing on relevant sentences) and re-usable search profiles.
|