RU Nijmegen logo

Linguistic Classification

the LCS system

PHASAR logo

The Linguistic Classification System LCS is a basic component for all applications involving document classification, developed in the course of the ESPRIT projects DORO and PEKING, which performs the following tasks:

  • learning a classifier from pre-classified train documents
  • applying a classifier to given documents to obtain a ranking of all documents for all categories
  • making, based on a computed ranking of documents for categories, an optimal assignment of documents to categories.
Other classification systems with this functionality are available, often in the public domain, but only LCS provides a framework which supports the classification process in practical situations:
  • multiple classification algorithms
  • de-novo training of classifiers
  • incremental training and feed-back training
  • multi-classification
  • hierarchical classification
  • cross-lingual classification
  • selection according to different utility functions
  • quality monitoring and reporting
The LCS system differs from others in that it can make use of linguistic terms (Dependency Triples) to enhance its accuracy. It has a proven track record in the classification of patent documents.


Cornelis H.A. Koster
Department of Computing Science
University of Nijmegen
6525ED Nijmegen, The Netherlands