A New Contour Based Invariant Feature Extraction Approach for the Recognition of Multi-lingual Documents
Loading...
Date
2005-02-02
Journal Title
Journal ISSN
Volume Title
Publisher
INFLIBNET Centre
Abstract
Now a day, developing a single OCR system for recognizing multi-lingual documents
becomes essential to enhance the ability and performance of the existing document
analysis system. Hence in this paper, we present a new technique based on contour detection
and distance measure for recognizing multi-lingual characters comprising south Indian
languages (Kannada, Tamil, Telugu, Malayalam, English Upper case, English Lower case,
English Numerals and Persian Alphanumeric). Proposed method finds boundary for a
character using contour detection and the result of contour detection is given to feature
extraction scheme to obtain distinct and invariant features for identifying different characters
of different languages. The method extracts invariant features by computing distance
between the centroid and the pixels of contour of character image.
We compare the experimental results of proposed method with result of existing methods to
evaluate the performance of the method. Based on experimental results it is realized that
the proposed method gives 100% accuracy with minimum expense and time. In addition,
the method is invariant to Rotation, Scaling and Translation transformations (RST).
Description
Keywords
Contour detection, Distance Measure, Invariant features, Character recognition, OCR