Detection of multi-oriented moving text in videos / Vijeta Khare

Vijeta, Khare (2016) Detection of multi-oriented moving text in videos / Vijeta Khare. PhD thesis, University of Malaya.

PDF (Thesis PhD)
Download (7Mb) | Preview


    Text, as one of the most significant creations of humankind, has played a vital part in humanoid life, so far from olden periods. High level semantics embodied in the text are beneficial in a wide range of vision-based applications. For example, image understanding, image indexing, geo location, automatic navigation, license plate recognition, assisting blind person and other surveillance applications. There are approaches in the field of content based image retrieval to solve the above mentioned problems. However, these approaches are inadequate to generate annotation based on semantics according to content of video or images due to opening between high level and low level features. Therefore text detection and recognition in videos grow into active and important research areas in computer vision and document analysis, which is capable of understanding the content of video and images at high level with the help of Optical Character Recognizer (OCR). Especially in recent years, the researchers has seen a flow of research efforts and considerable developments in these fields, however many challenges e.g. low resolution, complex background and variations in colors, font, font size, Multi-orientations, Multi-orientation text movements, noise, blur, and distortion still remain. The objectives of this work are in four folds: (1) to introduce a new descriptor called Histogram Oriented Moments (HOM) for detecting multi-oriented text from videos. The HOM is created by considering the orientations calculated with the second order geometrical moments. Further, to verify the detected text, optical flow properties are used to estimate the motion between text candidates in temporal frames. However, the use of temporal information is limited to false positive elimination but not as main features to find text candidates. (2) to propose new models for finding multi-oriented moving text from video and scene images through moments, motion vectors are utilized to identify moving regions that have constant velocity. However, the model is slightly sensitive to window size used for moment‟s calculation and different scripts in video. (3) To develop automatic window size determination for detecting text from videos, the next method explored stroke width transform based on the information that the stroke width remains constant throughout the characters. Further, the temporal frames are used for identifying text candidates based on the fact that caption text stays at the same unchanged location for few frames. However, the performance of the proposed method degrades when there is blur present in the video frames because moments and stroke width transforms are sensitive to blur. (4) To develop a method for text detection and recognition in blur frames, a blind deconvolution model is introduced that enhances the edge sharpness by suppressing blurred pixels. In summary, each work has been tested over benchmark datasets and authors‟ created datasets from different resources using standard measures. Furthermore, the results of the proposed methods are compared with the state of art methods to show that the proposed methods are competent to existing methods.

    Item Type: Thesis (PhD)
    Additional Information: Thesis (PhD) - Faculty of Engineering, University of Malaya, 2016.
    Uncontrolled Keywords: Humanoid life; Optical Character Recognizer (OCR); Moving text
    Subjects: T Technology > T Technology (General)
    T Technology > TA Engineering (General). Civil engineering (General)
    Divisions: Faculty of Engineering
    Depositing User: Mrs Nur Aqilah Paing
    Date Deposited: 30 Sep 2016 12:27
    Last Modified: 08 Oct 2019 03:12

    Actions (For repository staff only : Login required)

    View Item