Visual codebook analysis in image understanding / Hoo Wai Lam

Hoo, Wai Lam (2015) Visual codebook analysis in image understanding / Hoo Wai Lam. PhD thesis, University of Malaya.

[img] PDF
Restricted to Repository staff only until 01 January 2018.

Download (7Mb)

    Abstract

    Since the popularity of web search engine, computer vision researchers have been actively investigating image understanding problem for the past decade, in order to have a robust content-based image retrieval algorithm to understand objects and scenes in the environment. Visual codebook, that act as a ‘dictionary’ for images, has been widely used in the literature. This thesis aims to investigate the limitations of current visual codebook algorithms and propose new solutions to deal with the identified problems. The first contribution of this thesis is to enhance the visual codebook with the introduction of soft class labels. Visual codebook suffers from weakly-supervised learning, because background image patches are wrongly assigned to the attached semantic image class label. As a resultant of that, visual codebook will learn wrong information, and thus affects the image classification performance. To deal with this problem, soft class labels are proposed in a way that both image level and patch level information are utilized. For each image patch, the soft class labels assign weight on each object classes. Therefore, the visual codebook will no longer affected by those wrongly labeled image patches. The second contribution of this thesis is to reduce human annotation effort in zeroshot learning algorithm, by proposing hierarchical class concept. In general, when only limited images are available, the efficiency of the visual codebook is affected. Therefore, a zero-shot learning approach is needed to classify those images that have not been seen by the classification model before. State-of-the-art approaches often used attributes as the zero-shot learning solution. However, attributes need extensive human annotation to perform. The proposed method performs zero-shot learning using a newly defined Coarse Class and Fine Class, so that both seen classes and unseen classes can be related. With this proposed approach, the extensive human annotation effort will no longer needed. The third contribution of this thesis is to reduce the biases that exist in the image dataset. In order to do so, visual codebook that consists of codewords that significantly represents object classes, namely keybook, employing the mutual information approach is built. The bias mentioned here includes capture bias, where the dataset will be mostly contain images from certain viewpoints. In addition, when building a dataset, researcher might unintentionally favour some specific environments (e.g. street scene), which will resulting in selection bias. These biases are embedded in the datasets, and lead to the generated visual codebook from one dataset not able to perform well in another dataset. To overcome this, in the proposed approach the codewords from all visual codebooks that significantly represent object classes are selected to build the keybook. With this, the dataset bias effect in the visual codebook will be reduced. In summary, these three research works aim to solve those current limitations in visual codebook, so that a better visual codebook representation can be built for image understanding task, and achieve the ultimate goal - improve image classification performance.

    Item Type: Thesis (PhD)
    Additional Information: Thesis (Ph.D.) -- Faculty of Computer Science and Information Technology, University of Malaya, 2015
    Uncontrolled Keywords: Visual codebook; Analysis; Image understanding
    Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
    Q Science > QA Mathematics > QA76 Computer software
    Divisions: Faculty of Computer Science & Information Technology
    Depositing User: Mrs Nur Aqilah Paing
    Date Deposited: 19 Oct 2015 15:44
    Last Modified: 19 Oct 2015 15:44
    URI: http://studentsrepo.um.edu.my/id/eprint/5903

    Actions (For repository staff only : Login required)

    View Item