Architecture for biodiversity image retrieval using ontology and Content Based Image Retrieval (CBIR) / Arpah Abu

Abu, Arpah (2013) Architecture for biodiversity image retrieval using ontology and Content Based Image Retrieval (CBIR) / Arpah Abu. PhD thesis, University of Malaya.

PDF (Full Text)
Download (6Mb) | Preview


    This research looks into how ontology can be used to pre-classify training set images to improve the efficiency of Content-Based Image Retrieval (CBIR) for Biodiversity. The set of images used for image retrieval are the Malaysian monogeneans belonging to the order Dactylogyridae Bychowsky, 1937. Monogeneans are parasitic Platyhelminths and are distinguished based on both soft reproductive anatomical features as well as shapes and sizes of sclerotised hard parts (haptoral bar, anchor, marginal hook, and male and female copulatory organ). The diagnostic features of monogeneans especially their sclerotised hard parts are given as illustrations in the literatures. In this study, two models of image retrieval were built; one that does not use image pre-classification, while the other uses image pre-classification. A model without image pre-classification, named Model 1, runs using typical CBIR approach, whereby all the images in the image database are used as training set images. The second model, a model with image pre-classification, named Model 2 runs by integrating the CBIR with ontology, which pre-classifies the images in the image database for training purposes. In this approach, the images are annotated with taxonomic classification, diagnostic parts and image properties using the Taxonomic Data Working Group (TDWG) Life Sciences Identifiers (LSID) structured vocabulary that is represented in the form of ontology. In this context, the purpose of the image pre-classification is to classify the images in the training set based on certain parameters, which in this study focuses on the dorsal and ventral side of the haptoral bars. As a result, the size of the images in the training set decreases after the image pre-classification process. In the CBIR approach implemented in both models, region-based shape information using pixel mean value is used as the descriptor to represent the shapes of the images. As for image classification, Minimum distance classifier is used to classify the retrieved images and the relevant images in the retrieved images are then measured based on the Euclidean distance and visual comparison. For iii both the systems, the implementation is tested on 148 haptoral bar images. The performances of both systems are assessed using R-Precision, Error Rate (ER), Mean Average Precision (MAP), PR Graph, Receiver Operating Characteristic (ROC) and Area under ROC Curve (AUC). According to these measurements, Model 2 system performed better image retrieval. The application of this method shows that the relevancy rate increases when the size of the training set decreases since all the images are mostly relevant to the query image. Also, it shows that the size of training set affects the relevancy rate of the retrieved images whereby the relevancy rate is inversely proportional to the size of the training set. Besides that, the retrieval results contain the retrieved images with their annotations, providing more understanding and knowledge to the user. Finally, in this study a three-tier architecture of Biodiversity image retrieval is proposed and developed.

    Item Type: Thesis (PhD)
    Additional Information: Thesis (Ph.D) -- Institut Sains Biologi, Fakulti Sains, Universiti Malaya, 2013
    Uncontrolled Keywords: Content-based image retrieval; Image processing--Digital techniques--Scientific applications; Biodiversity--Data processing; Ontologies (Information retrieval)
    Subjects: Q Science > Q Science (General)
    Q Science > QH Natural history
    Divisions: Faculty of Science
    Depositing User: Mrs Nur Aqilah Paing
    Date Deposited: 26 Sep 2014 11:01
    Last Modified: 26 Sep 2014 11:01

    Actions (For repository staff only : Login required)

    View Item