Unsupervised monocular depth estimation with multi-scale structural similarity powered loss function / Ali Kohan

Ali, Kohan (2020) Unsupervised monocular depth estimation with multi-scale structural similarity powered loss function / Ali Kohan. Masters thesis, Universiti Malaya.

[img] PDF (The Candidate's Agreement)
Restricted to Repository staff only

Download (206Kb)
    [img] PDF (Thesis M.A)
    Download (1279Kb)


      Depth Estimation refers to a set of techniques and algorithms that aim to obtain a representation of spatial information of a scene. Nowadays specific hardware such as sensors, radars and multiple-view-recording cameras are being used in order to acquire depth data of a scene. Modern approaches use deep learning to address this task by trying to learn depth information in a supervised manner. However, this approach requires a large amount ground-truth data for a particular scene so that a model can be trained successfully. Also preparing ground-truth data for a range of environments is a challenging and expensive task to accomplish. Most recent works in this context have proposed self-supervised learning approaches, where they implicitly infer the target data from a stereo pair of images and use that self-obtained target data to train a deep neural network to learn disparities of the two views from the image pair. Disparities between two horizontal views of a same object, says all about how much that object moves on the horizontal line from one view to the other. Predicting the disparities will help calculate the depth data of the scene using simple geometric formulas. This approach however has shown some flaws in estimating depth on specular and transparent surfaces, where they end up predicting inconsistent depth for such surfaces. In this work a novel training objective is proposed, where a deep convolutional neural network learns to predict depth from a single image, where it improves the quality of depth prediction for specular and transparent surfaces. This proposed method follows the previous works that try to reconstruct the right-view of a scene, given the left one. On top of that, having considered the importance of loss layers in the performance of neural networks, it suggests a new image reconstruction and matching loss function that is aimed to improve depth estimation consistency on specular and transparent surfaces. The proposed loss function is perceptually motivated by the human visual system, assuming that it will help increase image reconstruction quality while maintaining key structures of a scene; hoping that it will impact directly on depth prediction which resolves the aforementioned deficiencies of the predecessor works.

      Item Type: Thesis (Masters)
      Additional Information: Dissertation (M.A.) – Faculty of Computer Science & Information Technology, Universiti Malaya, 2020.
      Uncontrolled Keywords: Depth estimation; Unsupervised; Monocular; Binocular; Convolutional neural networks
      Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
      Q Science > QA Mathematics > QA76 Computer software
      Divisions: Faculty of Computer Science & Information Technology
      Depositing User: Mr Mohd Safri Tahir
      Date Deposited: 09 May 2023 04:31
      Last Modified: 09 May 2023 04:31
      URI: http://studentsrepo.um.edu.my/id/eprint/14369

      Actions (For repository staff only : Login required)

      View Item