Enhancing the performance of IR-based traceability recovery of requirement artifacts using noun phrases / Mashahi Khalafalla Dafaalla Abdelrahman

Mashahi Khalafalla , Dafaalla Abdelrahman (2020) Enhancing the performance of IR-based traceability recovery of requirement artifacts using noun phrases / Mashahi Khalafalla Dafaalla Abdelrahman. Masters thesis, Universiti Malaya.

[img] PDF (The Candidate's Agreement)
Restricted to Repository staff only

Download (193Kb)
    [img] PDF (Thesis M.A)
    Download (1949Kb)

      Abstract

      Requirement traceability can be considered as a measure of software quality to help achieve validation, verification, and reusability. Neglecting traceability leads to less maintainable software. Creating traceability links after-the-fact, known as traceability recovery, is a tedious and time-consuming process when it is done manually. Therefore, information retrieval (IR) methods have been used to automatically identify traceability links between the artifacts. However, as a result of limitations of the software engineer and the IR techniques, the performance of the IR methods is negatively affected. There is no IR method that is able to recover traceability links between artifacts with high precision and high recall, such as in Vector Space Model (VSM), the retrieved false positives cause low precision results. Nevertheless, VSM is widely practiced as it considers the simplest linear algebraic method, easy to understand and use for non-IR experts. It allows ranking of documents concurring their probable relevance, and there are many tools and open-source implementations which implement VSM such as RETRO and ReqSimile. The research aims to assist software engineers (analysts) during the process of recovering traceability links between software artifacts by suggesting the appropriate type of phrases, which enhance the performance of IR method. The research objectives are: 1) To investigate IR methods for traceability recovery; 2) To propose a method that achieves high performance (as high recall and precision as possible) in traceability recovery; 3) To empirically validate the proposed method through an experimental analysis to demonstrate its ability to improve the performance (as high recall and precision as possible) in traceability recovery. A comparative experiment is done by extracting noun phrases (NP), verb phrases (VP), and combination of noun and verb phrases (NPVP) from three benchmarking datasets namely CM1, MODIS, and PINE. VSM is applied, the result is evaluated in terms of recall and precision and the result showed that indexing NP only tends to outperform VP, NPVP, and all terms by achieving high recall and precision as possible.

      Item Type: Thesis (Masters)
      Additional Information: Dissertation (M.A.) – Faculty of Computer Science & Information Technology, Universiti Malaya, 2020.
      Uncontrolled Keywords: Traceability recovery; Information retrieval; Vector space model; Software requirements; Noun phrases
      Subjects: Q Science > QA Mathematics > QA76 Computer software
      T Technology > TA Engineering (General). Civil engineering (General)
      Divisions: Faculty of Computer Science & Information Technology
      Depositing User: Mr Mohd Safri Tahir
      Date Deposited: 11 Mar 2022 09:58
      Last Modified: 11 Mar 2022 09:58
      URI: http://studentsrepo.um.edu.my/id/eprint/12935

      Actions (For repository staff only : Login required)

      View Item