Minnu Helen , Joseph (2024) Increasing the accuracy of information retrieval systems evaluation by improving the quality of the relevant judgements / Minnu Helen Joseph. PhD thesis, Universiti Malaya.
![]() | PDF (The Candidate's Agreement) Restricted to Repository staff only Download (214Kb) |
![]() | PDF (Thesis PhD) Download (2376Kb) |
Abstract
Information retrieval evaluation is a process of measuring how well the participating systems can meet the information needed by the user. The system's performance is evaluated based on the relevance judgment set quality. The quality of the judgment set is measured based on the ability of the participating systems to retrieve as many relevant documents based on topics into the judgment sets and rank them in a better way and also, at the same time suppress the irrelevant ones. However, it has been noticed that for smaller test collections, this assumption might be correct. But for large test collections like TREC(Text Retrieval Conference), this assumption might not always be true. It has been noticed that the quality of the judgment sets is not up to the level or incomplete according to the Cranfield paradigm methodology, especially through document similarity techniques. The main aim of this thesis is to increase the quality of the relevance judgment sets during the evaluation process. The quality of the judgment sets can be increased by augmenting the number of relevant documents in the judgment sets. It will indirectly help to increase the accuracy of the evaluation process. This thesis's main contribution is to increase the quality of the judgment sets by proposing some methodologies. The first experiment explored the issues of partial relevance judgments on existing methodologies. The methodologies' inability to retrieve all the relevant documents into the relevance judgment sets is considered. By considering the limitations of the existing methodologies, a methodology has been proposed to increase the relevant documents in the judgment sets. The proposed methodology combines the pooling and document similarity using clustering and classification techniques. Documents similarity has been done between pooled and clustered or classified unjudged documents. If a similarity is found, a new score will be assigned to those documents and moved that document into the pooled list. The evaluation continues until all the documents from the pooled list are considered for the similarity-checking process. The results show that the proposed methodology can achieve a greater number of relevant documents in the judgment sets and also helps to achieve a better result with lesser pool depth. The second experiment explored how to further improve or maintain the quality of the judgment set by considering the test collection. For this experiment, topics and participating systems from test collections were considered. Based on the results, it has been proven that a smaller number of the most effective topics, or easy topics, can maintain the quality of the judgment sets. Also, based on the system contributions, an enhanced methodology has been proposed and the results show that it helps to achieve better quality judgment set and also can achieve better results with lesser pool depth. Both, by considering only the most effective topics and good contributing systems documents helps to reduce the computational cost of the evaluation process. Lastly, it has been proven that the proposed methodology helped to reduce the incompleteness of the judgment sets, and biasness in the ranking of the judgment sets.
Item Type: | Thesis (PhD) |
---|---|
Additional Information: | Thesis (PhD) – Faculty of Computer Science & Information Technology, Universiti Malaya, 2024. |
Uncontrolled Keywords: | Information retrieval evaluation; Pooling; Document similarity; Incomplete judgments; Rank biasness |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Computer Science & Information Technology |
Depositing User: | Mr Mohd Safri Tahir |
Date Deposited: | 14 Mar 2025 02:31 |
Last Modified: | 14 Mar 2025 02:31 |
URI: | http://studentsrepo.um.edu.my/id/eprint/15599 |
Actions (For repository staff only : Login required)
View Item |