Shatha Ali , Mohammed Al-Ashwal (2019) Accelerating data retrieval using index prioritization approach / Shatha Ali Mohammed Al-Ashwal. Masters thesis, Universiti Malaya.
PDF (The Candidate's Agreement) Restricted to Repository staff only Download (241Kb) | |
PDF (Thesis M.A) Download (787Kb) |
Abstract
The last few decades have witnessed a huge growth in the size of generated data; the total amount of information that can be saved by all of the world's technical devices is doubling about every 40 months since 1980s. From 2012 to the present, 2.5 exabytes (2.5 x 1018) bytes of information are produced daily. Database systems have to adjust with this rapid data growth. The capabilities for storing the generated data are also available. The only concern now is how to retrieve the stored data when needed and in a timely and accurate manner. Many researchers have studied different approaches in the aspect of data retrieval, producing different ways that serves different scenarios. However, the most common way to speed up data retrieval is indexing. There are multiple types of indexing databases, but the most used ones in relational databases are the B-Tree and Bitmap index. These types of indexes speed up query response time, but with a price on storage and performance, as indexes need to be stored and maintained after each delete and write operation. Moreover, these indexes depend on indexing an attribute or two, and not the whole record, which make them limited to a limited number of queries that contain these attributes in the ‘where’ clause. This research proposed a covering index that depends on the priority of the records. It is known that data in a table are not in the same level of importance. Some records are more important than the others in a dataset. Some records need to be fetched in a timely manner, while others do not need to be retrieved very fast. Each company knows the criteria of important records, so it can decide the ranking of the records. Ranking of records can be done by using triggers or procedures. A procedure or trigger should be created to meet the company’s definition or criteria of the priority of the records. Once the records are prioritized, they are sorted according to the rank field. When a query is run, the records are scanned in an order according to their rank; the higher a record in the rank, the first it is going to be scanned. The Priority index overcomes the limitations of the classic indexes, as it does not need maintenance in each write or delete operation. Maintenance can be scheduled and made at night or weekends. Moreover, it can be useful for a variety of bounded queries as it indexes the whole record and not a single attribute. In addition, it is faster than the common index when querying the highly ranked records. The size of Priority index is also smaller than the size of the common indexes. This work required multiple experiments by running different types of queries on three tables; one indexed by B-Tree index, another one by Bitmap index, and the third by the proposed index. The outcome of the experiments show that Priority index is faster when retrieving highly ranked records, while the size of the Priority index is still smaller.
Item Type: | Thesis (Masters) |
---|---|
Additional Information: | Dissertation (M.A.) – Faculty of Computer Science & Information Technology, Universiti Malaya, 2019. |
Uncontrolled Keywords: | Index prioritization approach; Data; Priority index; Retrieval; Record |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software |
Divisions: | Faculty of Computer Science & Information Technology |
Depositing User: | Mr Mohd Safri Tahir |
Date Deposited: | 01 Apr 2022 02:01 |
Last Modified: | 01 Apr 2022 02:01 |
URI: | http://studentsrepo.um.edu.my/id/eprint/13338 |
Actions (For repository staff only : Login required)
View Item |