Data prediction and recalculation of missing data in soft set / Muhammad Sadiq Khan

Muhammad Sadiq , Khan (2018) Data prediction and recalculation of missing data in soft set / Muhammad Sadiq Khan. PhD thesis, University of Malaya.

[img] PDF (The Candidate's Agreement)
Restricted to Repository staff only

Download (1285Kb)
    PDF (Thesis PhD)
    Download (3756Kb) | Preview


      Uncertain data cannot be processed by using the regular tools and techniques of clear data. Special techniques like fuzzy set, rough set, and soft set need to be utilized when dealing with uncertain data, and each special technique comes with its own advantages and snags. Soft set is considered as the most appropriate of these techniques. A soft set application represents uncertain data in tabular form where all values are represented by 0 or 1. Researchers use soft set representation in a number of applications involving decision making, parameter reduction, medical diagnosis, and conflict analysis. Soft set binary data may be missing due to communicational errors or viral attacks etc. Soft sets with incomplete data cannot be used in applications. Few researchers have worked on data filling and recalculating incomplete soft sets, and the current research focuses on predicting missing values and decision values from non-missing data or aggregates. A soft set needs to be preprocessed in order to obtain aggregates while no preprocessing is needed when aggregates are not required. Therefore, this research discusses the existing techniques in terms of preprocessed and unprocessed soft sets. The currently available approaches in the preprocessed category recalculate partial missing data from aggregates, yet are unable to use the set of aggregates for recalculating entire values. This research presents a mathematical technique capable of recalculating overall missing values from available aggregates. Also investigated are the techniques belonging to the unprocessed category, among them being DFIS, a novel data filling approach for an incomplete soft set, which seems to be the most suitable technique in handling incomplete soft set data. The result shows that DFIS possesses a persisting accuracy problem in prediction. DFIS predicts missing values through association between parameters, yet makes no distinction between the different associations. Thus, it ignores the role of the strongest association, which in turn results in low accuracy. This research rectifies this particular DFIS issue by using a new prediction technique through strongest association (PSA). The experimental result validates the high accuracy of PSA over DFIS after implementing both techniques in MATLAB and testing for data filling using bench mark data sets. Further, this research applies PSA to online social networks (OSN) and detects a new kind of network community for those nodes that are associated with each other. The new network community is named ‗virtual community‘ and the inter-associated nodes are named ‗prime nodes‘. Researchers have found that the unavailability of complete OSN nodes results in a low accuracy of ranking algorithms. Therefore, this research predicts new links in two OSNs (Facebook and Twitter) data sets through association between prime nodes using PSA. By completing OSNs through association between prime nodes using PSA, this study demonstrates that the performance of famous ranking algorithms (k-Core and PageRank) can be significantly improved.

      Item Type: Thesis (PhD)
      Additional Information: Thesis (PhD) - Faculty of Computer Science & Information Technology, University of Malaya, 2018.
      Uncontrolled Keywords: Soft set; Data recalculation; Data prediction; Soft set
      Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
      Divisions: Faculty of Computer Science & Information Technology
      Depositing User: Mr Mohd Safri Tahir
      Date Deposited: 12 Jun 2018 11:35
      Last Modified: 12 Apr 2021 04:18

      Actions (For repository staff only : Login required)

      View Item