Chow, Kingsley (2012) A comparative study of two orthologous gene identification methods on synteny block inference / Chow Kingsley. Masters thesis, University of Malaya.
Abstract
A synteny block is a set of orthologous genes that share the same relative ordering on the chromosomes of two species. Synteny analysis at the genome scale is a powerful means of identifying orthologs in a set of genomes of interest for downstream phylogenetic studies. OrthoCluster is a data mining tool for inferring synteny blocks among multiple genomes. Before using OrthoCluster to infer synteny blocks, orthologous gene relationships between the species of interest have to be identified first. In this study, we evaluated the effects of two different orthologous gene identification methods: InParanoid and ad hoc BLAST, on the number, size and content of synteny blocks returned by OrthoCluster using the genomes of Oryza sativa and Arabidopsis thaliana. Results show that InParanoid identified 22 124 orthologous relationships while ad hoc BLAST identified 14 928. Subsequently, OrthoCluster identified 942 conserved synteny blocks that contain no mismatches using input from InParanoid. These synteny blocks cover 1234 genes (5.97 Mb) in O. sativa and 1403 genes (2.76 Mb) in A. thaliana, respectively. With input from ad hoc BLAST, OrthoCluster detected just 314 conserved synteny blocks, which cover 427 genes (2.3 Mb) in O. sativa and 435 genes (1.1 Mb) in A. thaliana. Allowing mismatches within a synteny block, OrthoCluster identified 1510 nonconserved synteny blocks from InParanoid input, which cover 3509 genes (25.1 Mb) in O. sativa and 3648 genes (9.06 Mb) in A. thaliana. Only 589 non-conserved synteny blocks were detected using ad hoc BLAST input, with 1335 genes (8.22 Mb) in O. sativa and 1257 genes (3.32 Mb) in A. thaliana.
Actions (For repository staff only : Login required)