Journal of
Bioinformatics and Sequence Analysis

  • Abbreviation: J. Bioinform. Seq. Anal.
  • Language: English
  • ISSN: 2141-2464
  • DOI: 10.5897/JBSA
  • Start Year: 2009
  • Published Articles: 49

Full Length Research Paper

Information fusion and multiple classifiers for haplotype assembly problem from SNP fragments and related genotype

M. Hossein Moeinzadeh1 and Ehsan Asgarian2*
1Department of Computer Science, University of Tehran, Iran. 2Department of Computer Engineering, Sharif University of Technology, Tehran, Iran.
Email: [email protected]

  •  Accepted: 28 April 2011
  •  Published: 31 May 2011

Abstract

 

Most positions of the human genome are typically invariant (99%) and only some positions (1%) are commonly invariant which are associated with complex genetic diseases. Haplotype information has become increasingly important in analyzing fine-scale molecular genetics data, due to the mutated form in human genome. Haplotype assembly is to divide aligned single nucleotide polymorphism (SNP) fragments, which is the most frequent form of difference to address genetic diseases, into two classes, and thus inferring a pair of haplotypes from them. Minimum error correction (MEC) is an important model for this problem but only effective when the error rate of the fragments is low. MEC/GI as an extension to MEC, employs the related genotype information besides the SNP fragments and so results in a more accurate inference. The haplotyping problem, due to its NP-hardness, may have no efficient algorithm for exact solution. In this paper, we focus to design serial and parallel classifiers with two classifiers. Genetic algorithm and K-means were two components of our approaches. This combination helps us to cover the single classifier’s weaknesses.

 

Key words: Multiple classifier systems, parallel classifiers, serial classifiers, haplotype, SNP fragments, genotype information, classification, reconstruction rate.