20Jan 2017

ACCELERATION OF DISTANCE COMPUTATION FOR MULTIPLE SEQUENCE ALIGNMENT ON MULTI-CORE ARCHITECTURES.

  • Education College for Girls, University of Mosul, Mosul, IRAQ.
  • Department of Computer Science, University of Cihan/Sulaimanya, Kurdstan, Iraq.
Crossref Cited-by Linking logo
  • Abstract
  • References
  • Cite This Article as
  • Corresponding Author

Although high quality multiple sequence alignment is an essential task in bioinformatics, it becomes a big dilemma nowadays due to the gigantic explosion in the amount of molecular data. The most consuming time and space phase is the distance matrix computation. This paper addresses this issue by proposing a vectorized parallel method that accomplishes the huge number of similarity comparisons faster in linear space.


  1. Albert Y. Zomaya, Parallel computing for  bioinformatics and computational biology:  models, enabling technologies, and case studies,  John Wiley & Sons Inc., 2006.
  2. Dunn, A. Hejnol, DQ. Matus, K. Pang, WE. Browne, et al.,  Broad phylogenomic  sampling improves resolution of the animal tree  of life, Nature, vol. 452, , 2008 pp. 745–749.
  3. Bao, P. Bolotov, D. Dernovoy, B. Kiryutin, L. Zaslavsky, et al.,  The influenza virus  resource at the National Center for  Biotechnology Information, J. Virol., vol. 82,  2008, pp. 596–601.
  4. Kuipers, HJ. Joosten, WJ. van Berkel, NG. Leferink, E. Rooijen, et al.,  3DM: systematic  analysis of heterogeneous superfamily data to  discover protein functionalities, Proteins, vol.  78, 2010, pp. 2101–2113.
  5. Singh, R. Tokhunts, V. Baubet, JA. Goetz, ZJ. Huang, et al.,  Sonic hedgehog mutations  identified in holoprosencephaly patients can act  in a dominant negative manner, Hum. Genet.,  vol. 125, 2009, pp. 95–103.
  6. D. Thompson, D.G. Higgins, and T.J. Gibson, CLUSTAL W: improving the sensitivity of  progressive multiple sequence alignment  through sequence weighting, positions-speci?c  gap penalties and weight matrix choice, Nucleic  Acids Res., vol. 22, No.  22, 1994, pp. 4673– 4680.
  7. Notredame, D.G. Higgins, J. Heringa, T- Coffee: a novel method for fast and accurate  multiple sequence alignment, J. Mol. Biol., vol.  302, No. 1, 2000, pp. 205–17.
  8. Katoh, K. Misawa, K. Kuma, and T. Miyata, MAFFT: a novel method for rapid multiple  sequence alignment based on fast Fourier  transform, Nucleic Acids Res., vol. 30, No. 14,  2002, pp. 3059-3066.
  9. Morgenstern, K. Frech, A. Dress and T.  Werner,  DIALIGN: Finding Local Similarities  by Multiple Sequence Alignment,  Bioinformatics, vol. 14, 1998, pp. 290-294.
  10. Feng, R. Doolittle, Progressive sequence  alignment as a prerequisite to correct  phylogenetic trees, J. Mol. Evol., vol. 25, 1987,  pp. 351–360.
  11. Saitou N, Nei M, The neighbor-joining  method: a new method for reconstructing  phylogenetic trees, Mol. Biol. Evol., vol. 4, pp.  406-425, 1987.
  12. Jan Urban., Interfacing C++ libraries to  Matlab, MSc. Thesis, Universitas Masarykiana,
  13. Xiaozhong Geng, A Task Scheduling  Algorithm for Multi-Core Cluster Systems, JCP,  7, No. 11, 2012, pp. 2797-2804.
  14. Kridsadakorn Chaichoompu and Surin Kittitornkun,  Multithreaded ClustalW with  improved optimization for Intel multi-core  processor, ISCIT '06, 2006, pp. 590-594.
  15. Hans Vandierendonck, Sean Rul,  Michiel  Questier et al., Experiences with parallelizing a  bio-informatics program on the cell BE,  HiPEAC’08, vol. 4917, 2008, pp. 161–175.
  16. Hans Vandierendonck, Sean Rul, and Koen De Bosschere,  Accelerating multiple sequence  alignment with the Cell BE processor, Comput.  , vol. 53, No. 6, 2010, pp. 814–826.
  17. Di Tommaso, M. Orobitg, F. Guirado, F. Cores, T. Espinosa, C. Notredame,  Cloud- Coffee: implementation of  a parallel  consistency-based multiple alignment algorithm  in the T-Coffee package and its benchmarking  on the Amazon Elastic-Cloud, Bioinformatics,  vol. 26, No. 15, 2010, pp. 1903-1904.
  18. Katoh, and H. Toh, Parallelization of the  MAFFT multiple sequence alignment program,  Bioinformatics, vol. 26, No. 15, 2010, pp. 1899- 1900.
  19. de Araujo Macedo, A.C. Magalhaes Alves de Melo, G.H. Pfitscher, A. Boukerche, Hybrid  MPI/OpenMP Strategy for Biological Multiple  Sequence Alignment with DIALIGN-TX in  Heterogeneous Multicore Clusters,  IPDPSW’11, IEEE Xplore Press, 2011, pp. 418- 425.
  20. Yongchao Liu, Bertil Schmidt, Douglas L. Maskell,  MSAProbs: multiple sequence  alignment based on pair hidden Markov models  and partition function posterior probabilities,  Bioinformatics, vol. 26, No. 16, 2010, pp. 1958  -196.
  21. Deng and J. Cheng, MSACompro: Protein Multiple Sequence Alignment Using Predicted  Secondary Structure, Solvent Accessibility, and  Residue-Residue Contacts, BMC  Bioinformatics, vol. 12, 2011, pp. 472-488.
  22. Zhi Ying, Xinhua Lin, Simon Chong-Wee and Minglu Li,  GPU-Accelerated DNA Distance  Matrix Computation, ChinaGrid’11, 2011, pp.  42-47.
  23. Balaji Venkatachalam, Parallelizing the  Smith-Waterman Local Alignment Algorithm  using CUDA, 2012,  http://www.zl50.com/  html.
  24. F. Smith, M.S. Waterman, Identification of  common molecular subsequences, J. Mol. Biol.,  vol. 147, 1981, pp. 195-197.

[Mohammed W. Al-Neama and Kasim A. Al-Salem. (2017); ACCELERATION OF DISTANCE COMPUTATION FOR MULTIPLE SEQUENCE ALIGNMENT ON MULTI-CORE ARCHITECTURES. Int. J. of Adv. Res. 5 (Jan). 1239-1245] (ISSN 2320-5407). www.journalijar.com


Mohammed W. A-Neama
Education College for Girls, Mosul University, Mosul, IRAQ.

DOI:


Article DOI: 10.21474/IJAR01/2877      
DOI URL: http://dx.doi.org/10.21474/IJAR01/2877