SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences

Enzo Rucci; Carlos Garcia; Guillermo Botella; Armando De Giusti; Marcelo Naiouf; Manuel Prieto-Matias

doi:10.1186/s12918-018-0614-6

SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences

BMC Syst Biol. 2018 Nov 20;12(Suppl 5):96. doi: 10.1186/s12918-018-0614-6.

Authors

Enzo Rucci¹, Carlos Garcia², Guillermo Botella², Armando De Giusti³, Marcelo Naiouf⁴, Manuel Prieto-Matias²

Affiliations

¹ III-LIDI, CONICET, Facultad de Informática, Universidad Nacional de La Plata, La Plata (Buenos Aires), 1900, Argentina. erucci@lidi.info.unlp.edu.ar.
² Depto. Arquitectura de Computadores y Automática, Universidad Complutense de Madrid, Madrid, 28040, Spain.
³ III-LIDI, CONICET, Facultad de Informática, Universidad Nacional de La Plata, La Plata (Buenos Aires), 1900, Argentina.
⁴ III-LIDI, Facultad de Informática, Universidad Nacional de La Plata, La Plata (Buenos Aires), 1900, Argentina.

Abstract

Background: The Smith-Waterman (SW) algorithm is the best choice for searching similar regions between two DNA or protein sequences. However, it may become impracticable in some contexts due to its high computational demands. Consequently, the computer science community has focused on the use of modern parallel architectures such as Graphics Processing Units (GPUs), Xeon Phi accelerators and Field Programmable Gate Arrays (FGPAs) to speed up large-scale workloads.

Results: This paper presents and evaluates SWIFOLD: a Smith-Waterman parallel Implementation on FPGA with OpenCL for Long DNA sequences. First, we evaluate its performance and resource usage for different kernel configurations. Next, we carry out a performance comparison between our tool and other state-of-the-art implementations considering three different datasets. SWIFOLD offers the best average performance for small and medium test sets, achieving a performance that is independent of input size and sequence similarity. In addition, SWIFOLD provides competitive performance rates in comparison with GPU-based implementations on the latest GPU generation for the large dataset.

Conclusions: The results suggest that SWIFOLD can be a serious contender for accelerating the SW alignment of DNA sequences of unrestricted size in an affordable way reaching on average 125 GCUPS and almost a peak of 270 GCUPS.

Keywords: DNA; FPGA; High-performance computing; OpenCL; Smith-Waterman.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Base Sequence*
Computational Biology
DNA / chemistry
Sequence Alignment / methods*
Software*

Substances

DNA