Accelerated Sequence Alignment
A major time factor in the common workflow of a molecular biologist is to compare sequences to each other and find the best fitting alignment. High-quality alignment algorithms, such as Smith-Waterman alignment, have almost been replaced by "average-quality yet fast" heuristic approaches, such as BLAST, because of their long execution times.
Given SciEngines computers and clusters, it is now possible to revive non-heuristic alignment algorithms and to give biologists the possibility of high-quality alignment results within short time-frames back. But, also for heuristic approaches, the RIVYERA provides a benefit, as new levels of analysis can be reached. The common constraint "it's going to take too long to calculate" simply doesn't apply anymore with a FPGA cluster that provides thousands of PC cores performance.
SciEngines is excited to provide a highly efficient and scalable implementation of the Smith-Waterman algorithm for any computer of the RIVYERA line. With speeds up to 6 TCUPS per computer, this exact, high quality alignment method has become practical again, also in the context of next generation sequencing scenarios, and helps to avoid time-consuming elimination of false positives or missing crucial information.
Main features of the accelerated Smith-Waterman:
User-definable scoring matrix. Use NUC22 or your own.
Affine gap penalties supported.
Calculation of reverse complement supported.
Possibility to define multiple query files and multiple database files for comparison.
Plain-Text Output as well as SAM Output supported.
In the currently available version of Smith-Waterman, a command-line interface is available. The same commands as in NCBI versions of the software are used so that the integration of this accelerated solution with your existing infrastructure and analysis pipelines is hassle-free.
Performance: Providing the above mentioned features and run on a real-live dataset, the enormous speed of FPGA-computers in comparison to regular hardware, as well its linear scaling properties become obvious. Using this high-quality alignment algorithm, only half a rack of RIVYERA S3-5000 would be sufficient to align in one day more than 3.5 billion base pairs and their complements against e.g. hg19 chromosome 1 or more than 18 billion base pairs and their complements against hg19 chromosome 21.
Additional resources for download:
Press Release: CLC & SciEngines collaboration