Data structure and algorithm analysis

Mapping – BWA-MEM

Mapping is a fundamental task in genomics. BWA-MEM is an implementation based on the FM-index data structure which used the Burrows-Wheeler transform (BWT). BWA-MEM is commonly used in many bioinformatics pipelines. The study showed that BWA-MEM’s data structure, the FM-index, is not adapted to PIM architecture.

Study report

Bloom Filters

Bloom filters are a well-known probabilistic set data structure used to compact data. In the context of genome sequencing, algorithms have to handle huge volumes of reads and perform computational analyses faster. Optimising this underlying data structure, which is inherently memory-bound, is a major challenge to accelerate further genomics applications. We design a C++ Bloom filter implementation on the UPMEM Processing-in-memory architecture. Our evaluation shows it provides interesting speedups compared to an efficient Bloom filter implementation running on CPU. For instance, inserting and querying 100 million items in a 1 GiB filter executes 2-3 times faster.

Study report – code link

Mapping – Minimap2

This work focus on profiling the widely used program Minimap2 to understand its bottlenecks and evaluate whether it could benefit from processing-in-memory (PIM) accelerators. We showed that the Minimap2 chaining stage becomes the most prevalent bottleneck when mapping reads with low divergence from the reference genome due to sequential constraints. Besides, the micro-instruction pipeline does not exhibit particular stalls regarding memory management. We conclude that the Minimap2 seed-chain-align paradigm may not be the best fit for an implementation on the UPMEM PIM architecture.

Study report

Genome matching – COBS

As bacterial databases are growing exponentially, finding genes or mutations require efficient and fast algorithms to perform approximate pattern matching of a query against a set of genomes. It is thus essential to assess the potential of modern processors and hardware accelerators for such task. We devise a proof-of-concept implementation on the UPMEM PiM architecture and compare it to a CPU-only equivalent that uses SIMD instructions. Our evaluation shows that the PiM program scales better as the database size increases, but suffers from the lack of vectorized instructions and fails to achieve significant speed-ups.

Study report
https://gitlab.inria.fr/pim/org.pim.gmatch

Comments are closed.