Skip to content

Dekkon/DPP-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DPP-Project

GPU-parallel implementation of the DC3-algorithm for computing suffix-arrays. Implemented in Futhark as part of the course Data Parallel Programming.

Suffix Arrays

The suffix array of a string is a lexicographically sorted list of all suffixes. As an example, the suffix array of fastfuthark is

ark
astfuthark
fastfuthark
futhark
hark
k
rk
stfuthark
tfuthark
thark
uthark

Suffix arrays have wide uses: For example pattern matching in DNA sequence analysis, and n-gram processing for language models.

Benchmarks

Input Size of input (~Characters) PBBS: parallelKs (s) PBBS: parallelRange (s) Futhark: DC3 (s) Futhark: gfg (s)
Trigram String 100_000_000 2.121 0.645 1.289 14.333
DNA-sequence 34_000_000 0.893 0.311 0.588 4.709
Project Gutenberg 105_000_000 3.089 1.751 2.029 15.125
Wikipedia XML 100_000_000 2.733 1.804 1.822 14.276

The parallelKs in PBBS (C++) algorithm approximately corresponds to the DC3 algorithm in Futhark.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors