Parallel Multi-BRWT query with one traversal by karasikov · Pull Request #559 · ratschlab/metagraph

karasikov · 2025-11-03T23:16:26Z

parallel querying to BRWT
enabled parallel batch query of annotations with counts
refactoring and other improvements

Status of parallelization for different query setups with batch query:

Annotation type \ Query type	matches	counts	coords
basic	✅	na	na
with counts	✅	❌ -> ✅	na
with coords	✅	❌ -> ✅	- (always non-batch)

metagraph/src/annotation/binary_matrix/multi_brwt/brwt.cpp

metagraph/src/annotation/binary_matrix/multi_brwt/brwt.hpp

metagraph/src/annotation/binary_matrix/multi_brwt/brwt.cpp

…ultiple threads

adamant-pwn

Thanks for the PR, it's nice to make it faster! Also sorry for the delay with review, the PR is quite large 👀

adamant-pwn · 2026-02-20T16:55:12Z

metagraph/src/annotation/binary_matrix/base/binary_matrix.cpp


+template <typename T>
+std::vector<T>
+BinaryMatrix::get_rows_parallel(const std::vector<Row> &rows, size_t num_threads,


I feel like the naming is somewhat confusing here. If it just get_rows_..., shouldn't it be a (non-static) member functions that calls get_rows directly?

But it looks like you also use this function with some other callers, where get_rows is actually something else, like get_row_values or get_column_ranks.

Maybe name it get_row_data_parallel, or similar?

adamant-pwn · 2026-02-24T17:15:31Z

metagraph/src/common/threads/threading.hpp

+                             size_t max_chunk_size,
+                             size_t num_threads,
+                             bool one_chunk_if_single_thread = true) {
+    num_threads = std::max<size_t>(1, num_threads);


Why and when do we ever have num_threads = 0?

adamant-pwn · 2026-02-24T17:16:01Z

metagraph/src/common/threads/threading.hpp

+                             size_t num_threads,
+                             bool one_chunk_if_single_thread = true) {
+    num_threads = std::max<size_t>(1, num_threads);
+    return (one_chunk_if_single_thread && num_threads < 2)


Suggested change

return (one_chunk_if_single_thread && num_threads < 2)

return (one_chunk_if_single_thread && num_threads == 1)

adamant-pwn · 2026-02-24T17:18:49Z

metagraph/src/annotation/binary_matrix/base/binary_matrix.cpp


-    size_t batch_size = std::min(kRowBatchSize,
-                                 (codes.size() + num_threads - 1) / num_threads);
+    const size_t batch_size = std::max<size_t>(1, std::min(kRowBatchSize, codes.size() / num_threads));


Why not call get_chunk_size?

adamant-pwn · 2026-02-24T17:21:00Z

metagraph/src/annotation/binary_matrix/base/binary_matrix.cpp

 }

+std::vector<BinaryMatrix::SetBitPositions>
+RowMajor::get_rows(const std::vector<Row> &row_ids, size_t num_threads) const {


Why this version doesn't use get_rows_parallel?

adamant-pwn · 2026-02-24T18:36:12Z

metagraph/src/annotation/binary_matrix/row_diff/row_diff.hpp

-    rd_ids = std::vector<Row>();
+    std::vector<BinaryMatrix::Row>().swap(rd_ids);
+
+    #pragma omp parallel for num_threads(num_threads) schedule(dynamic, 1000)


Should 1000 be named?

adamant-pwn · 2026-02-24T18:59:08Z

metagraph/src/cli/query.cpp

    std::atomic<uint64_t> num_found_kmers = 0;
-    #pragma omp parallel for num_threads(num_threads) schedule(dynamic, 100)
+    assert(num_threads);
+    size_t chunk_size = min<size_t>(max<size_t>(1, contigs.size() / (num_threads * 10)), 100);


Let's make 10 and 100 named constants?

adamant-pwn · 2026-02-24T19:01:50Z

metagraph/src/cli/server.cpp

    tsl::hopscotch_map<std::string, std::vector<std::pair<std::string, std::string>>> indexes;

-    ThreadPool graphs_pool(get_num_threads());
+    ThreadPool graphs_pool(get_num_threads(), 1000);


Suggested change

ThreadPool graphs_pool(get_num_threads(), 1000);

ThreadPool graphs_pool(get_num_threads(), /*max_num_tasks*/1000);

adamant-pwn · 2026-02-24T19:13:09Z

metagraph/CMakeLists.txt

+if(APPLE)
+  if(${CMAKE_OSX_ARCHITECTURES} MATCHES "arm64")
+    set(HOMEBREW_DIR /opt/homebrew)
+  else()
+    set(HOMEBREW_DIR /usr/local)
+  endif()
+endif()


Suggested change

if(APPLE)

if(${CMAKE_OSX_ARCHITECTURES} MATCHES "arm64")

set(HOMEBREW_DIR /opt/homebrew)

else()

set(HOMEBREW_DIR /usr/local)

endif()

endif()

if(APPLE)

execute_process(COMMAND brew --prefix

OUTPUT_VARIABLE HOMEBREW_DIR

OUTPUT_STRIP_TRAILING_WHITESPACE)

endif()

Would something like this work? I feel like it's semantically better approach than guessing from architecture.

adamant-pwn · 2026-02-24T19:14:12Z

metagraph/FindJemalloc.cmake

+elseif(${CMAKE_SYSTEM_PROCESSOR} MATCHES "x86_64")
+  set(HOMEBREW_DIR /usr/local)
+else()
+  set(HOMEBREW_DIR ~/.linuxbrew)


See the comment in main CMakeLists.txt file.

…ded with mmap

karasikov added 13 commits October 25, 2025 22:08

parallel brwt query

0956cde

use omp tasks

5c8cd36

sort rows outside of the critical section

3aed51b

print the number of sliced rows

093d3d8

Merge remote-tracking branch 'origin/master' into mk/brwt

13779cd

use 4 threads, query the largest child inline

9cedae7

use thread pool

94f67b7

minor

831fc0d

join threads in thread pool only when all workers are done

48de715

20 blocks max

078238a

minor

21b00ac

minor

efebc54

query with all threads, also rd paths

f2516a2

adamant-pwn reviewed Nov 4, 2025

View reviewed changes

karasikov added 13 commits November 27, 2025 15:10

added multithreaded get_rows

5884217

Merge remote-tracking branch 'origin/master' into mk/brwt

dac113c

parallel get_rd_ids

a4bf47b

minor

b024867

cleanup

83a5f98

cleanup, optimization

36fdf9d

don't store skip_row bitmaps, other cleanups

3bb15c5

cleanup

25b98f1

fix

bad29b5

minor

d551f88

minor

7878fbe

revert

27bf9fd

removed redundant critical section

4c90815

karasikov requested a review from adamant-pwn November 27, 2025 18:34

karasikov added 2 commits February 16, 2026 20:18

Merge remote-tracking branch 'origin/master' into mk/brwt

64463ce

update

ae94b54

karasikov added 20 commits February 19, 2026 20:58

fix: first step without diff with zero (critical for coords)

c5189cc

more safety

dbebd61

minor

fc124f1

fixed build with both x86_64 and arm64 brew installed on mac

baf2654

minor

1a9569e

minor

f9c414a

fixed block sizes

cb19f64

support flag --threads-each in server mode to query each graph with m…

e9d947e

…ultiple threads

break down long contigs into segments for better omp task balancing

51b2472

fix: set parallel_each to 1 during query to calculate threads correctly

470d836

make chunk size +1 on overflow

da01e04

more logging

3459081

set a larger chunk size, otherwise tests are too long

79ac040

even larger chunk size

ea78102

set madvice for DBGSuccinct loaded with mmap

bda8e01

set MADV_WILLNEED for the index of suffix ranges

ccb4ef4

load all graphs in parallel

2e9d249

minor

6ded0d2

free memory allocated in std::vector with guarantee

d58cc48

fix

6ac5568

adamant-pwn reviewed Feb 24, 2026

View reviewed changes

karasikov added 9 commits February 25, 2026 01:00

log size of queried rows in BRWT

267584c

tight vectors

dfcb27d

updated sdsl-lite: smaller footprint of select_support_mcl in RAM loa…

afb5c5b

…ded with mmap

fix

2d635e2

removed unused variable

129a285

up

9a42016

up

d5c21b8

up

e493c22

reserve allocates more in small_vector -> create a new vector instead

c90560e

	return (one_chunk_if_single_thread && num_threads < 2)
	return (one_chunk_if_single_thread && num_threads == 1)

	ThreadPool graphs_pool(get_num_threads(), 1000);
	ThreadPool graphs_pool(get_num_threads(), /max_num_tasks/1000);

Conversation

karasikov commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adamant-pwn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

karasikov commented Nov 3, 2025 •

edited

Loading