Skip to content

Commit c1747e0

Browse files
committed
Use suffix_array with cdivsufsort instead of psacak:
- Up to 200% performance boost of SAIS vs pSACAK in Rust vs Rust - Up to 15% performance boost of SAIS vs SAIS in Rust vs Go
1 parent e3b1e51 commit c1747e0

File tree

3 files changed

+13
-10
lines changed

3 files changed

+13
-10
lines changed

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,11 @@ homepage = "https://spectre-network.org"
1313

1414
[dependencies]
1515
fnv = "1.0.7"
16-
psacak = "0.1.0"
1716
rc4 = "0.1.0"
1817
salsa20 = "0.10.2"
1918
sha2 = "0.10.8"
2019
siphasher = "1.0.1"
20+
suffix_array = "0.5.0"
2121
xxhash-rust = { version = "0.8.10", features = ["xxh64"] }
2222

2323
[dev-dependencies]

README.md

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,13 +34,16 @@ The original algorithm utilized the [SA-IS](https://en.wikipedia.org/wiki/Suffix
3434
sorting algorithm. There exists an enhanced one with [SACA-K](https://www.sciencedirect.com/science/article/abs/pii/S0020019016301375)
3535
for induced sorting, improving the linear-time complexity to be
3636
in-place for constant alphabets. However, this remains a single-core
37-
variant. Our AstroBWTv3 implementation has switched to
38-
[pSACAK](https://ieeexplore.ieee.org/document/8371211), a fast
39-
linear-time, in-place parallel algorithm that leverages multi-core
40-
machines. It is fully compatible with the original AstroBWTv3 Suffix
41-
Array.
37+
variant. The [pSACAK](https://ieeexplore.ieee.org/document/8371211)
38+
algorithm offers a fast, linear-time, in-place parallel solution that
39+
utilizes multi-core machines. However, testing revealed that it still
40+
requires optimization. It is fully compatible with the original
41+
AstroBWTv3 Suffix Array.
4242

43-
There are still numerous opportunities to enhance the computation of
43+
Our AstroBWTv3 implementation is using `cdivsufsort` as it is still
44+
the fastest single threaded Suffix Array construction implementation.
45+
46+
There are numerous opportunities to enhance the computation of
4447
AstroBWTv3 hashes, including:
4548

4649
* Replacing most steps with highly optimized inline assembler code on

src/astrobwtv3.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ use sha2::Digest;
88
use sha2::Sha256;
99
use siphasher::sip::SipHasher24;
1010
use std::hash::Hasher;
11-
use psacak::psacak;
11+
use suffix_array::SuffixArray;
1212

1313
// This is the maximum.
1414
const MAX_LENGTH: u32 = (256 * 384) - 1;
@@ -2457,9 +2457,9 @@ pub fn astrobwtv3_hash(input: &[u8]) -> [u8; 32] {
24572457
let data_len = (tries - 4) as u32 * 256 + (((data[253] as u64) << 8 | (data[254] as u64)) as u32 & 0x3ff);
24582458

24592459
// Step 6: build our suffix array.
2460-
let scratch_sa = psacak(&scratch_data[..data_len as usize]);
2460+
let scratch_sa = SuffixArray::new(&scratch_data[..data_len as usize]);
24612461
let mut scratch_sa_bytes: Vec<u8> = vec![];
2462-
for vector in &scratch_sa {
2462+
for vector in &scratch_sa.into_parts().1[1..(data_len as usize + 1)] {
24632463

24642464
// Little and big endian.
24652465
if cfg!(target_endian = "little") {

0 commit comments

Comments
 (0)