Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Introduces a new hash-sorted-map crate implementing the first part of an insertion-only HashSortedMap with SIMD-accelerated group scanning and overflow chaining, plus accompanying benchmarks and design documentation.
Changes:
- Adds core
HashSortedMapimplementation (insert/get/entry API) with growth and overflow chaining. - Adds platform-specific group scanning primitives (SSE2/NEON/scalar) and a basic test suite.
- Adds a Criterion benchmarks crate and initial README/optimization docs.
Show a summary per file
| File | Description |
|---|---|
| crates/hash-sorted-map/src/lib.rs | Exposes group_ops and hash_sorted_map modules. |
| crates/hash-sorted-map/src/hash_sorted_map.rs | Implements the HashSortedMap data structure, growth logic, and Entry API, plus tests. |
| crates/hash-sorted-map/src/group_ops.rs | Adds SIMD/scalar helpers for ctrl-byte matching and mask iteration. |
| crates/hash-sorted-map/benchmarks/performance.rs | Adds Criterion benchmarks comparing multiple map implementations. |
| crates/hash-sorted-map/benchmarks/lib.rs | Benchmark utilities (trigram generation, identity hasher). |
| crates/hash-sorted-map/benchmarks/Cargo.toml | Defines the benchmarks crate and its dependencies. |
| crates/hash-sorted-map/README.md | Documents motivation/design and includes benchmark results + running instructions. |
| crates/hash-sorted-map/OPTIMIZATIONS.md | Design/optimization analysis and rationale. |
| crates/hash-sorted-map/Cargo.toml | Declares the new hash-sorted-map crate metadata. |
| Cargo.toml | Adds the benchmarks crate to the workspace members. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (3)
crates/hash-sorted-map/src/hash_sorted_map.rs:560
insert_after_growtreatsInsertion::NeedsOverflowas unreachable after a resize, but it can still happen for adversarial / extremely-colliding hash distributions (the table may still need overflow groups, even immediately after growing). Instead ofunreachable!, handleNeedsOverflowby allocating/linking an overflow group (with the same capacity check as the normal insert path) or by retrying via the normalinsert_hashedlogic.
// After grow, the new primary group for `key` cannot be full (see
// function docs), and the key wasn't in the table before grow.
FindResult::Vacant(Insertion::NeedsOverflow { .. }) | FindResult::Found(_) => {
unreachable!("post-grow walk must hit an empty slot")
}
crates/hash-sorted-map/README.md:90
- The benchmark invocation doesn’t match the benchmark that’s defined in
benchmarks/Cargo.toml([[bench]] name = "performance").cargo bench --bench hashmap_insertwill fail; update the README to the correct command (likelycargo bench -p hash-sorted-map-benchmarks --bench performance, or run from thebenchmarks/crate).
```sh
cargo bench --bench hashmap_insert
**crates/hash-sorted-map/src/hash_sorted_map.rs:367**
* `insert_for_grow` allocates overflow groups without checking `self.num_groups` against `self.groups.len()`. Under heavy hash collisions during rehash, `new_gi` can reach `self.groups.len()` and the subsequent indexing will panic. Consider mirroring the capacity check used in `insert_hashed` (e.g., trigger another `grow()` / allocate more groups) so growth remains panic-free even for pathological hash distributions.
} else {
let new_gi = self.num_groups as usize;
self.groups[gi].overflow = new_gi as u32;
self.num_groups += 1;
group = &mut self.groups[new_gi];
break;
</details>
- **Files reviewed:** 10/10 changed files
- **Comments generated:** 3
| │ Vec<Group<K,V>> where each Group (AoS): │ | ||
| │ { ctrl: [u8; 8], keys: [MaybeUninit<K>; 8], │ | ||
| │ values: [MaybeUninit<V>; 8], overflow: u32 } │ | ||
| │ │ | ||
| │ • Overflow chaining (linked groups) │ | ||
| │ • 8-byte groups with NEON/SSE2/scalar SIMD scan │ | ||
| │ • EMPTY / FULL tag states only (insertion-only, no deletion) │ |
There was a problem hiding this comment.
This architecture diagram hard-codes 8-slot groups (ctrl: [u8; 8], keys: ...; 8], etc.) and says “8-byte groups with NEON/SSE2”, but the implementation uses GROUP_SIZE = 16 on x86_64. Please update the documentation to reflect the 8-or-16 group size (or describe it as GROUP_SIZE).
jorendorff
left a comment
There was a problem hiding this comment.
This is cool.
Can I see the merge/iteration operations you have in mind, before stamping this?
| scanning the group. Gives a direct hit on most inserts at low load. | ||
| - **SIMD group scanning** — uses NEON on aarch64, SSE2 on x86\_64, and a | ||
| scalar fallback elsewhere to scan 8–16 control bytes in parallel. | ||
| - **AoS group layout** — each group stores its control bytes, keys, and values |
There was a problem hiding this comment.
Suggest linking Wikipedia here: https://en.wikipedia.org/wiki/AoS_and_SoA#Array_of_structures_of_arrays
Wikipedia calls this AoSoA (since each Group is a struct of arrays) or "tiled AoS"
| } | ||
|
|
||
| pub fn with_capacity_and_hasher(capacity: usize, hash_builder: S) -> Self { | ||
| let adjusted = capacity.checked_mul(8).unwrap_or(usize::MAX) / 7; |
There was a problem hiding this comment.
I think we need to adjust more than that, because the expected load factor is not as high as hashbrown.
I did a few simulations and we need to multiply by about 1.30 here, or 1.42 if GROUP_SIZE is 8. That is assuming alloc_groups overallocates by a further 1/8 and that we want with_capacity to be an honest effort to accommodate that many entries without growing.
There was a problem hiding this comment.
👍
(I had copilot compute those numbers as well, or at least something similar, just forgot to change the code)
Co-authored-by: Jason Orendorff <jorendorff@github.com>
|
Here is the very simple sorting of each group: In the ported version, we will need to introduce a new type, since the hash map will be broken after the sorting. Merging is implemented here: We could provide a helper function in the rust crate for this. Feel free to merge/make changes, since I will be OOO tomorrow. |
In a subsequent step, merging functions and sorted iterators will be added.