Skip to content

feat(datastructures, hashmap): design a hashmap#182

Merged
BrianLusina merged 7 commits intomainfrom
feat/datastructures-hashmap
Feb 27, 2026
Merged

feat(datastructures, hashmap): design a hashmap#182
BrianLusina merged 7 commits intomainfrom
feat/datastructures-hashmap

Conversation

@BrianLusina
Copy link
Owner

@BrianLusina BrianLusina commented Feb 27, 2026

Describe your change:

HashMap datastructure using buckets with configurable key space

  • Add an algorithm?
  • Fix a bug or typo in an existing algorithm?
  • Documentation change?

Checklist:

  • I have read CONTRIBUTING.md.
  • This pull request is all my own work -- I have not plagiarized.
  • I know that pull requests will not be merged if they fail the automated tests.
  • This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
  • All new Python files are placed inside an existing directory.
  • All filenames are in all lowercase characters with no spaces or dashes.
  • All functions and variable names follow Python naming conventions.
  • All function parameters and return values are annotated with Python type hints.
  • All functions have doctests that pass the automated testing.
  • All new algorithms have a URL in its comments that points to Wikipedia or other similar explanation.
  • If this pull request resolves one or more open issues then the commit message contains Fixes: #{$ISSUE_NO}.

Summary by CodeRabbit

  • New Features

    • Added Jewels and Stones solution and expanded search- and string-related algorithm entries across the library.
  • Documentation

    • Major enhancement of the HashMap design guide and added comprehensive problem documentation for Jewels and Stones.
    • Directory updated with many new puzzle/algorithm listings and removal of some older cryptography and puzzle entries.
  • Tests

    • Added and updated tests to cover new algorithm implementations and hashmap behaviour.

@BrianLusina BrianLusina self-assigned this Feb 27, 2026
@BrianLusina BrianLusina added enhancement Algorithm Algorithm Problem Datastructures Datastructures Documentation Documentation Updates Array Array data structure Hash Map Hash Map Data structure Design labels Feb 27, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 27, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 78a2a3f and 060ad3c.

📒 Files selected for processing (3)
  • algorithms/hash_table/jewels_and_stones/__init__.py
  • datastructures/hashmap/README.md
  • datastructures/hashmap/hash_map.py

📝 Walkthrough

Walkthrough

Adds a new "Jewels and Stones" algorithm (two implementations, README, tests), refactors HashMap into modular components (Item, Bucket, HashMap) with updated exports and tests, enhances HashMap README, updates DIRECTORY entries, and adds a minor type annotation for Caesar cipher.

Changes

Cohort / File(s) Summary
Jewels and Stones
algorithms/hash_table/jewels_and_stones/__init__.py, algorithms/hash_table/jewels_and_stones/README.md, algorithms/hash_table/jewels_and_stones/test_jewels_and_stones.py
Added two implementations (set-based and counter-based), comprehensive README documenting problem, algorithm and complexity, and parameterised tests covering multiple cases.
HashMap Modular Refactor
datastructures/hashmap/__init__.py, datastructures/hashmap/item.py, datastructures/hashmap/bucket.py, datastructures/hashmap/hash_map.py
Split prior HashMap implementation into modules: Item model, Bucket for per-bucket storage, HashMap core class; updated package __init__ to export HashMap and Item.
HashMap Docs & Tests
datastructures/hashmap/README.md, datastructures/hashmap/test_hashmap.py
Replaced README with full design narrative, API surface and complexity notes; tests updated to use public set/get/remove API (no direct table access).
Strings
algorithms/strings/caeser_cipher/__init__.py
Added type annotation for encode(self, plaintext: str) parameter.
Directory Listing
DIRECTORY.md
Expanded directory with many new entries (Jewels and Stones, multiple Binary Search problems, Strings items, HashMap components) and removed some cryptography/puzzle test entries.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~23 minutes

Possibly related PRs

Poem

🐰 I hopped through buckets, keys in a line,
Jewels counted bright, in stones they shine,
Items tucked neatly in modular rows,
HashMap refactored — how proud my nose! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 31.58% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: designing a HashMap datastructure with bucket-based implementation.
Description check ✅ Passed The description provides a clear change summary and completes all checklist items except the optional issue-resolution item, demonstrating alignment with repository contribution guidelines.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/datastructures-hashmap

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Feb 27, 2026

Warnings
⚠️ ❗ Big PR

: Pull Request size seems relatively large. If Pull Request contains multiple changes, split each into separate PR will helps faster, easier review.

Generated by 🚫 dangerJS against 060ad3c

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (4)
algorithms/strings/caeser_cipher/__init__.py (1)

85-87: Consider adding return type annotation for consistency.

The plaintext: str parameter annotation is a good addition. For completeness, consider also adding the return type annotation -> str.

-    def encode(self, plaintext: str):
+    def encode(self, plaintext: str) -> str:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@algorithms/strings/caeser_cipher/__init__.py` around lines 85 - 87, Add an
explicit return type annotation to the encode method for consistency: change the
signature of encode (the function named encode) from def encode(self, plaintext:
str): to def encode(self, plaintext: str) -> str: so callers and type checkers
know it returns a string; no other logic changes needed.
datastructures/hashmap/item.py (1)

4-17: LGTM!

Simple and clean data container class. The implementation is appropriate for its purpose.

Consider adding __repr__ for easier debugging:

def __repr__(self) -> str:
    return f"Item(key={self.key!r}, data={self.data!r})"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@datastructures/hashmap/item.py` around lines 4 - 17, Add a readable
representation for debugging by implementing __repr__ on the Item class: define
a __repr__ method on class Item that returns a string showing the key and data
(e.g., include both self.key and self.data) so instances print clearly during
debugging and logging.
datastructures/hashmap/test_hashmap.py (1)

54-58: Consider adding test for removing a non-existent key.

The test verifies that get raises KeyError for non-existent keys, but there's no test verifying that remove raises KeyError when attempting to remove a non-existent key (as documented in the HashMap.remove docstring).

Note: This requires fixing Bucket.remove to actually raise KeyError as flagged in the bucket.py review.

# Test removing non-existent key raises KeyError
with self.assertRaises(KeyError):
    hashmap.remove(999)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@datastructures/hashmap/test_hashmap.py` around lines 54 - 58, Add a test that
asserts HashMap.remove raises KeyError for a non-existent key and update the
underlying Bucket.remove implementation to raise KeyError when the key is not
found; specifically, in the test file add a case that calls hashmap.remove(999)
(or another absent key) inside assertRaises(KeyError), and in Bucket.remove make
sure it does not silently return when the item is missing but instead raises
KeyError so HashMap.remove propagates the exception.
algorithms/hash_table/jewels_and_stones/__init__.py (1)

21-34: Comments are misleading and appear copy-pasted from the other function.

The implementation logic is correct, but the comments don't accurately describe what this function does:

  • Line 22: "Store all jewel types for fast membership checking" — but stone_counts stores stone character frequencies, not jewel types.
  • Line 28: "Check each stone and increment count" — but we're iterating over jewels, not stones.
📝 Proposed fix for comments
 def num_jewels_in_stones_with_dict(jewels: str, stones: str) -> int:
-    # Store all jewel types for fast membership checking
+    # Count occurrences of each stone character
     stone_counts: Counter[str] = Counter(stones)
 
     # Count how many stones are jewels
     count = 0
 
-    # Check each stone and increment count if it's a jewel
+    # For each jewel type, add its count from stones
     for jewel in jewels:
         if jewel in stone_counts:
             count += stone_counts[jewel]
 
     # Return the total number of jewels found in stones
     return count
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@algorithms/hash_table/jewels_and_stones/__init__.py` around lines 21 - 34,
Update the misleading comments inside num_jewels_in_stones_with_dict: replace
the comment above stone_counts so it correctly states that stone_counts is a
frequency map of stones (not "jewel types"), change the loop comment to indicate
we're iterating over each jewel and summing matching stone frequencies, and
ensure the final comment notes returning the total matched jewels; reference
symbols: num_jewels_in_stones_with_dict, stone_counts, jewels, stones, count.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@algorithms/hash_table/jewels_and_stones/__init__.py`:
- Around line 1-2: Remove the incorrect duplicate import of Counter from typing
and keep the runtime Counter from collections; update the import line so only
the needed typing symbols are imported (e.g., remove "Counter" from "from typing
import Set, Counter") or switch to builtin generics (e.g., use "set[str]" and
"Counter[str]" in annotations) while keeping "from collections import Counter"
for runtime use; ensure references in the module (e.g., any type annotations or
uses of Counter) use the collections.Counter at runtime and appropriate typing
generics for annotations.

In `@datastructures/hashmap/bucket.py`:
- Around line 37-46: Bucket.remove currently iterates self.bucket and silently
returns if the key isn't found and also binds an unused variable current_value;
update the remove method (Bucket.remove) to raise KeyError when no matching
item.key is found (e.g., use a for/else or track a found flag and after loop
raise KeyError(f"Key not found: {key}")), and remove the unused current_value
binding (only use item.key to compare). Ensure the raised exception matches the
HashMap.remove docstring.
- Around line 23-31: Fix the minor typos and the unused variable in the for-loop
inside the bucket handling: change the comment "key-valur" to "key-value" and
"blag" to "flag", and replace the unused assignment "current_key, current_value
= item.key, item.data" with "current_key, _ = item.key, item.data" (or simply
assign only current_key) so the unused value is intentional; keep the existing
logic that updates self.bucket[idx] = Item(key, value) and sets found = True
before breaking.

In `@datastructures/hashmap/hash_map.py`:
- Around line 30-38: Docstring for HashMap.set uses "it's" instead of "its" and
the public API name is inconsistent with README which references put; update the
docstring in set to use the possessive "its" and then either rename the method
set to put (and update all references) or add a thin alias method put that calls
set to keep backward/README compatibility (ensure to update any tests/imports
that expect put). Target the HashMap.set method and the README references when
making the change.

In `@datastructures/hashmap/README.md`:
- Around line 5-12: The README's API and the implementation disagree: README
documents put(int key, int value) but the HashMap class in hash_map.py exposes
set(key, value); make them consistent by either renaming the method in the
HashMap class from set to put (update the method name and any internal
references/calls to set) or by updating README to document set(key, value)
instead of put; reference the HashMap class and the set method in your changes
so callers/tests use the same name.

In `@datastructures/hashmap/test_hashmap.py`:
- Around line 26-33: The test comment is wrong because lines using hashmap.set
with the same key (2) exercise update, not collision; change the test in
test_hashmap.py to insert two different keys that collide under the HashMap's
key_space (e.g., with key_space=3 use keys 1 and 4), call hashmap.set(1, ...)
and hashmap.set(4, ...), then assert both hashmap.get(1) and hashmap.get(4)
return their respective values to verify collision handling in the HashMap
implementation (refer to the hashmap.set and hashmap.get calls).

In `@DIRECTORY.md`:
- Around line 373-377: Fix the typographical error by renaming "Caeser Cipher"
to "Caesar Cipher" in the DIRECTORY.md entry and update the linked label "Test
Caeser" to "Test Caesar"; also rename any matching directory and test file names
(e.g., the directory containing caeser_cipher and the test file test_caeser.py)
to use "caesar_cipher" and "test_caesar.py" and update their import/links
accordingly to avoid broken references.

---

Nitpick comments:
In `@algorithms/hash_table/jewels_and_stones/__init__.py`:
- Around line 21-34: Update the misleading comments inside
num_jewels_in_stones_with_dict: replace the comment above stone_counts so it
correctly states that stone_counts is a frequency map of stones (not "jewel
types"), change the loop comment to indicate we're iterating over each jewel and
summing matching stone frequencies, and ensure the final comment notes returning
the total matched jewels; reference symbols: num_jewels_in_stones_with_dict,
stone_counts, jewels, stones, count.

In `@algorithms/strings/caeser_cipher/__init__.py`:
- Around line 85-87: Add an explicit return type annotation to the encode method
for consistency: change the signature of encode (the function named encode) from
def encode(self, plaintext: str): to def encode(self, plaintext: str) -> str: so
callers and type checkers know it returns a string; no other logic changes
needed.

In `@datastructures/hashmap/item.py`:
- Around line 4-17: Add a readable representation for debugging by implementing
__repr__ on the Item class: define a __repr__ method on class Item that returns
a string showing the key and data (e.g., include both self.key and self.data) so
instances print clearly during debugging and logging.

In `@datastructures/hashmap/test_hashmap.py`:
- Around line 54-58: Add a test that asserts HashMap.remove raises KeyError for
a non-existent key and update the underlying Bucket.remove implementation to
raise KeyError when the key is not found; specifically, in the test file add a
case that calls hashmap.remove(999) (or another absent key) inside
assertRaises(KeyError), and in Bucket.remove make sure it does not silently
return when the item is missing but instead raises KeyError so HashMap.remove
propagates the exception.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 98bf909 and 78a2a3f.

⛔ Files ignored due to path filters (13)
  • datastructures/hashmap/images/solutions/hash_map_design_key_collision.png is excluded by !**/*.png
  • datastructures/hashmap/images/solutions/hash_map_design_key_collision_with_buckets.png is excluded by !**/*.png
  • datastructures/hashmap/images/solutions/hash_map_design_solution_1.png is excluded by !**/*.png
  • datastructures/hashmap/images/solutions/hash_map_design_solution_10.png is excluded by !**/*.png
  • datastructures/hashmap/images/solutions/hash_map_design_solution_11.png is excluded by !**/*.png
  • datastructures/hashmap/images/solutions/hash_map_design_solution_2.png is excluded by !**/*.png
  • datastructures/hashmap/images/solutions/hash_map_design_solution_3.png is excluded by !**/*.png
  • datastructures/hashmap/images/solutions/hash_map_design_solution_4.png is excluded by !**/*.png
  • datastructures/hashmap/images/solutions/hash_map_design_solution_5.png is excluded by !**/*.png
  • datastructures/hashmap/images/solutions/hash_map_design_solution_6.png is excluded by !**/*.png
  • datastructures/hashmap/images/solutions/hash_map_design_solution_7.png is excluded by !**/*.png
  • datastructures/hashmap/images/solutions/hash_map_design_solution_8.png is excluded by !**/*.png
  • datastructures/hashmap/images/solutions/hash_map_design_solution_9.png is excluded by !**/*.png
📒 Files selected for processing (16)
  • DIRECTORY.md
  • algorithms/hash_table/jewels_and_stones/README.md
  • algorithms/hash_table/jewels_and_stones/__init__.py
  • algorithms/hash_table/jewels_and_stones/test_jewels_and_stones.py
  • algorithms/strings/caeser_cipher/README.md
  • algorithms/strings/caeser_cipher/__init__.py
  • algorithms/strings/caeser_cipher/test_caeser.py
  • algorithms/strings/run_length_encoding/README.md
  • algorithms/strings/run_length_encoding/__init__.py
  • algorithms/strings/run_length_encoding/test_run_length.py
  • datastructures/hashmap/README.md
  • datastructures/hashmap/__init__.py
  • datastructures/hashmap/bucket.py
  • datastructures/hashmap/hash_map.py
  • datastructures/hashmap/item.py
  • datastructures/hashmap/test_hashmap.py

Comment on lines +23 to +31
for idx, item in enumerate(self.bucket):
current_key, current_value = item.key, item.data
# If the key matches the key of the current key-valur pair
if current_key == key:
# Update the value of the key-value pair
self.bucket[idx] = Item(key, value)
# Set the blag to true, indicating the key is found and break out of the loop
found = True
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor typos in comments.

  • Line 25: "key-valur" should be "key-value"
  • Line 29: "blag" should be "flag"

Also, current_value is assigned but never used. Consider using _ to indicate an intentionally unused variable.

✏️ Suggested fix
         for idx, item in enumerate(self.bucket):
-            current_key, current_value = item.key, item.data
-            # If the key matches the key of the current key-valur pair
+            current_key = item.key
+            # If the key matches the key of the current key-value pair
             if current_key == key:
                 # Update the value of the key-value pair
                 self.bucket[idx] = Item(key, value)
-                # Set the blag to true, indicating the key is found and break out of the loop
+                # Set the flag to true, indicating the key is found and break out of the loop
                 found = True
                 break
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@datastructures/hashmap/bucket.py` around lines 23 - 31, Fix the minor typos
and the unused variable in the for-loop inside the bucket handling: change the
comment "key-valur" to "key-value" and "blag" to "flag", and replace the unused
assignment "current_key, current_value = item.key, item.data" with "current_key,
_ = item.key, item.data" (or simply assign only current_key) so the unused value
is intentional; keep the existing logic that updates self.bucket[idx] =
Item(key, value) and sets found = True before breaking.

Comment on lines +37 to +46
def remove(self, key: Any) -> None:
# Iterate through each key-value pair in the bucket
for idx, item in enumerate(self.bucket):
current_key, current_value = item.key, item.data
# If the key matches the key of the current key-value pair
if current_key == key:
# Delete the key-value pair from the bucket
del self.bucket[idx]
# Exit the loop as the key has been removed
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

remove does not raise KeyError when key is not found.

The HashMap.remove docstring states it "Raises a Key error if the key does not exist", but this Bucket.remove implementation silently does nothing if the key isn't found. This inconsistency could lead to unexpected behaviour.

Additionally, current_value on line 40 is unused.

🐛 Proposed fix to raise KeyError when key not found
     def remove(self, key: Any) -> None:
         # Iterate through each key-value pair in the bucket
         for idx, item in enumerate(self.bucket):
-            current_key, current_value = item.key, item.data
+            current_key = item.key
             # If the key matches the key of the current key-value pair
             if current_key == key:
                 # Delete the key-value pair from the bucket
                 del self.bucket[idx]
-                # Exit the loop as the key has been removed
-                break
+                return
+        raise KeyError(f"Key {key} does not exist")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@datastructures/hashmap/bucket.py` around lines 37 - 46, Bucket.remove
currently iterates self.bucket and silently returns if the key isn't found and
also binds an unused variable current_value; update the remove method
(Bucket.remove) to raise KeyError when no matching item.key is found (e.g., use
a for/else or track a found flag and after loop raise KeyError(f"Key not found:
{key}")), and remove the unused current_value binding (only use item.key to
compare). Ensure the raised exception matches the HashMap.remove docstring.

Comment on lines 26 to +33
# can handle collisions

item_5 = 2, "something"
hashmap.set(item_5[0], item_5[1])
item_5_key = 2
item_5_value = "something"
hashmap.set(item_5_key, item_5_value)

expected_item_5 = Item(item_5[0], item_5[1])
actual_5 = hashmap.table[2][0]
self.assertEqual(actual_5.key, expected_item_5.key)
self.assertEqual(actual_5.data, expected_item_5.data)
actual_5 = hashmap.get(item_5_key)
self.assertEqual(actual_5, item_5_value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

The "collision" comment is misleading - this tests update, not collision.

Lines 28-30 set the same key (2) twice, which tests the update functionality, not collision handling. A true hash collision occurs when different keys map to the same bucket.

With key_space=3, keys 1 and 4 both hash to bucket 1 (since 1 % 3 == 1 and 4 % 3 == 1). Consider testing actual collisions:

# Test actual collision: keys 1 and 4 both hash to bucket 1
hashmap.set(1, "value_for_1")
hashmap.set(4, "value_for_4")  # Collides with key 1
self.assertEqual(hashmap.get(1), "value_for_1")
self.assertEqual(hashmap.get(4), "value_for_4")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@datastructures/hashmap/test_hashmap.py` around lines 26 - 33, The test
comment is wrong because lines using hashmap.set with the same key (2) exercise
update, not collision; change the test in test_hashmap.py to insert two
different keys that collide under the HashMap's key_space (e.g., with
key_space=3 use keys 1 and 4), call hashmap.set(1, ...) and hashmap.set(4, ...),
then assert both hashmap.get(1) and hashmap.get(4) return their respective
values to verify collision handling in the HashMap implementation (refer to the
hashmap.set and hashmap.get calls).

BrianLusina and others added 5 commits February 27, 2026 15:34
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@BrianLusina BrianLusina merged commit 5b5bb62 into main Feb 27, 2026
3 checks passed
@BrianLusina BrianLusina deleted the feat/datastructures-hashmap branch March 2, 2026 06:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Algorithm Algorithm Problem Array Array data structure Datastructures Datastructures Design Documentation Documentation Updates enhancement Hash Map Hash Map Data structure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant