Skip to content

feat(re): Enhance CFFI re module and add tests#7

Closed
adsharma wants to merge 3 commits intomainfrom
feature/re-module-enhancements
Closed

feat(re): Enhance CFFI re module and add tests#7
adsharma wants to merge 3 commits intomainfrom
feature/re-module-enhancements

Conversation

@adsharma
Copy link
Collaborator

This commit includes several enhancements to the CFFI-based re module:

  • Introduced re.error for regex-specific exceptions.
  • Implemented MatchObject for search() results, providing access to captured groups via group() and groups() methods.
  • Enhanced findall() to correctly return:
    • list[str] of full matches if no capture groups.
    • list[str] of captured strings if one capture group.
    • list[tuple[str,...]] if multiple capture groups.
  • Ensured sub() supports standard ECMAScript group references (e.g., $1, $&).
  • Added comprehensive tests to tests/test_re.py to cover:
    • Basic matching, searching, character classes, quantifiers, anchors.
    • Detailed behavior of search() with MatchObject and groups.
    • Detailed behavior of findall() with 0, 1, and multiple groups.
    • sub() with various group backreferences.
    • Error handling for invalid regex patterns.

The C++ backend in native/src/regex_wrapper.cpp was updated to support these features, including functions to get mark counts and return structured group information for search. The CFFI interface in src/stdlib/re.py and the stub file native/src/regex_wrapper.pyi were updated accordingly.

This commit includes several enhancements to the CFFI-based `re` module:

- Introduced `re.error` for regex-specific exceptions.
- Implemented `MatchObject` for `search()` results, providing access to captured groups via `group()` and `groups()` methods.
- Enhanced `findall()` to correctly return:
    - `list[str]` of full matches if no capture groups.
    - `list[str]` of captured strings if one capture group.
    - `list[tuple[str,...]]` if multiple capture groups.
- Ensured `sub()` supports standard ECMAScript group references (e.g., `$1`, `$&`).
- Added comprehensive tests to `tests/test_re.py` to cover:
    - Basic matching, searching, character classes, quantifiers, anchors.
    - Detailed behavior of `search()` with `MatchObject` and groups.
    - Detailed behavior of `findall()` with 0, 1, and multiple groups.
    - `sub()` with various group backreferences.
    - Error handling for invalid regex patterns.

The C++ backend in `native/src/regex_wrapper.cpp` was updated to support these features, including functions to get mark counts and return structured group information for `search`. The CFFI interface in `src/stdlib/re.py` and the stub file `native/src/regex_wrapper.pyi` were updated accordingly.
This commit adds an `__all__` list to `src/stdlib/re.py` to explicitly
define the public API of the module. This ensures that `re.error`
and other key components like `match`, `compile`, `search`, `findall`,
`sub`, `CompiledRegex`, and `MatchObject` are correctly exported and
accessible via `from stdlib.re import ...`.

This addresses the issue where `re.error` was not directly importable.
Adds an `__all__` list to `src/stdlib/re.py` to explicitly define
the public API of the module. This ensures that `re.error`
and other key components like `match`, `compile`, `search`, `findall`,
`sub`, `CompiledRegex`, and `MatchObject` are correctly exported.

This addresses the issue where `from stdlib.re import error` would fail.
@adsharma adsharma closed this May 28, 2025
@adsharma adsharma deleted the feature/re-module-enhancements branch May 28, 2025 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant