Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ jobs:
uses: actions/cache@v5
with:
path: ~/.cargo
key: cargo-cache-${{ steps.rust-toolchain.outputs.cachekey }}-${{ hashFiles('Cargo.lock') }}
key: cargo-cache-${{ matrix.toolchain }}-${{ hashFiles('Cargo.lock') }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems unrelated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT this was already broken but my change here invalidated the cache and exercised the fact that this line is no longer valid. This was required to make CI, happy.


- name: Install dependencies
uses: astral-sh/setup-uv@v7
Expand Down Expand Up @@ -106,7 +106,7 @@ jobs:
RUST_BACKTRACE: 1
run: |
git submodule update --init
uv run --no-project pytest -v . --import-mode=importlib
uv run --no-project pytest -v --import-mode=importlib

- name: FFI unit tests
run: |
Expand Down
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,9 @@ features = ["substrait"]
[tool.pytest.ini_options]
asyncio_mode = "auto"
asyncio_default_fixture_loop_scope = "function"
addopts = "--doctest-modules"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this here? (sorry I don't know python enough)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default the examples in the doc strings aren't executed. There are a few different ways to turn that functionality on and this seemed the least intrusive. I broke down my commits somewhat (besides the giant here are loads of examples). So the first commit allows testing of the examples in the doc strings. Then I ran pytest (which now automatically runs the tests in the doc strings) and fixed the few cases we already had.

doctest_optionflags = ["NORMALIZE_WHITESPACE", "ELLIPSIS"]
testpaths = ["python/tests", "python/datafusion"]

# Enable docstring linting using the google style guide
[tool.ruff.lint]
Expand Down
14 changes: 9 additions & 5 deletions python/datafusion/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -327,8 +327,9 @@ def into_view(self, temporary: bool = False) -> Table:
>>> df = ctx.sql("SELECT 1 AS value")
>>> view = df.into_view()
>>> ctx.register_table("values_view", view)
>>> df.collect() # The DataFrame is still usable
>>> ctx.sql("SELECT value FROM values_view").collect()
>>> result = ctx.sql("SELECT value FROM values_view").collect()
>>> result[0].column("value").to_pylist()
[1]
"""
from datafusion.catalog import Table as _Table

Expand Down Expand Up @@ -1389,9 +1390,12 @@ def fill_null(self, value: Any, subset: list[str] | None = None) -> DataFrame:
DataFrame with null values replaced where type casting is possible

Examples:
>>> df = df.fill_null(0) # Fill all nulls with 0 where possible
>>> # Fill nulls in specific string columns
>>> df = df.fill_null("missing", subset=["name", "category"])
>>> from datafusion import SessionContext, col
>>> ctx = SessionContext()
>>> df = ctx.from_pydict({"a": [1, None, 3], "b": [None, 5, 6]})
>>> filled = df.fill_null(0)
>>> filled.sort(col("a")).collect()[0].column("a").to_pylist()
[0, 1, 3]

Notes:
- Only fills nulls in columns where the value can be cast to the column type
Expand Down
Loading