Remove XSS potential using format_html helpers#2950
Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to reduce XSS risk in Django admin + template output by replacing string interpolation + mark_safe() with format_html() / format_html_join() (and enabling Ruff’s S308 check by removing the global ignore).
Changes:
- Replace several HTML-producing code paths with
format_html()/format_html_join(). - Enable Ruff S308 (suspicious
mark_safeusage) by removing the global ignore and adding targeted# noqa: S308in a few places. - Refactor some admin/template-tag HTML builders for readability (e.g.,
dedent()blocks).
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Enables Ruff S308 by removing the global ignore. |
| apps/successstories/admin.py | Switches link rendering to format_html() with arguments. |
| apps/sponsors/forms.py | Replaces mark_safe labels with format_html() labels. |
| apps/sponsors/admin.py | Converts multiple admin display helpers to format_html*() and refactors HTML assembly. |
| apps/downloads/templatetags/download_tags.py | Refactors wbr_wrap to use format_html_join() and format_html(). |
| apps/companies/templatetags/companies.py | Refactors render_email to build HTML via format_html_join() / concatenation. |
| apps/boxes/templatetags/boxes.py | Adds an inline # noqa: S308 for an intentional mark_safe(). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -801,8 +807,7 @@ def get_custom_benefits_added_by_user(self, obj): | |||
| if not benefits: | |||
| return "---" | |||
|
|
|||
| html = "".join([f"<p>{b}</p>" for b in benefits]) | |||
| return mark_safe(html) | |||
| return format_html_join("", "<p>{}</p>", benefits) | |||
There was a problem hiding this comment.
format_html_join() expects an iterable of tuples; passing benefits (a list of strings) will splat each string into characters and only the first character will be used for {}. Wrap each benefit as (benefit,) before joining.
| def wbr_wrap(value: str | None) -> str: | ||
| """Insert <wbr> tags for optional line breaking, prioritising halfway break. | ||
|
|
||
| Uses inline-block spans for halves so the browser prefers breaking | ||
| at the midpoint first, then within each half if still too wide. | ||
| """ | ||
| if not value: | ||
| return value or "" | ||
|
|
||
| interval = 16 | ||
| chunks = [value[i : i + interval] for i in range(0, len(value), interval)] | ||
|
|
||
| # Split into two halves, each half has internal <wbr> breaks | ||
| midpoint = len(chunks) // 2 | ||
| first_half = "<wbr>".join(chunks[:midpoint]) | ||
| second_half = "<wbr>".join(chunks[midpoint:]) | ||
| first_half = format_html_join(mark_safe("<wbr>"), "{}", chunks[:midpoint]) | ||
| second_half = format_html_join(mark_safe("<wbr>"), "{}", chunks[midpoint:]) | ||
|
|
||
| return mark_safe( | ||
| f'<span class="checksum-half">{first_half}</span><wbr><span class="checksum-half">{second_half}</span>' | ||
| return format_html( | ||
| '<span class="checksum-half">{}</span><wbr><span class="checksum-half">{}</span>', first_half, second_half | ||
| ) |
There was a problem hiding this comment.
wbr_wrap() behavior is user-visible and there are existing template-tag tests in apps/downloads/tests/test_template_tags.py, but this filter isn’t covered. Please add a unit test that asserts the full string is preserved and <wbr> insertion works (including a case that spans multiple chunks).
| @@ -16,8 +16,8 @@ def render_email(value): | |||
| mailbox_tokens = mailbox.split(".") | |||
| domain_tokens = domain.split(".") | |||
|
|
|||
| mailbox = "<span>.</span>".join(mailbox_tokens) | |||
| domain = "<span>.</span>".join(domain_tokens) | |||
| mailbox = format_html_join(mark_safe("<span>.</span>"), "{}", [(token,) for token in mailbox_tokens]) | |||
| domain = format_html_join(mark_safe("<span>.</span>"), "{}", [(token,) for token in domain_tokens]) | |||
|
|
|||
| return format_html(f"{mailbox}<span>@</span>{domain}") | |||
| return mailbox + mark_safe("<span>@</span>") + domain | |||
There was a problem hiding this comment.
render_email() currently builds HTML by concatenating strings/SafeStrings, which produces a plain str and will be auto-escaped when used as a template filter (regressing the rendered output). Build and return the final value via format_html(...) (and avoid mark_safe constants, which will also trigger S308 now).
| return mark_safe(html) | ||
| html += format_html( | ||
| dedent(""" | ||
| <li><b>{year}</b>:" |
There was a problem hiding this comment.
There’s an extra literal double-quote in the generated HTML (<li><b>{year}</b>:") which will show up in the admin output. Remove the stray " from the template string.
| <li><b>{year}</b>:" | |
| <li><b>{year}</b>: |
| if obj.value and getattr(obj.value, "url", None): | ||
| html = f"<a href='{obj.value.url}' target='_blank'>{obj.value}</a>" | ||
| return mark_safe(html) | ||
| html = format_html("<a href='{}' target='_blank'>{}</a>", (obj.value.url, obj.value)) |
There was a problem hiding this comment.
This format_html() call is passing a single tuple as the first positional argument, so the rendered link will contain the tuple string (and the second {} will be left unresolved). Pass obj.value.url and obj.value as separate arguments.
| html = format_html("<a href='{}' target='_blank'>{}</a>", (obj.value.url, obj.value)) | |
| html = format_html("<a href='{}' target='_blank'>{}</a>", obj.value.url, obj.value) |
| def get_estimated_cost(self, obj): | ||
| """Return the estimated cost HTML for customized sponsorships.""" | ||
| cost = None | ||
| html = "This sponsorship has not customizations so there's no estimated cost" |
There was a problem hiding this comment.
Grammar in the admin message: “has not customizations” reads incorrectly. Consider changing to “has no customizations” (or similar) since this is user-facing text.
| html = "This sponsorship has not customizations so there's no estimated cost" | |
| html = "This sponsorship has no customizations so there's no estimated cost" |
| @@ -16,8 +16,8 @@ def render_email(value): | |||
| mailbox_tokens = mailbox.split(".") | |||
| domain_tokens = domain.split(".") | |||
|
|
|||
| mailbox = "<span>.</span>".join(mailbox_tokens) | |||
| domain = "<span>.</span>".join(domain_tokens) | |||
| mailbox = format_html_join(mark_safe("<span>.</span>"), "{}", [(token,) for token in mailbox_tokens]) | |||
| domain = format_html_join(mark_safe("<span>.</span>"), "{}", [(token,) for token in domain_tokens]) | |||
|
|
|||
| return format_html(f"{mailbox}<span>@</span>{domain}") | |||
| return mailbox + mark_safe("<span>@</span>") + domain | |||
There was a problem hiding this comment.
There are existing tests for render_email() in apps/companies/tests.py, but they only assert the returned string, not template rendering/escaping. Add a test that renders a template using the filter to ensure the output remains HTML-safe (i.e., spans are not escaped).
Instead of using formatting-strings with values and calling
mark_safe(), change most uses toformat_html[_join](). Thedjango-markupfieldspackage seems unmaintained, and I am not sure what the best way to handle arbitrary markup fields securely in Django.