Skip to content

Remove XSS potential using format_html helpers#2950

Merged
JacobCoffee merged 4 commits intopython:mainfrom
sethmlarson:xss
Mar 5, 2026
Merged

Remove XSS potential using format_html helpers#2950
JacobCoffee merged 4 commits intopython:mainfrom
sethmlarson:xss

Conversation

@sethmlarson
Copy link
Contributor

Instead of using formatting-strings with values and calling mark_safe(), change most uses to format_html[_join](). The django-markupfields package seems unmaintained, and I am not sure what the best way to handle arbitrary markup fields securely in Django.

@sethmlarson sethmlarson requested a review from JacobCoffee as a code owner March 3, 2026 20:18
@JacobCoffee JacobCoffee requested a review from Copilot March 5, 2026 15:52
@JacobCoffee JacobCoffee merged commit 1eff475 into python:main Mar 5, 2026
7 checks passed
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce XSS risk in Django admin + template output by replacing string interpolation + mark_safe() with format_html() / format_html_join() (and enabling Ruff’s S308 check by removing the global ignore).

Changes:

  • Replace several HTML-producing code paths with format_html() / format_html_join().
  • Enable Ruff S308 (suspicious mark_safe usage) by removing the global ignore and adding targeted # noqa: S308 in a few places.
  • Refactor some admin/template-tag HTML builders for readability (e.g., dedent() blocks).

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
pyproject.toml Enables Ruff S308 by removing the global ignore.
apps/successstories/admin.py Switches link rendering to format_html() with arguments.
apps/sponsors/forms.py Replaces mark_safe labels with format_html() labels.
apps/sponsors/admin.py Converts multiple admin display helpers to format_html*() and refactors HTML assembly.
apps/downloads/templatetags/download_tags.py Refactors wbr_wrap to use format_html_join() and format_html().
apps/companies/templatetags/companies.py Refactors render_email to build HTML via format_html_join() / concatenation.
apps/boxes/templatetags/boxes.py Adds an inline # noqa: S308 for an intentional mark_safe().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 804 to +810
@@ -801,8 +807,7 @@ def get_custom_benefits_added_by_user(self, obj):
if not benefits:
return "---"

html = "".join([f"<p>{b}</p>" for b in benefits])
return mark_safe(html)
return format_html_join("", "<p>{}</p>", benefits)
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format_html_join() expects an iterable of tuples; passing benefits (a list of strings) will splat each string into characters and only the first character will be used for {}. Wrap each benefit as (benefit,) before joining.

Copilot uses AI. Check for mistakes.
Comment on lines 99 to 118
def wbr_wrap(value: str | None) -> str:
"""Insert <wbr> tags for optional line breaking, prioritising halfway break.

Uses inline-block spans for halves so the browser prefers breaking
at the midpoint first, then within each half if still too wide.
"""
if not value:
return value or ""

interval = 16
chunks = [value[i : i + interval] for i in range(0, len(value), interval)]

# Split into two halves, each half has internal <wbr> breaks
midpoint = len(chunks) // 2
first_half = "<wbr>".join(chunks[:midpoint])
second_half = "<wbr>".join(chunks[midpoint:])
first_half = format_html_join(mark_safe("<wbr>"), "{}", chunks[:midpoint])
second_half = format_html_join(mark_safe("<wbr>"), "{}", chunks[midpoint:])

return mark_safe(
f'<span class="checksum-half">{first_half}</span><wbr><span class="checksum-half">{second_half}</span>'
return format_html(
'<span class="checksum-half">{}</span><wbr><span class="checksum-half">{}</span>', first_half, second_half
)
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wbr_wrap() behavior is user-visible and there are existing template-tag tests in apps/downloads/tests/test_template_tags.py, but this filter isn’t covered. Please add a unit test that asserts the full string is preserved and <wbr> insertion works (including a case that spans multiple chunks).

Copilot uses AI. Check for mistakes.
Comment on lines 10 to +22
@@ -16,8 +16,8 @@ def render_email(value):
mailbox_tokens = mailbox.split(".")
domain_tokens = domain.split(".")

mailbox = "<span>.</span>".join(mailbox_tokens)
domain = "<span>.</span>".join(domain_tokens)
mailbox = format_html_join(mark_safe("<span>.</span>"), "{}", [(token,) for token in mailbox_tokens])
domain = format_html_join(mark_safe("<span>.</span>"), "{}", [(token,) for token in domain_tokens])

return format_html(f"{mailbox}<span>@</span>{domain}")
return mailbox + mark_safe("<span>@</span>") + domain
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

render_email() currently builds HTML by concatenating strings/SafeStrings, which produces a plain str and will be auto-escaped when used as a template filter (regressing the rendered output). Build and return the final value via format_html(...) (and avoid mark_safe constants, which will also trigger S308 now).

Copilot uses AI. Check for mistakes.
return mark_safe(html)
html += format_html(
dedent("""
<li><b>{year}</b>:"
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s an extra literal double-quote in the generated HTML (<li><b>{year}</b>:") which will show up in the admin output. Remove the stray " from the template string.

Suggested change
<li><b>{year}</b>:"
<li><b>{year}</b>:

Copilot uses AI. Check for mistakes.
if obj.value and getattr(obj.value, "url", None):
html = f"<a href='{obj.value.url}' target='_blank'>{obj.value}</a>"
return mark_safe(html)
html = format_html("<a href='{}' target='_blank'>{}</a>", (obj.value.url, obj.value))
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This format_html() call is passing a single tuple as the first positional argument, so the rendered link will contain the tuple string (and the second {} will be left unresolved). Pass obj.value.url and obj.value as separate arguments.

Suggested change
html = format_html("<a href='{}' target='_blank'>{}</a>", (obj.value.url, obj.value))
html = format_html("<a href='{}' target='_blank'>{}</a>", obj.value.url, obj.value)

Copilot uses AI. Check for mistakes.
def get_estimated_cost(self, obj):
"""Return the estimated cost HTML for customized sponsorships."""
cost = None
html = "This sponsorship has not customizations so there's no estimated cost"
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar in the admin message: “has not customizations” reads incorrectly. Consider changing to “has no customizations” (or similar) since this is user-facing text.

Suggested change
html = "This sponsorship has not customizations so there's no estimated cost"
html = "This sponsorship has no customizations so there's no estimated cost"

Copilot uses AI. Check for mistakes.
Comment on lines 10 to +22
@@ -16,8 +16,8 @@ def render_email(value):
mailbox_tokens = mailbox.split(".")
domain_tokens = domain.split(".")

mailbox = "<span>.</span>".join(mailbox_tokens)
domain = "<span>.</span>".join(domain_tokens)
mailbox = format_html_join(mark_safe("<span>.</span>"), "{}", [(token,) for token in mailbox_tokens])
domain = format_html_join(mark_safe("<span>.</span>"), "{}", [(token,) for token in domain_tokens])

return format_html(f"{mailbox}<span>@</span>{domain}")
return mailbox + mark_safe("<span>@</span>") + domain
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are existing tests for render_email() in apps/companies/tests.py, but they only assert the returned string, not template rendering/escaping. Add a test that renders a template using the filter to ensure the output remains HTML-safe (i.e., spans are not escaped).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants