FEAT Beam Search for OpenAIResponseTarget by riedgar-ms · Pull Request #1346 · Azure/PyRIT

riedgar-ms · 2026-02-02T14:23:36Z

Description

Use the Lark grammar feature of the OpenAIResponseTarget to create a beam search for PyRIT. This is a single turn attack, where a collection of candidate responses (the beams) are maintained. On each iteration, the model's response is allowed to extend a little for each beam. The beams are scored, with the worst performing ones discarded, and replaced with copies of higher scoring beams.

Tests and Documentation

Have basic unit tests of the classes added, but since this requires features only currently in the OpenAIResponseTarget there didn't seem much point in mocking that. There is a notebook which runs everything E2E.

…h-01

…t be callable

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.

Copilot · 2026-02-27T18:57:32Z

pyrit/executor/attack/single_turn/beam_search.py

+        await self._setup_async(context=current_context)
+
+        message = self._get_message(current_context)
+        beam.id = current_context.conversation_id


_propagate_beam_async calls _setup_async, which mutates shared instance state (self._start_context) and is executed concurrently across beams. This creates a race condition and also overwrites the original start context for subsequent iterations. Refactor so per-beam setup does not write to shared state (e.g., create a local conversation_id + call ConversationManager.initialize_context_async directly without touching _start_context).

Suggested change

await self._setup_async(context=current_context)

message = self._get_message(current_context)

beam.id = current_context.conversation_id

local_conversation_id = str(uuid.uuid4())

current_context.conversation_id = local_conversation_id

await self._conversation_manager.initialize_context_async(

context=current_context,

conversation_reference=ConversationReference(

conversation_id=local_conversation_id,

conversation_type=ConversationType.ATTACK,

),

prepended_conversation_config=self._prepended_conversation_config,

)

message = self._get_message(current_context)

beam.id = local_conversation_id

pyrit/executor/attack/single_turn/beam_search.py

Copilot · 2026-02-27T18:57:33Z

pyrit/executor/attack/single_turn/beam_search.py

+        # Log the attack configuration
+        self._logger.info(f"Starting {self.__class__.__name__} with objective: {context.objective}")
+
+        beams = [Beam(id=context.conversation_id, text="", score=0.0) for _ in range(self._num_beams)]
+


BeamSearchAttack adds significant new behavior (beam propagation/scoring/pruning), but the current tests mainly cover grammar/reviewer and init validation. Add unit tests that mock PromptNormalizer.send_prompt_async and scorers to validate _perform_async end-to-end (beam expansion across iterations, scoring-driven pruning, and AttackResult fields like related_conversations).

pyrit/executor/attack/single_turn/beam_search.py

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

pyrit/prompt_target/openai/openai_response_target.py

pyrit/executor/attack/single_turn/__init__.py

doc/code/executor/attack/beam_search_attack.py

Copilot · 2026-02-27T19:18:03Z

pyrit/executor/attack/single_turn/beam_search.py

+        if beam.objective_score and beam.objective_score.get_value():
+            # We have a positive score, so it's a success
+            return AttackOutcome.SUCCESS, "Objective achieved according to scorer"
+
+        # No response at all (all attempts filtered/failed)
+        return AttackOutcome.FAILURE, "All attempts were filtered or failed to get a response"
+


The failure outcome_reason here is misleading when a response was produced but the objective scorer returned a negative/false score. As written, any non-success (including “objective not achieved”) reports that all attempts were filtered/failed. Consider distinguishing between “no scorable response” vs “objective not achieved” (e.g., check beam.message / beam.objective_score and craft a reason accordingly).

Suggested change

if beam.objective_score and beam.objective_score.get_value():

# We have a positive score, so it's a success

return AttackOutcome.SUCCESS, "Objective achieved according to scorer"

# No response at all (all attempts filtered/failed)

return AttackOutcome.FAILURE, "All attempts were filtered or failed to get a response"

if beam.objective_score is not None:

if beam.objective_score.get_value():

# We have a positive score, so it's a success

return AttackOutcome.SUCCESS, "Objective achieved according to scorer"

# A response was scored but did not achieve the objective

return AttackOutcome.FAILURE, "Objective not achieved according to scorer"

# No objective score was produced

if beam.message is None:

# No response at all (all attempts filtered/failed)

return AttackOutcome.FAILURE, "All attempts were filtered or failed to get a response"

# We have a response but no positive objective score

return AttackOutcome.FAILURE, "Objective not achieved: no positive score for the response"

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Copilot · 2026-03-02T19:56:05Z

pyrit/prompt_target/openai/openai_response_target.py

+    def fresh_instance(
+        self,
+        *,
+        extra_body_parameters: Optional[dict[str, Any]] = None,
+        grammar_name: Optional[str] = None,
+    ) -> "OpenAIResponseTarget":
+        """
+        Create a fresh instance of the OpenAIResponseTarget with the same configuration.
+
+        Optionally override extra body parameters or a grammar name for the new instance.
+
+        Args:
+            extra_body_parameters (Optional[dict[str, Any]]): Optional overrides for the
+                extra body parameters of the new instance.
+            grammar_name (Optional[str]): Optional override for the grammar name of the
+                new instance.
+
+        Returns:
+            OpenAIResponseTarget: A new instance of OpenAIResponseTarget.
+        """
+        init_args: dict[str, Any] = deepcopy(self._init_args)
+        if extra_body_parameters is not None:
+            init_args["extra_body_parameters"] = deepcopy(extra_body_parameters)
+        result = OpenAIResponseTarget(**init_args)
+        if grammar_name is not None:
+            result._grammar_name = grammar_name
+        return result


The new fresh_instance method on OpenAIResponseTarget has no test coverage in tests/unit/target/test_openai_response_target.py. Given that the rest of the file has comprehensive test coverage for all public methods, this is inconsistent. Tests should verify that fresh_instance returns an instance with the same base config, that extra_body_parameters override is applied correctly, and that grammar_name is set correctly on the result.

Copilot · 2026-03-02T19:56:06Z

pyrit/executor/attack/single_turn/beam_search.py

+            objective_scorer=self._objective_scorer,
+            auxiliary_scorers=self._auxiliary_scorers,
+            role_filter="assistant",
+            objective=context.objective,


In _score_beam_async, the call to Scorer.score_response_async does not pass skip_on_error_result=True, unlike the analogous call in PromptSendingAttack._evaluate_response_async (which explicitly passes skip_on_error_result=True). If any message piece has a response_error that is not "none", the scorer will not skip it and may raise an unexpected error. This could cause the entire asyncio.TaskGroup to fail rather than gracefully skipping the problematic beam. The _propagate_beam_async method has a broad try/except that handles this case there, but scoring errors are not handled here.

Suggested change

objective=context.objective,

objective=context.objective,

skip_on_error_result=True,

Copilot · 2026-03-02T19:56:06Z

pyrit/executor/attack/single_turn/beam_search.py

+            # Review beams at the top of the loop for simplicity
+            beams = self._beam_reviewer.review(beams=beams)


The _perform_async method calls self._beam_reviewer.review(beams=beams) on the very first iteration (step 0) when all beams have text="" and score=0.0. On the first iteration, reviewing uninitialized beams causes the reviewer to sort and potentially duplicate beams that all have identical state, which is a wasted operation. The reviewer is designed to improve beam quality based on previous scores, but in the first iteration no scores exist yet.

Consider skipping the review on the first iteration (when all beams are uninitialized), or initializing beams after the first propagation step.

Suggested change

# Review beams at the top of the loop for simplicity

beams = self._beam_reviewer.review(beams=beams)

# Review beams only after the first iteration when they have been updated

if step > 0:

beams = self._beam_reviewer.review(beams=beams)

…h-01

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Copilot · 2026-03-03T16:41:21Z

tests/unit/executor/attack/single_turn/test_beam_search.py

+    scorer.get_identifier.return_value = get_mock_scorer_identifier()
+    return scorer
+
+


The mock_non_true_false_scorer fixture is defined (line 57) but not used in any test. It should either be removed or have a corresponding test that validates that using a non-FloatScale scorer as an auxiliary scorer raises a ValueError.

Suggested change

class TestAttackScoringConfigValidation:

def test_non_float_scale_auxiliary_scorer_raises_value_error(

self,

mock_true_false_scorer,

mock_non_true_false_scorer,

):

with pytest.raises(ValueError):

AttackScoringConfig(

primary_scorer=mock_true_false_scorer,

auxiliary_scorers=[mock_non_true_false_scorer],

)

Copilot · 2026-03-03T16:41:22Z

pyrit/executor/attack/single_turn/beam_search.py

+        prefix = self.text.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n")
+
+        # Prune non-printable characters from the prefix to avoid issues in the grammar
+        end_index = 0
+        for i in range(len(prefix) - 1, -1, -1):
+            if str.isprintable(prefix[i]):
+                end_index = i + 1
+                break
+        prefix = prefix[:end_index]
+
+        return grammar_template.format(prefix=prefix, n_chars=n_chars)


The get_grammar method uses grammar_template.format(prefix=prefix, n_chars=n_chars) but doesn't escape curly braces ({ and }) in the prefix string. If the model generates text containing { or } characters (e.g., code snippets like function() { return 1; }), Python's str.format() will attempt to interpret them as format placeholders and raise a KeyError.

To fix this, escape curly braces in the prefix before calling .format(), by adding .replace("{", "{{").replace("}", "}}") to the prefix transformations at line 61. Alternatively, use an f-string or a different templating approach that doesn't use str.format() for dynamic insertion of potentially unsafe content.

Have added a test..... this doesn't appear to be a problem?

Copilot · 2026-03-03T16:41:22Z

pyrit/executor/attack/single_turn/beam_search.py

+        except Exception as e:
+            # Just log the error and skip the update
+            self._logger.warning(f"Error propagating beam, skipping this update: {e}")
+


After the beam reviewer creates deepcopies of beams from the previous iteration (for non-top-k beams), those copied beams retain the message field from the previous propagation. If a beam's propagation fails in a subsequent iteration (exception caught at line 398), the beam's message is not cleared — it still holds the old response from the prior iteration. This means scoreable_beams (line 307) will still include this beam (since message is not None), and _score_beam_async will score the stale old response, potentially overwriting a valid score with misleading results.

Consider resetting beam.message = None at the start of _propagate_beam_async for beams that are being re-propagated, so that failed propagations produce beams with no message rather than stale old messages.

…h-01

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Copilot · 2026-03-03T18:32:24Z

tests/unit/executor/attack/single_turn/test_beam_search.py

+@pytest.fixture
+def mock_non_true_false_scorer():
+    """Create a mock scorer that is not a true/false type"""
+    scorer = MagicMock(spec=Scorer)
+    scorer.get_identifier.return_value = get_mock_scorer_identifier()
+    return scorer
+
+


The mock_non_true_false_scorer fixture defined at line 56–61 is never referenced in any test method in this file. It should either be used in a test or removed to avoid dead code.

Suggested change

@pytest.fixture

def mock_non_true_false_scorer():

"""Create a mock scorer that is not a true/false type"""

scorer = MagicMock(spec=Scorer)

scorer.get_identifier.return_value = get_mock_scorer_identifier()

return scorer

Copilot · 2026-03-03T18:32:24Z

tests/unit/executor/attack/single_turn/test_beam_search.py

+                beam_reviewer=TopKBeamReviewer(k=2, drop_chars=1),
+                attack_scoring_config=scoring_config,
+                num_chars_per_step=0,
+            )


The test file only covers the initialization/validation logic of BeamSearchAttack, TopKBeamReviewer, and Beam.get_grammar. There are no tests for the core execution paths: _perform_async, _propagate_beam_async, _score_beam_async, _get_target_for_beam, and _determine_attack_outcome. Other single-turn attacks like SkeletonKeyAttack and PromptSendingAttack have comprehensive execution tests that mock the target and normalizer (see tests/unit/executor/attack/single_turn/test_skeleton_key.py and test_prompt_sending.py). The _propagate_beam_async and _score_beam_async can be tested by mocking self._prompt_normalizer and the scoring infrastructure, without needing a real OpenAIResponseTarget. Similarly, _get_target_for_beam can be tested by mocking fresh_instance.

Suggested change

)

)

def test_get_target_for_beam_uses_fresh_instance(self, mock_true_false_scorer, mock_float_scale_scorer):

"""Test that _get_target_for_beam obtains a fresh target instance via fresh_instance."""

# Arrange: create an objective target whose fresh_instance method is mocked

objective_target = MagicMock(spec=OpenAIResponseTarget)

fresh_target_instance = MagicMock(spec=PromptTarget)

objective_target.fresh_instance.return_value = fresh_target_instance

scoring_config = AttackScoringConfig(

objective_scorer=mock_true_false_scorer,

auxiliary_scorers=[mock_float_scale_scorer],

)

attack = BeamSearchAttack(

objective_target=objective_target,

beam_reviewer=TopKBeamReviewer(k=2, drop_chars=1),

attack_scoring_config=scoring_config,

)

# Use a simple Beam instance; the specific content is not important for this interaction

beam = Beam(prompt="test", score=0.0)

# Act: call the internal helper

target_for_beam = attack._get_target_for_beam(beam)

# Assert: fresh_instance was used and its return value was propagated

objective_target.fresh_instance.assert_called_once()

assert target_for_beam is fresh_target_instance

Copilot · 2026-03-03T18:32:25Z

pyrit/executor/attack/single_turn/beam_search.py

+PREFIX: "{prefix}"
+CONTINUATION: /.{{0,{n_chars}}}/
+"""
+        prefix = self.text.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n")


The get_grammar method escapes \\, ", and \n from the beam text before embedding it in the Lark grammar string. However, \r (carriage return, U+000D) is not escaped. Since '\r'.isprintable() returns False in Python, any trailing \r characters will be pruned by the trailing non-printable pruning loop. However, an embedded \r that appears before printable characters in the beam text will NOT be pruned and will be inserted unescaped into the grammar string literal, causing a Lark parse error at runtime.

If the model ever produces a response containing a Windows-style \r\n line ending, the \r would be embedded unescaped in the grammar. The escaping should be extended to handle \r (replacing it with \\r), and similarly any other non-printable characters that could appear in an intermediate position should be stripped or escaped before the trailing-prune step.

Suggested change

prefix = self.text.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n")

escaped_parts = []

for ch in self.text:

if ch == "\\":

escaped_parts.append("\\\\")

elif ch == '"':

escaped_parts.append('\\"')

elif ch == "\n":

escaped_parts.append("\\n")

elif ch == "\r":

escaped_parts.append("\\r")

elif ch.isprintable():

escaped_parts.append(ch)

prefix = "".join(escaped_parts)

Copilot · 2026-03-03T18:32:25Z

pyrit/executor/attack/single_turn/beam_search.py

+        # Store the prepended conversation configuration
+        self._prepended_conversation_config = prepended_conversation_config
+        self._start_context: Optional[SingleTurnAttackContext[Any]] = None
+


BeamSearchAttack does not override get_attack_scoring_config(), while the closely related PromptSendingAttack (which also stores _objective_scorer and _auxiliary_scorers) does override it. This means the attack's objective scorer and auxiliary scorers are not included in the attack's ComponentIdentifier, making it harder to distinguish between BeamSearchAttack instances configured with different scorers. For consistency with PromptSendingAttack (at pyrit/executor/attack/single_turn/prompt_sending.py:121-131), BeamSearchAttack should override get_attack_scoring_config() to return an AttackScoringConfig built from self._objective_scorer and self._auxiliary_scorers.

Suggested change

def get_attack_scoring_config(self) -> AttackScoringConfig:

"""

Get the attack scoring configuration for this beam search attack.

Returns:

AttackScoringConfig: The scoring configuration with objective and auxiliary scorers.

"""

return AttackScoringConfig(

objective_scorer=self._objective_scorer,

auxiliary_scorers=self._auxiliary_scorers,

)

Copilot · 2026-03-03T18:32:25Z

pyrit/executor/attack/single_turn/beam_search.py

+        if beam.objective_score and beam.objective_score.get_value():
+            # We have a positive score, so it's a success
+            return AttackOutcome.SUCCESS, "Objective achieved according to scorer"
+
+        # No response at all (all attempts filtered/failed)
+        return AttackOutcome.FAILURE, "All attempts were filtered or failed to get a response"
+


The comment at line 508 says "No response at all (all attempts filtered/failed)" but this outcome_reason applies to two distinct cases: (1) all attempts genuinely failed/were filtered (so beam.objective_score is None), and (2) the objective scorer returned a non-positive score (the beam got a response but didn't achieve the objective). The misleading outcome_reason ("All attempts were filtered or failed to get a response") is reported in both cases, even when a response was successfully received but failed the objective check. The outcome reason should reflect the actual cause: when beam.objective_score is None, use "No score was produced"; when it's present but non-positive, use "Objective not achieved according to scorer".

Suggested change

if beam.objective_score and beam.objective_score.get_value():

# We have a positive score, so it's a success

return AttackOutcome.SUCCESS, "Objective achieved according to scorer"

# No response at all (all attempts filtered/failed)

return AttackOutcome.FAILURE, "All attempts were filtered or failed to get a response"

objective_value = None

if beam.objective_score is not None:

objective_value = beam.objective_score.get_value()

if objective_value is not None and objective_value > 0:

# We have a positive score, so it's a success

return AttackOutcome.SUCCESS, "Objective achieved according to scorer"

if beam.objective_score is None:

# No score was produced for this beam

return AttackOutcome.FAILURE, "No score was produced"

# A response was scored but did not achieve the objective

return AttackOutcome.FAILURE, "Objective not achieved according to scorer"

riedgar-ms added 30 commits November 4, 2025 11:45

Adding a method to create a fresh instance of an OpenAIResponseTarget

e7ff64c

Doing some bludgeoning

9d9363e

Throwing together the rest of the code (I hope)

913bc6f

Hacking...

2aa63db

A little more output for debugging

8b09ce9

Starting to think about making this an attack instead

8320953

A couple more options

3b43850

Copy more implementations

0a861e3

Lint

65ae3da

Trying to puzzle out how things fit

4ea1495

Applying some bludgeoning

91ed0c6

Chipping away, trying to get scoring working

02a60a0

Merge remote-tracking branch 'origin/main' into riedgar-ms/beam-searc…

112ceed

…h-01

Temporarily add test script to branch

a793a6f

Merge branch 'main' into riedgar-ms/beam-search-01

7c37ac1

Tweaking due to all changes since this was started

1d98653

Small change

afa2aa7

More updates

f7f6aa5

Work better?

bb1b6fd

Something done for us now

0f1f485

Merge remote-tracking branch 'origin/main' into riedgar-ms/beam-searc…

81b8e71

…h-01

Cleaning up a little more

45cbe76

More updates (still not quite working)

5fc3042

Clean up some more errors

8c20d10

Get the beams propagating

84d852d

Merge remote-tracking branch 'origin/main' into riedgar-ms/beam-searc…

fdaeeb0

…h-01

Go on to the scoring

4810516

Use the Chat target for the refusal scorer.... seems more reliable

2f1e15e

Getting things working

42f66b1

Fiddling, trying to make stuff more robust

e0b0061

riedgar-ms added 3 commits February 27, 2026 13:45

Duplicate import

daa7190

Grammar fix

caae89e

Copilot suggests being a little more careful around things which migh…

69d2574

…t be callable

Copilot AI review requested due to automatic review settings February 27, 2026 18:51

Copilot started reviewing on behalf of riedgar-ms February 27, 2026 18:52 View session

Copilot AI reviewed Feb 27, 2026

View reviewed changes

riedgar-ms added 3 commits February 27, 2026 14:00

Another thing from copilot

9215d39

Unused variable

752c883

Rerun notebook

b9bd3b9

Copilot AI review requested due to automatic review settings February 27, 2026 19:11

Copilot started reviewing on behalf of riedgar-ms February 27, 2026 19:11 View session

Copilot AI reviewed Feb 27, 2026

View reviewed changes

Copilot fix docstring

6374cb3

riedgar-ms mentioned this pull request Mar 2, 2026

FEAT Embed schema in SelfAskRefusalScorer #1432

Open

Merge branch 'main' into riedgar-ms/beam-search-01

9432397

Copilot AI review requested due to automatic review settings March 2, 2026 19:44

Copilot started reviewing on behalf of riedgar-ms March 2, 2026 19:45 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

riedgar-ms added 2 commits March 3, 2026 08:33

Merge remote-tracking branch 'origin/main' into riedgar-ms/beam-searc…

e8d8f20

…h-01

Linting fix

84f7cdf

Copilot AI review requested due to automatic review settings March 3, 2026 16:28

Copilot started reviewing on behalf of riedgar-ms March 3, 2026 16:29 View session

Merge remote-tracking branch 'origin/main' into riedgar-ms/beam-searc…

48366f8

…h-01

Copilot AI reviewed Mar 3, 2026

View reviewed changes

riedgar-ms added 3 commits March 3, 2026 13:17

Extra test in response to comment

987120e

Make integer more distinctive

0970c17

Merge remote-tracking branch 'origin/main' into riedgar-ms/beam-searc…

69b75bd

…h-01

Copilot AI review requested due to automatic review settings March 3, 2026 18:20

Copilot started reviewing on behalf of riedgar-ms March 3, 2026 18:21 View session

Copilot AI reviewed Mar 3, 2026

View reviewed changes

-        await self._setup_async(context=current_context)
-        message = self._get_message(current_context)
-        beam.id = current_context.conversation_id
+        local_conversation_id = str(uuid.uuid4())
+        current_context.conversation_id = local_conversation_id
+        await self._conversation_manager.initialize_context_async(
+            context=current_context,
+            conversation_reference=ConversationReference(
+                conversation_id=local_conversation_id,
+                conversation_type=ConversationType.ATTACK,
+            ),
+            prepended_conversation_config=self._prepended_conversation_config,
+        )
+        message = self._get_message(current_context)
+        beam.id = local_conversation_id

	objective=context.objective,
	objective=context.objective,
	skip_on_error_result=True,

		# Review beams at the top of the loop for simplicity
		beams = self._beam_reviewer.review(beams=beams)

		scorer.get_identifier.return_value = get_mock_scorer_identifier()
		return scorer

+class TestAttackScoringConfigValidation:
+    def test_non_float_scale_auxiliary_scorer_raises_value_error(
+        self,
+        mock_true_false_scorer,
+        mock_non_true_false_scorer,
+    ):
+        with pytest.raises(ValueError):
+            AttackScoringConfig(
+                primary_scorer=mock_true_false_scorer,
+                auxiliary_scorers=[mock_non_true_false_scorer],
+            )

-            )
+            )
+    def test_get_target_for_beam_uses_fresh_instance(self, mock_true_false_scorer, mock_float_scale_scorer):
+        """Test that _get_target_for_beam obtains a fresh target instance via fresh_instance."""
+        # Arrange: create an objective target whose fresh_instance method is mocked
+        objective_target = MagicMock(spec=OpenAIResponseTarget)
+        fresh_target_instance = MagicMock(spec=PromptTarget)
+        objective_target.fresh_instance.return_value = fresh_target_instance
+        scoring_config = AttackScoringConfig(
+            objective_scorer=mock_true_false_scorer,
+            auxiliary_scorers=[mock_float_scale_scorer],
+        )
+        attack = BeamSearchAttack(
+            objective_target=objective_target,
+            beam_reviewer=TopKBeamReviewer(k=2, drop_chars=1),
+            attack_scoring_config=scoring_config,
+        )
+        # Use a simple Beam instance; the specific content is not important for this interaction
+        beam = Beam(prompt="test", score=0.0)
+        # Act: call the internal helper
+        target_for_beam = attack._get_target_for_beam(beam)
+        # Assert: fresh_instance was used and its return value was propagated
+        objective_target.fresh_instance.assert_called_once()
+        assert target_for_beam is fresh_target_instance

-        prefix = self.text.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n")
+        escaped_parts = []
+        for ch in self.text:
+            if ch == "\\":
+                escaped_parts.append("\\\\")
+            elif ch == '"':
+                escaped_parts.append('\\"')
+            elif ch == "\n":
+                escaped_parts.append("\\n")
+            elif ch == "\r":
+                escaped_parts.append("\\r")
+            elif ch.isprintable():
+                escaped_parts.append(ch)
+        prefix = "".join(escaped_parts)

+    def get_attack_scoring_config(self) -> AttackScoringConfig:
+        """
+        Get the attack scoring configuration for this beam search attack.
+        Returns:
+            AttackScoringConfig: The scoring configuration with objective and auxiliary scorers.
+        """
+        return AttackScoringConfig(
+            objective_scorer=self._objective_scorer,
+            auxiliary_scorers=self._auxiliary_scorers,
+        )

-        if beam.objective_score and beam.objective_score.get_value():
-            # We have a positive score, so it's a success
-            return AttackOutcome.SUCCESS, "Objective achieved according to scorer"
-        # No response at all (all attempts filtered/failed)
-        return AttackOutcome.FAILURE, "All attempts were filtered or failed to get a response"
+        objective_value = None
+        if beam.objective_score is not None:
+            objective_value = beam.objective_score.get_value()
+        if objective_value is not None and objective_value > 0:
+            # We have a positive score, so it's a success
+            return AttackOutcome.SUCCESS, "Objective achieved according to scorer"
+        if beam.objective_score is None:
+            # No score was produced for this beam
+            return AttackOutcome.FAILURE, "No score was produced"
+        # A response was scored but did not achieve the objective
+        return AttackOutcome.FAILURE, "Objective not achieved according to scorer"

Conversation

riedgar-ms commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests and Documentation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

riedgar-ms Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

riedgar-ms commented Feb 2, 2026 •

edited

Loading