feat(#79): GetPercentileAtOrBelowValue#135
Conversation
af33f5d to
6c9aae1
Compare
LeeCampbell
left a comment
There was a problem hiding this comment.
PR #135 Review: feat(#79): GetPercentileAtOrBelowValue
Author: leecampbell-codeagent | Base: main | Files: 2 (+151, -0)
Overview
This PR adds GetPercentileAtOrBelowValue(long value) to HistogramBase — the inverse of GetValueAtPercentile. Given a measurement value, it returns the percentile (0-100) at which that value falls in the distribution. This closes issue #79 and matches the already-documented method in spec/tech-standards/api-reference.md.
Implementation Review (HistogramBase.cs, +32 lines)
Algorithm correctness: Good. The implementation correctly mirrors the bucket-traversal loop in GetValueAtPercentile but in the forward direction:
- Guard for
TotalCount == 0→ returns0.0 - Computes target bucket/sub-bucket indices via
GetBucketIndex/GetSubBucketIndex - Guard for
bucketIndex >= BucketCount→ returns100.0 - Accumulates counts from
(0, 0)through(targetBucket, targetSubBucket) - Returns
(100.0 * runningCount) / TotalCount
The inner loop start index (j = 0 for bucket 0, j = SubBucketCount / 2 for subsequent buckets) correctly handles the logarithmic bucketing scheme, consistent with GetValueAtPercentile at line 378.
Issues found:
-
Typo in branch name and PR title —
GetPerecentileOfValueshould beGetPercentileOfValue. The branch isagent/79-getperecentileofvalue. Not blocking but worth fixing the PR title. -
Math.Min(100.0, ...)is unnecessary but harmless — SincerunningCount <= TotalCountalways holds (we only accumulate up to the target bucket),100.0 * runningCount / TotalCount <= 100.0is guaranteed. The JavapercentileAtOrBelowValueomits this clamp. It's defensive and won't cause issues, but it's a minor divergence from the reference implementation. -
No handling of negative input values — If
valueis negative,GetBucketIndexwill compute a very large bucket index (becauseNumberOfLeadingZerosreturns 0 for negative longs), which thebucketIndex >= BucketCountguard catches — returning100.0. This is incorrect semantics: a negative value should logically return0.0(no recorded values can be below a negative value since the histogram only tracks non-negative values). The Java implementation has the same behaviour, so this is consistent, but worth being aware of. Consider adding a guard:if (value < 0) return 0.0;
-
XML doc is good — Follows existing
<summary>,<param>,<returns>pattern. Minor note: the<param>tag lacks a period at the end, which is inconsistent with some other methods in the file, but this is a nit.
Test Review (HistogramTestBase.cs, +119 lines)
Coverage: Comprehensive. All 7 acceptance criteria are covered:
| Criterion | Test(s) |
|---|---|
| 1. Public method exists | All tests exercise it |
2. Returns [0.0, 100.0] |
ResultIsAlwaysInRange |
3. Returns % of values <= v |
KnownValueSet, SingleRecordedValue, BoundaryAtLowestTrackableValue |
4. Returns 100.0 for values at/above max |
ValueAtOrAboveHighestTrackable_Returns100 |
5. Returns 0.0 when empty |
EmptyHistogram_ReturnsZero |
| 6. Round-trip consistency | RoundTrip_ConsistentWithGetValueAtPercentile |
| 7. Monotonicity | IsMonotonic |
Plus a bonus test: LargeMagnitudeValues_ReturnsCorrectPercentile — tests values across different magnitude ranges. Good addition.
Issues found:
-
ResultIsAlwaysInRangehas a redundant assertion —histogram.GetPercentileAtOrBelowValue(0).Should().Be(0.0)asserts outside the loop, but0is also included in thequeryValuearray inside the loop where it's checked with.BeInRange(0.0, 100.0). The standalone assertion is stricter (checks exact 0.0 vs range), so it's not exactly redundant, but the test would read more clearly if the exact-0.0 check was either inside the loop with a conditional or documented with a comment explaining why it's tested separately. -
Missing edge case test: value = 0 — No test explicitly records values and then queries
GetPercentileAtOrBelowValue(0). TheResultIsAlwaysInRangetest queries 0 but only asserts it's0.0without context for why (no values recorded at 0). A dedicated test would strengthen coverage. -
SingleRecordedValuetests assumeTestValueLevel - 1maps to a different bucket —TestValueLevelis 4.TestValueLevel - 1is 3. Both values are small enough to be in bucket 0 but in different sub-buckets with 3 significant digits, so this works correctly. But the test would be more robust with a comment explaining this assumption. -
RoundTriptest could be fragile with edge values — The round-trip assertsGetValueAtPercentile(p) == HighestEquivalentValue(LowestEquivalentValue(v)). This is correct for the histogram's equivalent-value semantics, but values at bucket boundaries could theoretically behave differently. The chosen test values (1, 10, 100, 1000) are well within safe ranges, so this is fine in practice.
Commit Hygiene
-
5 commits with planning artifacts — The commit history includes:
plan(#79): initial brief from issueplan(#79): review briefplan(#79): create task breakdownfeat(#79): implement tasksfeat(#79): complete implementation
The first three commits added/moved
plan/directory files that were subsequently cleaned up. The net result is clean (no plan files in the branch), but the commit history exposes internal workflow noise. Recommend squash-merging this PR to collapse into a single clean commit. -
PR title typo — "GetPerecentileOfValue" → "GetPercentileOfValue" (extra 'e').
Summary
| Category | Verdict |
|---|---|
| Correctness | The algorithm is correct and matches the Java reference implementation |
| Tests | Comprehensive — all acceptance criteria covered with good additional cases |
| Code style | Consistent with existing codebase |
| Documentation | XML doc present and adequate |
| PR hygiene | Needs title typo fix; recommend squash merge |
Recommendation: Approve with minor fixes
The implementation is solid and well-tested. The blocking items before merge:
- Fix the PR title typo ("Perecentile" → "Percentile")
- Consider squash merge to clean up the planning commits
Optional improvements (non-blocking):
- Add a guard for negative input values (
if (value < 0) return 0.0;) - Add a test for
GetPercentileAtOrBelowValue(0)on a histogram with recorded positive values - Remove the unnecessary
Math.Min(100.0, ...)clamp to match the Java implementation exactly (or keep it — it's harmless)
- Add guard for negative input values returning 0.0 - Add test for GetPercentileAtOrBelowValue(0) with positive recorded values - Add test for negative input values - Fix missing period in XML doc param tag Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Issue #79: GetPercentileOfValue
Summary
The issue requests a method to answer: "Given a measurement value, what percentile does it fall under?"
This is the inverse of
GetValueAtPercentile.The API reference spec already names it
GetPercentileAtOrBelowValue(long value)— that name is more precise and matches the Java HdrHistogram library'spercentileAtOrBelowValue.The method must be implemented in
HistogramBaseand exposed as a public instance method.What Needs to Change and Why
HistogramBasecurrently hasGetValueAtPercentile(double percentile) → longbut no inverse.Users recording response times need to ask: "300ms is at what percentile?" without iterating the distribution themselves.
The spec (
spec/tech-standards/api-reference.mdline 132) already listsGetPercentileAtOrBelowValueas part of the public API surface, so this is a missing implementation.Affected Files (confirmed by exploration)
HdrHistogram/HistogramBase.cs— addGetPercentileAtOrBelowValue(long value)method (nearGetValueAtPercentile, around line 388)HdrHistogram.UnitTests/HistogramTestBase.cs— add shared tests covering all histogram typesspec/tech-standards/api-reference.md— no change needed; method is already listedAcceptance Criteria
double GetPercentileAtOrBelowValue(long value)exists onHistogramBase.[0.0, 100.0].v,GetPercentileAtOrBelowValue(v)returns the percentage of all recorded values that are<= v(using the histogram's equivalent-value range semantics).100.0whenvalue >= HighestTrackableValueor whenvalueis beyond the tracked range.0.0whenTotalCount == 0(empty histogram).GetValueAtPercentile:GetValueAtPercentile(GetPercentileAtOrBelowValue(v))should equal the highest equivalent value of the bucket containingv(within histogram resolution).v1 < v2,GetPercentileAtOrBelowValue(v1) <= GetPercentileAtOrBelowValue(v2).Algorithm
Mirrors the bucket-traversal loop in
GetValueAtPercentilebut in the forward direction:targetBucketIndex = GetBucketIndex(value)andtargetSubBucketIndex = GetSubBucketIndex(value, targetBucketIndex).targetBucketIndex >= BucketCount, return100.0.(targetBucketIndex, targetSubBucketIndex).(100.0 * countAtValue) / TotalCount.TotalCount == 0, return0.0.Test Strategy
Add tests to
HistogramTestBase(runs across all histogram types via the abstract factory):GetPercentileAtOrBelowValuereturns the expected percentage.GetValueAtPercentile(GetPercentileAtOrBelowValue(v))equalsHighestEquivalentValue(LowestEquivalentValue(v)).TotalCount == 0returns0.0.HighestTrackableValuereturns100.0.>= that valuereturns100.0, any query< that valuereturns0.0.LowestTrackableValuereturns(count_at_lowest / TotalCount) * 100.Tests should use
FluentAssertionswith appropriate tolerance for floating-point comparisons (e.g.,BeApproximately(..., precision: 0.1)).Risks and Open Questions
GetPercentileOfValue; the spec saysGetPercentileAtOrBelowValue.Use
GetPercentileAtOrBelowValueto match the spec and Java library.100.0for values aboveHighestTrackableValue(consistent with Java implementation).GetBucketIndex/GetSubBucketIndex, matching the same resolution logic used throughoutHistogramBase.All values within the same equivalent range map to the same percentile.
LongConcurrentHistogram,IntConcurrentHistogram) inherit the same implementation and their atomic reads ensure a consistent (if momentarily stale) snapshot.Task breakdown
Task Checklist: Issue #79 —
GetPercentileAtOrBelowValueCross-reference: every acceptance criterion in
brief.mdis covered by at least one task below (see criterion labels in parentheses).Implementation
HdrHistogram/HistogramBase.cs— AddGetPercentileAtOrBelowValue(long value)as a public instance method returningdouble, placed directly afterGetValueAtPercentile(around line 388).GetValueAtPercentilein the forward direction):TotalCount == 0, return0.0. (criterion 5)bucketIndex = GetBucketIndex(value)andsubBucketIndex = GetSubBucketIndex(value, bucketIndex).bucketIndex >= BucketCount, return100.0. (criterion 4)(bucketIndex, subBucketIndex)inclusive, accumulating counts viaGetCountAt(i, j). Use the same loop structure asGetValueAtPercentile: first bucket starts atj = 0; subsequent buckets start atj = SubBucketHalfCount. (criterion 3)(100.0 * runningCount) / TotalCount, clamped to[0.0, 100.0]. (criteria 1, 2)HistogramBase, compiles without error, and all tests below pass.HdrHistogram/HistogramBase.cs— Add XML doc comment forGetPercentileAtOrBelowValuefollowing the existing style (summary,<param>,<returns>).GetValueAtPercentile; the return range[0.0, 100.0]; the equivalent-value-range semantics.dotnet buildproduces no XML doc warnings for the new method.Unit Tests
All tests go in
HdrHistogram.UnitTests/HistogramTestBase.csas[Fact]methods on the abstract base class, so they run forShortHistogram,IntHistogram, andLongHistogramautomatically.Use
FluentAssertionsand the existingCreate(highestTrackableValue, significantFigures)/Create(lowestTrackableValue, highestTrackableValue, significantFigures)factory helpers.Floating-point comparisons must use
.BeApproximately(..., precision: 0.1)unless an exact value is expected.Empty histogram returns 0.0 (criterion 5)
histogram.GetPercentileAtOrBelowValue(TestValueLevel).Should().Be(0.0).Value at or above
HighestTrackableValuereturns 100.0 (criterion 4)histogram.GetPercentileAtOrBelowValue(DefaultHighestTrackableValue).Should().Be(100.0).histogram.GetPercentileAtOrBelowValue(long.MaxValue / 2).Should().Be(100.0).Basic percentile — known value set (criterion 3)
1through100(one each), soTotalCount == 100.GetPercentileAtOrBelowValue(50)is approximately50.0(±0.1).GetPercentileAtOrBelowValue(100)is approximately100.0(±0.1).GetPercentileAtOrBelowValue(1)is approximately1.0(±0.1).Return value is always in range
[0.0, 100.0](criterion 2)HighestTrackableValue), assert each result.Should().BeInRange(0.0, 100.0).Round-trip consistency with
GetValueAtPercentile(criterion 6)1,10,100,1000.v, computep = histogram.GetPercentileAtOrBelowValue(v).histogram.GetValueAtPercentile(p)equalshistogram.HighestEquivalentValue(histogram.LowestEquivalentValue(v)).Monotonicity — non-decreasing percentiles for increasing values (criterion 7)
1,50,100,500,1000.1, 10, 50, 100, 500, 1000, 5000in order.>=the previous one.Single recorded value — queries below return 0.0, queries at or above return 100.0 (criteria 3, 4, 5)
TestValueLevel(= 4) once.GetPercentileAtOrBelowValue(TestValueLevel - 1).Should().Be(0.0)(value below the recorded value).GetPercentileAtOrBelowValue(TestValueLevel).Should().Be(100.0)(value equals the recorded value).GetPercentileAtOrBelowValue(TestValueLevel + 1000).Should().Be(100.0)(value above the recorded value).Boundary — query at
LowestTrackableValue(criterion 3)lowestTrackableValue = 1.1once and value1000once (TotalCount == 2).GetPercentileAtOrBelowValue(1)is approximately50.0(±0.1) — one of two values is<= 1.Acceptance Criterion Coverage Matrix
double GetPercentileAtOrBelowValue(long value)onHistogramBase[0.0, 100.0]<= vusing equivalent-value semantics100.0whenvalue >= HighestTrackableValueor beyond range0.0whenTotalCount == 0GetValueAtPercentile(GetPercentileAtOrBelowValue(v))equals highest equivalent value of bucketCloses #79