feat(ilp): binary wire protocol by mtopolnik · Pull Request #7 · questdb/java-questdb-client

mtopolnik · 2026-02-25T17:11:23Z

TODO in run_tests_pipeline.yaml! Change before merging!

# TODO: remove branch once jh_experiment_new_ilp is merged
- script: git clone --depth 1 -b jh_experiment_new_ilp https://github.com/questdb/questdb.git ./questdb

Change to

- script: git clone --depth 1 https://github.com/questdb/questdb.git ./questdb

…va-questdb-client into jh_experiment_new_ilp

sendPongFrame() used the shared sendBuffer, calling reset() which destroyed any partially-built frame the caller had in progress via getSendBuffer(). This could happen when a PING arrived during receiveFrame()/tryReceiveFrame() while the caller was mid-way through constructing a data frame. Add a dedicated 256-byte controlFrameBuffer for sending pong responses. RFC 6455 limits control frame payloads to 125 bytes plus a 14-byte max header, so 256 bytes is sufficient and never needs to grow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sendCloseFrame() used reason.length() (UTF-16 code units) to calculate the payload size, but wrote reason.getBytes(UTF_8) (UTF-8 bytes) into the buffer. For non-ASCII close reasons, UTF-8 encoding can be longer than the UTF-16 length, causing writes past the declared payload size. This corrupted the frame header length, the masking range, and could overrun the allocated buffer. Compute the UTF-8 byte array upfront and use its length for all sizing calculations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When receiving a CLOSE frame from the server, the client now echoes a close frame back before marking the connection as no longer upgraded. This is required by RFC 6455 Section 5.5.1. The close code parsing was moved out of the handler-null check so the code is always available for the echo. The echo uses the dedicated controlFrameBuffer to avoid clobbering any in-progress frame in the main send buffer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Handle CONTINUATION frames (opcode 0x0) in tryParseFrame() which were previously silently dropped. Fragment payloads are accumulated in a lazily-allocated native memory buffer and delivered as a complete message to the handler when the final FIN=1 frame arrives. The FIN bit is now checked on TEXT/BINARY frames: FIN=0 starts fragment accumulation, FIN=1 delivers immediately. Protocol errors are raised for continuation without an initial fragment and for overlapping fragmented messages. The fragment buffer is freed in close() and the fragmentation state is reset on disconnect(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add a configurable maximum size for the WebSocket receive buffer, mirroring the pattern already used by WebSocketSendBuffer. Previously, growRecvBuffer() doubled the buffer without any upper bound, allowing a malicious server to trigger out-of-memory by sending arbitrarily large frames. Add getMaximumResponseBufferSize() to HttpClientConfiguration (defaulting to Integer.MAX_VALUE for backwards compatibility) and enforce the limit in both growRecvBuffer() and appendToFragmentBuffer(), which had the same unbounded growth issue for fragmented messages. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Tests that expect connection failure were hardcoding ports (9000, 19999) which could collide with running services. When a QuestDB server is running on port 9000, the WebSocket connection succeeds and the test fails with "Expected LineSenderException". Replace hardcoded ports with dynamically allocated ephemeral ports via ServerSocket(0). The port is bound and immediately closed, guaranteeing nothing is listening when the test tries to connect. Affected tests: - testBuilderWithWebSocketTransportCreatesCorrectSenderType - testConnectionRefused - testWsConfigString - testWsConfigString_missingAddr_fails - testWsConfigString_protocolAlreadyConfigured_fails Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The Sec-WebSocket-Accept header validation used case-sensitive String.contains(), which violates RFC 7230 (HTTP headers are case-insensitive). A server sending the header in a different casing (e.g., sec-websocket-accept) would cause the handshake to fail. Replace with a containsHeaderValue() helper that uses String.regionMatches(ignoreCase=true) for the header name lookup, avoiding both the case-sensitivity bug and unnecessary string allocation from toLowerCase(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace byte-by-byte native-heap copies in writeToSocket and readFromSocket with Unsafe.copyMemory(), using the 5-argument form that bridges native memory and Java byte arrays via Unsafe.BYTE_OFFSET. Add WebSocketChannelTest with a local echo server that verifies data integrity through the copy paths across various payload sizes and patterns. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move maxSentSymbolId and sentSchemaHashes updates to after the send/enqueue succeeds in both async and sync flush paths. Previously these were updated before the send, so if sealAndSwapBuffer() threw (async) or sendBinary()/waitForAck() threw (sync), the next batch's delta dictionary would omit symbols the server never received, silently corrupting subsequent data. Also move sentSchemaHashes.add() inside the messageSize > 0 guard in the sync path, where it was incorrectly marking schemas as sent even when no data was produced. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The validate() range check used TYPE_DECIMAL256 (0x15) as the upper bound, which excluded TYPE_CHAR (0x16). CHAR columns would throw IllegalArgumentException on validation. Extend the upper bound to TYPE_CHAR and add tests covering all valid type codes, nullable CHAR, and invalid type rejection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace raw AssertionError with LineSenderException when a token parameter is provided in ws:: or wss:: configuration strings. The else branch in config string parsing was unreachable when the code only supported HTTP and TCP, but became reachable after WebSocket support was added. Users now get a clear "token is not supported for WebSocket protocol" error instead of a cryptic AssertionError. Add test assertions for both ws:: and wss:: schemas with token. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

QwpVarint provided standalone LEB128 varint encode/decode utilities, but no production code uses it. The actual varint encoding path goes through WebSocketSendBuffer.putVarint(), which has built-in capacity checking via ensureCapacity() on each byte write. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

recvOrTimeout() caught all HttpClientException from ioWait(), treating every error as a timeout. This silently swallowed real I/O errors such as epoll/kqueue/select failures ("queue error") and control setup failures ("epoll_ctl failure"). Add an isTimeout flag to HttpClientException and tag it via flagAsTimeout() at the two throw sites: dieWaiting() and getRemainingTimeOrThrow(). The catch in recvOrTimeout() now re-throws non-timeout exceptions instead of swallowing them. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add two tests that verify recvOrTimeout() correctly distinguishes timeout exceptions from real I/O errors. A FakeSocket that always returns 0 from recv() forces the ioWait path, and a controllable ioWait override injects either a timeout or a queue error. - testRecvOrTimeoutReturnsFalseOnTimeout: verifies that a timeout-flagged exception causes receiveFrame() to return false. - testRecvOrTimeoutPropagatesNonTimeoutError: verifies that a non-timeout exception (e.g., epoll/kqueue failure) propagates out of receiveFrame() instead of being silently swallowed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove classes with zero production callers: - ParanoiaState.java - cairo/arr/BorrowedArray.java - cutlass/http/HttpCookie.java - cutlass/http/HttpHeaderParameterValue.java - cutlass/qwp/protocol/QwpNullBitmap.java - cutlass/qwp/protocol/QwpZigZag.java - std/AbstractLowerCaseCharSequenceHashSet.java - std/Base64Helper.java - std/ConcurrentHashMap.java - std/ConcurrentIntHashMap.java - std/FilesFacade.java - std/GenericLexer.java - std/LongObjHashMap.java - std/LowerCaseCharSequenceHashSet.java - std/ex/BytecodeException.java - std/str/DirectCharSequence.java Remove corresponding test files: - test/cutlass/qwp/protocol/QwpNullBitmapTest.java - test/cutlass/qwp/protocol/QwpZigZagTest.java - test/std/ConcurrentHashMapTest.java - test/std/ConcurrentIntHashMapTest.java Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The core module already has a superset QwpBitReader, and QwpGorillaEncoderTest (the only user of the client-side reader) has moved to core. This eliminates the duplicated reader class. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove classes that lost all production callers after the previous round of dead-code removal: - BufferWindowCharSequence (only implementor was GenericLexer) - ImmutableIterator (only implementor was GenericLexer) - CharSequenceHashSet (only used in GenericLexer) - BiIntFunction (only used in ConcurrentIntHashMap) - GeoHashes (only caller was Rnd.nextGeoHash()) Also remove Rnd.nextGeoHash() which has zero callers, and delete the CharSequenceHashSetTest. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove 16 dead public methods/constants that have zero callers in non-test code within the java-questdb-client submodule: Numbers: isPow2(), parseFloat(), parseIPv4Quiet(), parseInt000Greedy(), parseIntQuiet(), parseIntSafely(), parseLong000000Greedy() Unsafe: arrayGetVolatile() (2 overloads), arrayPutOrdered() (2 overloads), defineAnonymousClass(), getNativeAllocator(), setRssMemLimit() Also: Misc.getWorkerAffinity(), Hash.hashLong128_32(), Hash.hashLong128_64(), IOOperation.HEARTBEAT Clean up infrastructure that only served the removed methods: AnonymousClassDefiner interface and its two implementations, NATIVE_ALLOCATORS array and constructNativeAllocator(), INT_OFFSET/INT_SCALE fields, and newly unused imports. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove dead methods with no production callers from five classes: Numbers.java: appendLong256(), appendUuid(), appendHexPadded(long, int), intToIPv4Sink(), and their private helpers (appendLong256Four/Three/Two). Chars.java: contains(), noMatch(). Utf8s.java: stringFromUtf8BytesSafe(), both toString() overloads, validateUtf8() and its four private helpers. ColumnType.java: encodeArrayTypeWithWeakDims(). Update test code that referenced deleted methods: TestUtils replaces Chars.contains() with String.contains() and inlines ipv4ToString(). TestHttpClient replaces Utf8s.toString() with a direct null-safe call. ColumnTypeTest and NumbersTest drop tests for removed methods. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The client's Unsafe class had getRssMemLimit(), checkAllocLimit(), and RSS_MEM_LIMIT_ADDR, but no setRssMemLimit() method. The limit address was always zero, making the allocation limit check in malloc() and realloc() dead code. Remove the field, both methods, the static initializer slot, and the now-stale allocator.rs layout comment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

QwpWebSocketEncoder had two buffer fields: ownedBuffer (the allocated NativeBufferWriter) and buffer (the active write target, typed as QwpBufferWriter interface). The indirection existed so callers could inject an external buffer via setBuffer(), but no code ever called that method. The split caused a use-after-free: close() freed and nulled ownedBuffer but left buffer pointing to the closed writer. Merge the two fields into a single NativeBufferWriter buffer. Delete setBuffer(), isUsingExternalBuffer(), and the conditional guard in reset(). close() now nulls the only reference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ColumnBuffer constructor leaks allocateStorage() buffers when the subsequent Unsafe.calloc() for the null bitmap throws. Wrap both allocateStorage() and calloc in try/catch so close() frees any partially allocated buffers before rethrowing. Similarly, allocateStorage() leaks stringOffsets when the stringData allocation fails for STRING/VARCHAR columns. Guard the second allocation with try/catch that closes stringOffsets on failure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

QwpSchemaHash.computeSchemaHash() and computeSchemaHashDirect() encoded lone surrogates (high surrogate at end of string, or lone low surrogate) as 3-byte UTF-8, while OffHeapAppendMemory.putUtf8() replaced them with '?' (1 byte). This mismatch caused the schema hash to diverge from the actual wire encoding when a column name contained a lone surrogate. Add a Character.isSurrogate(c) guard before the 3-byte else branch in both methods, so lone surrogates hash as '?' — consistent with putUtf8(). Add tests covering lone high surrogate at end of string and lone low surrogate for both computeSchemaHash and computeSchemaHashDirect. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add a Cache@2 task to the Azure Pipelines BuildAndTest matrix so that the 4 platform jobs (linux, mac-arm, mac-x64, windows) reuse downloaded Maven dependencies across runs instead of fetching them from scratch each time. The cache key incorporates the OS and all pom.xml hashes. A restoreKeys fallback on just the OS ensures partial cache hits when dependencies change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add always() to the condition expressions in questdb_stop.yaml so the stop steps execute regardless of whether prior steps succeeded. Without this, Azure DevOps defaults to succeeded() and skips the cleanup when Maven tests fail, leaving the QuestDB server process orphaned until the VM is torn down. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The Cache@2 task uses $(HOME)/.m2/repository as the cache path. On Linux/macOS agents, HOME is a native environment variable that Azure DevOps imports as a pipeline variable. On Windows, HOME is not set natively (Windows uses USERPROFILE), so $(HOME) remains unexpanded as a literal string, causing tar to fail with "could not chdir to 'D:\a\1\s\$(HOME)\.m2\repository'". Add a step before the cache task that sets HOME from USERPROFILE on Windows agents via ##vso[task.setvariable]. The step runs only when Agent.OS is Windows_NT, leaving Linux/macOS unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

QwpWebSocketSender.shouldAutoFlush() accepted an autoFlushBytes parameter but never evaluated it. This commit implements the byte threshold by querying actual column buffer sizes rather than maintaining an estimated counter. This counts the bytes in column buffers, before wire-encoding. Encoded size will be less due to Gorilla compression, bit-packing, etc. QwpTableBuffer.getBufferedBytes() sums OffHeapAppendMemory append offsets and array data across all columns. QwpWebSocketSender.getPendingBytes() aggregates this across all table buffers. shouldAutoFlush() now checks this sum against autoFlushBytes between the row-count and interval checks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add testSenderBuilderWebSocket() to QwpSenderTest which exercises the production Sender.builder(Transport.WEBSOCKET) path end-to-end. All existing E2E tests use QwpWebSocketSender.connect() directly, but production users would use Sender.builder(). The new test writes rows with mixed column types (symbol, double, long, boolean, string) and verifies data arrives correctly via SQL queries. Also convert string concatenation in assertion expected values to text blocks throughout QwpSenderTest for improved readability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

QwpWebSocketSender.table() accepted any CharSequence without validation, unlike the HTTP sender which validates via TableUtils. Empty, null, or whitespace-only table names passed through to the server, causing internal errors instead of clear client-side diagnostics. Add validateTableName() in table() and checkedColumnName() in all column methods. Both use TableUtils.isValidTableName/ isValidColumnName to reject empty, null, too-long, and names with illegal characters, throwing LineSenderException with a descriptive message. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

QwpTableBuffer.addDecimal64/128/256 silently rescale values when scales differ, but when rescaling down would lose precision, Decimal256.rescale() threw a raw NumericException that leaked to the caller unwrapped. Catch NumericException from rescale() and wrap it in LineSenderException with a clear message naming the column and the incompatible scales. Add unit tests for the Decimal64 and Decimal128 precision-loss paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

bluestreak01 and others added 30 commits February 14, 2026 20:05

merge fallout

de43e2a

tidy

dee93bd

move client test

ab2f5b5

more tidy

73ee623

ilpv4 -> QWP (QuestDB Wire Protocol)

49ecfa7

tidy

10c306a

wip

f12a6f9

wip2

d442198

wip3

ea798b8

wip4

3e444a0

wip 5

dd4bb62

wip 9

f609b5a

wip 11

eb9531d

wip 12

7f16328

move test workloads/benchmarks to the client maven module

8795081

wip 13

992cd8e

Merge branch 'jh_experiment_new_ilp' of https://github.com/questdb/ja…

eacff2d

…va-questdb-client into jh_experiment_new_ilp

Auto-cleanup test code

00c145b

Don't default QUESTDB_RUNNING to true

dbca590

mtopolnik and others added 30 commits March 3, 2026 10:14

Delete QwpBitReader and its test

efa92e3

The core module already has a superset QwpBitReader, and QwpGorillaEncoderTest (the only user of the client-side reader) has moved to core. This eliminates the duplicated reader class. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Delete unused code

dc6833c

Use enhanced switch

53e4481

Simplify websocket handshake code

b2cd9a8

Code style, remove dead code

99b7552

Delete dead code, enhanced switch, auto-reorder

5df3ca5

Enhanced switch

159ee3d

Delete unused code

73e8986

Code style

eceded1

Magic bytes: ILP4 -> QWP1

1002c26

Magic bytes change fallout

c49d75d

Improve error message

8094d57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ilp): binary wire protocol#7

feat(ilp): binary wire protocol#7
mtopolnik wants to merge 147 commits intomainfrom
jh_experiment_new_ilp

mtopolnik commented Feb 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mtopolnik commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mtopolnik commented Feb 25, 2026 •

edited

Loading