Improvements to partition export#1402
Conversation
…xport request in case pool is full
…ly preserve lock inside task setting
|
@codex please review it deeply |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6fad4e2402
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /// what to do if it's not a valid value? | ||
| } | ||
|
|
||
| manifest.lock_inside_the_task = json->getValue<bool>("lock_inside_the_task"); |
There was a problem hiding this comment.
Keep manifest parsing backward-compatible
Make lock_inside_the_task optional when deserializing metadata. Existing metadata.json znodes created before this change do not contain that key, so json->getValue<bool>("lock_inside_the_task") throws during polling/system-table reads and can prevent processing of in-flight exports after upgrade until those znodes are manually cleaned up.
Useful? React with 👍 / 👎.
| LOG_INFO(storage.log, "ExportPartFromPartitionExportTask: Failed to lock part {}, skipping", part_name); | ||
| return false; |
There was a problem hiding this comment.
Remove export manifest when lock acquisition fails
Clean up the pre-inserted export_manifests entry on the lock-failure path. In lock_inside_the_task mode, the scheduler inserts the manifest before execution, but if tryCreate(.../locks/<part>) fails here, the task exits without erasing that manifest, leaving the part permanently marked as already-exporting on this replica and blocking retries if the current lock owner dies before finishing.
Useful? React with 👍 / 👎.
| info.transaction_id = metadata.transaction_id; | ||
| info.create_time = metadata.create_time; |
There was a problem hiding this comment.
Populate query_id in partition export system rows
Assign info.query_id when building ReplicatedPartitionExportInfo. This method now fills many fields from manifest metadata but never sets query_id, so system.replicated_partition_exports.query_id is always empty even for active exports, which is a regression from the previous implementation and breaks query-to-export correlation.
Useful? React with 👍 / 👎.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Forward port of unmerged PR #1177
Documentation entry for user-facing changes
As of now, this PR does the following things:
solvesthe problem of one replica locking all parts in a task just because it ran faster even if it did not have bandwidth to execute all of them.ExportPartFromPartitionExportTasksystem.replicated_partition_exportsoption. Refactor queryingsystem.replicated_partition_exportsto usemulti_readrequests instead of several different read requests. Throws iffmulti_readis not supported (this was vibe coded).Settingsobject instead of onlyFormatSettings- now more settings should be preserved (part export only)lock_inside_the_taskis not production ready as of now (the entire feature is not production ready tbh) - there is a possible crash in case the user schedules an export, changes the schema of the destination table, and the export executes. This is because validation is not being done on the fly for this setting. But I think it is ok to ignore this corner case for now.This partially tackles the following:
CI/CD Options
Exclude tests:
Regression jobs to run: