[FLINK-38881][postgres] Support partition table routing for PostgreSQL CDC#4216
Open
tchivs wants to merge 1 commit intoapache:masterfrom
Open
[FLINK-38881][postgres] Support partition table routing for PostgreSQL CDC#4216tchivs wants to merge 1 commit intoapache:masterfrom
tchivs wants to merge 1 commit intoapache:masterfrom
Conversation
a72da61 to
10e19a0
Compare
…L CDC This PR adds partition.tables configuration to support PostgreSQL 10+ partition table routing, significantly improving performance for databases with many partitions. New Features: - Child-to-parent routing for partition tables - Pattern-based matching for child partitions - Primary key inheritance from representative child Performance Improvements: - Fix frequent schema refresh on every table access - Fix full schema load when requesting single table - Consolidate child partition events to parent table - Cache parent table schema to reduce DB queries
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add
partition.tablesconfiguration option to support PostgreSQL 10+ partition table routing, significantly improving performance for databases with many partitions.JIRA: https://issues.apache.org/jira/browse/FLINK-38881
Background
Related to Discussion #4079 and PR #2571.
As noted in PR #2571: "It's very time-consuming for postgres to refresh schema if there are many tables to read. According to our testing, refreshing 400 tables takes 15 minutes."
This PR addresses similar performance issues specifically for PostgreSQL partition tables, where hundreds of child partitions can cause severe performance degradation.
New Features
PostgreSQL 10+ Partition Table Routing
Performance Improvements
Before (Problems)
After (Optimizations)
readSchema()now fetches only the requested tableConfiguration
Flink SQL:
Pipeline YAML:
Format:
parent:child_regex(comma-separated for multiple entries)Test Plan
PostgresPartitionRulesTest- Rule parsing testsPostgresPartitionRouterTest- Routing logic testsPostgresPartitionConnectorConfigTest- Config and filter testsPostgreSQLPartitionTablesConfigITCase- End-to-end integration test