Skip to content
/ server Public

Comments

MDEV-28213 Skip ignored domain IDs during GTID validation#4677

Open
bodyhedia44 wants to merge 1 commit intoMariaDB:10.6from
bodyhedia44:MDEV-28213-fix-ignored-domain-validation
Open

MDEV-28213 Skip ignored domain IDs during GTID validation#4677
bodyhedia44 wants to merge 1 commit intoMariaDB:10.6from
bodyhedia44:MDEV-28213-fix-ignored-domain-validation

Conversation

@bodyhedia44
Copy link

When a slave connects to a master using MASTER_USE_GTID=Slave_Pos and the
master has purged old binlogs, the master validates the slave's GTID state
against the oldest available binlog's Gtid_list event. If the Gtid_list
references domains that the slave is configured to ignore (via
CHANGE MASTER IGNORE_DOMAIN_IDS or DO_DOMAIN_IDS), validation incorrectly
fails with error 1236:

"Could not find GTID state requested by slave in any binlog files.
Probably the slave state is too old and required binlog files have
been purged."

This is a false rejection -- the slave does not need events from those domains.

Fix: the slave now sends its IGNORE_DOMAIN_IDS and DO_DOMAIN_IDS to the master
as user variables (@slave_connect_state_domain_ids_ignore and
@slave_connect_state_domain_ids_do) before COM_BINLOG_DUMP. The master reads
these and skips validation for ignored domains in three code paths:

  • contains_all_slave_gtid(): skip domains not needed by the slave when
    searching for the right binlog file
  • check_slave_start_position(): skip validation for domains the slave
    does not care about
  • gtid_find_binlog_file(): pass the ignore/do lists through to the above

This is backwards compatible: older masters store the unknown user variables
harmlessly, and older slaves simply do not send them.

Includes MTR test rpl.rpl_gtid_ignored_domain_ids_validation covering both
IGNORE_DOMAIN_IDS and DO_DOMAIN_IDS scenarios with purged binlogs.

@CLAassistant
Copy link

CLAassistant commented Feb 21, 2026

CLA assistant check
All committers have signed the CLA.

@bodyhedia44 bodyhedia44 force-pushed the MDEV-28213-fix-ignored-domain-validation branch from 988af46 to 155d0fe Compare February 21, 2026 14:28
@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Feb 23, 2026
Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution. This is a preliminary review.

@bodyhedia44 bodyhedia44 requested a review from gkodinov February 23, 2026 17:10
Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not do multiple commits. Please stick to a single commit and amend it.

sql/slave.cc Outdated
sprintf(err_buff, "%s Error: Out of memory", errmsg);
goto err;
}
for (uint i= 0; i < do_ids->elements; i++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good quantity of repeating code here. I'd consider making a helper function and passing down the list and the name to print as arguments.

sql/sql_repl.cc Outdated
bool expect_number= true;

/* Skip leading whitespace */
while (p < end && *p == ' ')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you really need to skip leading space twice?

sql/sql_repl.cc Outdated

while (p < end)
{
char *endptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this inside if(expect_number).

sql/sql_repl.cc Outdated
while (p < end)
{
char *endptr;
ulong domain_id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto this one

sql/sql_repl.cc Outdated
@retval 0 success
@retval 1 error
*/
static int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any specific reason why you're not returning a bool?

sql/sql_repl.cc Outdated
const DYNAMIC_ARRAY *do_ids, ulong domain_id)
{
/* If IGNORE_DOMAIN_IDS is set, check if this domain is in it */
for (uint32 i= 0; i < ignore_ids->elements; i++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any specific reason why you're not sorting this array and then using bsearch?

sql/sql_repl.cc Outdated
*/
if (do_ids->elements > 0)
{
for (uint32 i= 0; i < do_ids->elements; i++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto for this one: please sort and then use bsearch.

When a slave is configured with IGNORE_DOMAIN_IDS or DO_DOMAIN_IDS,
the master's binlog dump thread should skip GTID state validation for
those filtered domains. This avoids false ER_GTID_POSITION_NOT_FOUND
errors when the slave does not have (or need) the current GTID state
for domains it is filtering.

The slave now sends its IGNORE/DO domain ID lists to the master via
user variables @slave_connect_state_domain_ids_ignore and
@slave_connect_state_domain_ids_do, which the master reads in
mysql_binlog_send() and passes to check_slave_start_position().

Changes:
- sql/sql_repl.cc: load_ignore_domain_ids() returns bool, fix parser
  to avoid redundant whitespace skip and scope local variables tightly.
  Add ulong_cmp() comparator. Replace O(n) linear scans in
  is_domain_id_ignored() with bsearch() after sorting the arrays.
- sql/slave.cc: Add build_domain_ids_query() helper to construct SET
  queries for domain ID user variables. Refactor duplicate code into a
  loop using a struct array.
- mysql-test/suite/rpl/t/rpl_gtid_ignored_domain_ids_validation.test:
  New test validating end-to-end GTID replication with domain filtering.
@bodyhedia44 bodyhedia44 force-pushed the MDEV-28213-fix-ignored-domain-validation branch from 8f36cde to 2f4d520 Compare February 24, 2026 21:41
@bodyhedia44 bodyhedia44 requested a review from gkodinov February 24, 2026 21:42
@bodyhedia44
Copy link
Author

done

Comment on lines +2597 to +2599
Send the slave's IGNORE_DOMAIN_IDS and DO_DOMAIN_IDS to the master,
so it can skip GTID state validation for domains the slave doesn't
care about. See MDEV-28213.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why on the master side?

Let’s see – if the ignored domains are not provided in the @@gtid_slave_pos, then the master will think the slave wants to replicate those domains from the beginning, even though the domain will end up ignored, regardless of where the master starts.

So this problem is really overlapping with (but not necessarily entirely part of) MDEV-9345 filtering on master, #4086.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

4 participants