DBD::Pg Async Query Ownership and Data Preservation Fixes#150
DBD::Pg Async Query Ownership and Data Preservation Fixes#150oetiker wants to merge 4 commits intobucardo:masterfrom
Conversation
|
ping ? |
|
Please pull the recent async-related changes and retest. |
|
Also, this PR has merge conflicts. Please rebase/resolve, @oetiker. |
I will have to re-evaluate after the changes ... |
e40cc86 to
fe5c547
Compare
|
The PR is rebased and the entire testsuite is passing including the new testfile. |
|
@turnstep : Could you approve the CI workflow to run for this PR? |
Thank you, @oetiker ! Looks promising. |
|
@esabol well, let me know when you are of a mind to merge this, or find issues. I do have other fish to fry in the meantime. Once you are ready I will rebase. If this were my code base it would bother me extremely to know that these bugs persist. |
To be clear, I can't merge anything, unfortunately. I'm not a maintainer. I'm just trying to help things along where I can. |
turnstep
left a comment
There was a problem hiding this comment.
Looks good! Will review the other sections shortly
dbdimp.c
Outdated
| STH_ASYNC, | ||
| STH_ASYNC_PREPARE | ||
| STH_ASYNC_PREPARE, | ||
| STH_ASYNC_AUTORETRIEVED = 100 /* PG_OLDQUERY_WAIT auto-retrieved results */ |
There was a problem hiding this comment.
Not sure why this is being set to 100 rather than leaving it to autonumber?
dbdimp.c
Outdated
There was a problem hiding this comment.
Need the new check here too?
dbdimp.c
Outdated
| statement. This prevents finishing an unrelated statement from invalidating a pending async query. */ | ||
| if (imp_dbh->async_sth == imp_sth) { | ||
| imp_dbh->async_sth = NULL; | ||
| imp_dbh->async_status = 0; |
There was a problem hiding this comment.
async_status already set above - and it should use the named constant, not 0
dbdimp.c
Outdated
| /* Ownership will be checked below for statement-based calls when needed */ | ||
|
|
||
| /* Check if this statement had auto-retrieved results from PG_OLDQUERY_WAIT */ | ||
| if (imp_sth && imp_sth->async_status == STH_ASYNC_AUTORETRIEVED) { |
There was a problem hiding this comment.
Put constants on the left for comparisons.
dbdimp.c
Outdated
| TRC(DBILOGFP, "%sProcessing auto-retrieved results (rows: %ld)\n", | ||
| THEADER_slow, imp_sth->rows); | ||
| } | ||
| /* Don't return early - let the full processing happen below */ |
dbdimp.c
Outdated
| TRACE_PQERRORMESSAGE; | ||
| pg_error(aTHX_ h, status, PQerrorMessage(imp_dbh->conn)); | ||
| rows = -2; | ||
| imp_sth->async_status = STH_NO_ASYNC; /* Mark as consumed */ |
There was a problem hiding this comment.
All three branches do this, so move to a single call.
dbdimp.c
Outdated
|
|
||
| if (NULL != imp_dbh->async_sth) { | ||
| /* Store metadata in the appropriate statement handle */ | ||
| /* CRITICAL: Only store results if the calling statement owns the async query */ |
There was a problem hiding this comment.
We don't need things like upper case "critical" :)
There was a problem hiding this comment.
It's all critical, ha ha ha!
dbdimp.c
Outdated
| /* Free the last_result as needed */ | ||
| if (imp_dbh->last_result && imp_dbh->result_clearable) { | ||
| TRACE_PQCLEAR; | ||
| #if DEBUG_LAST_RESULT |
dbdimp.c
Outdated
| } | ||
|
|
||
| /* Mark as auto-retrieved, not discarded */ | ||
| orig_sth->async_status = STH_ASYNC_AUTORETRIEVED; /* Special value: auto-retrieved */ |
There was a problem hiding this comment.
Do not need comments like this.
dbdimp.c
Outdated
|
|
||
| imp_sth_t *orig_sth = imp_dbh->async_sth; | ||
|
|
||
| /* Store the error result for the original statement to discover */ |
There was a problem hiding this comment.
This comment doesn't really match the code
turnstep
left a comment
There was a problem hiding this comment.
Wordy in the wrong places, but very thorough coverage.
t/08async_regression.t
Outdated
| use strict; | ||
| use warnings; | ||
| use Test::More; | ||
| use DBI; |
There was a problem hiding this comment.
I know we are inconsistent, but should leave this out as we want to make sure use DBD::Pg does it for us always.
t/08async_regression.t
Outdated
| use DBI; | ||
| use DBD::Pg qw(:async); | ||
|
|
||
| my $dsn = $ENV{DBI_DSN} || "dbi:Pg:dbname=postgres;host=localhost"; |
There was a problem hiding this comment.
Not sure why all of this instead of connect_database() like the other tests have?
t/08async_regression.t
Outdated
| plan skip_all => "Cannot connect to PostgreSQL: $DBI::errstr"; | ||
| } | ||
|
|
||
| # Test 1: Basic ownership - wrong statement cannot steal results |
There was a problem hiding this comment.
Wayyy too wordy. The test names should be most of the documentation. See the other tests.
t/08async_regression.t
Outdated
| # BEHAVIOR: | ||
| # - BEFORE: sth2->pg_result() would successfully steal sth1's pending results | ||
| # - AFTER: sth2->pg_result() fails with "wrong statement handle" error | ||
| subtest 'Basic ownership - wrong statement' => sub { |
There was a problem hiding this comment.
Do not need subtests - just make the test names verbose
t/08async_regression.t
Outdated
|
|
||
| $sth1->finish; | ||
| $sth2->finish; | ||
| }; |
t/08async_regression.t
Outdated
|
|
||
| # Also start a good query | ||
| my $good = $dbh->prepare(q{SELECT 42}, { pg_async => PG_ASYNC + PG_OLDQUERY_WAIT }); | ||
| ok($good->execute, 'good executes'); |
There was a problem hiding this comment.
Really poor test name. Compare to a random one from 04misc.t:
Method 'private_attribute_info' is available without a database handle and returns an empty hashref
t/08async_regression.t
Outdated
| ok($sth2->execute, 'sth2 executes with WAIT'); | ||
|
|
||
| # sth1's results should have been auto-retrieved and preserved | ||
| # This is the critical test - data must not be lost |
There was a problem hiding this comment.
Claude sure loves the word "critical", doesn't it? 😆
82292d2 to
5c9c691
Compare
|
Thanks for the thorough review @turnstep! I've rebased onto current master and addressed all your feedback in a single squashed commit: C code (
Test file (
|
898f315 to
28413ed
Compare
|
Thanks for the changes, will look those over soon. Debating releasing a 3.19.0 then making this part of 3.20.0 with a short turnaround. |
|
@turnstep : Please scroll down to the bottom and approve running the CI workflow for this PR. Click on "Approve workflows to run." |
- Fix double-free when imp_sth->result and imp_dbh->last_result point to the same PGresult by adding pointer equality checks - Remove redundant NULL assignments before immediate reassignment - Remove unreachable wrong-statement-handle branch in PGRES_TUPLES_OK (already caught by ownership check that returns -2) - Use PQcmdTuples for PGRES_COMMAND_OK row count in auto-retrieve path instead of hardcoding 0 - Pass imp_sth->result directly to _sqlstate instead of via local var - Add TRACE_PQERRORMESSAGE before pg_error in AUTOERROR null-check - Use -2 (error convention) for auto-error rows placeholder - Rewrite OLDQUERY_WAIT test with named variables instead of confusing array index arithmetic - Add skip_all for failed second DB connection in regression tests - Remove redundant isnt connection tests (covered by skip_all) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
I've been tinkering with dbdimp.c a lot - can you rebase? |
- Fix double-free when imp_sth->result and imp_dbh->last_result point to the same PGresult by adding pointer equality checks - Remove redundant NULL assignments before immediate reassignment - Remove unreachable wrong-statement-handle branch in PGRES_TUPLES_OK (already caught by ownership check that returns -2) - Use PQcmdTuples for PGRES_COMMAND_OK row count in auto-retrieve path instead of hardcoding 0 - Pass imp_sth->result directly to _sqlstate instead of via local var - Add TRACE_PQERRORMESSAGE before pg_error in AUTOERROR null-check - Use -2 (error convention) for auto-error rows placeholder - Rewrite OLDQUERY_WAIT test with named variables instead of confusing array index arithmetic - Add skip_all for failed second DB connection in regression tests - Remove redundant isnt connection tests (covered by skip_all) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8f1e267 to
a86053a
Compare
|
CI tests all failed with 3 failures in t/99_lint.t:
|
|
Looking into this more closely, I think 99_lint.t line 153 should be changed to this: in order to eliminate the |
|
hmmm it looks as if pg has died during the tests ? |
It's an intentional test. The problem comes from changes @turnstep made this morning in 74d0612, and it's unrelated to the work in this PR. The check @turnstep added doesn't work on Perl 5.36+. Once that's fixed, you'll need to rebase again, unfortunately, @oetiker. |
|
No worries, my goal this year is to do more PRs, less random pushes.
Starting soon :)
|
|
@oetiker , the CI has been fixed. Please rebase one more time. |
…#105) - Only clear async_sth if the statement being finished/destroyed owns it - Add ownership verification in pg_db_result() to prevent result theft - Auto-retrieve results in handle_old_async() when PG_OLDQUERY_WAIT is used, instead of discarding them - Add STH_ASYNC_AUTORETRIEVED and STH_ASYNC_AUTOERROR status values - Support both $sth->pg_result and $dbh->pg_result patterns - Add regression tests in t/08async_regression.t
- Fix double-free when imp_sth->result and imp_dbh->last_result point to the same PGresult by adding pointer equality checks - Remove redundant NULL assignments before immediate reassignment - Remove unreachable wrong-statement-handle branch in PGRES_TUPLES_OK (already caught by ownership check that returns -2) - Use PQcmdTuples for PGRES_COMMAND_OK row count in auto-retrieve path instead of hardcoding 0 - Pass imp_sth->result directly to _sqlstate instead of via local var - Add TRACE_PQERRORMESSAGE before pg_error in AUTOERROR null-check - Use -2 (error convention) for auto-error rows placeholder - Rewrite OLDQUERY_WAIT test with named variables instead of confusing array index arithmetic - Add skip_all for failed second DB connection in regression tests - Remove redundant isnt connection tests (covered by skip_all) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
c74c7f1 to
5f6f84e
Compare
|
claude fortunately is unperturbed ... |
|
All the CI tests pass! There's just one thing that seems a bit odd, and it's the message
in the test output. It's completely harmless and expected whenever you do a DROP TABLE IF EXISTS and the table doesn't exist, of course, but it seems less than ideal. @oetiker and @turnstep, what do you think about adding the following right before the DROP TABLE IF EXISTS line?
|
|
Few early-morning thoughts on that table:
|
|
All of those things make sense to me. @oetiker , can you get Claude to either rename the |
The table was created and dropped but never referenced by any query, producing a spurious NOTICE and using a non-standard name (missing the dbd_pg_ prefix). Addresses PR bucardo#150 review feedback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
done |
DBD::Pg Async Query Ownership and Data Preservation Fixes
Caveat! I developed this code and the tests tests using Claude code in order to fix to #105 and mojolicious/mojo#2276
Overview
This document describes the comprehensive fixes for DBD::Pg's asynchronous query handling, addressing GitHub issue #105 and related problems with
PG_OLDQUERY_WAITdata loss.Problems Addressed
1. GitHub Issue #105: Statement Destruction Cancels Unrelated Async Queries
Problem: Destroying any statement handle would cancel the active async query of another statement, even if the destroyed statement never executed an async query.
Root Cause: The
dbd_st_destroy()anddbd_st_finish()functions unconditionally cleared the globalimp_dbh->async_sthpointer and calledhandle_old_async()without checking if the statement being destroyed actually owned the async query.Impact: Applications using multiple statement handles with async queries would experience random query cancellations when unrelated statements were destroyed or went out of scope.
2. PG_OLDQUERY_WAIT Data Loss
Problem: Using
PG_OLDQUERY_WAITto wait for a previous async query would successfully wait but would discard the query results instead of preserving them.Root Cause: The
handle_old_async()function would consume and discard results when waiting, with no mechanism to preserve them for the owning statement.Impact: Data loss - results from async queries would be irretrievably lost when another query used
PG_OLDQUERY_WAIT.3. Missing Ownership Verification
Problem: Any statement handle could call
pg_result()and retrieve another statement's async results, violating ownership semantics.Root Cause: The
pg_db_result()function did not verify that the calling statement was the one that initiated the async query.Impact: Race conditions, incorrect data retrieval, and unpredictable behavior in multi-statement applications.
Technical Background
PostgreSQL's async query limitations:
imp_dbh->async_sthpointing to the statement that owns the current async queryThe Fixes
Fix 1: Ownership Checking in Statement Lifecycle Functions
Location:
dbdimp.c-dbd_st_finish()(lines 3977-3980)Location:
dbdimp.c-dbd_st_destroy()(lines 4122-4127, 4194-4197)Fix 2: Auto-Retrieve Mechanism for PG_OLDQUERY_WAIT
Location:
dbdimp.c-handle_old_async()(line ~5715)Instead of discarding results, the function now:
async_status = 100to indicate auto-retrieved resultsFix 3: Ownership Verification in pg_db_result()
Location:
dbdimp.c-pg_db_result()(lines 5292-5397)Added comprehensive ownership checking:
Fix 4: Auto-Retrieved Result Handling
Location: dbdimp.c - pg_db_result()
Special handling for auto-retrieved results:
New Async Status Values
The implementation uses named constants for async_status values on statement handles:
Test Coverage
The fix includes comprehensive test coverage in
t/08async_regression.twith 7 test scenarios:Behavior Changes
Before the Fix
PG_OLDQUERY_WAITwould discard previous query resultsAfter the Fix
PG_OLDQUERY_WAITauto-retrieves and preserves resultsMigration Notes
The fixes are backward compatible with one exception: