fix(#15): ERROR log when crawl upserts fewer than 50 campaigns#22
Merged
Jing-yilin merged 2 commits intodevelopfrom Feb 27, 2026
Merged
fix(#15): ERROR log when crawl upserts fewer than 50 campaigns#22Jing-yilin merged 2 commits intodevelopfrom
Jing-yilin merged 2 commits intodevelopfrom
Conversation
Jing-yilin
commented
Feb 27, 2026
Contributor
Author
Jing-yilin
left a comment
There was a problem hiding this comment.
Blocking gap in the sanity check: upserted is not a count of unique campaigns anymore. After the earlier stacked changes, the same PID can show up under a root category, a subcategory, and multiple sort passes, and upserted += len(campaigns) increments on every occurrence. A partial parser break that only extracts a handful of repeated projects can still clear the >= 50 threshold, so this log won't reliably detect the failure mode described here. We should base the check on distinct PIDs or successful non-empty pages instead.
87d5b1e to
487c6e3
Compare
7c6a763 to
2922605
Compare
Add post-crawl sanity check: if total upserted < 50, emit an ERROR-level log line naming the likely cause (HTML selector change or ScrapingBee issue). Distinguishes silent parse failure from a legitimately quiet day.
2922605 to
7721938
Compare
487c6e3 to
151e123
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #15
Changes
upserted < 50, emitERROR-level log naming the likely cause (HTML selector change or ScrapingBee degradation)Stack
PR 4/6. Base: #21 (page depth). Next: #16 (backers_count)