Skip to content

fix(#15): ERROR log when crawl upserts fewer than 50 campaigns#22

Merged
Jing-yilin merged 2 commits intodevelopfrom
feature/15-parse-alerting
Feb 27, 2026
Merged

fix(#15): ERROR log when crawl upserts fewer than 50 campaigns#22
Jing-yilin merged 2 commits intodevelopfrom
feature/15-parse-alerting

Conversation

@Jing-yilin
Copy link
Contributor

Closes #15

Changes

  • Post-crawl sanity check: if upserted < 50, emit ERROR-level log naming the likely cause (HTML selector change or ScrapingBee degradation)

Stack

PR 4/6. Base: #21 (page depth). Next: #16 (backers_count)

Copy link
Contributor Author

@Jing-yilin Jing-yilin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking gap in the sanity check: upserted is not a count of unique campaigns anymore. After the earlier stacked changes, the same PID can show up under a root category, a subcategory, and multiple sort passes, and upserted += len(campaigns) increments on every occurrence. A partial parser break that only extracts a handful of repeated projects can still clear the >= 50 threshold, so this log won't reliably detect the failure mode described here. We should base the check on distinct PIDs or successful non-empty pages instead.

@Jing-yilin Jing-yilin force-pushed the feature/14-page-depth branch from 87d5b1e to 487c6e3 Compare February 27, 2026 10:26
@Jing-yilin Jing-yilin force-pushed the feature/15-parse-alerting branch from 7c6a763 to 2922605 Compare February 27, 2026 10:26
Add post-crawl sanity check: if total upserted < 50, emit an ERROR-level
log line naming the likely cause (HTML selector change or ScrapingBee issue).
Distinguishes silent parse failure from a legitimately quiet day.
@Jing-yilin Jing-yilin force-pushed the feature/15-parse-alerting branch from 2922605 to 7721938 Compare February 27, 2026 10:28
@Jing-yilin Jing-yilin force-pushed the feature/14-page-depth branch from 487c6e3 to 151e123 Compare February 27, 2026 10:29
@Jing-yilin Jing-yilin changed the base branch from feature/14-page-depth to develop February 27, 2026 10:29
@Jing-yilin Jing-yilin merged commit 9795c7d into develop Feb 27, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant