Skip to content

Conversation

@jeremyestein
Copy link
Collaborator

@jeremyestein jeremyestein commented Jan 8, 2026

Implements #29 and fixes #26.

Snakemake is run according to a cron specification (default is once a day in the early hours of the morning).

It causes the CSVs output by the waveform-controller to be converted to both kinds of parquet, and then the de-id parquet to be uploaded to the DSH, leaving behind marker files with upload stats so that snakemake knows they are done, and so do the humans!

Toy hasher needed a fix because Python's hash method is not stable from run to run. (Switch to using real hasher is in #35)

Also add a pipeline debugging guide.

@jeremyestein jeremyestein linked an issue Jan 8, 2026 that may be closed by this pull request
1 task
@jeremyestein jeremyestein mentioned this pull request Jan 8, 2026
6 tasks
@jeremyestein jeremyestein linked an issue Jan 8, 2026 that may be closed by this pull request
6 tasks
@jeremyestein jeremyestein marked this pull request as ready for review January 23, 2026 12:49
Fix docformatter errors in place, and stop ruff and docformatter from fighting with each other.
Copy link
Collaborator

@skeating skeating left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't get it running locally and I certainly couldn't find a 'tools' directory as given in it develop.md file

I think this could be filed as an issue to revisit if we have time; so not a blocker

@jeremyestein
Copy link
Collaborator Author

I couldn't get it running locally and I certainly couldn't find a 'tools' directory as given in it develop.md file

I think this could be filed as an issue to revisit if we have time; so not a blocker

Interesting, @thompson318 is that your script and did you intend to commit it to this repo?

@jeremyestein jeremyestein merged commit d07b7e4 into dev Jan 27, 2026
2 checks passed
@jeremyestein jeremyestein deleted the jeremy/pipeline branch January 27, 2026 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pipeline logic CSN is still present in de-id file names

3 participants