-
Notifications
You must be signed in to change notification settings - Fork 529
Audio statistics #3833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: maeb
Are you sure you want to change the base?
Audio statistics #3833
Conversation
|
When I see length, I think in seconds. I like the frames approach too, and I'd like it spelled out explicitly (num_frames or whatever). I'd like to see:
Would love to hear other feedback as well while I read into it a bit more. |
|
Added seconds and sampling rates |
isaac-chung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for adding more. Revisited some papers and maybe we should use the standard measure of audio dataset size.
isaac-chung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| for audio in audios: | ||
| array = audio["array"] | ||
| sampling_rate = audio["sampling_rate"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line assumes there is the sampling_rate key. Based on what you mentioned, this will fail for some datasets then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it's better to fix them to improve benchmark quality overall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please open an issue to track this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what to open. Possible missing sampling rate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, all audio should have sampling_rate. So if you say that's not true, then it's an issue.
| unique_audios: int | ||
|
|
||
| average_sampling_rate: float | ||
| sampling_rates: dict[int, int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this just be a unique set of sampling rates? OK either way.
| sampling_rates: dict[int, int] | |
| sampling_rates: list[int] |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Co-authored-by: Isaac Chung <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor things - generally think this looks good (of course Isaac's comments still apply, but nothing more to add)
Co-authored-by: Kenneth Enevoldsen <[email protected]>

Ref #3498
I’ve started integrating audio statistics. For now, I’ve come up with this format. Do you have any suggestions?