Utf8 regression#1000
Merged
wasade merged 6 commits intobiocore:masterfrom Dec 23, 2025
Merged
Conversation
ebolyen
approved these changes
Dec 23, 2025
| def test_parse_biom_table_hdf5_accent_utf8_regression(self): | ||
| tab = example_table.copy() | ||
| tab.update_ids({i: f'{i}fóó' for i in tab.ids()}, inplace=True) | ||
| with NamedTemporaryFile(delete=False) as tmpfile: |
Member
There was a problem hiding this comment.
Should delete be False? I think this would leave behind a file on the system.
Member
Author
There was a problem hiding this comment.
Potentially, but err'ing on following the style of the rest of the script. And, a left over file in TMPDIR isn't the best but not the worst thing to do
Member
There was a problem hiding this comment.
Oh I see. It looks like those add this to the end of the test:
self.to_remove.append(tmpfile.name)It's mostly github's problem these days I suppose. So your call, but wanted to mention it.
| ids_dtype = 'U%d' % max([len(v) for v in ids]) | ||
| ids = np.asarray(ids, dtype=ids_dtype) | ||
| else: | ||
| # .asstr does not handle an empty dataset |
Member
Author
There was a problem hiding this comment.
Might be HDF5 as the C-API is touchy. Or, we may not be correctly enforcing types on empty tables
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A regression on parse of non-trivial UTF-8 characters was introduced. Reported here.