feat: empaia parser#48
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a new EMPAIAParser for reading and filtering EMPAIA JSON annotations, exposes it via the parsers package, includes docs and tests, and bumps package version to 1.3.1 while updating authors and modernizing a Shapely import. Changes
Sequence Diagram(s)sequenceDiagram
participant File as EMPAIA JSON (file)
participant Parser as EMPAIAParser
participant Shapely as Shapely Geometry
File->>Parser: provide JSON stream or path
Parser->>Parser: load JSON into self.annotations
Parser->>Parser: compile name regex, filter items by name & type
Parser->>Shapely: convert coords -> Polygon / Point
Shapely-->>Parser: geometry object
Parser-->>Client: yield geometry
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~28 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the library's annotation parsing capabilities by integrating a new parser for EMPAIA-formatted JSON files, allowing for the extraction of both polygon and point annotations. Alongside this new feature, the project's version has been updated, and new contributors have been acknowledged. Minor code improvements were also applied to an existing parser for better consistency and type hinting. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
🧹 Nitpick comments (2)
ratiopath/parsers/empaia_parser.py (2)
31-49: Note:re.match()only matches at the start of the string.This means
name="1"won't match"Annotation 1". If you intend for the pattern to match anywhere in the name, usere.search()instead. However, if matching from the start is intentional (consistent with how other parsers work), this is fine.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ratiopath/parsers/empaia_parser.py` around lines 31 - 49, The current _get_filtered_annotations function uses name_regex.match(...) which only checks the start of annotation["name"]; if you want the provided pattern to match anywhere in the string, replace the use of re.match with re.search (i.e., compile the pattern into name_regex and call name_regex.search(annotation["name"]) when testing annotations in self.annotations["items"]) so names like "Annotation 1" will match a pattern "1"; if start-anchored matching is desired, leave as-is but add a comment clarifying the intentional behavior.
60-66: Avoid creating intermediatePointobjects when constructingPolygon.Shapely's
Polygonconstructor accepts coordinate tuples directly. CreatingPointobjects adds unnecessary overhead.♻️ Suggested simplification
for annotation in self._get_filtered_annotations(name, "polygon"): - yield Polygon( - [ - Point(float(coordinate[0]), float(coordinate[1])) - for coordinate in annotation["coordinates"] - ] - ) + yield Polygon( + [ + (float(coord[0]), float(coord[1])) + for coord in annotation["coordinates"] + ] + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ratiopath/parsers/empaia_parser.py` around lines 60 - 66, The code is creating intermediate Point objects when building a Polygon; update the polygon construction in the loop over self._get_filtered_annotations(name, "polygon") to pass coordinate tuples directly to shapely.geometry.Polygon instead of creating Point instances: transform annotation["coordinates"] into an iterable of (float(x), float(y)) tuples and pass that to Polygon (no Point instantiation), keeping the surrounding loop and yield as-is.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@ratiopath/parsers/empaia_parser.py`:
- Around line 31-49: The current _get_filtered_annotations function uses
name_regex.match(...) which only checks the start of annotation["name"]; if you
want the provided pattern to match anywhere in the string, replace the use of
re.match with re.search (i.e., compile the pattern into name_regex and call
name_regex.search(annotation["name"]) when testing annotations in
self.annotations["items"]) so names like "Annotation 1" will match a pattern
"1"; if start-anchored matching is desired, leave as-is but add a comment
clarifying the intentional behavior.
- Around line 60-66: The code is creating intermediate Point objects when
building a Polygon; update the polygon construction in the loop over
self._get_filtered_annotations(name, "polygon") to pass coordinate tuples
directly to shapely.geometry.Polygon instead of creating Point instances:
transform annotation["coordinates"] into an iterable of (float(x), float(y))
tuples and pass that to Polygon (no Point instantiation), keeping the
surrounding loop and yield as-is.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7e7888e3-652f-47f6-89b0-b8ba4679d90a
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
docs/reference/parsers/empaia.mdpyproject.tomlratiopath/parsers/__init__.pyratiopath/parsers/asap_parser.pyratiopath/parsers/empaia_parser.pytests/test_parsers.py
There was a problem hiding this comment.
Code Review
This pull request introduces a new EMPAIAParser for handling EMPAIA format annotation files, supporting both polygon and point features. The changes include the parser implementation, integration into the package's __init__.py, and comprehensive unit tests. The project version has been updated to 1.3.1, and new authors have been added. Feedback includes suggestions to improve efficiency in EMPAIAParser.get_polygons by directly using coordinate tuples for shapely.Polygon construction and to enhance the robustness of polygon assertions in tests by verifying the full closed coordinate sequence.
Parser for annotations downloaded from EMPAIA.
Summary by CodeRabbit
New Features
Documentation
Tests
Chores