-
Notifications
You must be signed in to change notification settings - Fork 4
Refactoring node and edge attribute key API #244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add AttrSchema dataclass with defensive copying - Create process_attr_key_args() helper to eliminate duplication - Refactor dtype inference and SQLAlchemy mapping to use dictionaries - Add overloaded method signatures for convenience and schema modes - Replace _node/edge_attr_keys lists with _attr_schemas dicts - Add comprehensive test suite (16 tests passing) Breaking change: dtype parameter now required. Next phase will update all calling sites throughout the codebase.
…add_edge_attr_key This is a breaking change that requires dtype (polars.DataType) to be explicitly specified when adding attribute keys to graphs. Core changes: - Add AttrSchema dataclass to store key, dtype, and default_value together - Add utility functions: infer_default_value_from_dtype, polars_dtype_to_sqlalchemy_type, validate_default_value_dtype_compatibility, and process_attr_key_args - Update BaseGraph abstract methods with overloaded signatures supporting both direct parameters and AttrSchema objects - Replace _node_attr_keys/_edge_attr_keys lists with _node_attr_schemas dicts in RustWorkXGraph and SQLGraph - Remove _boolean_columns tracking in SQLGraph (now uses AttrSchema) - Fix _polars_schema_override to use == instead of isinstance for pl.Boolean check Backend implementations: - RustWorkXGraph: Use process_attr_key_args helper, store AttrSchema objects - SQLGraph: Use process_attr_key_args helper, pass AttrSchema to _add_new_column - GraphView: Delegate to root graph with all parameters preserved - Add dtype inference fallback for complex objects that polars can't parse Updated all callers: - Graph methods: from_other(), match() - Node operators: RandomNodes, RegionPropsNodes, GenericNodesOperator - Edge operators: DistanceEdges, GenericEdgesOperator - Solvers: ILPSolver, NearestNeighborsSolver - IO: numpy_array, ctc
Update all test files to use the new required dtype parameter for add_node_attr_key and add_edge_attr_key methods. Key test fixes: - Add dtype inference logic in test_add_node_attr_key based on value type - Fix bbox attribute calls to use pl.Array(pl.Int64, size) instead of passing numpy arrays as dtype - Add missing polars imports where needed - Fix argument order from (key, False, dtype=pl.Boolean) to (key, pl.Boolean, default_value=False)
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #244 +/- ##
==========================================
- Coverage 88.35% 88.15% -0.21%
==========================================
Files 56 56
Lines 4104 4203 +99
Branches 714 733 +19
==========================================
+ Hits 3626 3705 +79
- Misses 291 311 +20
Partials 187 187 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi @TeunHuijben and @yfukai, I would appreciate it if you both could review this PR, at least the public API, since it involves breaking changes. |
| rx_graph = self.rx_graph | ||
| for node_id in rx_graph.node_indices(): | ||
| rx_graph[node_id][key] = default_value | ||
| rx_graph[node_id][schema.key] = schema.default_value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to add the default_value?
This could be solved when returning the attrs value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think returning the attrs value for unset attribute would be better.
| # Add to all existing edges | ||
| for edge_attr in self.rx_graph.edges(): | ||
| edge_attr[key] = default_value | ||
| edge_attr[schema.key] = schema.default_value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
|
Haven't personally tested the code, but the API looks good! |
|
Hi @yfukai , this should be ready for review. |
|
Hi sorry @TeunHuijben I'm still reviewing it, I'll finish that within this weekend! |
|
Hi @yfukai, no worries, take your time! Was just wondering whether your thumbs-up meant approval or whether you were reviewing. Thank you! 😊 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks much nicer than what we currently have, thanks @JoOkuma!
I think this PR is good to merge, but
- It would be even better if the dtype does not change with a round-trip with
from_other. For this purpose, maybe we can store theAttrSchemas into a read-only metadata. For RustworkXGraph, maybe we can skip setting the default value to the rx_graph and return the default value for unset attr to save the initialization time.→ I found this addressed in the next PR, thanks!
| rx_graph = self.rx_graph | ||
| for node_id in rx_graph.node_indices(): | ||
| rx_graph[node_id][key] = default_value | ||
| rx_graph[node_id][schema.key] = schema.default_value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think returning the attrs value for unset attribute would be better.
Closes #193 and #243