Skip to content

Conversation

@suvayu
Copy link
Contributor

@suvayu suvayu commented May 26, 2025

Migrate old JSON parameter_values into the new schema that is more like a flat table (for time_series, array, and map) and singular pyarrow compatible values for date_time, duration, and time_pattern.

Re spine-tools/Spine-Toolbox#2506

Checklist before merging

  • Documentation (also in Toolbox repo) is up-to-date
  • Release notes have been updated
  • Unit tests have been added/updated accordingly
  • Code has been formatted by black & isort
  • Unit tests pass

Authors

Since GH doesn't support setting multiple people as author in a PR, documenting it here

@OleMussmann, @suvayu

Copy link
Contributor Author

@suvayu suvayu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some notes/questions as review comments.

suvayu added 5 commits May 29, 2025 15:00
TimePattern was implemented as annotated type for schema generation,
however this is not distinguishable at runtime, so add an alternate
dataclass implementation.
make columns instead of records from old format parameter_value
Ole Mussmann and others added 12 commits June 2, 2025 21:05
- update only rows that need changes
- batch row updates
- convert types to `table` where necessary
- add `transition_data` function override for debugging
Remove unnecessary nullable types, and unused union type (ValueTypes)
Do not use pandas as intermediate step, instead transform ourselves
from record based to column based - easier for type inspection.

TODO: factor out into specific for data transition and generally
useful to inserting into spinedb from outside sources when using
spinedb_api as a library.


# types
class TimePeriod(str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this class be replaced by typing.NewType?

TimePeriod = NewType("TimePeriod", str)

- simplify with early return
- document implementation detail
- improve error message
- add pyarrow "duration" type
- add pyarrow types to type_map
- function to convert pyarrow arrays to model arrays (requirement to
  be able to serialise pyarrow for the DB)
- separate schema validation (aka pydatic models) & additional
  conventions like, metadata, restrictions related to value/index
  columns into models.py & parameter_value.py respectively
- improve error message when raising SpineDBAPIError
- remove unused functions after refactor
previous workaround was breaking any_array
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants