Skip to content

fix(tracking)!: switch mlflow from deprecated mlruns to sqllite db#122

Open
pizzajojo wants to merge 13 commits intomainfrom
109-migrate-mlflow-to-the-newer-db-storage
Open

fix(tracking)!: switch mlflow from deprecated mlruns to sqllite db#122
pizzajojo wants to merge 13 commits intomainfrom
109-migrate-mlflow-to-the-newer-db-storage

Conversation

@pizzajojo
Copy link
Copy Markdown
Collaborator

@pizzajojo pizzajojo commented May 8, 2026

Summary

- Migrate the default local MLflow tracking backend from deprecated filesystem storage to SQLite.

- Add persistent MLflow UI behavior so `tracking.open_when_done=true` leaves the local UI/server available after a run.

- Remove the stale `mlflow_server.yaml` preset from the old standalone launcher path.

- Update MLflow tracking docs and tests for the new SQLite-backed default.

Key Changes

- Default backend store URI is now `sqlite:///mlflow/mlflow.db`.

- Default artifact root is now `./mlflow/artifacts`.

- Local MLflow server startup now passes `--backend-store-uri` and `--default-artifact-root`.

- Direct SQLite UI startup uses `mlflow ui --backend-store-uri ...`.

- When reusing an already-open direct SQLite UI port, RAITAP now logs a warning that the existing UI may point to a different backend.

- Added constants for the default MLflow UI host/port instead of hardcoded literals.

- `tracking.output_forwarding_url` remains the supported way to connect to an existing/company MLflow server.

Checklist

  • CI — Required checks are green
  • Breaking changes — If this PR breaks compatibility (API, configs, file formats, etc.), the title uses ! after the type or scope (e.g. feat!: or feat(transparency)!:). I understand that this change will have a direct, disruptive impact on package users. I ensured that this change is absolutely necessary and cannot be delayed.
  • Contributor guide — I’ve read Pull requests and commit messages before requesting review.

Optional

  • Issue — A GitHub issues is linked to this PR ("Development" section in the right sidebar).
  • Docs — User or contributor docs updated where needed.
  • Tests — New or updated tests cover the change.

Copilot AI review requested due to automatic review settings May 8, 2026 15:32
@pizzajojo pizzajojo linked an issue May 8, 2026 that may be closed by this pull request
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Warning

The PR contains breaking changes, but the tiles does not contain "!". Please mark it as breaking by adding "!" after the type or scope (e.g. feat!: or feat(api)!:).

This PR migrates RAITAP’s default local MLflow tracking storage from the deprecated file-store layout (./mlruns) to a SQLite-backed store, and updates the MLflow launcher behavior so tracking.open_when_done=true keeps the local UI/server available after a run.

Changes:

  • Default local MLflow backend store switched to sqlite:///mlflow/mlflow.db with artifact root ./mlflow/artifacts, and local server/UI startup now passes these explicitly.
  • MLFlowTracker now ensures local paths exist, ensures direct-store experiments exist (with an explicit artifact location), and preserves spawned UI/server processes when open_when_done=true.
  • Tests and documentation updated for the SQLite-backed default; legacy mlflow_server.yaml removed.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/raitap/tracking/mlflow_tracker.py Implements SQLite default backend/artifact roots, server/UI CLI args, local path preparation, and new UI persistence behavior.
src/raitap/configs/tracking/mlflow.yaml Adds default backend_store_uri and default_artifact_root to the MLflow tracking preset.
src/raitap/configs/tracking/mlflow_server.yaml Removes obsolete standalone launcher preset.
src/raitap/configs/schema.py Extends TrackingConfig with backend_store_uri and default_artifact_root.
src/raitap/tracking/tests/test_mlflow_tracker.py Adds/updates mock-driven tests covering SQLite defaults, server/UI invocation, and terminate behavior.
src/raitap/tracking/README.md Updates tracking smoke-test docs and adds migration guidance for existing mlruns histories.
docs/modules/tracking/frameworks-and-libraries.md Documents the new SQLite default and migration command.
docs/modules/tracking/configuration.md Updates output_forwarding_url default behavior text and CLI example for enabling tracking.

Comment thread src/raitap/tracking/mlflow_tracker.py Outdated
Comment thread src/raitap/tracking/mlflow_tracker.py
Comment thread src/raitap/tracking/mlflow_tracker.py Outdated
Comment thread docs/modules/tracking/configuration.md
Comment thread src/raitap/tracking/tests/test_mlflow_tracker.py
@pizzajojo pizzajojo requested a review from stanlrt May 8, 2026 16:59
@pizzajojo pizzajojo self-assigned this May 8, 2026
@pizzajojo pizzajojo changed the title fix(tracking): switch mlflow from deprecated mlruns to sqllite db fix(tracking)!: switch mlflow from deprecated mlruns to sqllite db May 8, 2026
log_model: true

:cli: tracking=mlflow tracking.log_model=true
:cli: +tracking=mlflow tracking.log_model=true
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why +

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because tracking is not in our default config.yaml. + is used in mflow to add a property/config without it you overwrite existing config/properties. see below:\

❯ uv run raitap tracking=mlflow
      Built raitap @ file:///Users/jonasvonderhagen/raitap
Uninstalled 1 package in 7ms
Installed 1 package in 2ms
Could not override 'tracking'. No match in the defaults list.
To append to your default list use +tracking=mlflow

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can also add tracking=null to config.yaml

Comment thread src/raitap/configs/tracking/mlflow.yaml Outdated
_target_: MLFlowTracker
output_forwarding_url: http://127.0.0.1:5000
output_forwarding_url: http://127.0.0.1:5001
backend_store_uri: sqlite:///mlflow/mlflow.db
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we setting the default values?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed backend_store_uri, default_artifact_root, log_model, and open_when_done from the preset so it only defines the MLflow target and local server URL.

Comment thread src/raitap/configs/tracking/mlflow.yaml Outdated
default_artifact_root: ./mlflow/artifacts
log_model: false
open_when_done: true No newline at end of file
open_when_done: true
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt this the default?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

class TrackingConfig:
_target_: str = "MLFlowTracker"
output_forwarding_url: str | None = None
backend_store_uri: str | None = None
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that does not match the docs. if the defaults are MLFlow specific, it is correct to not set them here. but the docs should be updated (config and mlflow pages)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs updated

Comment thread src/raitap/tracking/mlflow_tracker.py Outdated
else:
logger.info("MLflow UI is ready at %s", url)
else:
logger.warning(
Copy link
Copy Markdown
Collaborator

@stanlrt stanlrt May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #124 and

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waiting for merge

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebased and addapted logging

pizzajojo added a commit that referenced this pull request May 9, 2026
Remove duplicated default values from the +tracking=mlflow preset and
document MLFlowTracker-specific fallback behavior separately from generic
TrackingConfig defaults.

Align MLflow examples to the preset's 127.0.0.1:5001 server URL.

Refs:
- #122 (comment)
- #122 (comment)
@pizzajojo pizzajojo force-pushed the 109-migrate-mlflow-to-the-newer-db-storage branch from 1ffc521 to 38abd17 Compare May 10, 2026 06:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate MLFlow to the newer DB storage

3 participants