Skip to content

Conversation

@Samoed
Copy link
Member

@Samoed Samoed commented Jan 4, 2026

uv.lock weights too much (10 mb on main and 25 mb on maeb) and with it really hard switching between branches. I think we can delete it for easier development

@Samoed Samoed requested a review from isaac-chung January 4, 2026 09:31
Copy link
Collaborator

@isaac-chung isaac-chung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about the consequences, but I'm willing to try it out.

@KennethEnevoldsen
Copy link
Contributor

I would really keep it. The lock file is there to communicate that this worked with exactly these dependencies. It helps us and others track problematic dependencies.

From the docs:

Unlike the pyproject.toml, which is used to specify the broad requirements of your project, the lockfile contains the exact resolved versions that are installed in the project environment. This file should be checked into version control, allowing for consistent and reproducible installations across machines.

A lockfile ensures that developers working on the project are using a consistent set of package versions. Additionally, it ensures when deploying the project as an application that the exact set of used package versions is known.

We could consider using pylock.toml, but uv only supports that as an export so I don't see it as a real alternative

Maybe it is better to examine why it is so large and what we can do to reduce it

@Samoed
Copy link
Member Author

Samoed commented Jan 4, 2026

Lock prevents us to test on latest versions to see problems before user will face them

Maybe it is better to examine why it is so large and what we can do to reduce it

We have a lot of conflict dependencies

mteb/pyproject.toml

Lines 322 to 338 in 44e9b20

conflicts = [
[{ extra = "timm" }, { extra = "blip2" }],
[{ extra = "llm2vec" }, { extra = "model2vec" }],
[{ extra = "llm2vec" }, { extra = "pylate" }], # conflicting versions of transformers
[{ extra = "llm2vec" }, { extra = "llama-embed-nemotron" }], # conflicting versions of transformers
[{ extra = "colpali-engine" }, { extra = "pylate" }],
[{ extra = "colpali-engine" }, { extra = "llm2vec" }],
[{ extra = "colpali-engine" }, { extra = "llama-embed-nemotron" }], # conflicting version of transformers
[{ extra = "colqwen3" }, { extra = "pylate" }], # conflicting versions of transformers
[{ extra = "colqwen3" }, { extra = "llm2vec" }], # conflicting versions of transformers
[{ extra = "colqwen3" }, { extra = "llama-embed-nemotron" }], # conflicting versions of transformers
[{ extra = "jina-v4" }, { extra = "llm2vec" }],
[{ extra = "jina-v4" }, { extra = "llama-embed-nemotron" }], # conflicting versions of transformers
[{ extra = "sauerkrautlm-colpali" }, { extra = "pylate" }],
[{ extra = "sauerkrautlm-colpali" }, { extra = "llm2vec" }],
[{ extra = "sauerkrautlm-colpali" }, { extra = "llama-embed-nemotron" }],
]

@KennethEnevoldsen
Copy link
Contributor

Lock prevents us to test on latest versions to see problems before user will face them

would be doable to just update it our test CI

We have a lot of conflict dependencies

Hmm I actually thought conflicting dependencies would reduce the solve time (fewer constraints), but it seems like literally half the file is conflicting dependencies.

What the best approach to combat these? My current thought are:

  1. [probably the most likely], figure out if there is some setting in uv that could help us.
  2. make mteb-models a seperate package (seen in other packages like pydantic-eval), can be done in a backward compatible manner. Reduces complexity in the core mteb package. Probably not the short-term solution. We could even have certain models moved to some sort of legacy section.
  3. Forgo maintaining implementations for certain models (e.g. llm2vec), this is a reproducibility vs development cost trade-off
  4. Work with underlying packages to expand support

@Samoed
Copy link
Member Author

Samoed commented Jan 4, 2026

Hmm I actually thought conflicting dependencies would reduce the solve time (fewer constraints),

This is opposite. They're increasing complexity by a lot astral-sh/uv#16779

Work with underlying packages to expand support

I think we have a lot of packages and most of them are not maintained

@KennethEnevoldsen
Copy link
Contributor

Seems like a solution might be to simplify the conflicting depencies:

# sauerkrautlm-colpali can rely on colpali-engine, which should reduce the grid
sauerkrautlm-colpali = [{include-group = "colpali-engine"}, ...]

There is also a few of these which I am not sure why are there (model2vec and llm2vec seems odd).

We can also set the environments (probably the way to go), e.g.:

[tool.uv]
environments = [
    "python_version >= '3.10' and python_version < '3.13' and sys_platform == 'linux'",
    "python_version >= '3.10' and python_version < '3.13' and sys_platform == 'darwin'",
    "python_version >= '3.10' and python_version < '3.13' and sys_platform == 'win32'",
]

I could see us making a uv issue on this as well. I think it is a pretty good case for them and seems like that have been missing from previous issues (astral-sh/uv#9735)

@KennethEnevoldsen
Copy link
Contributor

I think we have a lot of packages and most of them are not maintained

Yeah, if we want to keep support for these I think we need to figure out a decent approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants