Skip to content
Open
Changes from 6 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions cep-recipe-jinja.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Jinja functions in recipes

The new recipe format has some Jinja functionalities. We want to specify what functions exist and their expected behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this is lacking a short intro. At least we should link into the previous CEPs.

Suggested change
The new recipe format has some Jinja functionalities. We want to specify what functions exist and their expected behavior.
The new recipe format (introduced by CEP-XX, CEP-XY) has some Jinja functionalities. We want to specify what functions exist and their expected behavior.


## The compiler function

The compiler function is used to stick together a compiler from `{lang}_compiler` and `{lang}_compiler_version`

The function looks as follows:

```yaml
${{ compiler('c') }}
```

This would pull in the c_compiler and c_compiler_version from the variant config. The compiler function suffixes `{lang}_compiler` with the `target_platform` to render to something such as:

```
gcc_linux-64 8.9
clang_osx-arm64 12
msvc_win-64 19.29
```

The function thus evaluates to `{compiler}_{target_platform} {compiler_version}`.

The variant config could look something like:

```
c_compiler:
- gcc
c_compiler_version:
- "8.9"
cxx_compiler:
- clang
cxx_compiler_version:
- "12"
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit out of place; I think the example variant config should be provided before using its values to evaluate the function.

Additionally, we should provide an example variant config showing how to select different compilers for different platforms; e.g., are we keeping the conda_build_config.yaml mechanism?

c_compiler:
  - gcc       # [linux]
   - clang     # [osx]
   - vc        # [win]
c_compiler_version:
  - "14.1"    # [linux]
   - "17.0"    # [osx]
   - "2022"    # [win]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we need to write a spec for the variant config (or even more global "config"). But IMO that shouldn't be part of this CEP.


## The `cdt` function

CDT stands for "core dependency tree" packages. These are repackaged from Centos.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment disappeared, so re-adding it:

Does "centos" have to be mentioned here? CDT packages could technically contain binaries from any Linux distribution right? I'm also wondering if we should add a link to some documentation that describe CDT more in details?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the current default value in conda-build. idk what to do, i am basically just documenting what's already in conda-build.


The function expands to the following:

- package-name-<cdt_name>-<cdt_arch>

Where `cdt_name` and `cdt_arch` are loaded from the variant config. If they are undefined, they default to:

- `cos6` for `cdt_name` on `x86_64` and `x86`, otherwise `cos7`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel a little bit uneasy with having default values hardcoded here. Do we really need to define default values in the spec? I feel like this kind of thing should be explicitly defined in an ecosystem (the central variant config in an ecosystem) instead of relying on some defaults.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should NOT have default values for CDTs. The choice of CentOS 6 and 7 was made by Anaconda and conda-forge long ago, but I don't think we should necessarily impose such choices on the ecosystem as a whole.

Further, imposing these defaults means we either have to indefinitely support CDTs from EOL Linux distributions (CentOS 6 hit EOL in 2020-Nov, and CentOS 7 will hit EOL in 2024-Jun), or the evaluation of ${{ cdt("...") }} changes depending on the version of {conda,rattler)-build you use. Neither of those outcomes seems ideal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can get rid of this. I think we adopted this from conda-build where these values are also there by default.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, in the new CDTs for Alma 8, we're planning to get rid of cdt_name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also leave/move these out (see comment regarding defaults for compilers above).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@h-vetinari - curious what that means exactly? A change to conda-build?

I'll remove the defaults from the CEP!

- To the `platform::arch` for `cdt_arch`, except for `x86` where it defaults to `i686`.

## The `pin_subpackage` function

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pin_subpackage is defined in another section. This section seems to only define the arguments of pin_subpackage and pin_compatible?


### Pin definition

A pin has the following arguments:

- `min_pin`: The lower bound of the dependency spec. This is expressed as a `x.x....` version where the `x` are filled in from the corresponding version of the package. For example, `x.x` would be `1.2` for a package version `1.2.3`. The resulting pin spec would look like `>=1.2` for the `min_pin` argument of `x.x` and a version of `1.2.3` for the package.
- `max_pin`: This defines the upper bound and follows the same `x.x` semantics but adds `+1` to the last segment. For example, `x.x` would be `1.(2 + 1)` for a package version `1.2.3`. The resulting pin spec would look like `<1.3` for the `max_pin` argument of `x.x` and a version of `1.2.3` for the package.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems a strong but unstated assumption here that version strings are integers separated by dots (.). If that is the case, that assumption should be explicitly stated and guidance provided as to what should happen if the assumption is violated. (Just sayin', because 1!24.4.0+91_gbfa8ccdce is a perfectly valid PEP-440/conda package version string.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, as written, it seems like min_pin and max_pin do not support an explicit version; e.g., recipes cannot simply have max_pin="1.2.3".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, as written, it seems like min_pin and max_pin do not support an explicit version; e.g., recipes cannot simply have max_pin="1.2.3".

Conda's existing pin_compatible jinja has both {min,max}_pin as well as {lower,upper}_bound, in order to be able to distinguish 'x.x' from an explicit version limit.

Perhaps it's easier to understand to overload both styles into two kwargs (one for lower, one for upper), and allow either 'x(.x)*' or an explicit version number? It's more work to implement, but less confusing IMO (otherwise it's easy to specify conflicting things e.g. by setting both max_pin and upper_bound)

Copy link
Contributor Author

@wolfv wolfv May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I wasn't aware! I found a few places where it's used but I don't know how it works. For example, in the onnx-feedstock it says:

    - {{ pin_compatible('numpy', lower_bound='1.19', upper_bound='3.0') }}  # [py>38]

Does this still take the lower bound from the build-time resolution? Or is this equivalent to just writing numpy >=1.19,<3.0?

(https://github.com/conda-forge/onnx-feedstock/blob/49b7f95ea64d2b0e174f23ac62fe0f59441491d7/recipe/meta.yaml#L49)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't really understand why we would want this in the function. One can just as well write something like:

- numpy >=1.5
- ${{ pin_compatible("numpy", max_pin="x.x", min_pin="x.x.x") }}

If one needs additional (hard-coded) constraints. The solver doesn't really care about more constraints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@h-vetinari I'm on board with your proposal.

I suppose that could in theory create a problem if some upstream project decided to the string literals x.x, x.y, and x.z as their versions, but if that ever happens, I'm happy to say "NOPE" to attempting to package it. 😆

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wolfv, given your changes in 0fda213, I presume you're not on board with this idea (or didn't see this thread)?

If you want to keep {min,max}_pin separate from {lower,upper}_bound, I think we need to enforce that at most one {min_pin,lower_bound} resp. {max_pin,upper_bound} each may be specified, everything else should raise an error.

I'd still prefer a unified interface, because I don't see the gain to keep these arguments separate (but rather requires a lot of complexity to ensure it is unambiguous/consistent)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I indeed didn't see it or didn't read carefully enough. I also think this is a good change. The logic is already in place, so I don't think it will be a big change on the rattler-build side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I just implemented that we only use lower_bound and upper_bound, but now I am wondering if we shouldn't have used min_pin and max_pin instead (as more people are using that these days). But this PR has the implementation: prefix-dev/rattler-build#918

Copy link

@h-vetinari h-vetinari Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposed the API that I think is more natural and IMO more desirable long-term.

Obviously the naming choice is just bike shedding and both would work, but max_pin=1.2.3 seems off to me, whereas upper_bound='x.x' is fine because it's a more generic term, where users don't have to guess what a pin is or isn't.

- `exact`: This is a boolean that specifies whether the pin should be exact. It defaults to `False`. If `exact` is `True`, the `min_pin` and `max_pin` are irrelevant. We pin to a `==` version and also include the build string exactly (e.g. `==1.2.3=h1234`).

#### Example

If we consider a package like `numpy-1.21.3-h123456_5` we could apply some pin expressions.

- `min_pin=x.x, max_pin=x.x` would result in `>=1.21,<1.22`
- `min_pin=x.x.x, max_pin=x` would result in `>=1.21.3,<2`
- `exact=True` would result in `==1.21.3=h123456_5`


## The `pin_compatible` function

Pin compatible will pin the dependency to the same version as "previously" resolved in the `host` or `build` environment. This is useful to ensure that the same package is used at run time as was used at build time.

Example:

```yaml
requirements:
host:
- numpy
run:
- ${{ pin_compatible('numpy', exact=True) }}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it support min_pin and max_pin? It's unclear right now in this spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does. Both use the same pin specification as mentioned earlier.

```

## The `pin_subpackage` function

Pin subpackage will pin the dependency to the same version as another sub-package from the recipe (or the current package itself).
This is useful to ensure that multiple outputs from a recipe are linked together or to export the correct `run_exports` for a package.

Example:

```yaml
outputs:
- package:
name: libfoo
version: "1.2.3"
- package:
name: foo
version: "1.2.3"
requirements:
run:
- ${{ pin_subpackage('libfoo', exact=True) }}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it support min_pin and max_pin? It's unclear right now in this spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They both are derived from the same "pin" function. So yeah, they both support the same inputs.

```

## The `cmp` function

The `cmp` function is used to compare two versions. It returns `True` if the comparison is true and `False` otherwise.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add an example of how it can be used?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, what operators are available and what is the syntax?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some examples!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so it's more like a version matcher. When I read compare I thought we were comparing two packages together like "is numpy's version greater than python's", which didn't make much sense to me.

But this is more like asking "Does the version of Python match this version expression?". So in that sense the name could be better expressed as:

  • version(python, "<3.8")
  • version_match(python, "<3.8")
  • match_version(python, "<3.8")
  • satisfy(python, "<3.8")

Also this assumes that python will be represented by a concrete version x.y.z and not a spec like x.y.* right? IOW, it's already resolved to a specific package record.

If that's the case, should we specify that this function can only return after the solve has completed?


## The `hash` variable

`${{ hash }}` is the variant hash and is useful in the build string computation. This used to be `PKG_HASH` in the old recipe format. Since the `hash` variable depends on the variant computation, it is only available in the `build.string` field and is computed after the entire variant computation is finished.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we explicitly say that this is a read-only variable? E.g., is providing "hash: deadbeef123455" in the variant config file explicitly prohibited?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we call it variant_hash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not really prohibited in the implementation right now. But we can say that it should be...

I don't have strong feelings about the name. We can change it to variant_hash if that's preferred!


## The `version_to_buildstring` function

- `${{ python | version_to_buildstring }}` converts a version from the variant to a build string (it removes the `.` character and takes only the first two elements of the version).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we provide an example here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first time we introduce the | syntax as well. Can we do version_to_buildstring(python)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah using a function would also work. But we also expose the default minijinja filters (such as lower, replace, ...), that can be used like ${{ "mystring" | upper }}

or ${{ val | replace('foo', 'moo') }}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaimergp I added some more text regarding jinja filters to the bottom of the CEP.


## The `env` object

You can use the `env` object to retrieve environment variables and forward them to your build script. There are two ways to do this:

- `${{ env.get("MY_ENV_VAR") }}` will return the value of the environment variable `MY_ENV_VAR` or throw an error if it is not set.
- `${{ env.get_default("MY_ENV_VAR", "default_value") }}` will return the value of the environment variable `MY_ENV_VAR` or `"default_value"` if it is not set.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be overloaded to a single function? With the 2nd param being optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that can work. We can also make it a named argument, e.g.

env.get("FOO", default="baz")

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaimergp I implemented this change: prefix-dev/rattler-build#917

Let me know what you think.


You can also check for the existence of an environment variable:

- `${{ env.exists("MY_ENV_VAR") }}` will return `true` if the environment variable `MY_ENV_VAR` is set and `false` otherwise.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick. false and False are both used in the spec. It's hard to know if it returns a string or a bool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the word boolean true to make it clear that this is a bool value.