Regression stability + new regression tests by scarlehoff · Pull Request #2480 · NNPDF/nnpdf

scarlehoff · 2026-06-03T13:13:27Z

Add a test for vp-setupfit
Improve the stability of the regression tests
Include a theory covmat test
Generate the data in a dedicated worker
Set the tolerance with respect to said worker

…-eko-download for debug purposes

…_stability.

scarlehoff · 2026-06-04T19:59:14Z

Ok, so the good news is that after many label-unlabel I have now the two samples that produce different results and make the test fail.
The failure seems to be coming from workers in different Azure regions. And there is no way to choose among them. Also, no guarantee that within the same region all workers are the same.

There are two options:

Run the regression in our own hardware.
Use those two samples to set the tolerances

I think the solution I'll go for is:

Regenerate the regression in our own hardware (if our github plan allows for it...)
Use the two samples to select a threshold compatible with the drift with respect to our own hardware.

This should be the most robust option.

Sorry for the spam whoever is still subscribed to this issue/PR but making the CI do it for me in a loop was the lowest-effort way to fish for the errors 😅

Radonirinaunimi · 2026-06-04T20:05:29Z

Regenerate the regression in our own hardware (if our github plan allows for it...)

For this, I think we can host our own runner in the Nikhef cluster (or somewhere else) as we did with pineko for the regression test.

scarlehoff · 2026-06-04T20:36:50Z

Ah, fantastic. We should be able to use the same one I think. I was worried the custom runner was a paid feature, I didn't realize we were already using it.

Radonirinaunimi · 2026-06-08T21:19:57Z

Btw, I will be waiting for this before resuming #2478.

scarlehoff added the redo-regressions Recompute the regression data label Jun 3, 2026

scarlehoff force-pushed the regression_stability branch from b2fd1ad to b1fc97f Compare June 3, 2026 13:20

scarlehoff added redo-regressions Recompute the regression data and removed redo-regressions Recompute the regression data labels Jun 3, 2026

scarlehoff force-pushed the regression_stability branch from b1fc97f to e3ae1d8 Compare June 3, 2026 13:42

scarlehoff added redo-regressions Recompute the regression data and removed redo-regressions Recompute the regression data labels Jun 3, 2026

scarlehoff force-pushed the regression_stability branch from a78efb4 to 9979d66 Compare June 3, 2026 14:07

scarlehoff added redo-regressions Recompute the regression data and removed redo-regressions Recompute the regression data labels Jun 3, 2026

scarlehoff force-pushed the regression_stability branch from a727524 to 0b6df3c Compare June 3, 2026 18:59

scarlehoff added buildmaster redo-regressions Recompute the regression data and removed redo-regressions Recompute the regression data buildmaster labels Jun 3, 2026

scarlehoff force-pushed the regression_stability branch from ef46748 to bd1f82d Compare June 3, 2026 19:38

scarlehoff added redo-regressions Recompute the regression data devtools Build, automation and workflow and removed redo-regressions Recompute the regression data labels Jun 3, 2026

scarlehoff added 4 commits June 4, 2026 08:20

test for changed files on setupfit

ec8b332

bugfix for vp-setupfit where a second loader was being used; add a no…

c89ad76

…-eko-download for debug purposes

add a starting point for hyperopt runcard

38a9871

automatically regenerate also the setupfit files

22cd99e

scarlehoff force-pushed the regression_stability branch from 882ce46 to 32b2e7a Compare June 4, 2026 07:04

scarlehoff added devtools Build, automation and workflow and removed devtools Build, automation and workflow labels Jun 4, 2026

scarlehoff and others added 2 commits June 4, 2026 12:05

add a min delta parameter to patience to stabilize a bit the regression

b3fd3aa

Automatically regenerated regressions from PR 2480, branch regression…

3dd49bb

…_stability.

scarlehoff force-pushed the regression_stability branch from 22afeea to 3dd49bb Compare June 4, 2026 10:05

docs update; change the delta for hyperopt and polarized

e6451da

add a test for the thcovmat

f301c4d

scarlehoff added redo-regressions Recompute the regression data and removed devtools Build, automation and workflow redo-regressions Recompute the regression data labels Jun 4, 2026

Automatically regenerated regressions from PR 2480, branch regression…

69ba511

…_stability.

scarlehoff added redo-regressions Recompute the regression data and removed redo-regressions Recompute the regression data labels Jun 4, 2026

Automatically regenerated regressions from PR 2480, branch regression…

a8a44f9

…_stability.

scarlehoff added redo-regressions Recompute the regression data and removed redo-regressions Recompute the regression data labels Jun 4, 2026

scarlehoff marked this pull request as draft June 4, 2026 19:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression stability + new regression tests#2480

Regression stability + new regression tests#2480
scarlehoff wants to merge 10 commits into
masterfrom
regression_stability

scarlehoff commented Jun 3, 2026 •

edited

Loading

Uh oh!

scarlehoff commented Jun 4, 2026

Uh oh!

Radonirinaunimi commented Jun 4, 2026

Uh oh!

scarlehoff commented Jun 4, 2026

Uh oh!

Radonirinaunimi commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

scarlehoff commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scarlehoff commented Jun 4, 2026

Uh oh!

Radonirinaunimi commented Jun 4, 2026

Uh oh!

scarlehoff commented Jun 4, 2026

Uh oh!

Radonirinaunimi commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

scarlehoff commented Jun 3, 2026 •

edited

Loading