Skip to content

Add apt retries in systemtest Dockerfiles#774

Closed
PranjalManhgaye wants to merge 2 commits into
precice:developfrom
PranjalManhgaye:fix-768-systemtest-download-retries
Closed

Add apt retries in systemtest Dockerfiles#774
PranjalManhgaye wants to merge 2 commits into
precice:developfrom
PranjalManhgaye:fix-768-systemtest-download-retries

Conversation

@PranjalManhgaye
Copy link
Copy Markdown
Contributor

Summary

This PR adds limited retries to the network-sensitive package and download commands in the systemtest Dockerfiles.

when we build the systemtest Docker images, we depend on external package mirrors and download servers. Sometimes these fail temporarily, even though nothing is wrong with our code or the tutorial setup. In those cases, a local run or CI job can fail just because a mirror was slow or unreachable.

tthis is especially relevant for steps such as installing packages, setting up OpenFOAM, and downloading CalculiX or SU2 sources. with a few bounded retries, we keep the normal behavior the same, but make the builds a bit more robust against temporary network issues.

Changes

This PR adds bounded retry behavior in both systemtest Dockerfiles:

  • apt-get update and apt-get install now use Acquire::Retries=3
  • wget downloads now use --tries=3 --waitretry=5
  • Scope is limited to:
    • tools/tests/dockerfiles/ubuntu_2204/Dockerfile
    • tools/tests/dockerfiles/ubuntu_2404/Dockerfile

This keeps the Dockerfile behavior the same in normal cases, but makes builds more robust against temporary network or mirror issues.

Fixes #768.

Validation

image

Both ran successfully .

Checklist:

  • I added a summary of any user-facing changes (compared to the last release) in the changelog-entries/<PRnumber>.md.
  • I will remember to squash-and-merge, providing a useful summary of the changes of this PR.

Make apt and wget commands in the systemtest Dockerfiles retry transient network failures so local and CI builds are less flaky.
Copy link
Copy Markdown
Member

@MakisH MakisH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Something in this direction could be proven helpful, but I still have some doubts.

Comment thread tools/tests/dockerfiles/ubuntu_2204/Dockerfile Outdated
Comment on lines +23 to +24
RUN apt-get -o Acquire::Retries=3 -qq update && \
apt-get -o Acquire::Retries=3 -qq -y install \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading https://manpages.debian.org/buster/apt/apt.conf.5.en.html#THE_ACQUIRE_GROUP I understand that there is no default value, so three retries here sounds valid.

Do we know if this option actually has an effect on the update? I am not sure if "acquire" here means both "download the metadata" and "download the packages".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i looked at the apt.conf manpage , from what i understand that the " Acquire " group is not only abt installing .deb packages , but about the files apt fetches through its acquire mechanism , the retries option says that apt retries failed files when this value is non-zero , since the apt-get update also fetches files, like Release/InRelease and Packages index files, I think this should also apply there. so i kept on both update and install .

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PranjalManhgaye it looks like the Acquire::Retries=3 is also the default, since APT 2.3.2, which is included in both Ubuntu 22.04 and 24.04, which we currently use:

https://github.com/Debian/apt/blob/main/debian/changelog

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides that, the easier implementation would be to add a system-wide rule to retry: https://askubuntu.com/a/1107071/142834

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it may be better to close this pr , If Acquire::Retries=3 is already the default in the APT versions used by both Ubuntu 22.04 and 24.04, then this PR does not really change the behavior anymore after removing the wget options , i will come up with different approach to address this issue .

Remove the explicit wget retry options because wget already has more permissive defaults, while keeping apt Acquire::Retries for package fetches.
@PranjalManhgaye PranjalManhgaye changed the title Add retries for systemtest Docker package downloads Add apt retries in systemtest Dockerfiles May 12, 2026
@PranjalManhgaye
Copy link
Copy Markdown
Contributor Author

thanks, looking at it again, this PR now only covers the apt-get part of #768 by adding explicit retries for apt fetches. for the rest of the issue, wget already seems to have retry defaults, and the bigger remaining problem may be things like git clone/git fetch, especially for external dependencies such as DUNE/GitLab , do you think it is fine to keep this PR as a small apt-only improvement, or would you prefer a broader follow-up for the non-apt downloads?

@MakisH
Copy link
Copy Markdown
Member

MakisH commented May 12, 2026

@PranjalManhgaye let's make this even smaller for now: I would focus on OpenFOAM and add a large --waitretry (e.g., 2min). If even that fails, then there is probably a more significant issue with the package repository.

@PranjalManhgaye
Copy link
Copy Markdown
Contributor Author

@MakisH i will open a new small focused pr for only changing the OpenFOAM repository setup download in the systemtest Dockerfiles by adding a larger wget --waitretry=120 ,,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Retries in retrieving artifacts

2 participants