Skip to content

CA-311475: do not change a domain's memory allocation while it is being…#6890

Open
edwintorok wants to merge 1 commit intoxapi-project:masterfrom
edwintorok:private/edvint/squeeze-me-not
Open

CA-311475: do not change a domain's memory allocation while it is being…#6890
edwintorok wants to merge 1 commit intoxapi-project:masterfrom
edwintorok:private/edvint/squeeze-me-not

Conversation

@edwintorok
Copy link
Member

… built

When a domain takes a long time to be built (e.g. >1TiB) then squeezed might run and attempt to change maxmem, causing the domain build to fail to complete.

2026-02-04T13:59:12.915844+00:00 orca squeezed: [debug||9 ||squeeze_xen] Xenctrl.domain_setmaxmem domid=717 max=6370254848 (was=0)
2026-02-04T13:59:22.878301+00:00 orca squeezed: [debug||3 ||squeeze_xen] Xenctrl.domain_setmaxmem domid=717 max=2075287552 (was=6370254848)

Squeezed shouldn't change the maxmem setting on domains that have never been run (other than to initialize it if 0). In fact another module in Squeezed had code to detect whether a domain has ever been run, which has been replaced with checking whether it has an active balloon driver (if it hasn't reported a balloon driver it is still not very safe to change it too early).

But that check missed one place that was still setting maxmem, ignoring the balloon driver's presence. Fix this (hopefully last!) place: if there is no balloon driver and we attempt to decrease maxmem then just log a message instead.

Fixes: 9819bdb ("CA-32810: prevent the memory ballooning daemon capping a domain's memory usage before it has written feature-balloon.")

@edwintorok
Copy link
Member Author

edwintorok commented Feb 4, 2026

Draft, tested this on the same machine again, and now it doesn't change the VM's maxmem while it is being built.
(The operation still fails, so we have multiple bugs here, this is required but not sufficient to fix it)

@edwintorok edwintorok force-pushed the private/edvint/squeeze-me-not branch from 35ddca5 to c05f997 Compare February 4, 2026 17:11
…ng built

When a domain takes a long time to be built (e.g. >1TiB) then squeezed might run and attempt to change maxmem, causing the domain build to fail to complete.

```
2026-02-04T13:59:12.915844+00:00 orca squeezed: [debug||9 ||squeeze_xen] Xenctrl.domain_setmaxmem domid=717 max=6370254848 (was=0)
2026-02-04T13:59:22.878301+00:00 orca squeezed: [debug||3 ||squeeze_xen] Xenctrl.domain_setmaxmem domid=717 max=2075287552 (was=6370254848)
```

Squeezed shouldn't change the maxmem setting on domains that have never been run (other than to initialize it if 0).
In fact another module in Squeezed had code to detect whether a domain has ever
been run, which has been replaced with checking whether it has an active
balloon driver (if it hasn't reported a balloon driver it is still not very
safe to change it too early).

But that check missed one place that was still setting maxmem, ignoring the
balloon driver's presence. Fix this (hopefully last!) place: if there is no
balloon driver and we attempt to decrease maxmem then just log a message
instead.

Fixes: 9819bdb ("CA-32810: prevent the memory ballooning daemon capping a domain's memory usage before it has written feature-balloon.")

Signed-off-by: Edwin Török <edwin.torok@citrix.com>
@edwintorok edwintorok force-pushed the private/edvint/squeeze-me-not branch from c05f997 to 0c3aaca Compare February 4, 2026 18:08
@edwintorok edwintorok changed the title CA-NNNN: do not change a domain's memory allocation while it is being… CA-311465: do not change a domain's memory allocation while it is being… Feb 5, 2026
@edwintorok edwintorok changed the title CA-311465: do not change a domain's memory allocation while it is being… CA-311475: do not change a domain's memory allocation while it is being… Feb 5, 2026
@edwintorok
Copy link
Member Author

Actually the underlying bug here is a 64-bit -> 32-bit truncation bug in Xenctrl, but anyway we should still fix squeezed.

@edwintorok edwintorok marked this pull request as ready for review February 5, 2026 14:01
@edwintorok
Copy link
Member Author

edwintorok commented Feb 5, 2026

This passed the DMC tests in the BST (there is one failed test in the BST related to a reverted blktap bug, unrelated to this change).

@edwintorok edwintorok enabled auto-merge February 5, 2026 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant