hypervisor: refactor VMM overhead into shared library and use in eve-k by zedi-pramodh · Pull Request #5681 · lf-edge/eve

zedi-pramodh · 2026-03-16T22:30:51Z

This commit adds VMM Overhead support to eve-k

Extract VMM overhead calculation functions from kvm.go into a new shared file vmm_overhead.go (no build tag) so they are available to all hypervisor backends:

vmmOverhead: top-level function respecting global config, per-app override, and automatic estimation priority
estimatedVMMOverhead: combines all overhead components
ramVMMOverhead: 2.5% of domain RAM for page table overhead
qemuVMMOverhead: 20 MB for QEMU binaries/libraries
mmioVMMOverhead: 1% of total MMIO size for GPU/VGA passthrough devices
cpuVMMOverhead: 3 MB per vCPU
undefinedVMMOverhead: 350 MB base overhead for QEMU internal use

kvm.go: remove the above functions (now in vmm_overhead.go); CountMemOverhead wrapper is unchanged.

kubevirt.go: add CountMemOverhead method on kubevirtContext that delegates to the shared vmmOverhead function, replacing the inherited zero-overhead ctrdContext implementation. Add uuid import.

How to test and validate this PR

Create a VM with VMM overhead value
Verify that value is calculated and published in zedmanager AppInstanceStatus
VM still starts with actual RAM value provided, verified that getRemainingMemory(ctx) uses this overhead in remaining memory calculation

Changelog notes

Users can now use VMM overhead value in AppInstance config for eve-k

Checklist

I've provided a proper description
I've added the proper documentation
I've tested my PR on amd64 device
I've written the test verification instructions
I've set the proper labels to this PR

And the last but not least:

I've checked the boxes above, or I've provided a good reason why I didn't
check them.

Please, check the boxes above after submitting the PR in interactive mode.

zedi-pramodh · 2026-03-17T00:21:09Z

Verified domain metrics to see overhead got added..
[kube] root@c2c8906f-4230-4fd6-99c7-4456fcebc6d4:/run/domainmgr/DomainMetric$ cat 5454a0a1-7da4-4278-bb5a-a6c91c09fbb3.json | jq
{
"UUIDandVersion": {
"UUID": "5454a0a1-7da4-4278-bb5a-a6c91c09fbb3",
"Version": "1"
},
"CPUTotalNs": 0,
"CPUScaled": 2,
"AllocatedMB": 2648, --> 600MB overhead
"UsedMemory": 305,
"MaxUsedMemory": 305,
"AvailableMemory": 1598,
"UsedMemoryPercent": 0.11518126888217523,
"LastHeard": "2026-03-17T00:17:13.383722193Z",
"Activated": true,
"NodeName": ""
}

codecov · 2026-03-17T01:12:39Z

Codecov Report

❌ Patch coverage is 32.53968% with 85 lines in your changes missing coverage. Please review.
✅ Project coverage is 21.08%. Comparing base (2caf795) to head (58203bc).

Files with missing lines	Patch %	Lines
pkg/pillar/hypervisor/vmm_overhead.go	38.46%	43 Missing and 5 partials ⚠️
pkg/pillar/hypervisor/kubevirt.go	30.55%	24 Missing and 1 partial ⚠️
pkg/pillar/cmd/zedmanager/updatestatus.go	0.00%	12 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5681      +/-   ##
==========================================
+ Coverage   20.64%   21.08%   +0.44%     
==========================================
  Files         489      500      +11     
  Lines       90431    92160    +1729     
==========================================
+ Hits        18667    19432     +765     
- Misses      70187    70963     +776     
- Partials     1577     1765     +188

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR refactors KVM VMM memory overhead estimation into a shared hypervisor helper and applies the same overhead calculation to the KubeVirt (“eve-k”) backend so that AppInstance memory accounting can include hypervisor overhead.

Changes:

Extracted VMM overhead calculation helpers from kvm.go into a new shared vmm_overhead.go.
Updated KubeVirt backend to report/track VMM memory overhead (including adding it into AllocatedMB metrics output).
Removed the duplicated overhead helpers from kvm.go (KVM continues to call the same vmmOverhead wrapper).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File	Description
pkg/pillar/hypervisor/vmm_overhead.go	New shared VMM overhead estimation implementation (RAM/QEMU/MMIO/CPU/base components + global/per-app override priority).
pkg/pillar/hypervisor/kvm.go	Removes overhead helper implementations now moved into the shared file.
pkg/pillar/hypervisor/kubevirt.go	Adds CountMemOverhead for kubevirt and propagates overhead into domain/pod metrics accounting via stored metadata.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

zedi-pramodh · 2026-03-23T21:14:25Z

+	overhead, err := vmmOverhead(domainName, config.UUIDandVersion.UUID,
+		int64(config.Memory), int64(config.VMMMaxMem),
+		int64(config.MaxCpus), int64(config.VCpus), config.IoAdapterList, aa, nil)
+	if err != nil {
+		logrus.Warnf("CreateReplicaVMIConfig: vmmOverhead failed for %s: %v, using 0", domainName, err)
+		overhead = 0
+	}


will fix this

zedi-pramodh · 2026-03-23T21:14:42Z

+		// Add VMM overhead to AllocatedMB so it reflects total host memory
+		// consumption (guest RAM + hypervisor overhead), consistent with KVM.
+		if vmis, ok := ctx.vmiList[n]; ok && vmis.memOverhead > 0 {
+			r.AllocatedMB += uint32(vmis.memOverhead / (1024 * 1024))


valid point, will fix this.

+	// overhead so it reflects total host memory consumption, consistent with KVM.
+	allocatedMB := uint32(memoryLimits.Value()) / BytesInMegabyte
+	if vmis != nil && vmis.memOverhead > 0 {
+		allocatedMB += uint32(vmis.memOverhead / uint64(BytesInMegabyte))


+}
+
+// memory allocated by QEMU for its own purposes.
+// statistical analysis did not revile any correlation between


eriknordmark

The copilot comments seems relevant. Please review them and see what might need to be changed.

zedi-pramodh · 2026-03-20T17:49:31Z

The copilot comments seems relevant. Please review them and see what might need to be changed.

Looks like Claude and copilot does not like each other :) let me review and get back. thanks

eriknordmark · 2026-03-25T22:45:44Z

@zedi-pramodh can you rebase this on master so we can get the eve-k build to work?

zedi-pramodh · 2026-03-26T04:29:31Z

Rebased and pushed the commits

rucoder

@zedi-pramodh squash both commits together. it doesn't make sens to have "fix review comments" on master

OhmSpectator · 2026-03-30T11:02:04Z

+		var err error
+		overhead, err = vmmOverhead(domainName, config.UUIDandVersion.UUID,
+			int64(config.Memory), int64(config.VMMMaxMem),
+			int64(config.MaxCpus), int64(config.VCpus), config.IoAdapterList, aa, ctx.globalConfig)


Here, you should check extra carefully that the adapters list is up to date: the adapters have already been reserved, and the domain knows what it owns. We had an issue with that in the past: when we called the function before it was resolved, it just silently returned zero.
You can check the fix here and decide if the code in the kubevirt version is already fine. https://github.com/lf-edge/eve/pull/5413/changes

But I blindly moved all your fixes to common file. So those are already considered.

OhmSpectator

I have some doubts about making this code generic. The functions we use to estimate vmmOverhead were tuned exactly for KVM + QEMU VMs. We did a set of experiments and code research to figure them out. Other setups (not pure KVM) most probably have other overhead. So, I would say we should not blindly copy the logic and expect it to work the same. Even if kubevirt or any other hyp also uses a similar setup, I guess they can have some other overheads that might be reflected in THEIR formula.

zedi-pramodh · 2026-03-30T17:27:57Z

I have some doubts about making this code generic. The functions we use to estimate vmmOverhead were tuned exactly for KVM + QEMU VMs. We did a set of experiments and code research to figure them out. Other setups (not pure KVM) most probably have other overhead. So, I would say we should not blindly copy the logic and expect it to work the same. Even if kubevirt or any other hyp also uses a similar setup, I guess they can have some other overheads that might be reflected in THEIR formula.

Good question. kubevirt is a Kubernetes API wrapper for kvm + qemu. So all the work done for eve-kvm is valid here too. Coming to kubevirt itself having the overhead, they handle that inside their implementation and user need not know about that AFAIU. https://kubevirt.io/user-guide/compute/virtual_hardware/#memory-overhead

zedi-pramodh · 2026-04-03T21:31:08Z

Merged the commits and rebased to master.

zedi-pramodh · 2026-04-21T21:23:19Z

Rebased to latest master

zedi-pramodh · 2026-04-28T17:58:57Z

Rebased to latest master.

rene · 2026-04-29T09:23:29Z

@zedi-pramodh I know you didn't touch this part, but it doesn't hurt to fix a codespell and get 100% Yetus passing. Appreciate if you could fix it.

ok will do that

rene

LGTM. @rucoder , pls, take another look and remove your request to changes if you are ok with the current changes.

zedi-pramodh · 2026-05-07T03:52:54Z

rebased to master again :)

zedi-pramodh · 2026-05-17T02:42:15Z

rebased to master again :)

…nto KubeVirt Extract VMM overhead calculation functions from kvm.go into a new shared file vmm_overhead.go (no build tag) so they are available to all hypervisor backends: - vmmOverhead: top-level function respecting global config, per-app override, and automatic estimation priority - estimatedVMMOverhead: combines all overhead components - ramVMMOverhead: 2.5% of domain RAM for page table overhead - qemuVMMOverhead: 20 MB for QEMU binaries/libraries - mmioVMMOverhead: 1% of total MMIO size for GPU/VGA passthrough devices - cpuVMMOverhead: 3 MB per vCPU - undefinedVMMOverhead: 350 MB base overhead for QEMU internal use kvm.go: remove the above functions (now in vmm_overhead.go); CountMemOverhead wrapper is unchanged. kubevirt.go: add CountMemOverhead method on kubevirtContext that delegates to the shared vmmOverhead function, replacing the inherited zero-overhead ctrdContext implementation. Add uuid import. include VMM overhead in DomainMetric.AllocatedMB Previously AllocatedMB in KubeVirt DomainMetric was set to the Kubernetes resource limit (guest RAM only), so reported metrics and device-level AllocatedAppsMB undercounted total host memory consumption by the full VMM overhead. Calculate and store the VMM overhead (via the shared vmmOverhead function) in vmiMetaData when the domain config is created — both for VMI replicasets (CreateReplicaVMIConfig) and pod replicasets (CreateReplicaPodConfig). In GetDomsCPUMem (VMI virt-handler metrics path), add the stored overhead to AllocatedMB in the post-processing loop after all metrics are filled. In getPodMetrics (pod replicaset path), add the stored overhead to AllocatedMB before constructing the DomainMetric. This makes KubeVirt consistent with KVM, where AllocatedMB is read from the cgroup HierarchicalMemoryLimit which naturally includes QEMU overhead. - updatestatus.go: skip CountMemOverhead call when VirtualizationMode is NOHYPER - kubevirt.go: skip vmmOverhead in CreateReplicaVMIConfig and CreateReplicaPodConfig when VirtualizationMode is NOHYPER Store globalConfig on kubevirtContext in Setup() so that CreateReplicaVMIConfig and CreateReplicaPodConfig can pass it to vmmOverhead instead of nil. This ensures the global VmmMemoryLimitInMiB override is respected when calculating VMM overhead during domain config creation, consistent with the CountMemOverhead path. Replace truncating integer division (memOverhead / (1024 * 1024)) with roundFromBytesToMbytes() when adding VMM overhead to AllocatedMB. This avoids under-reporting overhead by up to ~1 MB and is consistent with how other memory conversions are done in the codebase. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Pramodh Pallapothu <pramodh@zededa.com>

zedi-pramodh · 2026-05-26T02:35:28Z

rebased to master again !!! This has been sitting here for 2.5 months now :)

andrewd-zededa

The memory limit cast will cause issues with larger VMs

andrewd-zededa · 2026-05-26T21:11:58Z

 	}
+	// AllocatedMB is the Kubernetes resource limit (guest RAM only); add VMM
+	// overhead so it reflects total host memory consumption, consistent with KVM.
+	allocatedMB := uint32(memoryLimits.Value()) / BytesInMegabyte


The memory limits value is an int64, this division must be inside the cast, otherwise the uint32 cast will wrap for a VM using 4GB+

andrewd-zededa · 2026-05-26T21:17:04Z

+// CountMemOverhead returns the memory overhead estimation for a KubeVirt VM domain.
+func (ctx kubevirtContext) CountMemOverhead(domainName string, domainUUID uuid.UUID, domainRAMSize int64, vmmMaxMem int64,
+	domainMaxCpus int64, domainVCpus int64, domainIoAdapterList []types.IoAdapter, aa *types.AssignableAdapters,
+	globalConfig *types.ConfigItemValueMap) (uint64, error) {


globalConfig *types.ConfigItemValueMap is passed in as a func parameter but already available as ctx.globalConfig. Can this be consolidated?

zedi-pramodh requested review from rene, rouming, rucoder and shjala as code owners March 16, 2026 22:30

github-actions Bot requested review from OhmSpectator and eriknordmark March 16, 2026 22:31

rene requested a review from Copilot March 17, 2026 08:24

Copilot started reviewing on behalf of rene March 17, 2026 08:25 View session

Copilot AI reviewed Mar 17, 2026

View reviewed changes

zedi-pramodh force-pushed the vmm-overhead branch from 1103db0 to 438c972 Compare March 19, 2026 01:48

eriknordmark reviewed Mar 20, 2026

View reviewed changes

github-actions Bot requested a review from eriknordmark March 23, 2026 21:33

zedi-pramodh force-pushed the vmm-overhead branch from 075f601 to 45af414 Compare March 26, 2026 04:29

rucoder requested changes Mar 30, 2026

View reviewed changes

OhmSpectator reviewed Mar 30, 2026

View reviewed changes

zedi-pramodh force-pushed the vmm-overhead branch from 45af414 to 6d13aaa Compare April 3, 2026 21:30

github-actions Bot requested review from OhmSpectator and rucoder April 3, 2026 21:30

zedi-pramodh force-pushed the vmm-overhead branch from 6d13aaa to 55a4bb1 Compare April 21, 2026 21:23

zedi-pramodh force-pushed the vmm-overhead branch 2 times, most recently from dd84610 to 726553a Compare April 28, 2026 17:58

rene reviewed Apr 29, 2026

View reviewed changes

rene approved these changes Apr 29, 2026

View reviewed changes

zedi-pramodh force-pushed the vmm-overhead branch from 726553a to 25bf391 Compare May 7, 2026 03:52

github-actions Bot requested a review from rene May 7, 2026 03:52

rene approved these changes May 7, 2026

View reviewed changes

zedi-pramodh force-pushed the vmm-overhead branch from 25bf391 to 58203bc Compare May 17, 2026 02:41

github-actions Bot requested a review from rene May 17, 2026 02:41

zedi-pramodh force-pushed the vmm-overhead branch from 58203bc to e6f05a9 Compare May 26, 2026 02:34

andrewd-zededa suggested changes May 26, 2026

View reviewed changes

Conversation

zedi-pramodh commented Mar 16, 2026

How to test and validate this PR

Changelog notes

Checklist

Uh oh!

zedi-pramodh commented Mar 17, 2026

Uh oh!

codecov Bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eriknordmark left a comment

Choose a reason for hiding this comment

Uh oh!

zedi-pramodh commented Mar 20, 2026

Uh oh!

eriknordmark commented Mar 25, 2026

Uh oh!

zedi-pramodh commented Mar 26, 2026

Uh oh!

rucoder left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

OhmSpectator left a comment

Choose a reason for hiding this comment

Uh oh!

zedi-pramodh commented Mar 30, 2026

Uh oh!

zedi-pramodh commented Apr 3, 2026

Uh oh!

zedi-pramodh commented Apr 21, 2026

Uh oh!

zedi-pramodh commented Apr 28, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rene left a comment

Choose a reason for hiding this comment

Uh oh!

zedi-pramodh commented May 7, 2026

Uh oh!

zedi-pramodh commented May 17, 2026

Uh oh!

zedi-pramodh commented May 26, 2026

Uh oh!

andrewd-zededa left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

codecov Bot commented Mar 17, 2026 •

edited

Loading