Skip to content

Bug: systemd-networkd races CNI on AL2023, causing missing routing rules & packet drops on secondary ENIs #3524

@mariot8

Description

@mariot8

What happened:

On Amazon Linux 2023 EKS nodes, systemd-networkd sometimes configures secondary ENIs before the VPC CNI is initialized, resulting in missing policy routing rules for pod IPs. This leads to intermittent network failures, stale routes, and martian source kernel messages.

The behavior is inconsistent across node groups - some have secondary ENIs marked as Unmanaged by systemd-networkd as expected, while others end up with systemd-networkd fully managing the ENIs, causing duplicate or missing routes.

# networkctl status ens5
● 2: ens5
                     Link File: /usr/lib/systemd/network/99-default.link
                  Network File: /run/systemd/network/70-eks-ens5.network
                         State: routable (configured)
                  Online state: online
                          Type: ether
                          Path: pci-0000:00:05.0
                        Driver: ena
                        Vendor: Amazon.com, Inc.
                         Model: Elastic Network Adapter (ENA)
             Alternative Names: enp0s5
              Hardware Address: 0a:ff:e1:47:45:f3
                           MTU: 9001 (min: 128, max: 9216)
                         QDisc: mq
  IPv6 Address Generation Mode: eui64
      Number of Queues (Tx/Rx): 2/2
                       Address: 10.101.41.199 (DHCP4 via 10.101.41.1)
                                fe80::8ff:e1ff:fe47:45f3
                       Gateway: 10.101.41.1
                           DNS: 10.101.0.2
                Search Domains: ec2.internal
             Activation Policy: up
           Required For Online: yes
               DHCP4 Client ID: IAID:0xed10bdb8/DUID
             DHCP6 Client IAID: 0xed10bdb8
             DHCP6 Client DUID: DUID-EN/Vendor:0000ab11cc500c456718037f
# networkctl status ens6
● 7: ens6
                     Link File: /usr/lib/systemd/network/99-default.link
                  Network File: /run/systemd/network/70-eks-ens6.network
                         State: routable (configured)
                  Online state: online
                          Type: ether
                          Path: pci-0000:00:06.0
                        Driver: ena
                        Vendor: Amazon.com, Inc.
                         Model: Elastic Network Adapter (ENA)
             Alternative Names: enp0s6
              Hardware Address: 0a:ff:c9:51:6a:c7
                           MTU: 9001 (min: 128, max: 9216)
                         QDisc: mq
  IPv6 Address Generation Mode: eui64
      Number of Queues (Tx/Rx): 2/2
                       Address: 10.101.41.164 (DHCP4 via 10.101.41.1)
                                fe80::8ff:c9ff:fe51:6ac7
                       Gateway: 10.101.41.1
                           DNS: 10.101.0.2
                Search Domains: ec2.internal
             Activation Policy: up
           Required For Online: yes
               DHCP4 Client ID: IAID:0x6618dd42/DUID
             DHCP6 Client IAID: 0x6618dd42
             DHCP6 Client DUID: DUID-EN/Vendor:0000ab11cc500c456718037f

IPv4: martian source 10.101.40.133 from 10.101.40.1, on dev ens6

$ ip route
default via 10.101.40.1 dev ens5 proto dhcp src 10.101.40.117 metric 512
default via 10.101.40.1 dev ens6 proto dhcp src 10.101.40.196 metric 513
default via 10.101.40.1 dev ens7 proto dhcp src 10.101.40.20 metric 514

_ ** IPs differs as logs comes from the diffrent failing nodes_

What you expected to happen:
Secondary ENIs (ens6 and above) should remain unmanaged by systemd-networkd so that the AWS VPC CNI can fully configure policy routing, IP rules, and ENI-specific routing tables.

Actual Behavior:

Occasionally, systemd-networkd takes over configuration of secondary ENIs during boot, preventing the AWS VPC CNI from setting correct routing rules. This results in pod connectivity failures, missing route tables, and dropped packets.

How to reproduce it (as minimally and precisely as possible):

This issue is difficult to reproduce reliably because it appears to be a boot-time race condition between:
• systemd-networkd, which configures interfaces as soon as they appear
• AWS VPC CNI, which expects to configure secondary ENIs before systemd touches them

However, the problem can be reproduced more consistently by intentionally slowing down early boot steps so the secondary ENIs attach before kubelet and aws-node are fully initialized.

Suggested reproduction strategy:
1. Create an AL2023 EKS node group on EKS 1.34
2. Configure your cluster or nodegroup to ensure secondary ENIs will be attached:
• set WARM_ENI_TARGET >= 1
• or set high WARM_IP_TARGET
3. Delay node initialization so ENIs attach before CNI is running.
For example:
• inject an artificial delay i.e. in pre userdata - before the nodeadm init (e.g., sleep 300)
• add a slow external proxy or metadata throttling
• use a large cloud-init payload
5. Check network state after each boot:
• networkctl status
• ip rule
• ip route show table 10001
• dmesg | grep -i martian

Eventually, a node will come up with missing policy routing rules and pod connectivity failures.

Expected result in broken state:
• systemd-networkd configures the secondary ENI
• CNI skips configuring the ENI because systemd marked it as “managed”
• The routing table for the ENI is missing (no from <POD_IP> lookup 10001)
• Duplicate DHCP-added default routes appear
• Kernel logs martian source messages
• Pods scheduled to that ENI lose connectivity

Anything else we need to know?:

Impact
• Pods on affected nodes cannot reach services or other pods
• DNS resolves but traffic drops on reply path (asymmetric routing)
• Kernel drops packets (rp_filter)
• EKS upgrades are blocked due to instability
• Production workloads experience intermittent failures
• AL2023 migration becomes unreliable without manual overrides

Hotfix
A reliable short-term mitigation is to prevent systemd-networkd from managing secondary ENIs entirely, ensuring that the AWS VPC CNI has exclusive control over routing and interface configuration.

We confirmed that placing the following override file on AL2023 nodes resolves the issue:

--BOUNDARY
Content-Type: text/cloud-config; charset="us-ascii"

#cloud-config
write_files:
  - path: /etc/systemd/network/10-vpc-cni-secondary.network
    owner: root:root
    permissions: '0644'
    content: |
      [Match]
      Name=ens[6-9]* ens[1-9][0-9]*

      [Link]
      Unmanaged=yes

runcmd:
  - [systemctl, daemon-reload]
  - [systemctl, restart, systemd-networkd]

This seems related to the discussion in:
awslabs/amazon-eks-ami#1738
Environment:

  • Kubernetes version: Server Version: v1.34.1-eks-3cfe0ce
  • CNI Version: v1.20.2-eksbuild.1
  • OS (e.g: cat /etc/os-release):
NAME="Amazon Linux"
VERSION="2023"
ID="amzn"
ID_LIKE="fedora"
VERSION_ID="2023"
PLATFORM_ID="platform:al2023"
PRETTY_NAME="Amazon Linux 2023.9.20251105"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2023"
HOME_URL="https://aws.amazon.com/linux/amazon-linux-2023/"
DOCUMENTATION_URL="https://docs.aws.amazon.com/linux/"
SUPPORT_URL="https://aws.amazon.com/premiumsupport/"
BUG_REPORT_URL="https://github.com/amazonlinux/amazon-linux-2023"
VENDOR_NAME="AWS"
VENDOR_URL="https://aws.amazon.com/"
SUPPORT_END="2029-06-30"
  • Kernel: Linux 6.12.53-69.119.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Oct 21 22:19:00 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions