Skip to content

Debugging

Thomas Mangin edited this page Nov 13, 2025 · 4 revisions

Debugging ExaBGP

Troubleshooting guide for common ExaBGP issues

πŸ” Most issues are configuration or API process problems - not ExaBGP bugs


Table of Contents


Quick Diagnosis

Start here for fast troubleshooting:

1. Is ExaBGP Running?

# Check process
ps aux | grep exabgp

# Check with pgrep
pgrep -f exabgp

# Check via systemd
systemctl status exabgp

2. Is BGP Session Established?

# In ExaBGP logs, look for:
grep "neighbor.*up" /var/log/exabgp.log

# On router (Cisco):
show bgp summary | grep 192.168.1.2

# On router (Juniper):
show bgp summary | match 192.168.1.2

3. Are Routes Being Announced?

# Check ExaBGP process output
tail -f /var/log/exabgp.log | grep announce

# Check router (Cisco):
show ip bgp neighbors 192.168.1.2 received-routes

# Check router (Juniper):
show route receive-protocol bgp 192.168.1.2

4. Is API Process Running?

# Check if health check / API script is running
ps aux | grep healthcheck.py

# Check for errors in stderr
tail -f /var/log/exabgp.log | grep ERROR

Common Issues

Issue 1: ExaBGP Won't Start

Symptoms: ExaBGP exits immediately after starting

Check logs:

exabgp -d /etc/exabgp/exabgp.conf 2>&1 | tee /tmp/exabgp-debug.log

Common causes:

A. Configuration Syntax Error

Error message:

configuration issue: syntax error

Fix:

# Test configuration
exabgp --test /etc/exabgp/exabgp.conf

# Check for common mistakes:
# - Missing semicolons
# - Incorrect indentation
# - Typos in directives

Example error:

# ❌ WRONG (missing semicolon)
neighbor 192.168.1.1 {
    router-id 192.168.1.2
    local-as 65001
}

# βœ… CORRECT
neighbor 192.168.1.1 {
    router-id 192.168.1.2;
    local-as 65001;
}

B. API Process Not Found

Error message:

process healthcheck run /etc/exabgp/healthcheck.py - [Errno 2] No such file or directory

Fix:

# Verify script exists
ls -l /etc/exabgp/healthcheck.py

# Make executable
chmod +x /etc/exabgp/healthcheck.py

# Test manually
/etc/exabgp/healthcheck.py

C. Python Version Mismatch

Error message:

python3: No module named exabgp

Fix:

# Check Python version
python3 --version

# ExaBGP requires Python 3.8.1+
# Reinstall if needed
pip3 install --upgrade exabgp

# Verify installation
python3 -m exabgp version

Issue 2: BGP Session Not Establishing

Symptoms: BGP session stuck in "Connect" or "Active" state

Debug:

# Run ExaBGP in debug mode
exabgp -d /etc/exabgp/exabgp.conf 2>&1 | grep -i "neighbor\|tcp"

Common causes:

A. TCP Connection Failure

Error message:

Connection refused

Check:

# Test TCP connection to router
telnet 192.168.1.1 179

# Check if router is listening
# On router (Cisco):
show tcp brief | include 179

# Verify firewall allows BGP
iptables -L -n | grep 179

Fix:

# On router, ensure BGP neighbor configured
router bgp 65000
 neighbor 192.168.1.2 remote-as 65001

B. Authentication Failure

Error message:

NOTIFICATION sent to peer 192.168.1.1 code 2 (OPEN Message Error)

Check:

# Verify MD5 password matches
grep md5-password /etc/exabgp/exabgp.conf

Fix:

# ExaBGP config
neighbor 192.168.1.1 {
    md5-password "secret123";  # Must match router
}
# Router config (must match!)
router bgp 65000
 neighbor 192.168.1.2 password secret123

C. ASN Mismatch

Error message:

NOTIFICATION sent code 2 subcode 2 (Bad Peer AS)

Fix:

# Verify ASNs match
# ExaBGP:
local-as 65001;
peer-as 65000;

# Router:
# router bgp 65000
#  neighbor 192.168.1.2 remote-as 65001

Issue 3: Routes Not Being Announced

Symptoms: BGP session up, but routes not on router

Debug:

# Watch API process output
tail -f /var/log/exabgp.log | grep -i "announce\|withdraw"

Common causes:

A. API Process Not Sending Commands

Check:

# Is API process running?
ps aux | grep healthcheck.py

# Test API process manually
/etc/exabgp/healthcheck.py

# Should see output like:
# announce route 100.10.0.100/32 next-hop self

Fix:

# Common mistake: forgetting to flush stdout
sys.stdout.write("announce route 100.10.0.100/32 next-hop self\n")
sys.stdout.flush()  # ← CRITICAL!

B. Address Family Not Enabled

Error: No error, routes just not visible

Check:

# Verify address family in config
grep "family" /etc/exabgp/exabgp.conf

Fix:

neighbor 192.168.1.1 {
    family {
        ipv4 unicast;  # Must be enabled!
    }
}
# Router must also enable address family
router bgp 65000
 neighbor 192.168.1.2 remote-as 65001
 !
 address-family ipv4 unicast
  neighbor 192.168.1.2 activate
 !

C. Routes Filtered by Router Policy

Check:

# Cisco - check for route-map
show bgp neighbors 192.168.1.2 | include route-map

# Check if routes rejected
show ip bgp neighbors 192.168.1.2 received-routes

Fix:

# Remove or adjust route-map
router bgp 65000
 neighbor 192.168.1.2 route-map ACCEPT in

route-map ACCEPT permit 10

Issue 4: Routes Announced but Not Installed

Symptoms: Routes visible in BGP table but not in routing table

Check:

# Cisco
show ip bgp 100.10.0.100  # Shows in BGP table?
show ip route 100.10.0.100  # Shows in routing table?

Common causes:

A. Invalid Next-Hop

Error: Route in BGP table but marked invalid

Fix:

# βœ… Use "next-hop self"
announce route 100.10.0.100/32 next-hop self

# Or explicit reachable next-hop
announce route 100.10.0.100/32 next-hop 192.168.1.2

B. Better Path Exists

Router prefers different route (lower MED, shorter AS-PATH, etc.)

Check:

show ip bgp 100.10.0.100
# Look for "best" marker

Fix:

# Adjust BGP attributes to make route preferred
announce route 100.10.0.100/32 next-hop self local-preference 200

Issue 5: API Process Crashes

Symptoms: ExaBGP runs but API process keeps exiting

Check logs:

tail -f /var/log/exabgp.log | grep -i "process.*exit\|error"

Common causes:

A. Python Exception

Error:

Process healthcheck exited with code 1

Debug:

# Run API process manually
python3 /etc/exabgp/healthcheck.py

# Add error handling
import sys
import traceback

try:
    # Your code
    pass
except Exception as e:
    sys.stderr.write(f"ERROR: {e}\n")
    traceback.print_exc(file=sys.stderr)

B. Missing Python Modules

Error:

ModuleNotFoundError: No module named 'requests'

Fix:

# Install missing module
pip3 install requests

# Or add to requirements
echo "requests" >> requirements.txt
pip3 install -r requirements.txt

Issue 6: FlowSpec Rules Not Applied

Symptoms: FlowSpec announced but traffic not filtered

Check:

# Cisco
show flowspec ipv4
show flowspec ipv4 detail

# Juniper
show firewall filter __flowspec_default_inet__

Common causes:

A. FlowSpec Not Enabled on Router

Fix:

# Cisco IOS-XR
router bgp 65000
 address-family ipv4 flowspec
  neighbor 192.168.1.2 activate
 !
!
flowspec
 local-install interface-all
!

B. FlowSpec Validation Failing

Error: Rules received but not installed

Fix:

# Disable validation (testing only!)
flowspec
 validation off  # or "local"

Debug Mode

Enable Full Debugging

Command line:

exabgp -d /etc/exabgp/exabgp.conf

Environment variables:

# Enable all debug logging
export exabgp_log_all=true
export exabgp_log_level=DEBUG

exabgp /etc/exabgp/exabgp.conf

Selective Debugging

Enable specific subsystems:

# Debug BGP packets
export exabgp_log_packets=true

# Debug BGP messages
export exabgp_log_message=true

# Debug configuration parsing
export exabgp_log_configuration=true

# Debug process communication
export exabgp_log_processes=true

# Debug network events
export exabgp_log_network=true

exabgp /etc/exabgp/exabgp.conf

Decode BGP Messages

Decode captured BGP packets:

# Capture BGP traffic
tcpdump -i eth0 -w bgp.pcap port 179

# Decode with ExaBGP
env exabgp_tcp_bind='' exabgp decode -c /etc/exabgp/exabgp.conf \
  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:003C:02:0000001C4001010040020040030465016501800404000000C840050400000064000000002001010101

Logging

Log Levels

# Set log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
export exabgp_log_level=DEBUG

Log Destinations

Stdout (default):

exabgp /etc/exabgp/exabgp.conf

File:

exabgp /etc/exabgp/exabgp.conf > /var/log/exabgp.log 2>&1

Syslog:

[exabgp.log]
destination = syslog
level = INFO

Useful Log Patterns

Search for errors:

grep -i error /var/log/exabgp.log

Track BGP session state:

grep "neighbor.*up\|neighbor.*down" /var/log/exabgp.log

Monitor route announcements:

grep "announce\|withdraw" /var/log/exabgp.log

Find process crashes:

grep "process.*exit" /var/log/exabgp.log

BGP Session Issues

Session Won't Establish

Check TCP connectivity:

# Test connection
telnet 192.168.1.1 179

# Check routing
traceroute 192.168.1.1

# Verify firewall
iptables -L -n | grep 179

Session Flapping

Symptoms: BGP session repeatedly going up/down

Check:

# Monitor session state
watch -n 1 'grep "neighbor.*up\|neighbor.*down" /var/log/exabgp.log | tail'

Common causes:

  • Network instability
  • Keepalive/hold-time mismatch
  • Process crashes
  • Memory/CPU exhaustion

Fix:

# Adjust BGP timers
neighbor 192.168.1.1 {
    hold-time 180;  # Increase if needed
}

API Process Issues

Process Not Starting

Check:

# Verify script exists and is executable
ls -l /etc/exabgp/healthcheck.py
chmod +x /etc/exabgp/healthcheck.py

# Test manually
/etc/exabgp/healthcheck.py

Process Crashes Immediately

Debug:

# Run with Python directly
python3 /etc/exabgp/healthcheck.py

# Check for:
# - Syntax errors
# - Missing imports
# - Exceptions

Process Hangs

Symptoms: API process runs but doesn't send commands

Debug:

# Add debug output
sys.stderr.write("[DEBUG] Script started\n")
sys.stderr.write(f"[DEBUG] Service healthy: {is_healthy()}\n")
sys.stderr.write("[DEBUG] Announcing route\n")
sys.stdout.write("announce route 100.10.0.100/32 next-hop self\n")
sys.stdout.flush()
sys.stderr.write("[DEBUG] Route announced\n")

Route Announcement Issues

Routes Not Visible on Router

Checklist:

  1. βœ… BGP session established?
  2. βœ… Address family enabled?
  3. βœ… API process sending commands?
  4. βœ… Commands have newline + flush?
  5. βœ… Router policy allowing routes?

Routes Announced but Invalid

Check next-hop:

show ip bgp 100.10.0.100
# Look for "inaccessible" or "invalid"

Fix:

# Use next-hop self or reachable IP
announce route 100.10.0.100/32 next-hop self

Performance Issues

High CPU Usage

Symptoms: ExaBGP consuming excessive CPU

Check:

# Monitor CPU
top -p $(pgrep -f exabgp)

# Check API process
ps aux | grep healthcheck.py

Common causes:

  • Tight loop in API process
  • No sleep in while loop
  • Processing large BGP tables

Fix:

while True:
    # Your code
    time.sleep(5)  # ← Add sleep!

High Memory Usage

Symptoms: ExaBGP using excessive RAM

Check:

# Monitor memory
ps aux | grep exabgp | awk '{print $6 " " $11}'

Common causes:

  • Large number of routes
  • Memory leak in API process
  • Large BGP tables from peers

Tools and Commands

Useful ExaBGP Commands

# Test configuration
exabgp --test /etc/exabgp/exabgp.conf

# Show version
exabgp --version

# Decode BGP message
exabgp decode -c config.conf <hex>

# Run health check
exabgp --run healthcheck --help

Network Debugging Tools

# Capture BGP traffic
tcpdump -i eth0 port 179 -w bgp.pcap

# Test TCP connection
telnet 192.168.1.1 179
nc -zv 192.168.1.1 179

# Monitor connections
ss -tan | grep :179
netstat -an | grep :179

# Check routing
ip route get 192.168.1.1
traceroute 192.168.1.1

Router Commands

Cisco IOS/IOS-XR:

! BGP summary
show bgp summary
show ip bgp summary

! Specific neighbor
show bgp neighbors 192.168.1.2
show ip bgp neighbors 192.168.1.2 received-routes
show ip bgp neighbors 192.168.1.2 routes

! Route details
show ip bgp 100.10.0.100

! FlowSpec
show flowspec ipv4
show flowspec ipv4 detail

Juniper Junos:

! BGP summary
show bgp summary

! Specific neighbor
show bgp neighbor 192.168.1.2

! Routes from peer
show route receive-protocol bgp 192.168.1.2

! Route details
show route 100.10.0.100 detail

Getting Help

Before Asking for Help

Gather this information:

  1. ExaBGP version

    exabgp --version
  2. Full debug log

    exabgp -d /etc/exabgp/exabgp.conf 2>&1 | tee debug.log
  3. Configuration file

    cat /etc/exabgp/exabgp.conf
  4. API process code

    cat /etc/exabgp/healthcheck.py
  5. Router BGP config (sanitized)

  6. Error messages (exact text)


Where to Get Help

GitHub Issues:

Slack:

Documentation:


How to Report Bugs

Include:

  1. ExaBGP version (exabgp --version)
  2. Python version (python3 --version)
  3. Operating system (uname -a)
  4. Full configuration file (sanitized)
  5. Complete debug output (exabgp -d)
  6. Steps to reproduce
  7. Expected vs actual behavior

Format:

## Environment
- ExaBGP version: 4.2.25
- Python version: 3.9.2
- OS: Ubuntu 20.04

## Configuration
```ini
neighbor 192.168.1.1 {
    ...
}

Steps to Reproduce

  1. Start ExaBGP with config
  2. Run health check script
  3. Observe...

Expected Behavior

Routes should be announced

Actual Behavior

No routes announced, error: ...

Debug Log

<paste debug output>

---

## Quick Reference

### Checklist for Common Issues

**ExaBGP won't start:**
- [ ] Configuration syntax correct?
- [ ] API process script exists?
- [ ] Script is executable?
- [ ] Python version >= 3.8.1?

**BGP session won't establish:**
- [ ] TCP connection works? (telnet)
- [ ] ASNs match?
- [ ] MD5 password matches?
- [ ] Address family enabled?

**Routes not announced:**
- [ ] API process running?
- [ ] stdout.flush() called?
- [ ] Address family enabled?
- [ ] Router policy allowing?

**FlowSpec not working:**
- [ ] FlowSpec family enabled?
- [ ] Router supports FlowSpec?
- [ ] FlowSpec locally installed?
- [ ] Validation passing?

---

## Next Steps

### Learn More

- **[Monitoring](Monitoring)** - Monitor ExaBGP in production
- **[API Overview](API-Overview)** - API architecture
- **[Quick Start](Quick-Start)** - Getting started

### Configuration

- **[Configuration Syntax](Configuration-Syntax)** - Config reference
- **[Directives Reference](Directives-Reference)** - A-Z directives

---

**Still stuck?** Join our [Slack community](https://exabgp.slack.com/) or [file an issue](https://github.com/Exa-Networks/exabgp/issues) β†’

---

**πŸ‘» Ghost written by Claude (Anthropic AI)**
Clone this wiki locally