Add OTel auto-instrumentation for ddtrace-equivalent fidelity#60
Conversation
When INSTALL_OTEL=true, the image now also installs:
- opentelemetry-distro (provides the opentelemetry-instrument CLI)
- opentelemetry-instrumentation-urllib3 (outbound HTTP — webhook +
HTTP-style storage uploads)
- opentelemetry-instrumentation-botocore (boto3 — S3 native uploads)
- opentelemetry-instrumentation-logging (otelTraceID/otelSpanID on
LogRecord, equivalent to DD_LOGS_INJECTION)
- opentelemetry-instrumentation-dnspython (DNS during SPF/DKIM/DMARC)
entrypoint.sh wraps the milter launch with opentelemetry-instrument
when OTEL_TRACING_ENABLED=true, which auto-discovers and activates all
installed instrumentations. ddtrace path stays unchanged (its
auto-instrumentation activates via import, no wrapper needed).
Net effect on the OTel path: child spans for every outbound HTTP
request, S3 op, log line, and DNS lookup — same depth ddtrace
provides today. Manual spans (process_email, storage_upload,
webhook_call) become parents of the auto-instrumentation child spans.
No degradation for ddtrace deployers (gated on INSTALL_OTEL build arg
+ OTEL_TRACING_ENABLED runtime flag); auto-instrumentation can be
disabled per-library via OTEL_PYTHON_DISABLED_INSTRUMENTATIONS.
Greptile SummaryThis PR adds four Confidence Score: 5/5Safe to merge — strictly additive, doubly gated, and the prior reviewer concern is resolved. No P0 or P1 findings. The implementation is clean: version ranges follow the OTel Python convention (0.48b0 ↔ api 1.27), opentelemetry-instrument exec's into the Python process so MILTER_PID tracking and signal propagation are unaffected, the silent-fallback gap from the prior review is closed with three explicit stderr warnings, and the change is completely opt-in behind two independent gates (build arg + runtime flag). All findings are P2 or lower — full marks. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[entrypoint.sh starts] --> B{OTEL_TRACING_ENABLED=true?}
B -- No --> C[Launch milter directly]
B -- Yes --> D{opentelemetry-instrument\nin PATH?}
D -- Found --> E[opentelemetry-instrument\npython3 primitivemail_milter.py]
D -- Missing --> F[Print 3 stderr WARNINGs]
F --> G[Launch milter directly\nmanual spans only]
E --> H[MILTER_PID captured]
G --> H
C --> H
H --> I[Readiness check port 9900]
I --> J[Postfix starts]
subgraph OTel_auto_instrumentations
E --> K[urllib3 spans]
E --> L[botocore S3 spans]
E --> M[logging correlation]
E --> N[dnspython spans]
end
Reviews (3): Last reviewed commit: "Greptile P2: warn loudly when openteleme..." | Re-trigger Greptile |
Previously, OTEL_TRACING_ENABLED=true with the wrapper absent fell silently through to a plain milter launch — auto-instrumentation didn't attach, but the earlier 'OpenTelemetry tracing enabled' log made it look like everything was working. Now an explicit three-line WARNING surfaces the fidelity-loss case and points at the rebuild fix (INSTALL_OTEL=true). Manual spans still emit, so this is a degradation alarm, not a hard failure.
…61) PR #60 listed it among the auto-instrumentation packages by analogy with the others, but there's no upstream OTel auto-instrumentation for dnspython on PyPI. The build fails with: ERROR: Could not find a version that satisfies the requirement opentelemetry-instrumentation-dnspython<1,>=0.48b0 (from versions: none) ERROR: No matching distribution found for opentelemetry-instrumentation-dnspython<1,>=0.48b0 DNS lookups during SPF/DKIM/DMARC checks won't have per-call spans, but the manual milter.process_email parent still wraps them — trace context is preserved, only the per-DNS-call breakdown is lost. Comment on the Dockerfile updated to call this out explicitly so the next person who looks at the list knows why it's three packages, not four. Co-authored-by: prim-8 <prim-8@users.noreply.github.com>
Summary
The previous PR (#59) added OpenTelemetry as a parallel runtime tracing option to ddtrace, but the OTel path only emitted the three manual spans hand-coded in the milter (
process_email,storage_upload,webhook_call). This PR closes the fidelity gap so the OTel path matches what ddtrace's auto-instrumentation provides.opentelemetry-instrumentation-*packages that match the libraries primitivemail uses (urllib3,botocore,logging,dnspython) plusopentelemetry-distrofor the CLI wrapper.opentelemetry-instrumentwhenOTEL_TRACING_ENABLED=trueso all installed instrumentations attach at process start.INSTALL_OTEL=truebuild arg andOTEL_TRACING_ENABLED=trueruntime flag). Zero impact on ddtrace deployers and untraced deployers.What this delivers
For an OTel deployer, the trace graph for an inbound email becomes:
Same depth ddtrace currently produces in Datadog. Manual spans become parents of the auto-instrumented child spans naturally.
Backwards compatibility
INSTALL_OTEL=false(default)INSTALL_OTEL=true,OTEL_TRACING_ENABLED=falseINSTALL_OTEL=true,OTEL_TRACING_ENABLED=trueINSTALL_DDTRACE=true,DATADOG_TRACING_ENABLED=trueAuto-instrumentations can be disabled per-library at runtime via
OTEL_PYTHON_DISABLED_INSTRUMENTATIONS=urllib3,botocore,...for deployers who only want the manual spans.Test plan
pytest— existing tests pass (auto-instrumentation only attaches when both gates are set; default-off path is undisturbed).INSTALL_DDTRACE=true INSTALL_OTEL=true, run withDATADOG_TRACING_ENABLED=true— confirm Datadog APM still shows existing trace structure (no behavior change on this path).OTEL_TRACING_ENABLED=true+OTEL_EXPORTER_OTLP_ENDPOINT=...pointing at a local OTel collector or Tempo — confirm:milter.process_email,milter.storage_upload,milter.webhook_call) appearotelTraceID/otelSpanIDfieldsSPOOF_PROTECTION=enforceexercises SPF/DKIM/DMARCOTEL_PYTHON_DISABLED_INSTRUMENTATIONS=urllib3,botocore,logging,dnspython— confirm only the manual spans land (escape hatch works).