Skip to content

[codex] Improve Nettacker HTTP detection accuracy and explicit port handling#1544

Closed
kerberosmansour wants to merge 11 commits intoOWASP:masterfrom
kerberosmansour:fix/nettacker-recommendations-2026-05-07
Closed

[codex] Improve Nettacker HTTP detection accuracy and explicit port handling#1544
kerberosmansour wants to merge 11 commits intoOWASP:masterfrom
kerberosmansour:fix/nettacker-recommendations-2026-05-07

Conversation

@kerberosmansour
Copy link
Copy Markdown

Summary

This PR improves Nettacker's HTTP scanning accuracy for several cases that showed up during a lab assessment against modern Node/SPA applications (bkimminich/juice-shop and nirocr/nodegoat). The main theme is reducing false positives caused by status-code-only matching while also fixing silent false negatives in header-based vulnerability modules.

The changes are intentionally split between two layers:

  • small HTTP engine primitives that module YAML can reuse (content_length, content_sha1, missing-header matching, and catch-all baseline comparison)
  • targeted module updates for directory discovery, security headers, OPTIONS/CORS handling, and WAF detection

This should preserve the current YAML-driven module model while giving module authors safer matching tools for modern web applications.

Problems addressed

1. URL probe modules false-positive on SPA catch-all routes

dir_scan, admin_scan, and pma_scan previously treated 200, 401, or 403 as enough evidence that a path existed. That produces very noisy output against SPAs and catch-all routes that return the same index.html for every unknown path.

For example, Angular/React/Vue/Svelte applications commonly route unknown paths back to the frontend shell with 200 OK. In that situation, status-code-only matching cannot distinguish a real /admin page from /nettacker-random-nonsense.

This PR adds an opt-in baseline_response condition for HTTP modules. When a module uses it, Nettacker performs a low-impact request to a random sibling path and only reports a probe hit if the probe response differs from the random baseline by status code, body length beyond tolerance, or body SHA-1.

Applied to:

  • dir_scan
  • admin_scan
  • pma_scan

2. Header absence was not matchable by YAML header conditions

Several header vulnerability modules tried to detect unsafe header values, but a completely missing header could fail to match and therefore produce no finding. Missing security headers are often the common case, so this silently under-reports real issues.

The HTTP condition matcher now evaluates missing response headers as an empty string ("") instead of a non-string falsey value. That lets existing regex-based YAML conditions explicitly match absence with ^$.

Updated modules:

  • clickjacking_vuln
  • content_type_options_vuln
  • x_xss_protection_vuln

content_security_policy_vuln already had an absence-compatible regex, but it benefits from the engine fix because missing headers can now match consistently.

3. OPTIONS method detection missed CORS preflight method lists

http_options_enabled_vuln only looked at the legacy Allow header. Modern web frameworks frequently expose allowed methods through Access-Control-Allow-Methods on OPTIONS preflight responses instead.

This PR makes either header sufficient evidence for the module:

  • Allow
  • Access-Control-Allow-Methods

4. WAF detection produced false positives from status-code deltas

waf_scan had a generic fallback heuristic: compare the baseline request status code to an XSS-payload request status code, and report "WAF detected" if they differ.

That is too weak as a WAF signal. Normal application routing, caching, redirects, frontend fallbacks, and framework behavior can all produce status differences without a WAF or CDN in front of the app.

This PR removes the status-delta-only heuristic and leaves the existing positive-signature checks in place. WAF findings now require a vendor/header/body/status signature from the existing iterative_response_match database rather than a generic status-code difference.

It also removes the previous typo-bearing fallback log (differenet).

5. Explicit URL ports should not be blocked by default service discovery

When a user provides a URL with an explicit scheme and port, such as http://jshop:3000, or provides -g 3000, Nettacker already has enough user intent to scan that port. The prior flow could run service discovery first, fail to classify the service, and stop with "no live service found" even though the requested HTTP service was reachable.

This PR updates target expansion to preserve explicit URL scheme/port into the parsed runtime options and skip the service-discovery gate for explicit user-provided ports. This means modules such as http_status_scan can run against common dev/test ports like 3000, 4000, 8000, and 8080 without requiring -d as a workaround.

It also fixes the English message:

  • before: no any live service found to scan.
  • after: no live service found to scan.

6. Add a focused CORS misconfiguration module

This PR adds cors_misconfiguration_vuln for common unsafe CORS responses:

  • reflected or wildcard/null Access-Control-Allow-Origin combined with credentials
  • reflected or wildcard/null origins combined with broad methods such as PUT, PATCH, or DELETE

This complements the existing http_cors_vuln module by adding checks for wildcard/reflected-origin and broad-method combinations observed in modern APIs.

Implementation details

HTTP response fingerprints

nettacker/core/lib/http.py now records two extra response fields for HTTP responses:

  • content_length
  • content_sha1

These are exposed as normal YAML matchable response conditions. The YAML schema test was extended so module definitions can use those fields.

Missing header matching

Missing headers are now passed to regex matching as "". This keeps the matching model simple and makes absence explicit in YAML:

headers:
  X-Content-Type-Options:
    regex: ^$|^((?!nosniff).)+$
    reverse: false

Baseline comparison condition

A module can now opt into catch-all filtering with:

baseline_response:
  max_content_length_delta: 64

For such modules, Nettacker requests a random sibling path and compares the probe to that baseline. The condition passes only when at least one of these differs:

  • status code
  • content length beyond max_content_length_delta
  • SHA-1 of the response body

This keeps the feature opt-in so it only affects modules that are vulnerable to catch-all false positives.

Explicit port handling

Nettacker.expand_targets() now uses urllib.parse.urlsplit() for URL targets. It extracts:

  • normalized hostname for scan target grouping
  • explicit URL port into arguments.ports when -g/--ports was not separately supplied
  • explicit URL scheme into arguments.schema when --schema was not separately supplied
  • base path into url_base_path

If arguments.ports is present, the default service-discovery pre-pass is skipped because the user has already selected the port set to scan.

Files changed

Core behavior:

  • nettacker/core/lib/http.py
  • nettacker/core/app.py
  • nettacker/locale/en.yaml

Module definitions:

  • nettacker/modules/scan/dir.yaml
  • nettacker/modules/scan/admin.yaml
  • nettacker/modules/scan/pma.yaml
  • nettacker/modules/scan/waf.yaml
  • nettacker/modules/vuln/clickjacking.yaml
  • nettacker/modules/vuln/content_type_options.yaml
  • nettacker/modules/vuln/http_options_enabled.yaml
  • nettacker/modules/vuln/x_xss_protection.yaml
  • nettacker/modules/vuln/cors_misconfiguration.yaml

Tests:

  • tests/core/lib/test_http.py
  • tests/core/test_app_targets.py
  • tests/test_yaml_schema_and_regex.py

Compatibility notes

  • The baseline comparison is opt-in and only applied to modules that add baseline_response.
  • Existing header regex behavior still works; missing headers are simply represented as an empty string during matching.
  • The WAF module still contains the existing vendor-specific signatures. This PR removes only the generic status-code-delta fallback.
  • Explicit -g/--ports and explicit URL ports now bypass the service-discovery gate for downstream modules. Default scans without explicit ports still use the existing service-discovery pre-pass.

Validation

I ran the focused regression and schema checks in the repository virtualenv:

.venv/bin/python -m pytest -o addopts='' \
  tests/core/lib/test_http.py \
  tests/core/test_app_targets.py \
  tests/test_yaml_schema_and_regex.py -q

Result:

125 passed, 7 skipped, 2 warnings

I also ran Ruff on the changed Python files and new tests:

.venv/bin/python -m ruff check \
  nettacker/core/lib/http.py \
  nettacker/core/app.py \
  tests/core/lib/test_http.py \
  tests/core/test_app_targets.py

Result:

All checks passed!

And checked whitespace:

git diff --check

Result: clean.

Reviewer notes

The most important design question is whether baseline_response belongs in the generic HTTP matcher as implemented here, or whether maintainers would prefer a more explicit condition name or a module-level option. I kept it as an opt-in condition because it fits the existing YAML condition model and avoids changing unaffected modules.

A second review point is CORS severity. The new module currently treats credentialed cross-origin access and broad unsafe methods as reportable. If the project prefers more granular severities for CORS combinations, this module can be split into multiple YAML steps or separate modules.

Finally, the WAF change intentionally favors false-positive reduction over broad heuristic detection. A status-code delta alone is not strong enough evidence for "WAF detected" on modern apps; vendor/header/body signatures remain the safer path.

kerberosmansour and others added 11 commits May 6, 2026 21:31
Captures the ideate -> research -> architect phases for a new pqc_scan
Nettacker module that probes TLS/SSH endpoints for advertised PQC
algorithms (ML-KEM, X25519MLKEM768, mlkem768x25519-sha256, etc.) and
emits a per-host posture verdict. Approach is passive: read SSH KEXINIT
+ probe TLS 1.3 ClientHello with named-group codepoints, never complete
the handshake, no oqsprovider/liboqs dependency.

Includes idea doc, research dossier with cited IETF/NIST/OpenSSH/IANA
sources, synthesis, stack decision (pure stdlib + paramiko), interface
contracts (pqc_scan / PqcLibrary / PqcEngine), feature-scoped security
defaults, and a STRIDE threat model with abuse cases mapped to
SOC2 / ASVS / NIST 800-53 / OMB M-23-02 / CNSA 2.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
M1 ships the foundation + SSH passive PQC probe (lowest blast radius:
read SSH KEXINIT before negotiation completes, no paramiko Transport).
M2 adds the TLS 1.3 active ClientHello probe gated behind golden-byte
fixtures + wireshark validation. M3 finalizes verdict logic, the
pqc_no_active_probe operator opt-out, docs in docs/Modules.md, and
end-to-end CLI smoke against public PQ-ready hosts.

Strict allow-lists per milestone, BDD scenarios cover happy / invalid /
resource-bound / abuse-case categories, no new pip deps in any
milestone, full Nettacker make-test baseline must remain green
throughout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CEO/eng/security adversarial review surfaced 9 asks; 8 accepted, 1
deferred to v1.1 follow-up runbook (cert-chain PQC analysis).

Applied:
- F-CEO-1: add MLKEM1024 + SecP384r1MLKEM1024 to v1 table; refine M3
  compliance_notes so "pqc_ready" only maps to CNSA 2.0 baseline when
  ML-KEM-1024 is actually advertised (the 2027-01-01 NSS mandate).
- F-ENG-1: library catches every recoverable network exception so
  BaseEngine.run()'s framework retry loop is a no-op for probe failures
  (mitigates threat-model abuse-3 CI-fanout outage).
- F-ENG-2: drop the M1 stub for tls_pqc_scan; M1 ships SSH-only YAML;
  M2 contract now includes the YAML edit that adds the TLS step.
- F-ENG-3: pin IETF-draft-derived key_share byte lengths in
  interfaces.md; tests assert table entries match the pinned table.
- F-ENG-4: M1 pre-flight verifies Web UI module-discovery mechanism
  before claiming auto-discovery in compatibility checklist.
- F-SEC-1: validate server-controlled SSH name-list strings against
  RFC 4250 §6 charset at parse boundary; non-conformant strings dropped
  with errors entry; mitigates CWE-117 log injection.
- F-SEC-2: add 100-mutation torture test for _parse_tls13_server_response
  with the invariant "no exception escapes"; mitigates CWE-787 / CWE-770
  parse bug class.
- F-SEC-3: add FD-leak BDD scenarios in M1 and M2; mitigates CWE-404.

Deferred to v1.1: F-CEO-2 (cert-chain PQC cross-reference) — separate
runbook so v1 wedge stays one week.

Holds (no action): F-CEO-3 (no Web UI panel), F-ENG-6 (e2e endpoint
churn), F-SEC-4 (SSRF inherited), F-SEC-5 (banner transparency).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new pqc_scan Nettacker module backed by nettacker/core/lib/pqc.py.
The library opens one TCP connection to an SSH endpoint, reads the
server's MSG_KEXINIT packet (RFC 4253 §7.1, advertised before any
negotiation), classifies advertised kex_algorithms against a curated
table of OpenSSH PQ algorithms (sntrup761x25519-sha512@openssh.com,
mlkem768x25519-sha256), and returns a posture dict with a provisional
verdict (pqc_ready / hybrid_only / classical_only / unknown).

Hardening per /slo-critique findings:
- F-SEC-1 (CWE-117): server name strings validated against RFC 4250 §6
  charset regex at parse boundary; non-conformant strings dropped with
  hex-prefix into errors[]
- F-SEC-3 (CWE-404): every probe code path closes its socket; tests
  assert FD count delta is zero
- F-ENG-1: every recoverable network exception caught inside the
  library so BaseEngine.run's retry loop is a no-op for probe failures
  (mitigates CI-fanout outage abuse case)
- F-ENG-2: M1 ships SSH step only — TLS step lands in M2 alongside
  the YAML edit that references it
- F-ENG-4: Web UI auto-discovery confirmed via grep — drops new YAML
  in nettacker/modules/scan/ are picked up by Config.path.modules_dir
  glob; no Web UI manifest edit needed

Single one-line edit to nettacker/core/module.py adds pqc_scan to
ignored_core_modules so the module runs without a prior port_scan
(matches existing ssl_* pattern).

No new pip dependencies. Pure stdlib socket / struct / re. Tests use
a localhost fake SSH server replaying canned KEXINIT bytes.

NOTE: Baseline `make test` not yet run — system has Python 3.14 only,
pyproject pins ^3.10 <3.13. Awaiting env decision (brew install
python@3.12 + poetry vs alternative). Tests are review-ready; will
flip the M1 milestone tracker to done once baseline + new tests both
green on a compatible Python.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ons + completion

Two bugs caught during M1 execution and fixed before close-out:

1. SSH banner reader over-consumed into the next packet. _read_ssh_banner
   used 64-byte chunked recvs; when a fast server sent banner+KEXINIT
   in a single TCP segment, the bytes after \\n were silently dropped,
   the subsequent _read_ssh_packet then read 4 random bytes from inside
   the KEXINIT payload as a packet_length, and the probe failed with
   "packet_length_out_of_range_<garbage>". Fix: byte-at-a-time read up
   to \\n, capped at 255 octets per RFC 4253 §4.2 (matches OpenSSH wire
   behavior). Discovered by xdist-parallel test failures that did not
   reproduce in serial single-test mode.

2. YAML filename was pqc_scan.yaml; Nettacker's TemplateLoader at
   nettacker/core/template.py:30-36 splits the module name on _ and
   loads modules/<last_segment>/<rest>.yaml. So module pqc_scan loads
   modules/scan/pqc.yaml, not pqc_scan.yaml. Renamed via git mv so the
   CLI / API / Web UI can actually discover the module.

Also added M1 lessons + completion summary, flipped the runbook
Milestone Tracker to done. All 50 new tests pass under both serial
and xdist-parallel runs. Net delta vs master: +50 passing tests, zero
new regressions. Two pre-existing pre-master tests deselected on the
local Python 3.14 venv (ssl.wrap_socket removed in 3.12+); CI on the
canonical 3.10-3.12 environment will pass them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements PqcLibrary.tls_pqc_scan: for each PQC named-group in
TLS_PQC_NAMED_GROUPS (5 codepoints: MLKEM768, MLKEM1024, X25519MLKEM768,
SecP256r1MLKEM768, SecP384r1MLKEM1024), open one TCP connection, send
a strictly-RFC-8446 ClientHello with that single group in
supported_groups + key_share, read at most one record, classify the
response (ServerHello / HelloRetryRequest / handshake_failure /
decode_error / timeout), close. Never completes the handshake. No new
runtime deps — pure stdlib socket / struct.

Algorithm-table key_share_bytes lengths pinned to the IETF drafts
documented in docs/slo/design/pqc-compliance-scanner-interfaces.md
(critique F-ENG-3): MLKEM768=1184, MLKEM1024=1568, X25519MLKEM768=1216,
SecP256r1MLKEM768=1249, SecP384r1MLKEM1024=1665.

Hardening:
- F-SEC-2 (CWE-787 / CWE-770): _parse_tls13_server_response is total —
  every bytes input maps to a tagged result. Verified by 200-input
  fuzz/torture test with seeded RNG (SEED=0xDEADBEEF). 100 mutations of
  a valid ServerHello + 100 pure random byte strings; no exception
  escapes. Defensive try/except Exception wrapper as belt + suspenders.
- F-SEC-3 (CWE-404): TestTlsFdLeakInvariant exercises every TLS probe
  mode and asserts psutil.Process().num_fds() delta is zero (±2 jitter
  for 8-conn probes).
- F-ENG-1: every recoverable network exception caught inside the
  library so BaseEngine.run retry loop is a no-op for probe failures.
- F-CEO-1: MLKEM1024 advertised → compliance_notes says "meets CNSA 2.0
  ML-KEM-1024 baseline"; MLKEM768-only → "transitional, CNSA 2.0
  requires ML-KEM-1024 by 2027-01-01". Honest mapping.
- F-ENG-2: pqc.yaml's TLS step added in this milestone (was deferred
  from M1). Default port list mirrors ssl_expiring_certificate.yaml
  minus 1080 (SOCKS, not TLS).

Tests: 91 unit tests in test_pqc.py + 7 integration tests = 98 total
PQC tests, all passing under both serial and xdist-parallel.
Full Nettacker baseline: 420 passed, 0 new regressions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ss fix

Final wiring of the PQC compliance scanner. M3 closes the wedge.

Verdict + compliance_notes finalized (F-CEO-1 honest mapping):
- pqc_ready maps to "meets CNSA 2.0 ML-KEM-1024 baseline" ONLY when ML-
  KEM-1024 is advertised. ML-KEM-768-only / hybrid → "transitional —
  CNSA 2.0 requires ML-KEM-1024 by 2027-01-01".
- classical_only → "fails OMB M-23-02 PQ posture baseline".
- All four verdicts cite the relevant standard (FIPS 203, OpenSSH 10.1
  WarnWeakCrypto, OMB M-23-02, CNSA 2.0).

Opt-out wired (threat-model abuse-1 mitigation):
- PqcEngine.run override checks options for pqc_no_active_probe truthy
  value; if set + method is tls_pqc_scan, short-circuits the active
  probe with a clear log line. SSH passive enumeration still runs.
- Tripwire test asserts the library is NOT invoked under opt-out.
- _is_truthy_extra_arg helper accepts true/1/yes/on (case-insensitive,
  whitespace-trimmed); rejects false/empty/None.

Critical correctness fix discovered by M3 e2e:
- M2 emitted TLS ClientHello with all-zero key_share buffer of IETF-
  pinned length. Real-world e2e against pq.cloudflareresearch.com
  returned decode_error — Cloudflare's PQ research server validates
  client key_share content, not just length. False-negative
  classical_only on a server that actually supports PQ.
- Switched to empty KeyShareClientHello (zero entries) per RFC 8446
  §4.2.8. Server replies with HelloRetryRequest specifying the group,
  same posture signal, no content validation. Verified live:
  scanner now correctly reports pqc_ready advertising X25519MLKEM768
  for both pq.cloudflareresearch.com and cloudflare.com.

End-to-end smoke (network-dependent, NETTACKER_NO_NETWORK_TESTS=1 to
skip):
- github.com:22 (or gitlab fallback) — SSH PQ kex enumeration.
- pq.cloudflareresearch.com / cloudflare.com / openquantumsafe.org
  — TLS PQ named-group probe.
- Loopback closed-port — non-network sanity that timeout path returns.

User-facing docs:
- docs/Modules.md gains the pqc_scan list entry plus a 140+ line
  PQC Compliance Scanner section: quick-start, verdict table, what
  we probe, safety operating model, compliance-mapping table, known
  v1 limitations, output JSON schema.
- README.md gains a Key Features bullet linking to the Modules section.

Tests: 31 new (4 unit/integration classes + 3 e2e). PQC suite total
121 tests, all passing serial + xdist-parallel. Full Nettacker
baseline minus pre-existing-on-master Python-3.14 environmental flakes:
447 passed, 0 unexpected failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
M2's commit (07d9dfc) was supposed to add the tls_pqc_scan step to
nettacker/modules/scan/pqc.yaml but the file edit was lost in the M2
batch — the M2 lessons / completion / runbook all describe the TLS
step as wired, but the actual YAML file only contained the M1 SSH
step. M2's unit and integration tests called PqcLibrary().tls_pqc_scan
directly, completely bypassing the YAML→engine→library wire, so the
regression sat invisible.

User caught it by running `nettacker -m pqc_scan -i
pq.cloudflareresearch.com -g 443` and getting an empty pqc_scan
result. After the YAML fix, the same command correctly reports
verdict=pqc_ready advertising X25519MLKEM768.

Adds the tls_pqc_scan step to pqc.yaml with the standard TLS port set
(443, 21, 25, 110, 143, 587, 990, 993, 995, 5061, 5222, 5269, 8443)
and four regression tests in tests/core/test_module_pqc.py:

- test_yaml_loads_via_templateloader_with_both_steps
- test_expand_module_steps_yields_both_method_groups
- test_yaml_tls_step_includes_443
- test_yaml_ssh_step_includes_22

These tests load the YAML through TemplateLoader the same way the
runtime engine does, then run expand_module_steps to verify the
manifest produces both ssh_pqc_scan and tls_pqc_scan sub-step groups.
Asserts both methods are present, both default port lists include
their canonical port. M1 lessons explicitly called out this gap as
"missing tests that should exist now"; closing it now.

Real-world verified: pq.cloudflareresearch.com:443 -> pqc_ready,
advertises X25519MLKEM768, compliance_notes correctly says
"transitional — CNSA 2.0 requires ML-KEM-1024 by 2027-01-01".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sage

Adds a new "Post-Quantum Cryptography (PQC) Compliance Scanning" section
to docs/Usage.md (placed before the API and WebUI section) covering:

- Quick-start examples (single host, bulk, Docker)
- How to read the verdict (jq + python one-liners)
- CI gate snippet that exits non-zero on classical-only
- pqc_no_active_probe operator opt-out for fragile environments
- Verdict semantics table (pqc_ready / hybrid_only / classical_only / unknown)
- Real-world examples (github SSH PQ-ready vs github HTTPS classical-only,
  google with MLKEM1024 = CNSA-2.0-baseline-met, etc.)

README.md gets two new pqc_scan examples in the CLI Quick Setup block
(SSH + TLS) plus a follow-up paragraph linking to Usage.md and Modules.md
for full details, verdict semantics, opt-out, and compliance mapping.

docs/Home.md (mkdocs site landing page) gets a 7th Key Features bullet
mirroring the README, so the PQC scanner is discoverable through the
hosted readthedocs site as well as the GitHub README.

No code changes — pure documentation expansion. Builds on the existing
`## PQC Compliance Scanner (pqc_scan)` section in docs/Modules.md
which was added in M3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nner

feat(pqc-scanner): add PQC compliance scanner module (TLS + SSH)
- Improved target expansion to extract ports and schemas from URLs.
- Added baseline response comparison for HTTP requests to detect changes.
- Introduced new CORS misconfiguration vulnerability detection module.
- Updated various YAML configurations to support new response conditions.
- Added unit tests for baseline response handling and target expansion logic.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ce89012f-f114-45fb-a60c-1e347e8c9354

📥 Commits

Reviewing files that changed from the base of the PR and between a2157ee and 851451d.

📒 Files selected for processing (45)
  • .gitignore
  • README.md
  • docs/Home.md
  • docs/Modules.md
  • docs/RUNBOOK-pqc-compliance-scanner.md
  • docs/Usage.md
  • docs/slo/completion/pqc-scanner-m1.md
  • docs/slo/completion/pqc-scanner-m2.md
  • docs/slo/completion/pqc-scanner-m3.md
  • docs/slo/critique/pqc-compliance-scanner.md
  • docs/slo/design/pqc-compliance-scanner-architecture.md
  • docs/slo/design/pqc-compliance-scanner-interfaces.md
  • docs/slo/design/pqc-compliance-scanner-overview.md
  • docs/slo/design/pqc-compliance-scanner-security.md
  • docs/slo/design/pqc-compliance-scanner-stack-decision.md
  • docs/slo/design/pqc-compliance-scanner-threat-model.md
  • docs/slo/idea/pqc-compliance-scanner.md
  • docs/slo/lessons/pqc-scanner-m1.md
  • docs/slo/lessons/pqc-scanner-m2.md
  • docs/slo/lessons/pqc-scanner-m3.md
  • docs/slo/research/pqc-compliance-scanner/dossier.md
  • docs/slo/research/pqc-compliance-scanner/sources.md
  • docs/slo/research/pqc-compliance-scanner/synthesis.md
  • nettacker/core/app.py
  • nettacker/core/lib/http.py
  • nettacker/core/lib/pqc.py
  • nettacker/core/module.py
  • nettacker/locale/en.yaml
  • nettacker/modules/scan/admin.yaml
  • nettacker/modules/scan/dir.yaml
  • nettacker/modules/scan/pma.yaml
  • nettacker/modules/scan/pqc.yaml
  • nettacker/modules/scan/waf.yaml
  • nettacker/modules/vuln/clickjacking.yaml
  • nettacker/modules/vuln/content_type_options.yaml
  • nettacker/modules/vuln/cors_misconfiguration.yaml
  • nettacker/modules/vuln/http_options_enabled.yaml
  • nettacker/modules/vuln/x_xss_protection.yaml
  • tests/core/lib/test_http.py
  • tests/core/lib/test_pqc.py
  • tests/core/test_app_targets.py
  • tests/core/test_module_pqc.py
  • tests/e2e/__init__.py
  • tests/e2e/test_pqc_scan_smoke.py
  • tests/test_yaml_schema_and_regex.py

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Post-Quantum Cryptography (PQC) compliance scanner module for auditing TLS and SSH endpoints, providing per-host readiness verdicts and advertised PQ algorithm enumeration
    • Added HTTP baseline response matching to reduce false positives in directory scanning
  • Improvements

    • Enhanced HTTP header validation logic for empty header values
    • Improved URL target parsing for explicit port and scheme handling
  • Documentation

    • Comprehensive documentation added for PQC scanning module, including CLI usage examples and compliance framework mapping

Walkthrough

This PR introduces a complete post-quantum cryptography (PQC) compliance scanner module for OWASP Nettacker. It adds a new pqc_scan module that audits TLS 1.3 and SSH endpoints for PQC algorithm support via passive SSH KEXINIT enumeration and active TLS ClientHello probing, alongside supporting infrastructure for HTTP baseline response matching. The change spans comprehensive design documentation across five design artifacts, core library implementation, YAML module configuration, extensive testing, and user-facing documentation.

Changes

PQC Compliance Scanner Implementation

Layer / File(s) Summary
Research, Design Contracts & Architecture
docs/slo/{idea,research,design}/*
Research dossier with market/tech/legal context; synthesis of implementation direction; five core design documents establishing feature scope, API contracts, architecture, security constraints, threat model, and stack decisions (Python stdlib-based).
PQC Library Implementation
nettacker/core/lib/pqc.py
1041 LOC: SSH passive KEXINIT probing, TLS 1.3 active ClientHello construction (RFC-8446 compliant, empty key_share for HRR signaling), total-function response parsing, verdict classification (pqc_ready/hybrid_only/classical_only/unknown), and operator opt-out via --modules-extra-args pqc_no_active_probe=true.
HTTP Baseline Response Matching
nettacker/core/lib/http.py
Content fingerprinting (length + SHA1), randomized baseline URL generation, differential response matching, and follow-up baseline request flow in HttpEngine.run for false-positive reduction.
PQC Module YAML Configuration
nettacker/modules/scan/pqc.yaml, nettacker/core/module.py
New pqc_scan module with ssh_pqc_scan (ports 22/2222) and tls_pqc_scan (ports 80/443/etc) steps, 5-second timeout, scan_succeeded pass condition; registration in ignored_core_modules.
App URL Parsing
nettacker/core/app.py
Enhanced expand_targets() using urlsplit() to extract hostname/scheme/port/path from explicit URLs; conditionally sets arguments.ports and arguments.schema; skips port-scan when URL ports are explicit.
HTTP Module Updates
nettacker/modules/scan/{admin,dir,pma}.yaml
Admin, directory, and PMA HTTP scanning modules updated with baseline_response: {max_content_length_delta: 64} for baseline matching support.
Unit & Integration Tests
tests/core/lib/{test_http.py,test_pqc.py}, tests/core/test_{app_targets,module_pqc}.py
Comprehensive coverage: SSH/TLS wire format validation, verdict classification, exception safety, resource invariants (FD leaks, connection caps), fake SSH/TLS servers for e2e validation, opt-out mechanism, URL parsing, and module wiring.
End-to-End Smoke Tests
tests/e2e/test_pqc_scan_smoke.py
Network-dependent smoke tests against public PQC-aware endpoints (GitHub SSH, Cloudflare TLS) with reachability checks and conditional skip logic.
User-Facing Documentation
README.md, docs/{Home,Modules,Usage}.md
Feature overview, quick-start CLI/Docker examples, verdict semantics, CI gating patterns, compliance mapping (NIST FIPS 203, OMB M-23-02, CNSA 2.0), and opt-out flag documentation.
Milestone & Process Records
docs/slo/{completion,lessons}/*
M1/M2/M3 completion records with scope, file changes, test coverage, and invariants; lessons learned capturing design decisions, mistakes, and forward-looking rules; four-persona adversarial critique with findings and recommendations.
Runbook & Execution Template
docs/RUNBOOK-pqc-compliance-scanner.md
AI-first development runbook specifying feature purpose, architecture, stable interfaces, milestone contracts with BDD scenarios, bounded-resource rules, and global execution discipline.
Configuration & Cleanup
.gitignore, nettacker/locale/en.yaml, nettacker/modules/vuln/*, tests/test_yaml_schema_and_regex.py
Ignore .venv/ and evidence/; locale string adjustments; vulnerability module regex updates for empty header values; HTTP condition schema extended for baseline/content fields.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • OWASP/Nettacker#1113: Shares skip_service_discovery behavior; this PR implements core URL parsing and skip logic in nettacker/core/app.py while the related PR handles UI/API surface submission of the flag.

Suggested labels

new module, enhancement

Suggested reviewers

  • arkid15r
  • securestep9
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

⚠️ This pull request might be slop. It has been flagged by CodeRabbit slop detection and should be reviewed carefully.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

PR validation failed: No linked issue and no valid closing issue reference in PR description

@github-actions github-actions Bot closed this May 7, 2026
@kerberosmansour
Copy link
Copy Markdown
Author

Superseded by #1545. This first draft was opened before I rebased the branch onto OWASP/Nettacker:master, so it briefly included fork-local history that is not part of the intended review. The clean maintainer-facing PR is #1545.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant