Skip to content

Commit fe50ea5

Browse files
ilaclaude
andcommitted
Add systematic clipping parameter evaluation and composition attack tests
Sweep pac_clip_support threshold, iterative mean-sigma clipping, their combination, composition attacks (up to 100 queries), and power-law skewed data. Key finding: composition breaks unclipped PAC (87% at 100 queries) but hard-zero clipping holds (60% flat). Mean-sigma clipping outperforms level-based clipping for small-group queries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 5a24ac6 commit fe50ea5

9 files changed

Lines changed: 1772 additions & 0 deletions

.claude/skills/explain-dp/SKILL.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,3 +80,22 @@ models. E.g., stronger regularization in ridge regression.
8080
For databases: this suggests that queries producing high-variance outputs (due to
8181
outliers, small groups, etc.) are inherently harder to privatize. Clipping reduces
8282
variance and thus the noise needed, improving the privacy-utility tradeoff.
83+
84+
### DP vs PAC: Worst-Case vs Instance-Based Sensitivity
85+
86+
DP calibrates noise to **global sensitivity**: max over ALL possible datasets of
87+
how much the output changes when one row is added/removed. This is a worst-case
88+
quantity independent of the actual data.
89+
90+
PAC calibrates noise to the **actual data geometry**: the variance of the query
91+
output across subsamples of the real table. Stable queries on stable data get
92+
less noise automatically.
93+
94+
The calibration transfer conjecture (Blueprint, April 2026) bridges the two:
95+
PAC's instance-based noise from one subsampling distribution D₀, augmented by
96+
a small compensation Δ for the spectral gap, transfers to a nearby D₁ (e.g.,
97+
a different query or different population). Δ is instance-based (proportional
98+
to the actual distributional distance d(D₀,D₁)), much smaller than DP's global
99+
sensitivity. Clipping bounds per-PU influence on the variance, keeping d small.
100+
This gives PAC "universal MIA resistance" that degrades gracefully with the
101+
effective distributional distance, rather than DP's uniform worst-case bound.

.claude/skills/explain-pac/SKILL.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,42 @@ sampling per query** (each query uses a fresh random subset).
8989
- `pac_clip_support`: Minimum distinct contributors per magnitude level (NULL = disabled)
9090
- `pac_hash_repair`: Ensure pac_hash outputs exactly 32 bits set
9191

92+
### Calibration Transfer (Blueprint, April 2026)
93+
94+
PAC calibrates noise per query from the 64 counters' variance. The data is
95+
fixed; the "distribution" D is over random 50%-subsamples (the 64 worlds).
96+
The calibration transfer conjecture asks: when noise calibrated under one
97+
subsampling distribution D₀ also protects under a different distribution D₁.
98+
99+
**What D₀ and D₁ represent** (NOT two table versions — the data is fixed):
100+
- Different effective subsampling distributions arising from different queries
101+
or different populations. E.g., D₀ = variance profile of a broad query,
102+
D₁ = variance profile of a narrow-filter query targeting one PU.
103+
- Or: D₀ = subsampling with Alice present, D₁ = without Alice. The
104+
covariance Σ changes because Alice's contribution affects the 64 counters.
105+
106+
**Why narrow-filter attacks succeed**: The noise was calibrated from the full
107+
query's variance (D₀). But the attacker's distinguishing task operates on a
108+
narrow slice (D₁) where one PU dominates. If d(D₀, D₁) is large, calibration
109+
doesn't transfer → the noise is insufficient → attack succeeds.
110+
111+
**Conjecture**: If d(D₀, D₁) ≤ t, noise Q₀ from D₀ augmented by
112+
Δ = N(0, spectral_gap) is valid for D₁. The compensation Δ is instance-based
113+
(proportional to actual distributional distance, not worst-case like DP).
114+
115+
**Connection to clipping**: pac_clip_support bounds per-PU influence on Σ.
116+
This keeps d(D₀, D₁) small regardless of filter → calibration transfers →
117+
attacks fail. Clipping is the mechanism that makes the transfer bound tight.
118+
119+
**Open questions**: optimal distance metric (Wasserstein vs Fisher-Rao),
120+
sharp transfer constants, extending from continuous (SGD) to discrete
121+
(PAC DB's 64-out-of-128 subsampling) setting.
122+
123+
Reference: "Calibration Transfer Between Close Distributions — Blueprint for
124+
Universal Membership Inference Resistance" (working document, 05 April 2026).
125+
Thesis: Sridhar, "Toward Provable Privacy for Black-Box Algorithms via
126+
Algorithmic Stability" (MIT PhD, February 2026), Chapter 3.
127+
92128
### DDL
93129

94130
```sql
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
===================================================
2+
COMPOSITION ATTACK: clip=off, filt<=3
3+
15 trials x 100 queries each
4+
===================================================
5+
6+
--- NQ=1 ---
7+
| truth | mean | std | n |
8+
|-------|---------|---------|---:|
9+
| in | 1739821 | 6678043 | 13 |
10+
| out | 6835 | 117410 | 14 |
11+
| best_accuracy |
12+
|---------------|
13+
| 74.1% |
14+
15+
--- NQ=5 ---
16+
| truth | mean | std | n |
17+
|-------|--------|---------|---:|
18+
| in | -38831 | 2791251 | 15 |
19+
| out | 6918 | 39297 | 15 |
20+
| best_accuracy |
21+
|---------------|
22+
| 73.3% |
23+
24+
--- NQ=10 ---
25+
| truth | mean | std | n |
26+
|-------|---------|---------|---:|
27+
| in | -182683 | 1878886 | 15 |
28+
| out | 7183 | 35196 | 15 |
29+
| best_accuracy |
30+
|---------------|
31+
| 70.0% |
32+
33+
--- NQ=25 ---
34+
| truth | mean | std | n |
35+
|-------|--------|---------|---:|
36+
| in | 833152 | 1515532 | 15 |
37+
| out | 16222 | 18137 | 15 |
38+
| best_accuracy |
39+
|---------------|
40+
| 83.3% |
41+
42+
--- NQ=50 ---
43+
| truth | mean | std | n |
44+
|-------|--------|--------|---:|
45+
| in | 434678 | 894849 | 15 |
46+
| out | 18224 | 16098 | 15 |
47+
| best_accuracy |
48+
|---------------|
49+
| 73.3% |
50+
51+
--- NQ=100 ---
52+
| truth | mean | std | n |
53+
|-------|--------|--------|---:|
54+
| in | 709941 | 987083 | 15 |
55+
| out | 17313 | 13044 | 15 |
56+
| best_accuracy |
57+
|---------------|
58+
| 86.7% |
59+
60+
===================================================
61+
COMPOSITION ATTACK: clip=2, filt<=3
62+
15 trials x 100 queries each
63+
===================================================
64+
65+
--- NQ=1 ---
66+
| truth | mean | std | n |
67+
|-------|-------|--------|---:|
68+
| in | 31260 | 108105 | 11 |
69+
| out | 4746 | 64576 | 10 |
70+
| best_accuracy |
71+
|---------------|
72+
| 66.7% |
73+
74+
--- NQ=5 ---
75+
| truth | mean | std | n |
76+
|-------|-------|-------|---:|
77+
| in | 33914 | 60252 | 15 |
78+
| out | 7056 | 27635 | 15 |
79+
| best_accuracy |
80+
|---------------|
81+
| 70.0% |
82+
83+
--- NQ=10 ---
84+
| truth | mean | std | n |
85+
|-------|-------|-------|---:|
86+
| in | 20631 | 38586 | 15 |
87+
| out | 5626 | 19507 | 15 |
88+
| best_accuracy |
89+
|---------------|
90+
| 66.7% |
91+
92+
--- NQ=25 ---
93+
| truth | mean | std | n |
94+
|-------|-------|-------|---:|
95+
| in | 18989 | 20841 | 15 |
96+
| out | 15556 | 14957 | 15 |
97+
| best_accuracy |
98+
|---------------|
99+
| 63.3% |
100+
101+
--- NQ=50 ---
102+
| truth | mean | std | n |
103+
|-------|-------|-------|---:|
104+
| in | 16671 | 14026 | 15 |
105+
| out | 12200 | 11412 | 15 |
106+
| best_accuracy |
107+
|---------------|
108+
| 66.7% |
109+
110+
--- NQ=100 ---
111+
| truth | mean | std | n |
112+
|-------|-------|-------|---:|
113+
| in | 17200 | 9530 | 15 |
114+
| out | 15816 | 11039 | 15 |
115+
| best_accuracy |
116+
|---------------|
117+
| 60.0% |
118+

attacks/clip_composition_test.sh

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
#!/usr/bin/env bash
2+
# Composition attack: does averaging many queries break clipping?
3+
#
4+
# METHODOLOGY:
5+
# Run N independent queries (each with a different pac_seed) on the same data.
6+
# The attacker averages the N results. Noise decreases at 1/sqrt(N) but the
7+
# outlier signal stays constant. With enough queries, noise → 0 and the signal
8+
# should be detectable.
9+
#
10+
# For each NQ (number of queries), we compute the average over those queries
11+
# per trial, then find the best classification threshold across trials.
12+
# 50% = random, 100% = perfect attack.
13+
set -euo pipefail
14+
15+
DUCKDB="/home/ila/Code/pac/build/release/duckdb"
16+
PAC_EXT="/home/ila/Code/pac/build/release/extension/pac/pac.duckdb_extension"
17+
18+
N=1000; TV=999999; MI=0.0078125; FILT=3; NT=15
19+
20+
run_sum() {
21+
local cond=$1 seed=$2 clip=$3
22+
local insert=""
23+
[ "$cond" = "in" ] && insert="INSERT INTO users VALUES (0, ${TV});"
24+
local clip_sql=""
25+
[ "$clip" != "off" ] && clip_sql="SET pac_clip_support = ${clip};"
26+
$DUCKDB -noheader -list 2>/dev/null <<SQL
27+
LOAD '${PAC_EXT}';
28+
CREATE TABLE users(user_id INTEGER, acctbal INTEGER);
29+
INSERT INTO users SELECT i, ((hash(i*31+7)%10000)+1)::INTEGER FROM generate_series(1,${N}) t(i);
30+
${insert}
31+
ALTER TABLE users ADD PAC_KEY(user_id);
32+
ALTER TABLE users SET PU;
33+
SET pac_mi = ${MI};
34+
SET pac_seed = ${seed};
35+
${clip_sql}
36+
SELECT SUM(acctbal) FROM users WHERE user_id <= ${FILT} OR user_id = 0;
37+
SQL
38+
}
39+
40+
MAX_NQ=100 # max queries per trial
41+
42+
for CLIP in off 2; do
43+
echo "==================================================="
44+
echo " COMPOSITION ATTACK: clip=${CLIP}, filt<=${FILT}"
45+
echo " ${NT} trials x ${MAX_NQ} queries each"
46+
echo "==================================================="
47+
echo ""
48+
49+
# Collect all queries upfront
50+
IN_F=$(mktemp); OUT_F=$(mktemp)
51+
for trial in $(seq 1 $NT); do
52+
for q in $(seq 1 $MAX_NQ); do
53+
s=$((trial * 10000 + q))
54+
echo "in,${trial},${q},$(run_sum in $s $CLIP)" >> "$IN_F"
55+
echo "out,${trial},${q},$(run_sum out $s $CLIP)" >> "$OUT_F"
56+
done
57+
echo " trial ${trial}/${NT} done" >&2
58+
done
59+
60+
# Analyze at different NQ cutoffs
61+
for NQ in 1 5 10 25 50 100; do
62+
echo "--- NQ=${NQ} ---"
63+
$DUCKDB -markdown <<SQL
64+
CREATE TABLE raw AS
65+
SELECT split_part(c,',',1) AS truth,
66+
TRY_CAST(split_part(c,',',2) AS INTEGER) AS trial,
67+
TRY_CAST(split_part(c,',',3) AS INTEGER) AS qid,
68+
TRY_CAST(split_part(c,',',4) AS DOUBLE) AS v
69+
FROM (
70+
SELECT column0 AS c FROM read_csv('${IN_F}',columns={'column0':'VARCHAR'},header=false)
71+
UNION ALL
72+
SELECT column0 FROM read_csv('${OUT_F}',columns={'column0':'VARCHAR'},header=false)
73+
) WHERE split_part(c,',',4) != '';
74+
75+
-- Average first NQ queries per trial
76+
WITH avgs AS (
77+
SELECT truth, trial, AVG(v) AS v
78+
FROM raw WHERE qid <= ${NQ} AND v IS NOT NULL
79+
GROUP BY truth, trial
80+
)
81+
SELECT truth, printf('%.0f', AVG(v)) AS mean, printf('%.0f', STDDEV(v)) AS std, COUNT(*) AS n
82+
FROM avgs GROUP BY truth ORDER BY truth;
83+
84+
-- Best threshold classifier on averaged values
85+
WITH avgs AS (
86+
SELECT truth, trial, AVG(v) AS v
87+
FROM raw WHERE qid <= ${NQ} AND v IS NOT NULL
88+
GROUP BY truth, trial
89+
),
90+
ths AS (SELECT UNNEST(generate_series(
91+
(SELECT (MIN(v))::BIGINT FROM avgs),
92+
(SELECT (MAX(v))::BIGINT FROM avgs),
93+
GREATEST(1, ((SELECT MAX(v)-MIN(v) FROM avgs)/50)::BIGINT)
94+
)) AS t),
95+
accs AS (
96+
SELECT t, 100.0*SUM(CASE
97+
WHEN truth='in' AND v > t THEN 1 WHEN truth='out' AND v <= t THEN 1
98+
ELSE 0 END)::DOUBLE / COUNT(*) AS acc
99+
FROM avgs, ths GROUP BY t
100+
)
101+
SELECT printf('%.1f%%', MAX(acc)) AS best_accuracy FROM accs;
102+
SQL
103+
echo ""
104+
done
105+
rm -f "$IN_F" "$OUT_F"
106+
done

0 commit comments

Comments
 (0)