← All Research
[Draft · Not Yet Published]
[Preprint · April 2026]

Input-Level Differential Privacy for SHAP Values Is Fundamentally Expensive

If a bank runs a credit model on your encrypted data and sends you back a SHAP explanation of the decision, the explanation can often be reverse-engineered back into your original input. The textbook fix — adding differential-privacy noise — does not work at the privacy budgets anyone actually uses. This draft explains why, by how much, and what to publish instead.

AuthorBader Alissaei
AffiliationVaultBytes Innovations Ltd
StatusDraft preprint
TopicDifferential Privacy · SHAP · FHE · Explainable AI
KeywordsCoalition Sensitivity · Reconstruction · Gaussian Mechanism · Report-Noisy-Max

The setting, in plain terms

A bank holds a model. A client wants a prediction. The client does not want the bank to see their financial details. The fix is fully homomorphic encryption. The new problem is what happens after the prediction — specifically, what happens when the client asks "why?".

A bank, hospital, or insurer holds a machine-learning model — say, a credit-default classifier. A client sends an encrypted feature vector x: age, income, debt-to-income, number of open accounts. Fully homomorphic encryption (FHE) lets the server evaluate the model over the ciphertext and return an encrypted prediction f(x). The server never sees the plaintext. Good.

Modern deployments go further and also return a SHAP explanation — a per-feature score φ₁, φ₂, …, φ_d telling the client which of their features drove the decision. This is often a regulatory requirement: the client has to be able to audit, dispute, or file the decision. The server runs the explanation under FHE, the client decrypts.

Here is the problem. If the client publishes the explanation — shows it to a regulator, attaches it to a complaint, uploads it to a portal — an attacker who sees it can often reconstruct most of x, because φ is a function of x. Published SHAP vectors leak. The standard response is to add differential-privacy noise before releasing φ. This paper shows that fix has no good operating point.

// End-to-end flow · where the leak happens
CLIENT holds plaintext x enc(x) SERVER (FHE) f(x) — prediction φ(x) — SHAP explanation ciphertext throughout enc(f(x)), enc(φ(x)) CLIENT decrypts φ(x) publish ATTACKER reconstructs x from φ The leak is not in the FHE. It is in what the client voluntarily publishes after decrypting. φ(x) is a rich function of x, so publishing it reveals x.
The encryption is not the problem. The problem is that the client, after decrypting, chooses to share the explanation — and the explanation is enough to reverse-engineer the input.

Why DP noise fails here

Differential privacy works by adding just enough noise to hide whether the input was x or a neighbour x'. The cost depends on how much the output can change when one feature changes. For SHAP, that quantity grows with the square root of K — the number of coalition queries SHAP runs internally.

// How kernel SHAP is actually computed

SHAP does not read x directly into a single scoring function. It builds K different coalitions — each a random subset S ⊆ {1…d} of features — evaluates the model on each masked input x_S (keeping features in S, replacing the rest with a baseline b), and solves a weighted regression on the results to allocate credit per feature.

// Kernel SHAP φ(x)  =  (Zᵀ W Z)⁻¹ Zᵀ W  ·  y(x)  =  R · y(x)
where  y(x)ₖ = f(x_{Sₖ}),  K coalition queries,  Z ∈ {0,1}^{K×d},  W diagonal kernel weights

// The √K amplification

Changing one feature of x by 1 changes each masked input x_{S_k} by 1 only in coalitions that include that feature — roughly K/2 of them. So a single-feature change perturbs Θ(K) entries of y(x), each by a bounded amount. When you stack those perturbations into a vector and take its L₂ norm, you get √K growth. The regression matrix R redistributes that across the d output coordinates but does not shrink it.

// Coalition amplification · why sensitivity grows as √K
STEP 1 · INPUT Δx = e_j flip one feature j (one-feature adjacency) evaluate f on K masked inputs STEP 2 · K COALITION QUERIES each row = f(x_S) on one coalition S ⊂ {1..d} S₁ contains j → Δy₁ ≠ 0 (touched) S₂ skips j → Δy₂ = 0 S₃ contains j → Δy₃ ≠ 0 (touched) S₄ contains j → Δy₄ ≠ 0 (touched) S₅ skips j → Δy₅ = 0 S_K contains j → Δy_K ≠ 0 (touched) ~K/2 coalitions include j STEP 3 · STACK & TAKE L₂ NORM ‖Δy‖₂ = Θ(√K) ~K/2 non-zero entries · each O(1) → norm scales √K R · regression no shrink STEP 4 · SHAP SENSITIVITY ‖Δφ‖₂ = Θ(√K / σ_d(Z)) DP noise σ must match this · ≈ 20× larger than |φ| at K=400
Flipping one feature of x perturbs roughly K/2 of the K coalition queries. Stacking those perturbations into a vector and taking its L₂ norm gives Θ(√K) growth. The regression operator R = (ZᵀWZ)⁻¹ZᵀW redistributes across d coordinates without shrinking the norm. The Gaussian-mechanism DP noise must absorb that entire Θ(√K/ε) budget — for K ≈ 400 that is about 20× larger than a typical SHAP value.

// The impossibility, one sentence

Under the Gaussian mechanism with global or smooth sensitivity, there is no (ε, δ) with ε ≤ 5 and SNR ≥ 1 — for linear, tree-ensemble, or shallow-MLP models — at any K ≥ 100. The noise overwhelms the signal at every privacy budget tight enough to resist the reconstruction attack.

The numbers

Plugging the sensitivity into the Gaussian mechanism and varying the privacy budget.

Θ(√K/ε)
Gaussian-mechanism
noise floor (Theorem 1)
K ≈ 400
typical coalition count
in encrypted SHAP
SNR < 0.15
at ε = 20 with
aggressive clipping
tree ensembles vs
logistic regression
2 datasets
UCI Adult (d≈87)
German Credit (d=24)

// Signal-to-noise across privacy budgets

Privacy budget ε Model class Gaussian σ (global sens.) Median |φ| SNR Verdict
1logistic~3.2~0.040.01explanation destroyed
5logistic~0.7~0.040.06explanation destroyed
10tree-ensemble~1.4~0.080.06explanation destroyed
20logistic~0.35~0.040.12nominal DP only
20shallow-MLP~0.5~0.050.10nominal DP only

SNR = median SHAP magnitude / Gaussian σ. Any useful explanation needs SNR ≥ 1. Every row above is orders of magnitude short — and these are the good rows, using smooth sensitivity and aggressive clipping.

The reconstruction attack

Sanity check: at the budgets where the explanation is still useful, an attacker really can recover the input. At the budgets where the attack fails, the explanation is already noise.

We trained ridge, kernel-ridge, and 2-layer MLP attackers to predict the standardised input x from the released φ. Data: UCI Adult (d ≈ 87 after one-hot), German Credit (d = 24). K = 200 coalitions. Reported: MSE in standardised-feature units, lower is better; random baseline is 1.0.

Privacy budget εAttackerAdult MSEGerman MSEInterpretation
∞ (no DP)ridge0.120.15strong recovery
∞ (no DP)MLP0.090.13strong recovery
10ridge0.620.68partial recovery
5ridge0.910.94marginal · explanation also gone
1ridge~1.0~1.0no leak · no signal either

What you can publish instead

If releasing cardinal SHAP values under DP is impossible, what can you release? Three constructions, all with tighter sensitivity than the raw vector.

01

Report-noisy-max (top-1)

Publish only the identity of the top-ranked feature. Sensitivity depends on the gap between the top and second SHAP value, not on the absolute magnitudes. Roughly √K cheaper than cardinal release.

02

Hybrid: top-1 index + noisy magnitude

Release the top-1 feature via report-noisy-max, plus its DP-noised magnitude as a single scalar. One noisy scalar, not a full vector — the √K is gone.

03

Sparse thresholded release

Release only features whose |φ| exceeds a threshold, under propose-test-release. For models where most features contribute near zero, the effective sensitivity is the sensitivity of the subset, not of the full d-dimensional vector.

04

Exponential-mechanism ranking

Release the top-t ranking with the exponential mechanism over permutations. Useful as a baseline but absolute budgets for t > 1 are still large at realistic SHAP gaps — more a research direction than a deployment recipe.

// The one-line recommendation

If you have to release explanations from FHE inference under DP, release ranks, not values. Release top-1, not top-d. Pair DP with access control, not instead of it.

Lower bound · proof sketch + open gap

Upper bounds tell you the Gaussian mechanism can't do better than Θ(√K/ε). Lower bounds would tell you no mechanism can. We have a sketch, not a theorem — with one gap resolved and one gap that now looks like it invalidates the √K part of the bound entirely.

// Proposition (lower bound, sketch · status: weakened)

// what the fingerprinting argument would give Any (ε, δ)-DP mechanism that releases φ̃ with   ‖φ̃ − φ(x)‖₂ ≤ α   must satisfy
α  =  Ω( L₀ √(Kd) / (ε + log(1/δ)) )
…but the √K factor is likely spurious. See Gap 2 below.

// The packing + fingerprinting argument

The sketch adapts Bun–Ullman–Vadhan's fingerprinting lower bounds. Build a packing {x_u} indexed by sign vectors u ∈ {±1}^d such that Hamming-adjacent packing points have ‖φ(x_u) − φ(x_u')‖₂ = Ω(L₀ √(K/d)). Then:

A

Construct packing

Choose 2^d points so that adjacent ones differ in exactly one coordinate, with SHAP-vector separation growing as √K/d per coordinate.

B

Decode via accuracy

Any mechanism with error α ≪ L₀√(K/d) lets a codebook decoder recover u from φ̃. Accurate release ⇒ recoverable identity.

C

Contradict group privacy

DP's group-privacy blowup at distance d bounds the probability of correct recovery. Steinke–Ullman's tracing argument gives the contradiction.

D

Aggregate across d

Summing the d coordinates of u gives the √d factor. Combined with the √K from the packing separation, the sketch would yield Ω(√(Kd)/ε) — if the packing separation survived.

// Gap 1 · rank condition on Z   — resolved

Step B needs the coalition matrix Z to be well-conditioned so that changes in y translate into changes in φ without cancellation. We prove a matrix-Chernoff bound: for K ≥ 32 d ln(d/η), the smallest singular value satisfies σ_d(Z)² ≥ K/8 with probability ≥ 1 − η.

// Lemma (rank condition · uniform Bernoulli sampling) K ≥ 32 d ln(d/η)   ⇒   Pr[ σ_d(Z)² ≥ K/8 ] ≥ 1 − η
Proof via matrix Chernoff (Tropp 2012). The population matrix 𝔼[zzᵀ] = ¼I + ¼11ᵀ has λ_min = 1/4 on 1^⊥.

At our experimental scale (d = 87, K = 200) the Chernoff threshold is K* ≈ 18,900, so the quantitative bound doesn't bind. We verify empirically instead: σ_d(Z) measured on 100 draws has median 2.53, min 2.22, max 2.76 — comfortably bounded. Full rank holds almost surely by coupon-collector at Pr ≥ 1 − 8.4·10⁻³³.

// Gap 2 · the √K cancellation   — existence proof for a better mechanism

This is the positive result hiding inside a "failed" lower bound. The argument assumed ‖RΔy‖₂ inherits the √K growth from ‖Δy‖₂. Closer analysis shows the opposite: for additive and threshold-stump models, ‖RΔy‖₂ = Θ(1) independent of K.

Why this matters: it is a constructive proof that a smarter mechanism than Gaussian exists in principle. The Gaussian mechanism pays the √K upfront because it noises y before the regression. But the SHAP output itself doesn't have a √K sensitivity — the regression step erases it. A mechanism that noises after the regression, or uses the algebraic structure of R, could avoid the amplification entirely. We haven't built that mechanism yet; but we now know the search space is non-empty.

// Lemma (column cancellation · additive + stump models) ‖Δy‖₂ = Θ(√K)     but     ‖R · Δy‖₂ = Θ(1)
because (ZᵀWZ)⁻¹ ZᵀW acting on columns of Z exactly cancels the √K growth.
// What the fingerprinting route actually gives vs. what we'd want
DESIRED BOUND α = Ω(√(Kd) / ε) would match Gaussian upper bound · closes impossibility SENSITIVITY ROUTE α = Ω(√d / ε) √K factor cancels for additive + stump models GAP · open there may exist a mechanism that beats Gaussian by up to √K
The √K factor in the conjectured lower bound was expected to survive the regression step. It doesn't — at least for additive and stump models. The true lower bound for cardinal kernel-SHAP under DP may be substantially below the Gaussian-mechanism upper bound, leaving room for a better mechanism.

// Net status · the good news

Upper bound (Gaussian mechanism must use σ = Θ(√K/ε)): tight for that mechanism class, hence the impossibility under Gaussian.

Lower bound against all DP mechanisms: only Ω(√d/ε) is established. The conjectured Ω(√(Kd)/ε) is provably unreachable via the sensitivity route.

Constructive consequence: the column-cancellation lemma is itself evidence that a DP mechanism with accuracy o(√K/ε) exists for cardinal kernel SHAP. The √K amplification is an artefact of the Gaussian mechanism's "noise-y-first" structure, not of the SHAP-release problem itself. Finding an explicit construction — e.g. a mechanism that noises in the post-regression basis or exploits R's structure — is the main open constructive question this paper leaves on the table.

Practical recommendations

Conclusion

The short version: when a client does encrypted SHAP and wants to publish the result, the right design choice is almost never "slap Gaussian noise on it." It is one of three alternatives — and the formal "impossibility" is narrower than earlier versions of this draft claimed.

Under global or smooth sensitivity, adding DP noise to a full-vector cardinal SHAP release is not a standalone mitigation at conventional ε. The √K sensitivity of the kernel-SHAP regression forces noise orders of magnitude larger than the SHAP values themselves at any privacy budget worth claiming. Clipping and smooth sensitivity shrink the constants by up to 5× but preserve the rate.

// Three design choices

1

Don't publish

FHE already protects the input from the server. If the client consumes the explanation internally — e.g. displays it in a local dashboard, never exports it — no further privacy machinery is needed. Most deployments should start here.

2

Publish a ranking, not numbers

Report-noisy-max on per-coordinate sensitivity is roughly √K cheaper than cardinal DP-SHAP but still needs ε in the tens at realistic gaps (ε ≈ 70 at gap 0.1 on logistic Adult under the Shapley-weighted constant c_W ≈ 148). Ranking is the right axis — not a small-ε rescue.

3

DP as one defence among several

Aggressive clipping at ε = 10–20 makes reconstruction harder without qualifying as DP in any formal sense. Combine it with rate-limiting, access control, and FHE to raise the overall bar. Don't claim formal DP at these budgets.

// Limitations and open problems

// In one line

Don't publish cardinal SHAP under DP. Publish rankings, or nothing at all, and use DP — if at all — as one layer of a defence-in-depth stack, not as the headline guarantee.

Cite as (draft)

// BibTeX-ready Alissaei, B. (2026). Input-Level Differential Privacy for SHAP Values Is Fundamentally Expensive. Draft preprint, VaultBytes Innovations Ltd. Available on request from the author.