[Research]

Security Research

Independent cryptography research — analysing known vulnerabilities, building provable fixes, and publishing everything open access.

New FHECKKSKernel SHAPExplainabilityBSGS
April 2026

BHDR: A 43× Rotation Reduction for Encrypted Kernel SHAP

BHDR = BSGS-Hoisted Diagonal Regression (BSGS = Baby-Step Giant-Step). Kernel SHAP under CKKS fully homomorphic encryption, on a single server, without interactivity. The client's input, the prediction, and the explanation all stay encrypted on the server. The closing weighted-least-squares regression φ = (ZᵀWZ)⁻¹ ZᵀW · y was 84% of the algorithm-level pipeline cost at 7.7s. The regression matrix depends only on public coalition sampling parameters, so we precompute it offline in plaintext and evaluate the encrypted matvec with a BSGS-hoisted diagonal transform over a K′-periodic replicate encoding. Rotation count goes from about 2,200 to 51 at d = 50 (43× fewer); the regression step drops to 0.60s on Apple M1. The deployed end-to-end pipeline at d = 50 measures p50 13.4s, p95 16.3s, p99 24.2s over Nq = 300 UCI Adult queries on M1, with 0/300 silent CKKS overflows after the input-guard fix. Technical report on Zenodo (DOI 10.5281/zenodo.19889993).

43× rotation reduction
vs. production BSGS
51 total rotations for
full d = 50 regression
13.4 s deployed end-to-end p50
(Nq = 300 UCI Adult, M1)
0 / 300 silent CKKS overflows
post input-guard fix
HQCPower AnalysisReed-MullerConstant-Time
April 2026

PermNet-RM: A Constant-Time Reed-Muller Encoder for HQC via GF(2) Zeta-Transform Butterfly Decomposition

Every HQC implementation ships the same Reed-Muller encoder. It leaks the full 128-bit encapsulation message with 96.9% success from a single power trace on ARM Cortex-M4 (Jeon et al., 2026/071; 5,000 total profiling + evaluation traces). The root cause is algorithmic: the encoder tests each message bit to decide which generator rows to include. Any implementation of that algorithm will leak. PermNet-RM reformulates RM(1,m) encoding as the GF(2) zeta transform of a fixed indicator vector, computed via a fixed-topology butterfly network with per-stage compiler barriers. Message bits enter as initial register state and are never read again. ELMO power simulation on Cortex-M0: 9.1× mean signal reduction vs BIT0MASK; 11.1× peak / 31× bit-6 reduction under the shared-output Boolean-masked d=1 variant. Drop-in replacement for reed_muller_encode().

0 cyc timing spread across
all 256 inputs
21 ns total overhead per
HQC-128 encapsulation
6 GCC optimisation levels
confirmed branch-free
100% exhaustive correctness
verification (256 + 512 inputs)
HQCCompilersTiming Side-ChannelPost-Quantum
April 2026

We Found a 4-Year-Old Security Hole in Post-Quantum Encryption — and a Fix That Also Makes It 3× Faster

Every version of Clang released since June 2022 silently transforms constant-time post-quantum code into timing-leaky binaries. The responsible component is x86-cmov-converter inside the LLVM x86 backend: it detects the BIT0MASK pattern, decides a conditional jump would be faster, and emits one — branching on a secret key bit. Confirmed across 9 Clang versions, 20 compiler/platform combinations, on Linux and Windows. A single build flag fixes all of them. It also makes the code 3.07× faster, because the "optimisation" was causing millions of branch mispredictions per operation.

9 Clang versions
tested (14 → 22)
20 compiler/platform
combinations — all leaky
4 yrs scope of the
vulnerability
3.07× speed improvement
from the fix