Score-Debiased Kernel Density Estimation (SD-KDE)

TL;DR

Problem setting

KDE estimates an unknown density from samples, but performance is sensitive to bandwidth. The paper asks whether score estimates can reduce KDE bias without sacrificing variance too much.

Key idea

Use the score to "undo" part of KDE's smoothing bias:

  1. Shift each sample once in the direction of the estimated score.
  2. Run standard KDE on the shifted samples.

This is the core "shift-then-smooth" idea.

Method (as described)

Inputs:

Algorithm:

  1. Shift each point by a small step proportional to h^2 times the score.
  2. Evaluate KDE on the shifted points.

The shift step is:

x~i=xi+h22s^(xi)\tilde{x}_i = x_i + \frac{h^2}{2}\,\hat{s}(x_i)

The paper provides a bias expansion showing that, with an exact score, the leading-order bias term is removed.

Spiral synthetic example

SD-KDE with oracle score

2D mixture of Gaussians (oracle score)

Evidence

The paper reports:

With an exact score, the paper states an improved asymptotic rate:

MISE=O(n8/(d+8))\mathrm{MISE} = \mathcal{O}\left(n^{-8/(d+8)}\right)

Limitations and caveats

Takeaway

SD-KDE is a clean, theoretically motivated way to reduce KDE bias using score information, with promising synthetic and small-scale empirical evidence.