Point KL-Divergence is not Very Negative Very Often

If $X\sim P$ then for any distribution $Q$ it is unlikely that $Q$ ascribes much greater density to $X$'s outcome than $P$ does. In fact if $P,Q$ have PDFs $f_P, f_Q$, then:

\begin{align} \mathbb{P}(f_P(X)\leq c f_Q(X)) &= \int \mathbf{1}_{\{x:f_P(x)\leq cf_Q(x) \}} f_P(x)dx \\ &\leq \int c f_Q(x) dx \\ &= c. \end{align}

This carries over to relative entropy:

Noting $D_{KL}(P\|Q)=\mathbb{E}[Z]$ for $Z=\ln \frac{f_P(X)}{f_Q(X)}$, then for any $z,$

\begin{align} \mathbb{P}(Z\leq z) &= \mathbb{P}\left(\ln \frac{f_P(X)}{f_Q(X)} \leq z \right) \\ &= \mathbb{P}(f_P(X)\leq e^z f_Q(X)) \\ &\leq e^z. \end{align}

This is actually just an interesting instance of the Chernoff bound. The same thing can be done when $P,Q$ aren't over $\mathbb{R}$ or don't have CDFs, or even with other types of divergences.