cotangent spaceLet $M$ be a smooth manifold and $p \in M$. The tangent space at $p$, denoted $T_p M$, is defined as the space of derivations at $p$, i.e., linear maps
$$ v : C^\infty(M) \to \mathbb{R} $$
satisfying the Leibniz rule:
$$ v(fg) = …
Read More → | posts |
push-pullPushforward (for tangent vectors) Let $v \in T_p M$ be a tangent vector at point $p \in M$. Then:
$$ d \varphi_p: T_p M \rightarrow T_{\varphi(p)} N $$
is the pushforward map (also called the differential of $\varphi$ ).
$$ …
Read More → | posts |
key results of wibisono 2018Under strong logconcavity, the Wasserstein distance between the biased limit of ULA and $\nu$ is bounded by terms proportional to $\sqrt{\epsilon}$, with $\epsilon$ being the discretization parameter.
[…] If the covariances …
Read More → | posts |
bakry-emeryFor a smooth function $f:M\to\mathbb{R}$, define $$ \Gamma(f) ;:=; |\nabla f|^2. $$ The operator $\Gamma_2$ is given by $$ \Gamma_2(f) ;:=; \frac12,\bigl(\mathcal{L}\Gamma(f) - 2,\Gamma(f),\mathcal{L} f\bigr), $$ where …
Read More → | posts |
manyfoldsis a smooth manifold that sits inside some ambient euclidean space $\mathbb{R}^n$, with the subspace topology and differential structure inherited from $\mathbb{R}^n$. It is the smooth injective immersion which is also a …
Read More → | posts |
wish: to annihilate ambient dimensionThe SDE for the underdamped Langevin dynamics is modified to incorporate anisotropic smoothness through a position-dependent metric tensor: $$ dX_t = -G(x)^{-1} \nabla U(x) , dt + \sqrt{2G(x)^{-1}} , dB_t, $$ where $G(x) \in …
Read More → | posts |
good resourcessinho chewi’s book gradient flows ambrosio optimal transport villani (not topics)
Read More → | posts |
some literature review\cite{dalalyan2017} and \cite{durmus_non-asymptotic_2016} established error bounds for ULA based on step size constraints.
\cite{eberle_couplings_2017} reflection and synchronous couplings to quantify contraction rates in …
Read More → | posts |
exponential convergence of langevinTweedie deals with conditions under which the convergence of langevin diffusion is exponentially fast, and whether it extends to higher moments.
The Unadjusted Langevin Algorithm (ULA) is a discrete-time Markov chain $\mathbf{U}n$ …
Read More → | posts |
wgf or wasserstein gradient flowsThe Langevin diffusion and the Unadjusted Langevin Algorithm (ULA) can be interpreted as gradient flow dynamics minimizing the relative entropy (KL divergence) over the space of probability measures \cite{jordan_variational_1998}. …
Read More → | posts |
probability theory and mcmc backgroundA set $C \in \mathcal{B}(\mathrm{X})$ is called a small set if there exists an $m>0$, and a non-trivial measure $\nu_m$ on $\mathcal{B}(\mathrm{X})$, such that for all $x \in C, B \in \mathcal{B}(\mathrm{X})$,
$$ P^m(x, B) \geq …
Read More → | posts |
glossary of sortsThe probability density function $\rho_t$ of $X_t$ evolves according to the Fokker-Planck equation: $$ \frac{\partial \rho_t}{\partial t} = \nabla \cdot \left(\rho_t \nabla \log \frac{\rho_t}{\nu}\right) = \nabla \cdot (\rho_t …
Read More → | posts |
underdamped langevin\cite{cheng_underdamped_2018} studied ULMC as a variant of HMC algorithm, which converges to $\varepsilon$ error in 2-Wasserstein distance after $\mathcal{O}\left(\frac{\sqrt{d} \kappa^2}{\varepsilon}\right)$ iterations, under the …
Read More → | posts |
discretization langevinIn discrete time, we usually want to use a step size $\eta$ to construct an iterative update rule for the discretization.
[…] Unadjusted Langevin Algorithm (ULA) is a simple discretization of the Langevin diffusion: $$ …
Read More → | posts |
convergence of langevinWe note that the logarithmic sobolev inequality (LSI) provides a gradient domination condition that ensures exponential convergence of langevin dynamics under the wasserstein metric\cite{otto2000generalization}. $$ …
Read More → | posts |
gradient flowsLangevin dynamics connects probability theory and optimization through its formulation as a gradient flow in the Wasserstein-2 space. The time evolution of the Fokker-Planck equation\cite{jordan_variational_1998} can be …
Read More → | posts |
langevin diffusionThe goal is to efficiently sample from a target distribution $\pi \propto e^{-f(x)}$, in high dimensional or non-convex settings, direct sampling difficult and usually we define a stochastic process whose stationary distribution …
Read More → | posts |
defective lsithe point of this exposition is thinking about the log-sobolev inequality in situations where you have high probability guarantees that the conditions for LSI are met. does this mean exponential convergence is still guaranteed? …
Read More → | posts |
aboutthis webpage serves as a dump of my personal notes on sampling, optimization and algorithms. these are always a work in progress, feel free to email me with corrections or ideas, but please note that these are not maintained for …
Read More → | posts |