Posts

cotangent space

Let $M$ be a smooth manifold and $p \in M$. The tangent space at $p$, denoted $T_p M$, is defined as the space of derivations at $p$, i.e., linear maps $$ v : C^\infty(M) \to \mathbb{R} $$ satisfying the Leibniz rule: $$ v(fg) = … Read More →

push-pull

Pushforward (for tangent vectors) Let $v \in T_p M$ be a tangent vector at point $p \in M$. Then: $$ d \varphi_p: T_p M \rightarrow T_{\varphi(p)} N $$ is the pushforward map (also called the differential of $\varphi$ ). $$ … Read More →

key results of wibisono 2018

Under strong logconcavity, the Wasserstein distance between the biased limit of ULA and $\nu$ is bounded by terms proportional to $\sqrt{\epsilon}$, with $\epsilon$ being the discretization parameter. […] If the covariances … Read More →

bakry-emery

For a smooth function $f:M\to\mathbb{R}$, define $$ \Gamma(f) ;:=; |\nabla f|^2. $$ The operator $\Gamma_2$ is given by $$ \Gamma_2(f) ;:=; \frac12,\bigl(\mathcal{L}\Gamma(f) - 2,\Gamma(f),\mathcal{L} f\bigr), $$ where … Read More →

manyfolds

is a smooth manifold that sits inside some ambient euclidean space $\mathbb{R}^n$, with the subspace topology and differential structure inherited from $\mathbb{R}^n$. It is the smooth injective immersion which is also a … Read More →

wish: to annihilate ambient dimension

The SDE for the underdamped Langevin dynamics is modified to incorporate anisotropic smoothness through a position-dependent metric tensor: $$ dX_t = -G(x)^{-1} \nabla U(x) , dt + \sqrt{2G(x)^{-1}} , dB_t, $$ where $G(x) \in … Read More →

good resources

sinho chewi’s book gradient flows ambrosio optimal transport villani (not topics) Read More →

some literature review

\cite{dalalyan2017} and \cite{durmus_non-asymptotic_2016} established error bounds for ULA based on step size constraints. \cite{eberle_couplings_2017} reflection and synchronous couplings to quantify contraction rates in … Read More →

exponential convergence of langevin

Tweedie deals with conditions under which the convergence of langevin diffusion is exponentially fast, and whether it extends to higher moments. The Unadjusted Langevin Algorithm (ULA) is a discrete-time Markov chain $\mathbf{U}n$ … Read More →

wgf or wasserstein gradient flows

The Langevin diffusion and the Unadjusted Langevin Algorithm (ULA) can be interpreted as gradient flow dynamics minimizing the relative entropy (KL divergence) over the space of probability measures \cite{jordan_variational_1998}. … Read More →

probability theory and mcmc background

A set $C \in \mathcal{B}(\mathrm{X})$ is called a small set if there exists an $m>0$, and a non-trivial measure $\nu_m$ on $\mathcal{B}(\mathrm{X})$, such that for all $x \in C, B \in \mathcal{B}(\mathrm{X})$, $$ P^m(x, B) \geq … Read More →

glossary of sorts

The probability density function $\rho_t$ of $X_t$ evolves according to the Fokker-Planck equation: $$ \frac{\partial \rho_t}{\partial t} = \nabla \cdot \left(\rho_t \nabla \log \frac{\rho_t}{\nu}\right) = \nabla \cdot (\rho_t … Read More →

underdamped langevin

\cite{cheng_underdamped_2018} studied ULMC as a variant of HMC algorithm, which converges to $\varepsilon$ error in 2-Wasserstein distance after $\mathcal{O}\left(\frac{\sqrt{d} \kappa^2}{\varepsilon}\right)$ iterations, under the … Read More →

discretization langevin

In discrete time, we usually want to use a step size $\eta$ to construct an iterative update rule for the discretization. […] Unadjusted Langevin Algorithm (ULA) is a simple discretization of the Langevin diffusion: $$ … Read More →

convergence of langevin

We note that the logarithmic sobolev inequality (LSI) provides a gradient domination condition that ensures exponential convergence of langevin dynamics under the wasserstein metric\cite{otto2000generalization}. $$ … Read More →

gradient flows

Langevin dynamics connects probability theory and optimization through its formulation as a gradient flow in the Wasserstein-2 space. The time evolution of the Fokker-Planck equation\cite{jordan_variational_1998} can be … Read More →

langevin diffusion

The goal is to efficiently sample from a target distribution $\pi \propto e^{-f(x)}$, in high dimensional or non-convex settings, direct sampling difficult and usually we define a stochastic process whose stationary distribution … Read More →

defective lsi

the point of this exposition is thinking about the log-sobolev inequality in situations where you have high probability guarantees that the conditions for LSI are met. does this mean exponential convergence is still guaranteed? … Read More →

about

this webpage serves as a dump of my personal notes on sampling, optimization and algorithms. these are always a work in progress, feel free to email me with corrections or ideas, but please note that these are not maintained for … Read More →