cotangent space

Let $M$ be a smooth manifold and $p \in M$. The tangent space at $p$, denoted $T_p M$, is defined as the space of derivations at $p$, i.e., linear maps $$ v : C^\infty(M) \to \mathbb{R} $$ satisfying the Leibniz rule: $$ v(fg) = … Read More →

posts

push-pull

Pushforward (for tangent vectors) Let $v \in T_p M$ be a tangent vector at point $p \in M$. Then: $$ d \varphi_p: T_p M \rightarrow T_{\varphi(p)} N $$ is the pushforward map (also called the differential of $\varphi$ ). $$ … Read More →

posts

key results of wibisono 2018

Under strong logconcavity, the Wasserstein distance between the biased limit of ULA and $\nu$ is bounded by terms proportional to $\sqrt{\epsilon}$, with $\epsilon$ being the discretization parameter. […] If the covariances … Read More →

posts

bakry-emery

For a smooth function $f:M\to\mathbb{R}$, define $$ \Gamma(f) ;:=; |\nabla f|^2. $$ The operator $\Gamma_2$ is given by $$ \Gamma_2(f) ;:=; \frac12,\bigl(\mathcal{L}\Gamma(f) - 2,\Gamma(f),\mathcal{L} f\bigr), $$ where … Read More →

posts

manyfolds

is a smooth manifold that sits inside some ambient euclidean space $\mathbb{R}^n$, with the subspace topology and differential structure inherited from $\mathbb{R}^n$. It is the smooth injective immersion which is also a … Read More →

posts

wish: to annihilate ambient dimension

The SDE for the underdamped Langevin dynamics is modified to incorporate anisotropic smoothness through a position-dependent metric tensor: $$ dX_t = -G(x)^{-1} \nabla U(x) , dt + \sqrt{2G(x)^{-1}} , dB_t, $$ where $G(x) \in … Read More →

posts

good resources

sinho chewi’s book gradient flows ambrosio optimal transport villani (not topics) Read More →

posts

some literature review

\cite{dalalyan2017} and \cite{durmus_non-asymptotic_2016} established error bounds for ULA based on step size constraints. \cite{eberle_couplings_2017} reflection and synchronous couplings to quantify contraction rates in … Read More →

posts

exponential convergence of langevin

Tweedie deals with conditions under which the convergence of langevin diffusion is exponentially fast, and whether it extends to higher moments. The Unadjusted Langevin Algorithm (ULA) is a discrete-time Markov chain $\mathbf{U}n$ … Read More →

posts

wgf or wasserstein gradient flows

The Langevin diffusion and the Unadjusted Langevin Algorithm (ULA) can be interpreted as gradient flow dynamics minimizing the relative entropy (KL divergence) over the space of probability measures \cite{jordan_variational_1998}. … Read More →

posts

probability theory and mcmc background

A set $C \in \mathcal{B}(\mathrm{X})$ is called a small set if there exists an $m>0$, and a non-trivial measure $\nu_m$ on $\mathcal{B}(\mathrm{X})$, such that for all $x \in C, B \in \mathcal{B}(\mathrm{X})$, $$ P^m(x, B) \geq … Read More →

posts

glossary of sorts

The probability density function $\rho_t$ of $X_t$ evolves according to the Fokker-Planck equation: $$ \frac{\partial \rho_t}{\partial t} = \nabla \cdot \left(\rho_t \nabla \log \frac{\rho_t}{\nu}\right) = \nabla \cdot (\rho_t … Read More →

posts

underdamped langevin

\cite{cheng_underdamped_2018} studied ULMC as a variant of HMC algorithm, which converges to $\varepsilon$ error in 2-Wasserstein distance after $\mathcal{O}\left(\frac{\sqrt{d} \kappa^2}{\varepsilon}\right)$ iterations, under the … Read More →

posts

discretization langevin

In discrete time, we usually want to use a step size $\eta$ to construct an iterative update rule for the discretization. […] Unadjusted Langevin Algorithm (ULA) is a simple discretization of the Langevin diffusion: $$ … Read More →

posts

convergence of langevin

We note that the logarithmic sobolev inequality (LSI) provides a gradient domination condition that ensures exponential convergence of langevin dynamics under the wasserstein metric\cite{otto2000generalization}. $$ … Read More →

posts

gradient flows

Langevin dynamics connects probability theory and optimization through its formulation as a gradient flow in the Wasserstein-2 space. The time evolution of the Fokker-Planck equation\cite{jordan_variational_1998} can be … Read More →

posts

langevin diffusion

The goal is to efficiently sample from a target distribution $\pi \propto e^{-f(x)}$, in high dimensional or non-convex settings, direct sampling difficult and usually we define a stochastic process whose stationary distribution … Read More →

posts

defective lsi

the point of this exposition is thinking about the log-sobolev inequality in situations where you have high probability guarantees that the conditions for LSI are met. does this mean exponential convergence is still guaranteed? … Read More →

posts

about

this webpage serves as a dump of my personal notes on sampling, optimization and algorithms. these are always a work in progress, feel free to email me with corrections or ideas, but please note that these are not maintained for … Read More →