← Introduction → Continuous Contraction Semigroups
What should you be acquainted with? Measure Theory and some basics in Functional Analysis.

Examples

Discrete Dynamical Systems

In the sequel $(\O,\F,\mu)$ will always denote a $\s$-finite measure space. All spaces $S$ are more or less assumed to be Polish, i.e. there is some metric $d$ on $S$ such that $(S,d)$ is a complete and separable metric space. This ensures in particular that $S^\N$ is Polish and the Borel $\s$-algebra $\B(S^\N)$ on $S^\N$ coincides with the product $\s$-algebra, e.g. $F:\O\rar S^\N$ is measurable iff all components $\Prn_n\circ F$ are measurable. Morover any finite Borel measure $\mu$ on $S$ is regular, i.e. for all Borel sets $A$ and all $\e > 0$ there is a compact set $K$ and an open set $U$, such that $K\sbe A\sbe U$ and $\mu(U\sm K) < \e$. It follows that e.g. bounded Lipschitz functions are dense in $L_p(\mu)$ for all $1\leq p < \infty$:
Put $$ f(x)=\frac{d(x,U^c)}{d(x,U^c)+d(x,K)}, $$ then $f$ is Lipschitz, $f|K=1$, $f|U^c=0$ and for all $p > 0$: $\int|f-I_A|^p\,d\mu < \e$. Solution by T. Speckhofer
If $S=M$ is a (smooth) manifold (i.e. for convenience, a connected Polish space locally homeomorphic to a euclidean space with a differentiable structure). If you don't care about manifolds in general, open subsets of euclidean spaces suffice - though we assume that all manifolds don't have a boundary! The space $C_c^\infty(M)$ of smooth functions with compact support is dense in $L_p(\mu)$ for all $1\leq p < \infty$ and all Radon measures $\mu$ on $M$, i.e. $\mu$ is Borel and all compact subsets of $M$ have finite measure.
A measurable mapping $\theta:S\rar S$ is called measure preserving if for all $A\in\F$: $\mu(\theta\in A)=\mu(A)$ i.e. the image measure $\mu_\theta$ of $\mu$ under $\theta$ equals $\mu$.
Thus $\theta$ is measure preserving, if $\mu$ is invariant under the mapping $\theta$. We will also say that $\mu$ is stationary under $\theta$ and call $(S,\F,\mu,\theta)$ a dynamical system. By the transformation theorem of measure theory $\theta:S\rar S$ is measure preserving if and only if for all $f\in L_1(\mu)$: $$ \int f\circ\theta\,d\mu=\int f\,d\mu~. $$ Suppose $\theta$ is measure preserving, then for all $f\in L_p(\mu)$ and all $n\in\N_0$ we define $$ P_nf(x)\colon=f(\theta^n(x)) $$ where $\theta^0(x)\colon=x$. This is a simple example of what is known as a semigroup $P_n$ on $L_p(\mu)$; since $\theta$ is measure preserving, we have: $\norm{P_nf}_p=\norm f_p$, in particular $P_n$ is a contraction semigroup.
Put $S=\TT\colon=\R/2\pi\Z\simeq S^1$ and let $\l(dx)=dx/2\pi$ be the normalized Haar measure on $\TT$. Then for all $\theta\in\R$ the mapping $\Theta:S^1\rar S^1$, $z\mapsto ze^{2\pi i\theta}$ (or $\Theta:\TT\rar\TT$, $x\mapsto x+2\pi\theta$) is measure preserving.
Put $S=[0,1)$ and $\theta(x)=2x(\modul1)$. Then the Lebesgue measure is invariant under $\theta$.
$\theta:(0,1)\rar(0,1)$ has an invariant measure with density $\r$ iff (cf. exam): $$ \forall x\in(0,1):\quad \sum_{y:\theta(y)=x}\frac{\r(y)}{|\theta^\prime(y)|}=\r(x)~. $$
Put $\theta:\R\rar\R$, $\theta(x)=x-1/x$. Then the Lebesgue measure is $\theta$-invariant. Cf. example.
Suppose $\theta:\R^+\rar\R^+$ is an increasing bijection. Then $\mu_\theta(0,\theta(x))=\mu(0,x)$.
Suppose $\theta:S\rar S$ has a fixed point $x\in S$, i.e. $\theta(x)=x$. Then $\d_x$ is $\theta$-invariant.
Put $S=[0,1)$ and $\theta(x)=ax(\modul1)$ for $a-1=1/a$, $a > 0$. Then Lebesgue measure is not invariant under $\theta$. Find a measure $\mu$ on $S$ invariant under $\theta$. Hint: Assume $\mu$ has constant density $m_1$ on $(0,1/a)$ and constant density $m_2$ on $(1/a,1)$. Suggested solution
Show that the map $\theta:(0,1)\rar(0,1)$ $$ \theta(x)\colon=\left\{\begin{array}{cl} \frac x{1-x}&\mbox{if $x\in(0,1/2]$}\\ \frac{1-x}x&\mbox{if $x\in(1/2,1)$} \end{array}\right. $$ has an invariant measure with density $1/x$.
Suppose $\mu$ is a Borel probability measure on the Polish space $S$ and let $\P$ be the product measure on $\O\colon=S^\N$. The so called shift operator $\Theta:\O\rar\O$ defined by $\Prn_n(\Theta(\o))=\o_{n+1}$ is obviously measurable (with respect to the product $\s$-algebra) and $\P$ is invariant under $\Theta$. $\Theta$ is also known as the Bernoulli shift and the dynamical system a Bernoulli scheme (cf. exam).
Since the Borel $\s$-algebra of $\O$ is generated by the projections $\Prn_n$, $n\in\N$, we only have to show that for all $A\in{\cal B}(S)$ and all $n\in\N$: $\P(\Theta\in[\Prn_n\in A])=\P([\Prn_n\in A])$. Now by definition $\P([\Prn_n\in A])=\mu(A)$ and since $\Prn_n\circ\Theta=\Prn_{n+1}$ we conclude that $$ \P(\Theta\in[\Prn_n\in A]) =\P(\Prn_n\circ\Theta\in A) =\P(\Prn_{n+1}\in A) =\mu(A)~. $$ Remark: In probability the projections $\Prn_n$ are independent and identically distributed $S$-valued random variables, a so called i.i.d. sequence in $S$ with distribution $\mu$.
On $S=[0,1]$ define $\theta:S\rar S$ by $$ \theta(x)=\left\{\begin{array}{cl} 2x&\mbox{if $x < 1/2$}\\ 2(1-x)&\mbox{$x\geq1/2$} \end{array}\right. $$ Then the Lebesgue measure is invariant under $\theta$. Similarly $\theta(x)=2x-[2x]$ on $[0,1]$ or on $[0,1)=\R/\Z$: $\theta(x)=2x\,\modul(1)$.
On the set $S=[0,1]^2$ define $$ \theta(x,y)=\left\{\begin{array}{cl} (2x,y/2)&\mbox{if $x < 1/2$}\\ (2-2x,1-y/2)&\mbox{$x\geq1/2$} \end{array}\right. $$ Then the Lebesgue measure is invariant under $\theta$. The transformation $\theta$ is known as the folded baker transformation. Suggested solution. Solution by T. Speckhofer. The unfolded baker transformation $\theta:S\rar S$ is given by $\theta(x,y)=(2x-[2x],(y+[2x])/2)$, cf. e.g. wikipedia. Both the folded and the unfolded map are invertible!

The Gauß map

On the set $S=[0,1)$ define the Gauss map $\theta(x)=1/x-[1/x]$. Now let $\mu$ be the probability measure on $S$ with density $(\log 2)^{-1}(1+x)^{-1}$, then $\mu$ is $\theta$-invariant.
gauss map
For $0 < t < 1$ we have $\mu_\theta((0,t])=\mu(0 < \theta\leq t)$. $x\in[0 < \theta\leq t]$, iff $1/x\leq[1/x]+t$, i.e. $1/x\in\bigcup[n,n+t]$ and this holds if and only if $x\in\bigcup[1/(n+t),1/n]$. \begin{eqnarray*} \log 2\mu_\theta((0,t]) &=&\sum_{n=1}^\infty\int_{1/(n+t)}^{1/n}\frac1{1+x}\,dx =\sum_{n=1}^\infty\log\frac{1+1/n}{1+1/(n+t)}\\ &=&\sum_{n=1}^\infty\Big(\log\frac{1+n}{1+n+t}-\log\frac n{n+t}\Big) =-\log\frac1{1+t} =\log 2\mu((0,t])~. \end{eqnarray*}
Let $\theta$ be the Gauss map. For real numbers $x_1,\ldots,x_k\geq1$ put $$ \la x_1\ra\colon=\frac1{x_1} \quad\mbox{and}\quad \la x_1,\ldots,x_{k+1}\ra=\frac1{x_1+\la x_2,\ldots,x_{k+1}\ra}~. $$ In particular: $$ \la x_1,x_2\ra=\frac1{x_1+\frac1{x_2}}, \la x_1,x_2,x_3\ra=\frac1{x_1+\frac1{x_2+\frac1{x_3}}},\ldots $$ Provided $\theta^k(x)\neq 0$ define for $x\in(0,1)$: $a_1,a_2,\ldots:(0,1)\rar\N$ by $$ a_1(x)\colon=[1/x],\quad a_{k+1}(x)\colon=[1/\theta^k(x)]~. $$ The sequence $a_1(x),a_2(x),\ldots$ is called the continued fraction of $x$. Verify the following statements: (suggeted solution)
  1. For all $x\in[0,1)$ and all $k\in\N$ such that $x,\theta(x),\ldots,\theta^{k-1}(x)\neq0$: $$ x=\la a_1(x),\ldots,a_k(x)+\theta^k(x)\ra \quad\mbox{and}\quad \theta(x)=\la a_2(x),\ldots,a_k(x)+\theta^k(x)\ra~. $$
  2. For $n_1,\ldots,n_k\in\N$ and $t\in(0,1)$: $$ \theta(\la n_1,\ldots,n_k+t\ra)=\la n_2,\ldots,n_k+t\ra~. $$
  3. Put $I=[0,1)\sm\Q$. For all $k\in\N$ we have $\theta^k(I)\sbe I$. Hence the functions $a_k:I\rar\N$ are defined for all $k\in\N$.
  4. For all $x\in[0,1)\cap\Q$ there is some $k\in\N$ such that $\theta^k(x)=0$. Conversely, if $\theta^k(x)=0$, then $x$ must be rational.
For $n_1,n_2,\ldots\in\N$ we put \begin{eqnarray*} I_{n_1}&\colon=&(\la n_1+1\ra,\la n_1\ra),\\ I_{n_1,n_2}&\colon=&(\la n_1,n_2\ra,\la n_1,n_2+1\ra),\\ I_{n_1,n_2,n_3}&\colon=&(\la n_1,n_2,n_3+1\ra,\la n_1,n_2,n_3\ra, \quad\mbox{etc.} \end{eqnarray*} Then the following holds (suggested solution):
  1. $\theta$ is a homeomorphism from $I_{n_1,\ldots,n_k}$ onto $I_{n_2,\ldots,n_k}$.
  2. $I_{n_1,\ldots,n_k,n_{k+1}}\sbe I_{n_1,\ldots,n_k}$.
  3. For all $j\leq k$: $a_j|I_{n_1,\ldots,n_k}=n_j$.
  4. The open interval $I_{n_1,\ldots,n_k}$ has length at most $2^{-k}$. Thus if $a_j(x)=a_j(y)=n_j$ for all $j\leq k$, then $|x-y|\leq2^{-k}$.
The mapping $a:I\rar\N^\N$, $x\mapsto(a_n(x))$ is a homeomorphism with inverse $(n_1,\ldots,n_k,\ldots)\mapsto\lim_k\la n_1,\ldots,n_k\ra$.

An example on the real line

Suppose $\a\in\R$, $a_1 < a_2 <\cdots < a_n$, $a_0=-\infty$, $a_{n+1}=\infty$ and $c_1,\ldots,c_n > 0$. Define $\theta:\cl\R\rar\cl\R$ by $$ \theta(x)\colon=\left\{ \begin{array}{cl} x+\a-\sum_{j=1}^nc_j(x-a_j)^{-1}&\mbox{if $x\notin\{a_0,\ldots,a_{n+1}\}$}\\ \infty&\mbox{if $x\in\{a_0,\ldots,a_{n+1}\}$} \end{array}\right. $$
  1. For all $t\in\R$ the equation $\theta(x)=t$ has exactly $n+1$ real solutions $x_j(t)$ and $a_j < x_j(t) < a_{j+1}$.
  2. $\sum_{j=0}^nx_j^\prime(t)=1$ and for all $f\in L_1(\R)$: $$ \int f\circ\theta\,d\l=\int f\,d\l~. $$
1. $\theta|(a_j,a_{j-1})$ is continuous, $\theta^\prime > 1$ and $$ \lim_{x\dar a_j}\theta(x)=-\infty,\quad \lim_{x\uar a_{j+1}}\theta(x)=\infty~. $$ Moreover $\lim_{x\to\pm\infty}\theta(x)=\pm\infty$.
2. Fix $t\in\R$ and put $q(x)\colon=(x-x_0)\ldots(x-x_n)$, $p(x)\colon=(x-a_1)\ldots(x-a_n)=x^n-x^{n-1}\sum a_j+\cdots$ and $p_j(x)=p(x)/(x-a_j)$. Then \begin{eqnarray*} q(x) &=&(\theta(x)-t)p(x) =(x+\a-t)p(x)-\sum_{j=1}^nc_jp_j(x)\\ &=&x(x^n-x^{n-1}\sum a_j)+(\a-t)x^n+r(x), \end{eqnarray*} where $r$ is some polynomial of degree $n-1$ at most. Comparing the coefficients of $x^n$ we find: $$ -\sum_{j=0}^n x_j=-\sum_{j=1}^na_j+\a-t~. $$ Therefore $\theta:(a_j,a_{j+1})\rar\R$ is a diffeomorphism with inverse $x_j$ and by the transformation theorem of measure theory: $$ \int f\circ\theta\,d\l =\sum_{j=0}^{n+1}\int_{a_j}^{a_{j+1}}f(\theta(x))\,dx =\sum_{j=0}^{n+1}\int f(t)x_j^\prime(t)\,dt =\int f\,d\l~. $$
Lebesgue measure on $\R^+$ is invariant under $\theta(x)=|x-1/x|$.
Verify that $$ \int_\R e^{-x^2-1/x^2}\,dx=\frac{\sqrt\pi}{e^2}~. $$
For $t > 0$ let $\mu_t$ be the measure on $\R^+$ with density $$ x\mapsto\frac{t\,\exp(-t^2/4x)}{\sqrt{4\pi x^3}}~. $$ Show that the Laplace transform $\o_t(y)\colon=\int_0^\infty e^{-xy}\,\mu_t(dx)$ is given by $\o_t(y)=e^{-t\sqrt y}$. 2. Conclude that $\mu_s*\mu_t=\mu_{s+t}$. The measures $\mu_t$ are called $1/2$-stable probability measures on $\R^+$ ($1/2$ refers to the square root of $y$ in $\o_t$). Suggested solution.

Homomorphic and isomorphic systems

A dynamical systems $(\wt S,\wt\F,\wt\mu,\wt\theta)$ is said to be homomorphic to $(S,\F,\mu,\theta)$, if there is a measurable mapping $F:(S,\F)\rar(\wt S,\wt\F)$ such that $\wt\mu=\mu_F$ and $\wt\theta\circ F=F\circ\theta$. If in addition $(S,\F,\mu,\theta)$ is also homomorphic to $(\wt S,\wt\F,\wt\mu,\wt\theta)$, then $(S,\F,\mu,\theta)$ and $(\wt S,\wt\F,\wt\mu,\wt\theta)$ are said to be isomorphic.
If $\mu$ is $\theta$-invariant, then $\mu_F$ is $\wt\theta$-invariant, because for all $B\in\wt\F$ we have by definition: \begin{eqnarray*} \mu_F(\wt\theta^{-1}(B)) &=&\mu(F^{-1}(\wt\theta^{-1}(B))) =\mu((\wt\theta\circ F)^{-1}(B)) =\mu((F\circ\theta)^{-1}(B))\\ &=&\mu(\theta^{-1}(F^{-1}(B))) =\mu(F^{-1}(B)) =\mu_F(B)~. \end{eqnarray*} Suppose $R$ is an equivalence relation on a Polish space $S$ such that $S/R$ is again Polish. Let $\pi:S\rar S/R$ be the quotient map. If a continuous map $\theta:S\rar S$ is constant on all sets $R(x)$, then there is a continuous map $\wh\theta:S/R\rar S/R$ such that $\wh\theta\circ\pi=\pi\circ\theta$. Finally, if $\mu$ is a $\theta$-invariant probability measure on $S$, then its image measure $\mu_\pi$ is $\wh\theta$-invariant.
For $S\colon=(0,1)$ and $\theta(x)=4x(1-x)$ let $\mu$ be the probability measure with density $f(x)=\pi^{-1}(x(1-x))^{-1/2}$. Then $\mu$, $\d_0$ and $\d_{3/4}$ are $\theta$-invariant. Moreover, put $F(x)=(\sin(\pi x)/2)^2$, then (suggested solution) $$ F^{-1}\circ\theta\circ F(x) =\left\{\begin{array}{cl} 2x&\mbox{if $x < 1/2$}\\ 2(1-x)&\mbox{if $x\geq1/2$} \end{array}\right. $$
Let $S=\R^+$, $\mu(dx)=e^{-x}dx$ and $\theta(x)=-\log|1-2e^{-x}|$. Then $\mu$ is $\theta$-invariant. Suggested solution.

Recurrence

Suppose $\theta$ is a measure preserving map on the probability space $(\O,\F,\P)$. Then for all $A\in\F$: $$ \P\Big(A\cap\limsup_n[\theta^n\in A]\Big)=\P(A)~. $$
$\proof$ Let $B$ be the set of all $\o\in A$, such that for all $k\geq1$: $\theta^k(\o)\notin A$, i.e. $$ B \colon=A\cap\bigcap_{k\geq1}[\theta^k\in A^c] =A\cap\bigcap_{k\geq1}\theta^{-k}(A^c)~. $$ Then for all $n\geq1$: $B\cap\theta^{-n}(B)=\emptyset$, for $\theta^{-n}(B)\sbe\theta^{-n}(A)$ and $B\sbe\theta^{-n}(A^c)$. It follows that: $$ \forall m\neq n\qquad\theta^{-m}(B)\cap\theta^{-n}(B)=\emptyset $$ and thus the sets $[\theta^n\in B]$, $n\in\N$, are pairwise disjoint. Since all of these sets have equal probability this probability is necessarily zero. The same reasoning applied to $\theta^n$ instead of $\theta$ shows that $$ \forall n\in\N\qquad\P\Big(A\cap\bigcap_{k\geq1}[\theta^{nk}\in A^c]\Big)=0 \quad\mbox{i.e.}\quad \P\Big(A\cap\bigcup_n\bigcap_{k\geq1}[\theta^{nk}\in A^c]\Big)=0~. $$ Finally $$ \liminf_n[\theta^n\in A^c] =\bigcup_n\bigcap_{k\geq n}[\theta^k\in A^c] \sbe\bigcup_n\bigcap_{k\geq1}[\theta^{nk}\in A^c] $$ and therefore $\P\Big(A\cap\Big(\limsup_n[\theta^n\in A]\Big)^c\Big)=\P(A\cap\liminf_n[\theta^n\in A^c])=0$. $\eofproof$
Hence for $\P$ almost all $\o\in A$ there is an infinite number of $n\in\N$ such that $\theta^n(\o)\in A$; in other words the sequence $\theta^n(\o)$ hits $A$ infinitely many times!
Suppose $\theta$ is a measure preserving map on the probability space $(\O,\F,\P)$. Then for all $A\in\F$ (suggested solution): $$ \limsup_n\P(A\cap[\theta^n\in A])\geq\P(A)^2~. $$
The following result is a topological version of the previous proposition. Recall the following definitions from topology:
  1. A subset $G$ of a metric space $S$ is called a $G_\d$-set if it's the intersection of a sequence of open sets.
  2. A basis for the topology of $S$ is a collection $U_\a$, $\a\in I$, of open subsets, such that for every open subset $V$ of $S$ there is a subset $J\sbe I$ such that $V=\bigcup_{\a\in J}U_\a$.
For example, the irrational numbers form a $G_\d$ subset of the reals. Any closed or open subset of a metric space is a $G_\d$ subset, but any countable subset of an uncountable complete metric space is not $G_\d$! If $D$ is a dense subset of a metric space $S$, then the collection $$ \{B_r(x):x\in D,r\in\Q^+\} $$ is a basis for the topology of $S$.
For any $\e > 0$ find an open subset $U_\e$ of $\R$ such that $\l(U_\e) < \e$ and $U_\e\spe\Q$. Find a dense $G_\d$-subset $A$ of $\R$ such that $\l(A)=0$.
Suppose $\theta:S\rar S$ is a continuous transformation on a Polish space $S$ and for each pair of open sets $U,V$ there is some $n\in\N$ such that $U\cap[\theta^n\in V]\neq\emptyset$. Then the set of points $x\in X$ for which the set $\{\theta^n(x):n\in\N\}$ is dense is itself a dense $G_\d$-set.
$\proof$ $S$ being separable and metrizable there is a countable basis $V_k$, $k\in\N$, for the topology of $S$. By assumption we have that $$ U_k\colon=\bigcup_{n\in\N}[\theta^n\in V_k] $$ is a dense subset of $S$ and it's evidently open. Since $S$ is a Baire space the set $G\colon=\bigcap U_k$ is a dense $G_\d$-set and for each $x\in G$ the set $\{\theta^n(x):n\in\N\}$ is dense, because $x\in U_k$ means that for some $n$: $\theta^n(x)\in V_k$, as $V_k$ is a basis for the topology of $S$ the set $\{\theta^n(x):n\in\N\}$ must be dense. $\eofproof$
Suppose $\theta$ is measurable and $A\in\F$ satisfies $\mu(\theta^{-1}(A)\D A)=0$. Then there is a subset $B$ in the $\mu$-completion $\F^\mu$ of $\F$ such that $\mu(A\D B)=0$ and $\theta^{-1}(B)=B$. Suggested solution
For a measure space $(S,\F,\mu)$ and $A,B\in\F$ we will write $A=B$ if $\mu(A\D B)=0$.
If $\mu$ is finite then $\F$ can be seen as a subspace of $L_1(\mu)$: Prove that $\norm{I_A-I_B}_1=\mu(A\D B)$ and that $\F$ with the metric inherited from $L_1(\mu)$ is a complete metric space. Solution by T. Speckhofer

Ergodic transformations

Let $(S,\F,\mu)$ $\s$-finite measure space. A subset $A\in\F$ is said to be $\theta$-invariant, if $[\theta\in A]=A$. $\theta$ will be said to be ergodic on $(S,\F,\mu)$ if $A\in\F$ and $[\theta\in A]=A$ implies: $\mu(A)=0$ or $\mu(A^c)=0$.
We will mostly assume that $\mu$ is a probability measure; in this case $\theta$ is ergodic iff $[\theta\in A]=A$ implies: $\mu(A)\in\{0,1\}$.
Let $x_0\in S$ and let $\theta:S\rar S$ be the transformation $x\mapsto x_0$. Then $\theta$ is ergodic with respect to any probability measure on $S$. However, the only $\theta$-invariant probability measure is the Dirac measure $\d_{x_0}$.
When talking about ergodicity of a transformation $\theta$ with respect to a probability measure $\mu$ we implicitely assume that $\mu$ is $\theta$-invariant, i.e. $\theta$ preserves $\mu$!
Suppose and $(\wt S,\wt\F,\wt\mu,\wt\theta)$ is homomorphic to $(S,\F,\mu,\theta)$ and there is a measurable mapping $F:(S,\F)\rar(\wt S,\wt\F)$ such that $\wt\mu=\mu_F$ and $\wt\theta\circ F=F\circ\theta$. If $(S,\F,\mu,\theta)$ is ergodic, then $(\wt S,\wt\F,\wt\mu,\wt\theta)$ is ergodic as well.
For any countable set $S$ the counting measure is invariant under every permutation (i.e. bijection) $\theta:S\rar S$. 2. If $S$ is finite, then a permutation $\theta:S\rar S$ is ergodic with respect to the counting measure iff $\theta$ is a cycle of length $|S|$.
For any set $S$ the counting measure is invariant under every permutation (i.e. bijection) $\theta:S\rar S$. 2. If $S$ is finite, then a permutation $\theta:S\rar S$ is ergodic with respect to the counting measure iff $\theta$ is a cycle of length $|S|$.
Suppose $\mu(S) < \infty$, then $\theta$ is ergodic if and only if the constant functions are the only functions $f\in L_p(\mu)$ such that $Pf=f$ and thus for all $n$: $P_nf=f$: If $\theta$ is not ergodic then there exists $A\in\F$ such that $\mu(A),\mu(A^c) > 0$ and $[\theta\in A]=A$; it follows that $PI_A=I_A$. Conversely if $Pf=f\circ\theta=f$ for some non-constant function $f$, then for all Borel sets $B$ in $\R$: $$ A \colon=[f\in B] =[f\circ\theta\in B] =[\theta\in[f\in B]] =[\theta\in A] $$
Suppose $S$ is a metric space. Then the space $C_b(S)$ of all bounded, continuous, real valued functions with $\norm f\colon=\sup\{|f(x)|:x\in S\}$ is a Banach space. 2. The subspace $C_{bu}(S)$ of all bounded and uniformly continuous functions is a closed subspace of $C_b(S)$. 3. If in addition $S$ is locally compact (i.e. every point has a compact neighborhood), then $$ C_0(S)\colon=\{f\in C_b(S):\forall\e > 0\,\exists K\mbox{ compact: }|f|K^c| < \e\} $$ is a closed subspace of $C_b(S)$. If $S$ is discrete we also write $c_0(S)$
If $S$ is separable and locally compact and $\theta:S\rar S$ is continuous, then $P:C_0(S)\rar C_0(S)$, $f\mapsto f\circ\theta$ is a linear operator on $C_0(S)$ satisfying $\norm P=1$. By the Riesz Representation Theorem the space $M(S)$ of all finite signed Borel measures $\mu$ on $S$ is isometrically isomorphic to the dual $C_0(S)^*$ of $C_0(S)$: for any $x^*\in C_0(S)^*$ there is exactly one signed Borel measure $\mu$ such that for all $f\in C_0(S)$. $$ x^*(f)=\int f\,d\mu \quad\mbox{and}\quad \norm{x^*}=|\mu|(S)=\sup\Big\{\int f\,d\mu:\norm f\leq1\Big\}~. $$ Using this identification the dual (or adjoint) mapping $P^*:M(S)\rar M(S)$ is given by $\mu\mapsto\mu_\theta$: indeed for all $f\in C_0(S)$ we have by definition and the transformation theorem of measuer theory $$ \int f\,dP^*\mu =\int Pf\,d\mu =\int f\circ\theta\,d\mu =\int f\,d\mu_\theta \quad\mbox{i.e.}\quad P^*\mu=\mu_\theta~. $$ The Banach space $M(S)$ is the space of all finite signed Borel measures $\mu$ on $S$ equipped with the total variation norm: $\norm{\mu}\colon=|\mu|(S)$. If in addition $S$ is discrete, then $M(S)=\ell_1(S)$. A formally more general type of operators are so called Markov operators, which we are going to discuss in the subsequent section.

Translations on $\TT^d$

Suppose $h_1,\ldots,h_d$ and are rationally independent, i.e. for all $(n_1,\ldots,n_d)\in\Z^d\sm\{0\}$: $$ \sum n_jh_j\neq0\,\modul(1) \quad\mbox{or equivalently}\quad \la n,h\ra\colon=\sum n_jh_j\notin\Z, $$ Denote by $\theta:\TT^d\rar\TT^d$ the mapping $$ (x_1,\ldots,x_d)\mapsto(x_1+2\pi h_1,\ldots,x_d+2\pi h_d)=x+2\pi h~. $$ Then $\theta$ is ergodic. Conversely, if $\theta$ is ergodic, then $h_1,\ldots,h_d$ and are rationally independent.
$\proof$ We remember that the functions $e_n:x\mapsto\exp(i\la n,x\ra)$, $n\in\Z^d$, form an orthonormal basis of $L_2(\TT^d)$. Hence for all $f\in L_2(\TT^d)$: $$ f=\sum_n c_ne_n \quad\mbox{where}\quad c_n\colon=\la f,e_n\ra =\frac1{(2\pi)^d}\int_0^{2\pi}\ldots\int_0^{2\pi} f(t)e^{-i\la n,t\ra}\,dt_1\cdots dt_d~. $$ If $f$ is not constant and $\theta$-invariant, i.e. $f=f\circ\theta$, then for all $n\in\Z^d$ satisfying $c_n\neq0$: $\exp(2\pi i\la n,h\ra)=1$, i.e. $\la n,h\ra\in\Z$. Since $f$ is not constant, there must be such an $n\in\Z^d\sm\{0\}$ and therefore $h$ must be rationally dependent. Conversely, if $h$ is rationally dependent, then there is some $n\in\Z^d\sm\{0\}$, such that $\la n,z\ra\in\Z$. If follows that $e_n$ is $\theta$-invariant. $\eofproof$
$h_1,\ldots,h_d$ are rationally independent iff the set $2\pi(h_1,\ldots,h_d)\Z$ is a dense subgroup of $\TT^d$.
Let $\SU(d)$ be the special unitary group $\SU(d)\colon=\{U\in\Gl(d,\C): U^*U=1,\det U=1\}$. Find a condition on the eigenvalues of $U\in\SU(d)$ such that the set $\{U^n:n\in\Z\}$ is finite.

Markov chains

Conditional expectation

The conditional expectation $\E(X|\F^\prime)$
of an integrable random variable $X:(\O,\F)\rar\R$ with respect to a sub-$\s$-algebra $\F^\prime$ of $\F$ is an $\F^\prime$-measurable random variable, i.e. $\E(X|\F^\prime):(\O,\F^\prime)\rar\R$, such that $$ \forall A^\prime\in\F^\prime:\quad \E(\E(X|\F^\prime);A)=\E(X;A)\colon=\int_AX\,d\P~. $$ By the Radon-Nikodym-Theorem $\E(X|\F^\prime)$ exists and it's $\P$-a.s. unique. The function $\E(I_A|\F^\prime)$ is called the conditional probability of $A\in\F$ given $\F^\prime$ and it's denoted by $\P(A|\F^\prime)$. In general the mapping $A\mapsto\P(A|\F^\prime)(\o)$ is not a probability measure for $\P$ almost all $\o\in\O$ - this is due to the fact that for every sequence of pairwise disjoint sets $A_n\in\F$ the equality $$ \P(\bigcup A_n|\F^\prime)=\sum\P(A_n|\F^\prime) $$ only holds a.e. But there might be loads of such sequences and excluding unions of loads of null sets may wind up in excluding a non null set! In case it is a probability measure for $\P$ almost all $\o\in\O$ it's called a regular conditional probability. For these and any other probabilistic notions we refer to R. Durrett, Probability: Theory and Examples.
Suppose $\O_j$, $1\leq j\leq n$ is a partition of $\O$ such that $\P(\O_j) > 0$ and $\F^\prime=\s(\O_j,1\leq j\leq n)$. Prove that $$ \E(X|\F^\prime)=\sum_j\frac1{\P(\O_j)}\E(X;\O_j)I_{\O_j}~. $$ Moreover on the set $\O_j$ we have $\P(A|\F^\prime)=\P(A\cap\O_j)/\P(\O_j)=\colon\P(A|\O_j)$ and therefore $$ \P(A|\F^\prime)=\sum_j\P(A|\O_j)I_{\O_j}~. $$
Let $\P$ be a probability measure on $\R^2$ with density $\r$ and $X,Y$ denote the projections $(x,y)\mapsto x$ and $(x,y)\mapsto y$ respectively. If $\int\r(x,y)\,dx > 0$, then for all measurable $f:\R^2\rar[0,\infty]$ (suggested solution): $$ \E(f|Y=y) =\frac{\int f(x,y)\r(x,y)\,dx}{\int\r(x,y)\,dx}, $$ where $\E(f|Y=y)\in\R$ is a convenient notation for the value of $\E(f|Y)\colon=\E(f|\s(Y))$ on the set $[Y=y]$. If $\P$ is the uniform distribution on $(a,b)\times(c,d)$, then $$ \E(f|X=x)=\frac1{d-c}\int_c^d f(x,y)\,dy\quad\mbox{and}\quad \E(f|Y=y)=\frac1{b-a}\int_a^b f(x,y)\,dx~. $$
In polar coordinates $(x,y,z)=(\cos\vp\cos\theta,\sin\vp\cos\theta,\sin\theta)$, $\vp\in(0,2\pi)$, $\theta\in(-\pi/2,\pi/2)$, the normalized Haar measure $\s$ on the $2$-sphere $S^2$ has density $(4\pi)^{-1}\cos\theta$. Thus the conditional expectations of $f:S^2\rar\R$ given $\theta$ and $\vp$, respectively, are given by \begin{eqnarray*} \E(f|\theta) &=&\frac1{2\pi}\int_0^{2\pi}f(\cos\vp\cos\theta,\sin\vp\cos\theta,\sin\theta)\,d\vp \quad\mbox{and}\\ \E(f|\vp) &=&\frac1{2}\int_{-\pi/2}^{\pi/2}f(\cos\vp\cos\theta,\sin\vp\cos\theta,\sin\theta)\cos\theta\,d\theta~. \end{eqnarray*}

Markov chains and Markov operator

Suppose $(X_n,\F_n,\P^x)$ is a (homogeneous) Markov chain in a Polish space $S$, i.e.
  1. $\F_n$ is an increasing sequence of sub $\s$-algebras of $\F$.
  2. $X_n$ is an $\F_n$-measurable random variable, i.e. $X_n:(\O,\F_n)\rar(S,\B(S))$.
  3. For all $x\in S$: $\P^x$ is a probability measure on $S$ such that $\P^x(X_0=x)=1$ and $x\mapsto\P^x(A)$ is measurable for all $A\in\F$.
  4. There is a positive (i.e. $f\geq0$ implies $Pf\geq0$) linear operator $P:B(S)\rar B(S)$ mapping bounded measurable functions to bounded measurable functions such that for all $x\in S$ and all $f\in B(S)$: $$ \E^x(f(X_{n+1})|\F_n)=Pf(X_n) \quad\P^x\mbox{-a.s.} $$ That's called the Markov property!
$P$ is called the Markov operator and $X_n$ is a (homogeneous) Markov chain under $\P^x$ starting in $x$. If in addition $P:C_b(S)\rar C_b(S)$, then the Markov chain is called a Feller chain and $P$ a Feller operator. Sometimes $C_b(S)$ is replaced with $C_{bu}(S)$ or $C_0(S)$ - in the latter case $S$ is assumed to be locally compact (cf. section) - this is just a technical issue.
We have for all bounded measurable $f:S\rar\R$ and all $x\in S$: $\E^x(f(X_{n+1})|\F_n)=\E^x(f(X_{n+1})|X_n)$ - because the left hand side is $Pf(X_n)$, which is measurable with respect to the $\s$-algebra generated by $X_n$ - and $$ \E^x f(X_1) =\E^x\E^x(f(X_1)|\F_0) =\E^x Pf(X_0) =Pf(x)~. $$
Prove by induction on $m$ that for all $n,m\in\N_0$: $\E^x(f(X_{n+m})|\F_n)=P^mf(X_n)$. Suggested solution.

Transition functions

Let us put for $A\in\B(S)$ and $x\in S$: $$ P(x,A)\colon=PI_A(x)=\P^x(X_1\in A). $$ Then $A\mapsto P(x,A)$ is a probability measure, $x\mapsto P(x,A)$ is measurable and by the Markov property $P(X_n,A)$ is the conditional probability that $X_{n+1}\in A$ given $\F_n$, i.e. $$ P(X_n,A)=\P^x(X_{n+1}\in A|\F_n) \quad\P^x -a.s~. $$ These conditions simply say that $P(X_n,A)$ is a regular conditional probability for $X_{n+1}$ given $\F_n$. Therefore we also have $$ \forall f\in B(S):\quad \E^x(f(X_{n+1})|\F_n) =\int f(y)\,P(X_n,dy)~. $$ Finally the Markov operator is given by \begin{equation}\label{mareq1}\tag{MAR1} Pf(x)=\int f(y)\,P(x,dy) \end{equation} The mapping $P:S\times\B(S)\rar[0,1]$, $(x,A)\mapsto P(x,A)$, is called a Markovian transition function.
If $P_n(x,A)$ is a sequence of transition functions on $S$ and $p_n:S\rar[0,1]$ a sequence of measurable functions such that $\sum p_n(x)=1$, then $P(x,A)\colon=\sum_n p_n(x)P_n(x,A)$ is a transition function. Describe the corresponding Markov chain.
For all $n,m\in\N_0$: $$ \P^x(X_n\in A,X_{n+m}\in B)=\int_A P^mI_B(y)\,\P_{X_n}^x(dy)~. $$
By exam we have: \begin{eqnarray*} \P^x(X_n\in A,X_{n+m}\in B) &=&\E^x(I_A(X_n)\E^x(I_B(X_{n+m})|\F_n))\\ &=&\E^x(I_A(X_n)P^mI_B(X_n)) =\int_A P^mI_B(y)\,\P_{X_n}^x(dy)~. \end{eqnarray*} Any transformation $\theta:S\rar S$ defines a simple Markov chain, just put $P(x,A)=1$ if $\theta(x)\in A$ and otherwise $P(x,A)=0$, i.e. $P(x,A)=\d_{\theta(x)}(A)$.
Let $G=(V,E)$ be a finite undirected graph, $V$ the set of vertices and $E$ the set of edges. For $x\neq y\in V$ write $x\sim y$ if $\{x,y\}\in E$ and put $d(x)\colon=|\{y\in V:\,y\sim x\}$ - the degree of the vertex $x$. As $G$ is undirected we have $x\sim y$ iff $y\sim x$. $G$ is said to be regular, if every vertex has the same degree. Now put $P(x,\{y\})=1/d(x)$ if $y\sim x$ and $P(x,\{y\})=0$ otherwise. Then $P(x,A)$ is a Markovian transition function. We say the corresponding Markov chain $X_n$ performs a random walk on the graph $G$. Compute its Markov operator.
A lonely knight starts a random walk at the corner of a chess board (performing permissible moves only). Determine all vertices of the graph and all edges: two vertices i.e. two positions $x$ and $y$ of the knight are connected iff the knight can make a permissible move from position $x$ to position $y$. Finally compute the degree of each vertex.
Define a Markov operator on a directed graph $G=(V,E)$.
More generally, if $S$ is a finite (or countable) set and $p(x,y)$, $x,y\in S$ a so called stochastic matrix, i.e.
  1. for all $x,y\in S$: $p(x,y)\geq0$,
  2. for all $x\in S$: $\sum_yp(x,y)=1$.
Then $P(x,\{y\})\colon=p(x,y)$ is a Markovian transition function. Compute its Markov operator. Cf. e.g.
R. Durrett, Probability: Theory and Examples
1. Show that the spectrum $\Spec(P)$ of every stochstic matrix $P$ - i.e. the set of its eigenvalues - is contained in the set $\{z\in\C:|z|\leq1\}$. 2. Which stochastic matrices $P\in\Ma(n,\R)$ are orthogonal?
Suppose $P,Q$ are stochastic matrices. Then their Kronecker product $P\otimes Q$ is also a stochastic matrix. If $S$ and $T$ are the state spaces of $P$ and $Q$, respectively, then $S\times T$ is a state space for $P\otimes Q$ and the Markov chain is $(X_n,Y_n)$ and $$ \P^{(x,y)}(X_1=u,Y_1=v)=\P^x(X_1=u)\P^y(Y_1=v)=p(x,u)q(y,v)~. $$
Suppose $Y_1,Y_2,\ldots:(\O,\P)\rar S$ is an i.i.d. sequence in $S$. Put for $x\in S$: $\P^x\colon=\d_x\otimes\P$, $X_0(x,\o)=x$, $X_j(x,\o)=Y_j(\o)$ and $\F_n=\s(X_0,\ldots,X_n)$. Then $(X_n,\F_n,\P^x)$ is a Markov chain defined on $S\times\O$ with values in $S$ and the Markov operator maps any $f\in B(S)$ to the constant function $\E f(Y_1)$.
If $X_1,X_2,\ldots$ is an i.i.d. sequence in $\R^d$ and $\P^x(S_0=x)=1$, then $S_{n+1}\colon=S_n+X_{n+1}$ is a Markov chain on $\R^d$ with $\F_n\colon=\s(X_0,\ldots,X_n)$. The Markov operator is given by: $$ Pf(y)=\int_{\R^d}f(y+z)\,\mu(dz) $$ where $\mu$ is the distribution of $X_1$ under $\P^x$. Moreover if $\E^x X_1=0$, then $(S_n,\F)$ is a martingale (under $\P^x$), i.e. for all $n\in\N_0$: $\E^x(S_{n+1}|\F_n)=S_n$.
Let $(X_n,\F_n,\P^x)$ be a Markov chain with Markov operator $P$. Then for all bounded measurable $f:S\rar\R$: $\E^x(f(X_n)|\F_m)=P^{n-m}f(X_m)$. In particular $P^{n-m}f(X_m)$, $m=0,\ldots,n$, is a martingale.
By the Markov property and exam we have: $\E^x(f(X_n)|\F_m)=P^{n-m}f(X_m)$.
Let $(X_n,\F_n,\P^x)$ be a Markov chain with Markov operator $P$ and put $L\colon=P-1$. Then for all bounded measurable $f:S\rar\R$ $$ M_n^f\colon=f(X_n)-f(X_0)-\sum_{j=0}^{n-1}Lf(X_j) $$ is a martingale with respect to $\P^x$, i.e. for all $n\in\N_0$: $\E^x(M_{n+1}^f|\F_n)=M_n^f$. Suggested solution.
Let $Z_1,Z_2,\ldots$ be i.i.d. random variables uniformly distributed on the sphere $S^{d-1}$ (cf. e.g. section). For a bounded domain $D$ in $\R^d$ put for any $x\in\cl D$: $R(x)\colon=d(x,D^c)$, $S_0\colon=x$ and $S_{n+1}\colon=S_n+R(S_n)Z_{n+1}$. Then $S_n$ is a Markov chain with respect to $\F_n\colon=\s(Z_1,\ldots,Z_n)$ and the Markov operator is given by: $$ Pf(y)=\int_{S^{d-1}}f(y+R(y)z)\,\s(dz)~. $$ where $\s$ denotes the normalized Haar measure on $S^{d-1}$.
For $f\in B(\R^d)$ we get by independence of $Z_{n+1}$ from $\F_n$: \begin{eqnarray*} Pf(S_n) &=&\E(f(S_{n+1})|\F_n)\\ &=&\E(f(S_n+R(S_n)Z_{n+1})|\F_n) =\int f(S_n+R(S_n)z)\,\s(dz)~. \end{eqnarray*}
Write a program which generates a random variable uniformly distributed on the sphere $S^{d-1}$, cf. section. Suggested solution.
Suppose $|a| < 1$, $Z_1,Z_2,\ldots$ i.i.d. in $S=\R^d$ with distribution $\mu$ and put $X_n=aX_{n-1}+Z_n$. Then $X_n$ is a Markov chain with respect to $\F_n=\s(Z_1,\ldots,Z_n)$. This chain is called an autoregressiv moving average process, ARMAP for short. Show that its Markov operator is given by $$ Pf(y)=\int_{\R^d} f(ay+z)\,\mu(dz) $$ 2. In case $\mu$ is standard normal we get (suggested solution): $$ P^nf(y)=\int_{\R^d} f\Big(a^ny+z\sqrt{\frac{1-a^{2n}}{1-a^2}}\Big)\,\mu(dz)~. $$

Shift operator and invariant measures

As we have already seen any tansformation $\theta:S\rar S$ defines a simple Markov chain. Next we are going to establish the converse result: every Markov chain $X_n$ in a Polish space $S$ may be seen as a transformation on $\O=S^\N$. For this it suffices to assume that there is a mapping $\Theta:\O\rar\O$ such that for all $n$: $$ X_n\circ\Theta=X_{n+1}; $$ $\Theta$ is called the shift operator of the Markov chain and its existence usually follows from the construction of the underlying probability space: this operator is just the Bernoulli shift for the standard construction! Now the Markov property admits an important extension: Let $F:\O\rar\R$ be bounded and measurable with respect to the $\s$-algebra $\F_\infty^X$ generated by $X_0,X_1,\ldots$. Then for all $x\in S$ and alle $n\geq0$: \begin{equation}\label{mareq2}\tag{MAR2} \E^x(F\circ\Theta_n|\F_n) =\E^{X_n}F\quad\P^x\mbox{-a.s.} \end{equation} where $\E^{X_n}F$ is the mapping $\o\mapsto\E^{X_n(\o)}F$ (which is $\s(X_n)$-measurable) and $\Theta_n$ is a short hand for the $n$-fold product: $\Theta\circ\cdots\circ\Theta$. The proof is very similar to the proof of proposition and can be found in almost all text books on Markov chains, in particular in R. Durrett, Probability: Theory and Examples. For e.g. $F=f(X_m)$ we have $F\circ\Theta_n=f(X_{m+n})$ and $\E^{X_n}f(X_m)=P^mf(X_n)$; hence in this special case \eqref{mareq2} is just exam.
For all $n,m\in\N_0$, all $A\in\F_n$ and all $B\in\B(S)$ (compare exam): $$ \P^x(X_{n+m}\in B,A)=\E^x(P^mI_B(X_n);A) $$
Since $X_{n+m}\in B$ iff $X_n\circ\Theta_m\in B$ we infer from e.g. the extended Markov property: $$ \P^x(X_{n+m}\in B,A) =\E^x(\E^x(I_{[X_m\in B]}\circ\Theta_n|\F_n)I_A) =\E^x(\P^{X_n}(X_m\in B)I_A) =\E^x(P^mI_B(X_n);A)~. $$ Finally let $\mu$ be a probability measure on $S$ and put for all $A\in\F$: $$ \P^\mu(A)\colon=\int\P^x(A)\,\mu(dx) $$ Then $\P^\mu$ is a probability measure on $\O$.
For all $F\in L_1(\P^\mu)$: $\E^\mu F=\int\E^x F\,\mu(dx)$.
Actually under $\P^\mu$ the sequence $X_0,X_1,X_2,\ldots$ is a Markov chain in $S$ with $$ \P^\mu(X_0\in B) =\int\P^x(X_0\in B)\,\mu(dx) =\int_B1\,\mu(dx) =\mu(B)~. $$ Therefore the initial distribution is $\mu$. To check the Markov property, i.e. $\E^\mu(f(X_{n+1})|\F_n)=Pf(X_n)$, we notice that for all $x\in S$: $\E^x(f(X_{n+1})|\F_n)=Pf(X_n)$ and thus for all $A\in\F_n$: \begin{eqnarray*} \E^\mu(Pf(X_n);A) &=&\int_S\E^x(Pf(X_n);A)\,d\mu\\ &=&\int_S\E^x(\E^x(f(X_{n+1})|\F_n);A)\,d\mu\\ &=&\int_S\E^x(f(X_{n+1});A)\,d\mu =\E^\mu(f(X_{n+1});A)~. \end{eqnarray*} Hence by the definition of conditional expectations $\P^\mu$ a.e.: $\E^\mu(f(X_{n+1})|\F_n)=Pf(X_n)$.
$\mu$ is said to be an invariant measure for the Markov chain, if \begin{equation}\label{mareq3}\tag{MAR3} \forall B\in\B(S):\quad \P^\mu(X_{n+1}\in B) =\P^\mu(X_n\in B) \end{equation}
This shows that $\mu$ is invariant iff for all $n\in\N_0$: $\P^\mu(X_n\in A)=\mu(A)$. We also say that under $\P^\mu$ the sequence $X_0,X_1,\ldots$ is stationary. Can we check stationarity by just inspecting the Markov operator? Yes! because this holds if and only if \begin{equation}\label{mareq4}\tag{MAR4} \forall f\in B(S):\quad \int Pf\,d\mu=\int f\,d\mu~. \end{equation} This is because the Markov property implies (compare exam or the previous reasoning for $A=\O$ and $f=I_B$): $$ \P^\mu(X_{n+1}\in B) =\E^\mu PI_B(X_n) =\int PI_B\,d\P_{X_n}^\mu~. $$ Thus if $\mu$ is invariant, then for all $n$: $\P_{X_n}^\mu=\mu$ and in particular $\P^\mu(X_{n+1}\in B)=\mu(B)$. Hence for all $B\in\B(S)$: $\int PI_B\,d\mu=\mu(B)$ and this is easily seen to be equivalent to \eqref{mareq4}. Conversely if \eqref{mareq4} holds, then we get for $n=0$: $\P^\mu(X_1\in B)=\int PI_B\,d\mu=\mu(B)=\P^\mu(X_0\in B)$ and the assertion follows by induction on $n$. We notice that \eqref{mareq4} also makes sense for arbitrary measures $\mu$!
Lebesgue measure is an invariant measure both for the Markov chain in exam and in exam.
However we will almost exclusively stick to probability measures!
If $\mu=\P_{Y_1}$ is the distribution of $Y_1$, then $\mu$ is invariant for the Markov chain in exam. This Markov chain is also called a Bernoulli scheme, cf. also exam.
If $\mu$ is invariant for the Markov chain $X_n$, then for all $n\in\N_0$ and all $A,B\in\B(S)$: $$ \P^\mu(X_n\in A,X_{n+m}\in B)=\int_AP^mI_B(y)\,\mu(dy)~. $$
Next we verify that the invariance of $\mu$ implies invariance of $\P^\mu$ by the shift operator $\Theta$, i.e. $\Theta$ preserves the probability measure $\P^\mu$ on $\O$: indeed, by the extended Markov property \eqref{mareq2} we have for an invariant measure $\mu$ and any bounded $\F_\infty^X$-measurable $F:\O\rar\R_0^+$: $$ \E^\mu(F\circ\Theta) =\E^\mu\E^\mu(F\circ\Theta|\F_1) =\E^\mu\E^{X_1}F =\E^\mu F, $$ i.e. $\P^\mu$ is a probability measure on $(\O,\F)$ invariant under $\Theta$. Thus any Markov chain with invariant measure can be regarded as a sort of discrete dynamical system on $(\O,\F)$ discussed in the previous section, provided $\O$ is Polish!
Given a stochastic matrix $p(x,y)$ on an discrete Polish space $S$. The associated Markov operator $P$ is a contraction on $\ell_\infty(S)$. 1. If for each finite subset $K$ of $S$ and each $\e > 0$ there is another finite subset $E$ of $S$ such that for all $x\notin E$: $P(x,K) < \e$ (i.e. $P(.,K)\in c_0(S)$), then $P$ is a contraction on $c_0(S)$ and thus $P^*$ is a contraction on $\ell_1(S)$ (the dual of $c_0(S)$ is isometrically isomorphic to $\ell_1(S)$). 2. A (signed) measure $\mu$ on $S$ is just a vector $\mu\in\ell_1(S)$. Verify that if $P$ is a contraction on $c_0(S)$ then $\mu$ is invariant iff $P^*\mu=\mu$, i.e. $$ \forall x\in S:\quad \sum_{y\in S}\mu(y)p(y,x)=\mu(x)~. $$ Give an example of an infinite stochastic matrix such that $P$ doesn't map $c_0(S)$ into $c_0(S)$. Suggested solution.
A stochstic matrix $P=(p(x,y))$ is called doubly stochastic if for all $x,y$: $\sum_zp(x,z)=\sum_z p(z,y)=1$. 1. The normalized counting measure is an invariant probability measure for $P$. 2. A finite undirected graph $G=(V,E)$ (cf. exam) is doubly stochastic iff for all $y\in V$: $\sum_{z\sim y}\frac1{d(z)}=1$. 3. Permutation matrices are doubly stochastic.
Suppose $P,Q$ are stochastic matrices with invariant measures $\mu$ and $\nu$, respectively. Then their Kronecker product $P\otimes Q$ (cf. exam) has invariant measure $\mu\otimes\nu$.

Ergodicity

Formally the Markov operator $P$ is only defined on the vector space $B(S)$ of bounded, measurable functions on $S$. Now the existence of an invariant measures $\mu$ allows us to define $P$ as a positive, linear contraction on the Hilbert space $L_2(\mu)$:
If $\mu$ is invariant then $P:L_2(\mu)\rar L_2(\mu)$ is a contraction.
$\proof$ 1. By Jensen's inequality we have for all $f\in B(S)$ and all $x\in S$: $$ (Pf)^2(x) =\Big(\int f(y)\,P(x,dy)\Big)^2 \leq\int f(y)^2\,P(x,dy) =P(f^2)(x) $$ and by invariance for $f\in B(S)\cap L_2(\mu)$: $$ \int(Pf)^2\,d\mu\leq\int P(f^2)\,d\mu=\int f^2\,d\mu~. $$ Since $B(S)\cap L_2(\mu)$ is dense in $L_2(\mu)$, there is a unique bounded, linear extension of $P$ to $L_2(\mu)$ and this extension is obviously a positive contraction. $\eofproof$
If $\mu$ is invariant for the Markov operator $P$, then for all $f\geq0$: $\Ent(Pf)\leq\Ent(f)$, where $\Ent(f)\colon=\int f\log f\,d\mu$ is called the entropy of $f$ with respect to $\mu$. Beware, in physics and especially in thermodynamics the entropy is defined by $-\int f\log f\,d\mu$! Hence, to physicists the entropy of a Markov chain increases as time goes on.
As $\vp:x\mapsto x\log x$ is convex on $\R^+$, we infer from Jensen's inequality: $$ \int\vp(Pf(x))\,\mu(dx) =\int\vp\Big(\int f(y)\,P(x,dy)\Big)\,\mu(dx) \leq\int P(\vp(f))(x)\,\mu(dx) =\int\vp(f)\,d\mu~. $$
If $\mu$ is invariant for the Markov operator $P$, then $P$ is a contraction on all $L_p(\mu)$, $1\leq p\leq\infty$.
A Markov operator $P$ on $S$ with invariant probability measure $\mu$ is said to be ergodic if $f\in L_2(\mu)$ and $Pf=f$ imply that $f$ is $\mu$ a.e. constant.
Hence $P:L_2(\mu)\rar L_2(\mu)$ is ergodic iff all eigen functions of $P$ for the eigenvalue $1$ are constant functions.
  1. The Markov operator in exam is evidently ergodic.
  2. If $(p(x,y))$ is a finite stochastic matrix with a unique invariant probability measure, then the associated Markov operator is ergodic, cf. proposition.
Suppose $P$ is a stochastic matrices with invariant measures $\mu$. If $P$ is ergodic on $L_2(\mu)$, then the Kronecker product $P\otimes P$ (cf. exam) need not be ergodic on $L_2(\mu\otimes\mu)$. Hint: If $-1$ is an eigenvalue of $P$ with eigen vector $x$, then $(P\otimes P)x\otimes x=x\otimes x$. You may take for $P$ the permutation matrix of the permutation $(2,1)\in S(2)$.
Let us remark that there is another commonly used notion of ergodicity for stochastic matrices (irreducible and aperiodic states), which is a bit stronger than the notion we employ! However, we will never refer to that stronger notion.
Any measure $\mu$ supported on $\pa D$, i.e. $\mu((\pa D)^c)=0$, is an invaraint measure of the Markov operator $P$ in exam. 2. If $f:D\rar\R$ is harmonic then $Pf=f$. Hence $P$ is not ergodic, no matter what measure we take.
If $\mu$ is supported on $\pa D$, then for all bounded $f\in B(\cl D)$ $$ \int Pf\,d\mu =\iint f(y+R(y)z)\,\s(dz)\,\mu(dy) =\iint f(y+R(y)z)\,\mu(dy)\,\s(dz) =\iint f(y)\,\mu(dy)\,\s(dz) =\int f(y)\,\mu(dy)~. $$ $f$ is harmonic iff for all $x\in D$ and all $r > 0$ satisfying $B_r(x)\sbe D$: $\int f(x+rz)\,\s(dz)=f(x)$. Hence for harmonic functions $f$ we have $Pf=f$.
Remark: Starting at $x\in D$ the sequence $X_n$ converges 'weakly' to a random variable $X$, whose distribution is the harmonic measure on $\pa D$ with respect to $x$. Hence for all continuous $f:\pa D\rar\R$ the function $u(x)\colon=\E^xf(X)$ is the harmonic extension $u:D\rar\R$ of $f$ into the interior of $D$.

Invariant measures on finite sets

For every finite stochastic matrix $P=(p(x,y))$ we have $\dim\ker(P^*-1)\geq1$, because $1$ is an eigenvalue for $P$ and thus $\bar 1=1$ is an eigenvalue for $P^*$ - the spectrum of $P^*$ is the complex conjugate of the spectrum of $P$! Thus $\dim\ker(P^*-1)\geq1$. Yet we don't know if $\ker(P^*-1)$ contains a measure.
Given a stochastic matrix on a finite set $S$. Then there is an invariant probability measure $\mu$. Moreover if there is a unique invariant probability measure $\mu$, then for any probability measure $\nu$ on $S$ the sequence $$ A_n^*\nu\colon=\frac1n\sum_{j=0}^{n-1}P^{*j}\nu $$ converges to $\mu$ and for all functions $f:S\rar\R$ the sequence $$ A_nf\colon=\frac1n\sum_{j=0}^{n-1}P^{j}f $$ converges to the constant function $\sum_y f(y)\mu(y)$ and thus $P$ is ergodic. For a more general result cf. exam.
$\proof$ 1. Suppose $|S|=n$, then the associated Markov operator $P:\ell_\infty^n\rar\ell_\infty^n$ and its adjoint $P^*:\ell_1^n\rar\ell_1^n$ are given by $$ Pf(x)\colon=\sum_y p(x,y)f(y) \quad\mbox{and}\quad P^*\mu(x)\colon=\sum_y p(y,x)\mu(y)~. $$ Let $M_1\sbe\ell_1^n$ be the set of probability measures on $S$, $M_1$ is compact and convex. Since $P^*(M_1)\sbe M_1$ we infer from Brouwer's fixed point theorem that there is some $\mu\in M_1$ such that $P^*\mu=\mu$.
2. Alternatively and more elementary we may take any $\nu\in M_1$ and define $$ \forall n\in\N:\quad \mu_n\colon=A_n^*\nu $$ Then $\norm{P^*\mu_n-\mu_n}\leq2/n$ and thus any accumulation point $\mu$ of the sequence $\mu_n$ is a fixed point of $P^*$. This also shows that $A_n^*\nu$ converges to $\mu$ if $\mu$ is the unique invariant probability measure. Hence for all $f:S\rar\R$ $$ A_nf(x) =\int A_nf\,d\d_x =\int f\,dA_n^*\d_x \to\int f\,d\mu=\sum_y f(y)\mu(y)~. $$ If $Pf=f$, then for all $n$: $f=A_nf\to\int f\,d\mu$ and therefore $f$ must be constant $\eofproof$
Noteworthy the alternative proof also works if the sequence $\mu_n$ has some accumulation point with respect to some metric $d$, say, which is weaker than the metric defined by the norm, i.e. $d(x,y)\leq C\norm{x-y}$ for some constant $C > 0$.
Computationally the presented proofs don't work well because averaging gives a pretty slow algorithm. Usually a certain convex combination $Q$ of the powers of $P$ yields a stochastic matrix with strictly positive entries (cf. exam) and the sequence $Q^{*n}\nu$ for any $\nu\in M_1$ converges (exponentially fast) to the invariant measure $\mu\in M_1$ of $Q$, which is also the invariant measure of $P$, provided it's unique.
Given a finite set $S$ and a transformation $\theta:S\rar S$. Prove that there is a $\theta$ invariant probability measure $\mu$ on $S$. Algebraically that means that there is a measure $\mu\in M_1$ such that for all $x\in S$: $\mu(\theta^{-1}(x))=\mu(x)$.
Stochastic matrices are frequently used in text analysis and text generation, cf. e.g. text generation. Here is a C code generating random text based on Molly's soliloquy in J. Joyce's Ulysses. This can also be used to get a certain graph structure of poems: e.g. Poe's poem Alone.
Suppose $\F_n$ is a filtration and $X_n:\O\rar S$ are measurable with respect to $\F_n$. Assume that there is some $m\in\N$ such that for all bounded measurable $f:S\rar\R$ and all $n$: $$ \E(f(X_{n+1})|\F_n) =\E(f(X_{n+1})|\s(X_n,\ldots,X_{n-m+1}))~. $$ Then $X_n$ is called an $m$-Markov chain. Verify that $Z_n\colon=(X_n,\ldots,X_{n-m+1})$ is a Markov chain with respect to $\F_n$ in $S^m$. Suggested solution.
This C code generates text by means of an $m$-Markov chain. Short texts such as Time will almost be reproduced (for $m=2$), wheras long texts such as Mann get jumbled (for $m\leq2$).
Given a finite stochastic matrix $P=(p(x,y))$. If $\mu$ is an invariant probability measure for $P$ and $S_0=\{x\in S:\mu(x)=0\}$ then for all $y\in S\sm S_0$ and all $x\in S_0$: $p(y,x)=0$. The Markov chain will never jump from a point in $S\sm S_0$ to any point in $S_0$.
A finite stochastic matrix $P=(p(x,y))$ has a unique invariant probability measure if and only if $\dim\ker(P^*-1)=1$. Solution by T. Speckhofer
We will see (cf. theorem) that if for all $x,y\in S$: $p(x,y) > 0$, then $\dim\ker(P^*-1)=1$ and the unique invariant probability measure $\mu$ is strictly positive, i.e. for all $x\in S$: $\mu(x) > 0$.
Suppose $P^*:\ell_1^n\rar\ell_1^n$ is linear (and diagonalizable) such that all eigenvalues $\l$ satisfy: $|\l|\leq1$. Prove that the sequence $$ A_n^*\colon=\frac1n(1+P^*\mu+\cdots+P^{*n-1}) $$ converges to the projection $Q$ onto the kernel of $P^*-1$ ($Q$ is defined by $Q|\ker(P^*-\l)=0$ for all eigenvalues $\l\neq1$ and $Q$ is the identity on $\ker(P^*-1)$). 2. If all eigenvalues $\l$ satisfy: $|\l| < 1$ or $\l=1$. Then the sequence $P^{*n}$ converges to $Q\mu$. 3. If there is some eigenvalue $\l\neq1$ satisfying $|\l|=1$, then the sequence $P^{*n}$ doesn't converge.
Determine all invariant probability measures of the following stochastic matrices: $$ \left(\begin{array}{ccc} 0&1&0\\ 0&.5&0.5\\ .5&0&.5 \end{array}\right),\quad \left(\begin{array}{ccc} .5&.3&.2\\ .2&.8&0\\ .3&.3&.4 \end{array}\right),\quad \left(\begin{array}{ccc} .6&.1&.3\\ .3&.6&.1\\ .1&.3&.6 \end{array}\right),\quad $$
Any two state ($S=\{1,2\}$) stochastic matrix is given by $$ P\colon=\left(\begin{array}{ccc} 1-a&a\\ b&1-b \end{array}\right),\quad a,b\in[0,1]~. $$ Suppose $P\neq1$, i.e. $a+b > 0$.
  1. The invariant probability measure is given by $\mu(1)=a/(a+b)$, $\mu(2)=b/(a+b)$.
  2. Compute all eigenvalues and eigen spaces of $P$.
  3. Prove that for all $n\in\N$: $$ P^n=\frac1{a+b}\left(\begin{array}{ccc} a&a\\ b&b \end{array}\right) +\frac{(1-a-b)^n}{a+b}\left(\begin{array}{ccc} a&-b\\ -a&b \end{array}\right) $$
  4. Give an example of a symmetric stochastic matrix $P$ such that $P^n$ does not converge!
Find the stochastic matrix describing the random walk on the vertices of the $3$-dimensional unit ball of $\ell_1^3$. What about $\ell_1^n$?

Reversible Markov chains

$X_n$ (or $\mu$) is said to be reversible
if for all $A,B\in\B(S)$: \begin{equation}\label{mareq5}\tag{MAR5} \P^\mu(X_n\in A,X_{n+1}\in B) =\P^\mu(X_{n+1}\in A,X_n\in B) \end{equation} A reversible measure is invariant, for $$ \P^\mu(X_n\in A) =\P^\mu(X_n\in A,X_{n+1}\in S) =\P^\mu(X_{n+1}\in A,X_n\in S) =\P^\mu(X_{n+1}\in A)~. $$ Moreover, for all $f,g\in B(S)$: \begin{eqnarray*} \E^{\mu}(f(X_n)g(X_{n+1})) &=&\E^{\mu}(f(X_n)\E^\mu(g(X_{n+1})|\F_n))\\ &=&\E^\mu(f(X_n)Pg(X_n)) =\int f(x)Pg(x)\,\mu(dx) \end{eqnarray*} and analogously $\E^{\mu}(f(X_{n+1})g(X_n))=\int Pf(x)g(x)\,\mu(dx)$. Thus for all $f,g\in B(S)$: $$ \int f\cdot Pg\,d\mu=\int Pf\cdot g\,d\mu $$
If $\mu$ is reversible, then $P:L_2(\mu)\rar L_2(\mu)$ is self-adjoint.
$\proof$ This follows immediately from the relation $\int f.Pg\,d\mu=\int Pf\cdot g\,d\mu$ for all $f,g\in B(S)$, and the fact that $B(S)$ is dense in $L_2(\mu)$. $\eofproof$
Conversely, if the Markov operator $P:L_2(\mu)\rar L_2(\mu)$ is self-adjoint then the corresponding Markov chain is reversible.
A Markov operator $P:L_2(\mu)\rar L_2(\mu)$ is self-adjoint iff for all disjoint subsets $A,B\in\B(S)$: $$ \int_AP(x,B)\,\mu(dx)=\int_BP(x,A)\,\mu(dx)~. $$
$\proof$ For arbitrary $A,B\in\B(S)$ put $D\colon=A\cap B$. Since both $A\sm D$ and $B\sm D$ and $D$ and $B\sm D$ are disjoint we conclude by the additivity of $C\mapsto P(x,C)$: \begin{eqnarray*} \int_AP(x,B)\,\mu(dx) &=&\int_D P(x,B)\,\mu(dx)+\int_{A\sm D}P(x,B)\,\mu(dx)\\ &=&\int_D P(x,D)\,\mu(dx)+\int_D P(x,B\sm D)\,\mu(dx)+\int_{A\sm D}P(x,B)\,\mu(dx)\\ &=&\int_D P(x,D)\,\mu(dx)+\int_{B\sm D}P(x,D)\,\mu(dx)+\int_{B}P(x,A\sm D)\,\mu(dx)\\ &=&\int_B P(x,D)\,\mu(dx)+\int_B P(x,A\sm D)\,\mu(dx) =\int_B P(x,A)\,\mu(dx)~. \end{eqnarray*} $\eofproof$
Put $D=\sum_x d(x)=2|E|$. Then the probability measure $\mu(x)\colon=d(x)/D$ is reversible for the Markov chain in exam. In particular the normalized counting measure is a reversible probability measure for the random walk on a regular undirected graph. Suggested solution.
If $\mu=\P_{Y_1}$ is the distribution of $Y_1$, then the Markov chain in exam is reversible with respect to $\mu$.
Let $B$ be a convex and symmetric body in $\R^d$ (i.e. $B$ is compact, convex, symmetric and $B^\circ\neq\emptyset$), $D$ a bounded domain in $\R^d$ and $Z_n$ an i.i.d. sequence uniformly distributed on $B$. Put $X_{n+1}=X_n+Z_{n+1}$ if $X_n+Z_{n+1}\in D$ and $X_{n+1}=X_n$ otherwise. Then $X_n$ is a Markov chain and the Markov operator is given by: $$ Pf(x)=\frac1{\l(B)}\Big(f(x)\l(D^c\cap(B+x))+\int_{D\cap(B+x)}f(y)\,dy\Big)~. $$ Moreover the normalized Lebesgue measure on $D$ is a reversible probability measure.
Let $\l_B(A)\colon=\l(B\cap A)/\l(B)$ be the normalized Lebesgue measure on $B$. For $x\in D$ the point $x+Z$ is not in $D$ iff $Z\in D^c-x$, and thus for a measurable subset $A$ of $D$ the transition function $P(x,A)$ is given by definition by: \begin{eqnarray*} P(x,A) &=&\P^x(X_1\in A)\\ &=&\d_x(A)\P^x(Z_1\in D^c-x)+\P^x(x+Z_1\in A,Z_1\in D-x)\\ &=&\d_x(A)\l_B(D^c-x)+\l_B(A-x) =\frac{1}{\l(B)}\Big(\d_x(A)\l(D^c\cap(B+x))+\l(A\cap(B+x))\Big)~. \end{eqnarray*} Hence we get for the associated Markov operator: $$ Pf(x) =\int f(z)\,P(x,dz)\\ =\frac1{\l(B)}\Big(f(x)\l(D^c\cap(B+x))+\int_{D\cap(B+x)}f(y)\,dy\Big)~. $$ Finally we obtain for disjoint (cf. lemma) sets $E,F\sbe D$ by symmetry of $B$: \begin{eqnarray*} \int_E P(x,F)\,dx &=&\frac1{\l(B)}\int_E\l(F\cap(B+x))\,dx\\ &=&\frac1{\l(B)}\iint I_B(y-x)I_F(y)I_E(x)\,dy\,dx\\ &=&\frac1{\l(B)}\iint I_B(x-y)I_F(x)I_E(y)\,dy\,dx\\ &=&\frac1{\l(B)}\iint I_B(y-x)I_F(x)I_E(y)\,dy\,dx =\int_{F} P(x,E)\,dx~. \end{eqnarray*} The Markov operator is indeed ergodic, i.e. the only functions satisfying $Pf=f$ are constants, but yet that's not so obvious (cf. theorem). It's way simpler (cf. exam or its discrete version exam) to prove ergodicity for another Markov operator which we are going to describ now:
Perform a random walk $X_n$ on a bounded convex domain $D$ with boundary $\pa D$: given $X_n=x$ in $D$ we choose a random one dimensional subspace $[Z]$ and compute the length $L$ of the segment of the line $t\mapsto x+tZ$ lying in $D$. This line intersects the boundary $\pa D$ in exactly two points $x+t_1Z$, $t_1=t_1(x,Z) < 0$ and $x+t_2Z$, $t_2=t_2(x,Z) > 0$, which terminate the segment. Finally we choose a 'random' point $X_{n+1}=y$ on the segment and consider $L$ as a function of $x$ and $y$, i.e. $L=L(x,y)$. This gives a Markov chain $X_n$. The Markov operator is given by $$ Pf(x)=\frac2{\vol{}(S^{d-1})}\int_D\frac{f(y)}{\norm{x-y}^{d-1}L(x,y)}\,dy $$ and since $(x,y)\mapsto\norm{x-y}^{-d+1}L(x,y)^{-1}$ is symmetric Lebesgue measure on $D$ is reversible.
We first have to clearify what is meant by a random one dimensional subspace. How do we choose such a subspace: We start with a random unit vector $Z$ on $S^{d-1}$, i.e the distribution of $Z$ is the normalized surface measure on $S^{d-1}$. Then we take the uniform distribution on the segment $[x+t_1Z,x+t_2Z]$ of length $L$ and choose a point on this segment randomly.
markov chain
This will give us the Markov operator \begin{eqnarray*} Pf(x) &=&\int_{S^{d-1}}\int_{t_1}^{t_2}\frac{f(x+tz)}{L(x,x+tz)}\,dt\,\s(dz)\\ &=&\int_{S^{d-1}}\Big(\int_0^{t_2}f(x+tz)+\int_0^{-t_1}f(x-tz)\Big) \frac1{L(x,x+tz)}\,dt\,\s(dz)\\ &=&\int_{S^{d-1}}\Big(\int_0^{t_2(x,z)}\frac{f(x+tz)}{L(x,x+tz)} +\int_0^{t_2(x,-z)}\frac{f(x-tz)}{L(x,x-tz)}\Big)\,dt\,\s(dz)\\ &=&2\int_{S^{d-1}}\Big(\int_0^{t_2(x,z)}\frac{f(x+tz)}{L(x,x+tz)}\,dt\,\s(dz) =\frac2{\vol{}(S^{d-1})}\int_D\frac{f(y)}{L(x,y)\norm{x-y}^{d-1}}\,dy~. \end{eqnarray*} Here we used two simple facts: 1. for all $t\in[t_1,t_2]$: $L(x,x+tz)=L(x,y)$, hence it doesn't depend on $t$, 2. $t_2(x,-z)=-t_1(x,z)$ and finally integration in 'polar coordinates': if $x\in D^\circ$ and $t_2$ is the (euclidean) distance of $x$ to the intersection point of the half line originating in $x$ and pointing in direction $z$ and the boundary of $D$, then $$ \int_D g(y)\,dy =\vol{}(S^{d-1})\int_{S^{d-1}}\int_0^{t_2} g(x+tz)t^{d-1}\,dt\,\s(dz) $$ applied to $g(y)=f(y)\norm{x-y}^{-d+1}L(x,y)^{-1}$.
Suppose $0$ is an interior point of the convex domain $D\sbe\R^d$. Show that $$ \frac{\vol{}(D)}{\vol{}(B_2^d)}=\int_{S^{d-1}}p_D(z)^{-d}\,\s(dz)~. $$ where $B_2^d$ denotes the euclidean unit ball and $p_D(z)\colon=\inf\{r > 0: z\in rD\}$ is the so called Minkowski functional of $D$, i.e. $D=D^\circ=[p_D < 1]$ and $\pa D=[p_D=1]$. Suggested solution.
Write a subroutine which determines the numbers $t_1$ and $t_2$ given the Minkowski functional of $D$ and two points $x,y\in D^\circ$.
Of course we may take any open and bounded set $D$ with sufficiently 'nice' boundary. The problem is not the Markov operator - this doesn't change anyway - the problem is a possibly huge number of intersection points!
Suppose $w_1,\ldots,w_n:S\rar S$ are continuous maps on the Polish space $S$ and $p_1,\ldots,p_n:S\rar[0,1]$ are continuous such that for all $x\in S$: $\sum p_j(x)=1$. Then $$ Pf(x)\colon=\sum_j p_j(x)f(w_j(x)) $$ is a Markov operator on $C_b(S)$. The corresponding Markov chain $X_n$ is called a random iterated function system (IFS for short): we have $\P(X_{n+1}=w_j(x)|X_n=x)=p_j(x)$. If $P:C_0(S)\rar C_0(S)$ is Feller then the adjoint $P^*:M(S)\rar M(S)$ is given by $$ P^*\mu(A)=\sum_j\int_{w_j\in A}p_j\,d\mu $$ and if all $p_j$ are constant: $P^*\mu=\sum p_j\mu_{w_j}$.
Given a stochastic matrix $p(x,y)$ on a finite or countable set $S$ with invariant probability measure $\mu$. Verify that the adjoint of the Markov operator $P:L_2(\mu)\rar L_2(\mu)$ is given by $$ P^*f(x)=\sum_y f(y)q(x,y), \quad\mbox{where}\quad q(x,y)=\frac{p(y,x)\mu(y)}{\mu(x)} $$ and conclude that $\mu$ is reversible iff $$ \forall x,y\in S:\quad \mu(x)p(x,y)=\mu(y)p(y,x)~. $$ Give an example of a stochastic matrix with multiple reversible strictly positive measures. Suggested solution.
Suppose $\mu$ and $\nu$. respectively, are reversible measures for the stochastic matrices $P$ and $Q$, respectively. Then $\mu\otimes\nu$ is a reversible measures for the Kronecker product $P\otimes Q$. If in addition both $P$ and $Q$ are ergodic, then $P\otimes Q$ need not be ergodic (cf. exam).
Suppose $S$ is finite and $P:L_2(\mu)\rar L_2(\mu)$ is a self-adjoint Markov operator, i.e. $Pf(x)=\sum p(x,y)f(y)$, $\sum_y p(x,y)=1$, $p(x,y)\geq 0$ and $\mu(x)p(x,y)=\mu(y)p(y,x)$. Then for all $f\in L_2(\mu)$: $$ \la(1-P)f,f\ra =\tfrac12\sum_{x,y}p(x,y)(f(x)-f(y))^2\mu(x) $$ 2. If for all $x$: $\mu(x) > 0$ and for $x\neq y$: $p(x,y) > 0$, then $P$ is ergodic.
Put $a(x,y)=p(x,y)-\d_x(y)$, then $\sum_y a(x,y)=0$ and since $\mu$ is reversible: $\mu(x)a(x,y)=\mu(y)a(y,x)$. Thus we get $$ \sum_{x,y} a(x,y)f(x)^2\mu(x)=0 \quad\mbox{and}\quad \sum_{x,y} a(x,y)f(y)^2\mu(x) =\sum_{x,y} a(y,x)f(y)^2\mu(y)=0~. $$ Therefore \begin{eqnarray*} \la(1-P)f,f\ra &=&-\sum_{x,y} a(x,y)f(y)f(x)\mu(x)\\ &=&\sum_{x,y} a(x,y)\Big(-f(y)f(x)+\tfrac12f(x)^2+\tfrac12f(y)^2\Big)\mu(x)\\ &=&\tfrac12\sum_{x,y}a(x,y)(f(x)-f(y))^2\mu(x) =\tfrac12\sum_{x,y}p(x,y)(f(x)-f(y))^2\mu(x)~. \end{eqnarray*} 2. Suppose $f(x_1)\neq f(x_2)$, then $$ \la(1-P)f,f\ra \geq(f(x_1)-f(x_2))^2p(x_1,x_2)\mu(x_1) > 0~. $$
$S=\Z_N$, $p(x,x+1)=q(x+1)$, $p(x,x)=r(x)$ and $p(x,x-1)=p(x-1)$. In this case we have $Pf(x)=p(x-1)f(x-1)+r(x)f(x)+q(x+1)f(x+1)$. A measure $\mu$ is reversible iff $$ \prod_{x}\frac{q(x)}{p(x)}=1 \quad\mbox{and}\quad \forall x\in\Z_N:\quad \mu(x)=c\prod_{y=0}^{x-1}\frac{q(y+1)}{p(y)}, $$
$S=\Z$, $U:S\rar\R_0^+$, $p(x)=q(x)=e^{-U(x)}/2$ and $Z\colon=\sum_{y\in\Z}^\infty e^{-U(y)}$. A reversible probability measure is given by $$ \mu(x)=e^{-U(x)}/Z $$
We have a collection of $N$ molecules in two boxes. Choose one of these molecules randomly and put it into the other box. We say that the system is in state $x\in S\colon=\{0,\ldots,N\}$ if there are exactly $x$ molecules in the first box. For $x < N$ we have: $p(x,x+1)=(N-x)/N$ and for $x > 0$: $p(x,x-1)=x/N$, otherwise $p(x,y)=0$. Verify that the binomial distribution $\mu(x)={N\choose x}2^{-N}$ is reversible. The corresponding Markov chain is called Ehrenfest chain.
We have got two boxes $A$ and $B$, each of which contains exactly $N$ balls. $n$ of these balls are black and $2N-n$ balls are white. We say that these collection of balls is in state $x$, $x=0,\ldots,n$ if box $A$ contains $x$ black balls. Now we choose 'randomly' one ball from each box and put the ball from box $A$ into box $B$ and the ball from box $B$ into box $A$. This gives us a Markov chain on $S=\{0,\ldots,n\}$ - That's called the Bernoulli-Laplace diffusion model. Check that we have the following transition function for $x,y\in S$: $$ p(x,y)=\left\{ \begin{array}{cl} \frac{(N-x)(n-x)}{N^2}&\mbox{if $y=x+1$ and $y\leq n$}\\ \frac{x(n-x)+(N-x)(N-(n-x))}{N^2}&\mbox{if $y=x$}\\ \frac{x(N-(n-x))}{N^2}&\mbox{if $y=x-1$ and $y\geq0$}\\ 0&\mbox{otherwise}\\ \end{array}\right. $$ 2. Verify that the hypergeometric distribution $$ \mu(x)={n\choose x}{2N-n\choose n-x}\Big/{2N\choose n}~. $$ is reversible.

Poissonization

Up until now we have been investigating a linear operator $P$ and the semigroup $P^n$, $n\in\N_0$. Our next focus of interest is on so called one parameter semigroups $P_t$, which depend on a continuous parameter $t\in\R_0^+$. There is an easy procedure to 'imbed' a discrete semigroup in a continuous semigroup: the Poissonization: Suppose $E$ is a Banach space and $P:E\rar E$ a bounded linear operator; let $N_t$, $t\geq0$, be a family of Poisson variables with parameter $\l$, i.e. $$ \P(N_t=n)=\frac{(\l t)^ne^{-\l t}}{n!} $$ then putting $L\colon=P-1$ and $$ Q_t \colon=\E P^{N_t} =\sum_{n=0}^\infty\frac{(\l t)^ne^{-\l t}}{n!}P^n =e^{\l tL} $$ we get a continuous contraction semigroup $Q_t$, $t\geq0$, with generator $\l L$. Actually, for all $x\in E$ the mapping $t\mapsto Q_tx$ is analytic. We remark that we do not have: $Q_n=P^n$ for $n\in\N$, hence the quotation 'imbed'! Strict imbedding is not always possible.
If $P$ is a bounded, strictly positive (definite) linear operator on a Hilbert space $E$ (i.e. for some $c > 0$ and all $x\in E$: $\la Px,x\ra\geq c\Vert x\Vert^2$), then there is a bounded, positive (definite) linear operator $L$ such that $P=e^L$. 2. Find a symmetric stochastic matrix $P\in\Ma(2,\R)$ for which there is no matrix $L\in\Ma(2,\R)$ such that $P=e^L$.
$Q_t$ is the family of Markov operators for a Markov process $(Y_t)_{t\geq0}$, where the number of steps of the original Markov chain $X_n$ (with Markov operator $P$) is $N_t$. i.e. $Y_t=X_{N_t}$.
Suppose $L=(l_{jk})\in\Ma(n,\R)$ has the following properties:
  1. For all $j$: $\sum_k l_{jk}=0$.
  2. For all $j\neq k$: $l_{jk}\geq0$
Then the matrix $e^{tL}$ is stochastic.
$\proof$ For sufficently large $a > 0$ the entries of the matrix $L+a$ are non-negative. Hence all entries of $$ e^{tL}=e^{t(L+a)-at}=e^{t(L-a)}e^{-at} $$ are non-negative. Moreover, for $x=(1,\ldots,1)^t$ we get $Lx=0$ and thus: $(p_{jk})x\colon=e^{L}x=x$, i.e. for all $j$: $\sum_k p_{jk}=1$. $\eofproof$
Let $P$ be the Markov operator in exam. Put $\l=1$ and describe the Markov process with Markov operators $Q_t$.
If $P:B(S)\rar B(S)$ has invariant measure $\mu$, then for all $t\geq0$ $\mu$ is invariant for $Q_t$. 2. If $P:L_2(\mu)\rar L_2(\mu)$ is self-adjoint, then so is $Q_t:L_2(\mu)\rar L_2(\mu)$ for all $t\geq0$. 3. If $Q_t$ is ergodic then $P:L_2(\mu)\rar L_2(\mu)$ is ergodic.
If a finite stochastic matrix $(p(x,y))$, $x,y\in S$ has the property that for each pair $(x,y)\in S\times S$ there is some $n\in\N_0$ such that $p^{(n)}(x,y) > 0$, then $P$ has a unique invariant probability $\mu$ - hence it's ergodic with respect to $\mu$. Here $p^{(n)}(x,y)$ are the matrix entries of the $n$-th power of the stochastic matrix. Hint: Use the remark following exam.
Verify that the Markov chains in exam, exam, exam and exam are ergodic.
For $\l > 0$ let $\g_n$ be an i.i.d. sequence of non-negative random variables with density $\l e^{-\l t}$. Put $\G_0\colon=0$ and $\G_n=\sum_{j=1}^n\g_j$. Then $\G_n$ has density $$ \l\frac{(\l t)^{n-1}}{(n-1)!}e^{-\l t}~. $$ 2. Define $N_t\colon=\max\{n\geq0:\,\G_n\leq t\}=\inf\{n\geq1:\G_n > t\}-1$, then $N_t$ is an increasing $\N_0$ valued process satisfying: $[N_t\leq n]=[\G_{n+1} > t]$, $[N_t\geq n]=[\G_n\leq t]$ and $$ \forall n\in\N_0:\qquad \P(N_t=n)=\frac{(\l t)^n}{n!}e^{-\l t}~. $$ Usually $\g$ is interpreted as a waiting time for the arrival of a bus. In this case $\G_n$ is the waiting time for the arrival of the $n$-th bus and $N_t$ is the number of buses that have arrived up to time $t$.

Pinsker- and Poincaré inequality

Proving ergodicity of a particular Markov operator might be quite intricate. One method of proof is based on Pinsker's inequality:
Suppose $\mu$ is a probability measure on $S$ and $f:S\rar\R_0^+$ is such that $\int f\,d\mu=1$. Then $$ \int|f-1|\,d\mu\leq\sqrt{-2\Ent(f)}~. $$ This is called Pinsker's inequality. Hint: Use the elementary inequality: $3(x-1)^2\leq(4+2x)(x\log x-x+1)$, which holds for all $x\geq0$. Suggested solution.
Now assume $Pf=f$, $f\geq0$ and $\int f\,d\mu=1$. If $\Ent(P^nf)$ converges to $0$, then $\int|f-1|\,d\mu=0$, i.e. $f$ is $\mu$ a.e. constant. Another related method is based on so called Poincaré inequalities: for all $f$ such that $\int f\,d\mu=0$ there is some constant $C\in\R^+$ such that $$ \int f^2\leq\frac1C\int f(1-P)f\,d\mu~. $$ This inequality immediately implies that if $Pf=f$ then $f$ is $\mu$ a.e. constant, i.e. $f=\int f\,d\mu$. The constant $C$ is called the spectral gap of the operator $1-P$, because in case $1-P$ is self-adjoint, it's the distance of $0$ to the rest of the spectrum of $1-P$ - which is a subset of $\R^+$. The following generalizes exam:
Let $Pf(x)=\int f(y)P(x,dy)$ be the Markov operator of a reversible Markov chain with reversible probability measure $\mu$. Verify that for all $f\in L_2(\mu)$: $$ Q(f) \colon=\int f(1-P)f\,d\mu =\frac12\iint(f(x)-f(y))^2\,P(x,dy)\,\mu(dx)~. $$ If for all $\mu(A) > 0$ and $\mu$ almost all $x$: $P(x,A) > 0$, then $P$ is ergodic on $L_2(\mu)$. $Q$ is called the Dirichlet-form of $P$. 2. Verify that the functional $f\mapsto Q(f)$ is convex, i.e. $Q((1-t)f+tg)\leq(1-t)Q(f)+tQ(g)$.
Let $Q$ be the Dirichlet-form of a self-adjoint Markov operator with reversible probability measure $\mu$. Prove that $Q(|f|)\leq Q(f)$, $Q(f^+)\leq Q(f)$ and for all $s\in\R$: $Q(f-s)=Q(f)$. Conclude that $Q(f)\geq\sup_sQ((f-s)^+)$.

Continuous Dynamical Systems

The spaces $C_b(\R^d)$, $C_0(\R^d)$ and $C_c(\R^d)$ are called the space of all bounded and continuous functions, the space of all continuous functions vanishing at infinity and the space of all continuous functions with compact support. The first two spaces are Banach spaces with the norm given by: $$ \norm f\colon=\sup\{|f(x)|:x\in\R^d\}~. $$ and $C_c(\R^d)$ is a dense subspace of $C_0(\R^d)$. By $C^\infty(\R^d)$ and $C_c^\infty(\R^d)$ we denote the space of all smooth functions and the space of all smooth functions with compact support. Both are dense subspaces of $C_0(\R^d)$.
Suppose $\z:\R^d\rar\R^d$ is smooth, then by the existence theorem of ordinary differential equations there is an open subset ${\cal D}$ of $\R\times\R^d$ such that $\{0\}\times\R^d\sbe{\cal D}$ and a smooth mapping $\theta:{\cal D}\rar\R^d$ such that \begin{equation}\label{cdseq1}\tag{CDS1} \pa_t\theta(t,x)=\z(\theta(t,x)) \quad\mbox{and}\quad \theta(0,x)=x~. \end{equation} By the uniqueness theorem of ordinary differential equations this map has the additional property that for all $x\in\R^d$ and all $t,s > 0$: $$ \theta(t+s,x)=\theta(t,\theta(s,x)), $$ whenever $\theta(s,x)$ and $\theta(t,\theta(s,x))$ are defined. Putting $P_tf(x)\colon=f(\theta(t,x))$ we get a local group on $C_0(\R^d)$ with the property that for all $f\in C^\infty(\R^d)$ and all $(t,x)\in{\cal D}$: \begin{equation}\label{cdseq2}\tag{CDS2} \pa_tP_tf(x) =\sum_{j=1}^d\pa_t\theta_j(t,x)\pa_jf(\theta(t,x)) =\sum_{j=1}^d\z_j(\theta(t,x))\pa_jP_tf(x)~. \end{equation} For a manifold $M$ the spaces $C_0(M)$, $C^\infty(M)$ and $C_c^\infty(M)$ are defined analogously.

Vector fields

A vector field $X$ on a manifold $M$ is a linear mapping $X:C^\infty(M)\rar C^\infty(M)$ such that for all $f,g\in C^\infty(M)$: $X(fg)=fXg+gXf$
Of course, the space $\G(M)$ of all vector fields on $M$ is a vector-space ($(X+Y)f\colon=Xf+Yf$), but it's also an $C^\infty(M)$-modul: for all $g\in C^\infty(M)$ and all vector fields $X$ there is a new vector field $gX$: $(gX)f(x)\colon=g(x)Xf(x)$. Moreover, given two vector fields $X$ and $Y$ the commutator $[X,Y]\colon=XY-YX$ is also a vector field.
Prove that the commutator is indeed a vector field.
Straightforward insertion shows that the commutator satisfies Jacobi's identity: $$ [X,[Y,Z]]+[X,[Y,Z]]+[X,[Y,Z]]=0 $$ and $[X,Y]=-[Y,X]$. Thus $\G(M)$ is a Lie-algebra (cf. e.g. wikipedia).
Prove that the vector space $\R^3$ with the vector product $(x,y)\mapsto x\times y$ is a Lie-algebra, i.e. $x\times y=-y\times x$ and $x\times(y\times z)+y\times(z\times x)+z\times(x\times y)=0$.
Vector fields and smooth mappings $\z:M\rar\R^d$ on open subsets $M$ of $\R^d$ are in one to one correspondence: for any such mapping $\z$ \begin{equation}\label{cdseq3}\tag{CDS3} Xf(x)\colon=\sum_{j=1}^d\z_j(x)\pa_jf(x) \end{equation} is a vector field. The mapping $\theta:{\cal D}\rar\R^d$ defined in \eqref{cdseq1} is called the flow of the vector field $X$ and the curve $c(t)\colon=\theta(t,x)$ is called the integral curve of $X$ through $x$. Instead of $Xf(x)$ we will also write $X_xf$. Conversely, for a vector field $X$ on $M$ there is exactly one smooth mapping $\z:M\rar\R^d$, such that \eqref{cdseq3} holds - this is usually one of the first results in any course on manifolds. If a smooth mapping $\theta:{\cal D}\rar M$ satisfies \eqref{cdseq1}, then: \begin{equation}\label{cdseq4}\tag{CDS4} \forall f\in C^\infty(M):\quad \pa_t(f\circ\theta)(t,x)=Xf(\theta(t,x))=P_tXf(x)~. \end{equation} On the other hand this relation implies \eqref{cdseq1}: just take for $f$ the projections $(x_1,\ldots,x_d)\mapsto x_j$ onto the components. In case $M$ is an arbitrary manifold we take \eqref{cdseq4} just as definition of the flow $\theta$ of a vector field $X$.
A constant vector field $E=\sum e_j\pa_j$ on $\R^d$ commutes with any other constant vector field and the flow of $E$ ist given by $\s(t,x)=x+te$.
If ${\cal D}(X)\colon={\cal D}=\R\times M$, the vector field $X$ is said to be complete. Completeness of $X$ ensures that $P_tf(x)\colon=f(\theta(t,x))$ is in fact a group on $C(M)$, $C_b(M)$, $C_0(M)$, etc.; moreover, the mappings $\theta_t:M\rar M$ are diffeomorphisms for all $t\in\R$ with inverse $\theta_{-t}$.
The vector field $X_x=x^2\pa_x$ on $\R$ is not complete.
Find a solution of the first order PDE $\pa_tu=y\pa_xu-x\pa_y$ on $M=\R^2$ satisfying $u(0,x,y)=x^2-y^2$. Suggested solution.

From vector fields to one parameter groups

Putting $t=0$ in \eqref{cdseq4} we see that the group $P_t$, $t\in\R$, also determines $X$: $$ Xf(x) =\pa_t|_{t=0} P_tf(x) =\lim_{t\dar0}\tfrac1t(P_tf(x)-f(x))~. $$ The notion of a complete vector field allows us to write \eqref{cdseq1} as \eqref{cdseq4}. By the group property we get: \begin{eqnarray*} XP_tf(x) &=&\lim_{s\dar0}\tfrac1s(P_{t+s}f(x)-P_tf(x))\\ &=&\lim_{s\dar0}\tfrac1s(P_sf(\theta_t(x)))-P_tf(\theta_t(x))) =Xf(\theta_t(x)) =P_tXf(x) \quad\mbox{i.e.}\quad P_tX=XP_t \end{eqnarray*} and thus \eqref{cdseq1} is equivalent to: \begin{equation}\label{cdseq5}\tag{CDS5} \forall t\in\R\, \forall f\in C^\infty(M):\quad \pa_tP_tf(x)=XP_tf(x)~. \end{equation}
Suppose $X$ is a vector field on $M$ and $u:M\rar\R^+$ is smooth such that
  1. For all $r > 0$ the set $[u\leq r]$ is compact.
  2. There exists a constant $C > 0$, such that $Xu\leq Cu$.
Then $X$ is complete.
$\proof$ For any $x\in M$ put $c(t)\colon=\theta(t,x)$. Then it follows by 2. and \eqref{cdseq5} that $$ \ttd t(e^{-Ct}u(c(t))) =e^{-Ct}(Xu(c(t))-Cu(t))\leq0, $$ i.e: $u(c(t))\leq u(x)e^{Ct}$, which shows that $c(t)$ is in the compact set $[u\leq u(x)e^{Ct}]$. $\eofproof$
If $\z$ is $L$-Lipschitz on $\R^d$, then $X=\sum\z_j\pa_j$ is complete on $\R^d$.
$\proof$ Choose $u(x)=\Vert x\Vert^2+1$ then $$ Xu =\sum_j\z_j\pa_ju =2\sum_j(\z_j-\z_j(0))x_j+2\sum_j\z_j(0)x_j \leq 2L\Vert x\Vert^2+2K\Vert x\Vert \leq Cu(x)~. $$ $\eofproof$
Next we are going to verify that $P_tf=f\circ\theta_t$ it is a continuous group on $C_0(M)$ meaning that for all $f\in C_0(M)$ the map $t\mapsto P_tf$ is a continuous curve in $C_0(M)$. Indeed, for $f\in C_0(M)$, there is some $\e > 0$, some $\d > 0$ and a compact subset $K$ of $M$, such that for all $d(x,y) < \d$: $|f(x)-f(y)| < \e$ and $|f|K^c| < \e$. Moreover, we can find some $r > 0$, such that for all $|t| < r$, all $x\in K$: $d(x,\theta_t(x)) < \d$ and $|f|\theta_t(K^c)| < \e$. It follows that $$ \norm{P_tf-f}_\infty \leq\sup_{x\in K}|f(\theta_t(x))-f(x)| +\sup_{x\in K^c}|f(\theta_t(x))| +\sup_{x\in K^c}|f(x)| \leq\e+\e+\e~. $$ In general the group $P_t$ is not continuous on $C_b(M)$: take for example $M=\R$, $P_tf(x)\colon=f(x+t)$ and $f(x)=\sin(x^2)$; for all $t > 0$: $\sup_x|\sin((x+t)^2)-\sin x^2|=2$. However for $f\in C_c^\infty(M)$ we conclude by the mean value theorem of calculus that: $$ \norm{\tfrac1t(P_tf-f)-Xf}_\infty \leq \sup\{|\pa_sf(\theta(s,x))-\pa_sf(\theta(0,x))|:0 < s < t,x\in M\}~. $$ As $f$ has compact support this converges to $0$ as $t$ converges to $0$ and hence $Xf$ is the derivative of the curve $t\mapsto P_tf$ in $C_0(M)$ at $t=0$. In fact equation \eqref{cdseq5} is somewhat weaker than the linear ordinary differential equation in $C_0(M)$: \begin{equation}\label{cdseq6}\tag{CDS6} \ttd tP_tf=XP_tf~. \end{equation} Beware, on the left hand side we mean the derivative of a Banach space valued curve! What's the asset? \eqref{cdseq6} is linear! The price for this 'linearization' of the non-linear differential equation $x^\prime(t)=\z(x(t))$ is the substitution of the infinite dimensional space $C_0(M)$ for the finite dimensional space $M$. On the other hand we may and will examine \eqref{cdseq6} in different Banach spaces such as $L_p(\mu)$ for some measure $\mu$ on $M$ and in these cases the evaluation of \eqref{cdseq6} at a particular point doesn't make sense in general, though it makes sense in e.g. $C_0(M)$. As a final remark we notice that for all $f\in C^\infty(M)$, all $x\in M$ and all $n\in\N$ we have: $$ \frac{d^n}{dt^n}\Big|_{t=0}P_tf(x)=X^nf(x) $$ Thus in case $t\mapsto P_tf$ is analytic we get the Taylor expansion: $$ P_tf =\sum_{n=0}^\infty\frac{t^n}{n!}X^nf $$ which we simply write as $e^{tX}f$. Henceforth we will use $e^{tX}$ just as a synonym for $P_t$. Take for example $M=\R^d$, $h=(h_1,\ldots,h_d)\in\R^d$ and put $X=\sum h_j\pa_j$, then: $\theta(t,x)=x+th$ and $e^{tX}f(x)=f(x+th)$. On the other hand we have in case $t\mapsto f(x+th)$ is analytic (which is obviously weaker than analyticity of $t\mapsto P_tf$): $$ e^{tX}f(x) =\sum_{n=0}^\infty\frac{t^n}{n!}\Big(\sum h_j\pa_j\Big)^nf(x) $$ and thus we are back at the classical Taylor formula: $f(x+th)=e^{tX}f(x)$.
Let $\theta$ be the flow of the complete vector field $X$ and $P_tf=f\circ\theta_t$. A Borel measure $\mu$ on $M$ is said to be invariant under $\theta$ if $$ \forall t\in\R\,\forall f\in C_c^\infty(M):\quad \int P_tf\,d\mu=\int f\,d\mu~. $$
Of course this implies that $\int Xf\,d\mu=0$. Conversely, since $P_t$ maps $C_c^\infty(M)$ into itself, we get for every $f\in C_c^\infty(M)$ by \eqref{cdseq6}: $$ \ftd t\int P_tf\,d\mu=\int XP_tf\,d\mu=0 $$ and thus $\mu$ is invariant iff for all $f\in C_c^\infty(M)$: $\int Xf\,d\mu=0$.
Suppose $X$ is a complete vector field on a Riemannian manifold $M$, $\r\in C^\infty(M)$ the density of a Borel measure $\mu$ with respect to the Riemannian volume $v$. $\mu$ is invariant under the flow of $X$ if and only if $\divergence(\r X)=0$.
$\proof$ The divergence $\divergence Y$ of a vector field $Y$ on $M$ may be defined by \begin{equation}\label{cdseq7}\tag{CDS7} \forall g\in C_c^\infty(M):\quad \int Yg\,dv=\int g\cdot\divergence(Y)\,dv.~. \end{equation} Now take any $f\in C_c^\infty(M)$, then $$ \int Xf\,d\mu =\int(\r X)f\,dv =\int\divergence(\r X)f\,dv $$ Since $f\in C_c^\infty(M)$ is arbitrary, the conclusion follows. $\eofproof$
Verify that in the euclidean case \eqref{cdseq7} amounts to $$ -\divergence\Big(\sum\z_j\pa_j\Big)=\sum\pa_j\z_j~. $$
Provided the conditions of the above theorem are satisfied the group $P_tf\colon=f\circ\theta_t$ will turn out to be a continuous group on both $C_0(M)$ and $L_p(\mu)$ (for all $1\leq p < \infty$). As for ergodicity of the flow we have a negative result: If $H:M\rar\R$ is a first integral of the vector field $X$, i.e. $H$ is not constant and $XH=0$, then $\theta_t$ is not ergodic. First integrals are an important means in ODE, for they are obviously constant along integral curves, i.e. for all $x\in M$ the function $t\mapsto H(\theta_t(x))$ is constant! Anyway, if $P_tf\colon=f\circ\theta_t$ is ergodic, then there is no first integral.

Integrating factor

Let $X=-Q(x,y)\pa_x+P(x,y)\pa_y$ be a vector field on an open subset $M$ of $\R^2$, then $$ -\divergence(\r X)=\pa_x(-\r Q)+\pa_y(\r P) $$ and thus $\divergence(\r X)=0$ if and only if $\r$ is an integrating factor of the differential equation: $$ c_1^\prime(t)=-Q(c_1(t),c_2(t)) \quad\mbox{and}\quad c_2^\prime(t)=P(c_1(t),c_2(t)) $$ i.e. $c(t)=(c_1(t),c_2(t))$ is an integral curve of $X$. If $\r$ is an integrating factor and $M$ is simply connected, then there is a first integral $H$ given by $$ \pa_xH=\r Q \quad\mbox{and}\quad \pa_yH=\r P~. $$
The predator-prey model is the flow of the vector field $X=(ax-bxy)\pa_x+(cxy-dy)\pa_y$ on $M=\R^+\times\R^+$ for $a,b,c,d > 0$. Verify that the following measure is invariant: $$ \mu(K)=\int_K\frac1{xy}\,dx\,dy~. $$ Here $x$ and $y$ model the 'densities' (the number of individuals per unit area) of the prey and the predators, respectively. 2. Verify that $H(x,y)=cx-by-d\log x+a\log y$ is a first integral and that $H$ is strictly convex. If $a,b > 0$ and $c,d < 0$ this is called the competitor model.
Some trajectories for the predator-prey and the competitor model:
models

Geodesic flow

Let $M$ be a Riemannian manifold and ${\cal G}$ the so called geodesic vector field, i.e. for all $X_m\in T_mM$ the tangent vector ${\cal G}_{X_m}\in T_{X_m}TM$ is horizontal and $\pi_*({\cal X}_{X_m})=X_m$. The flow $G(t,X_m)$ of this field is the parallel transport of $X_m$ along the geodesic $t\mapsto\exp_m(tX_m)$. Then the Riemannian volume on $TTM$ is invarian under $G_t$. If $M$ is an open subset of $\R^d$ we have $TM=M\times\R^d$ and for the cannonical Riemannian metric on $M$ we get $$ {\cal G}_{(x,v)} =\sum_j v_j\pa_{x_j} \quad\mbox{and its flow:}\quad G(t,x,v)=(x+tv,v) $$ Every measure of the form $d\mu(x,v)=\r(v)\,dx\,dv$ is invariant under the flow $G$, indeed: $$ -\divergence(\r{\cal G}) =\sum_j(\pa_{x_j}(\r(v)v_j)+\pa_{v_j}0)=0~. $$ A typical choice is $\r(v)=c\exp(-\Vert v\Vert^2/2)$. If the geodesic flow is complete, the Riemannian manifold $M$ is said to be complete
. Quite miraculously (cf. Hopf-Rinow Theorem) this happens if and only if the manifold $M$ furnished with the geodesic metric (cf. e.g. \eqref{difeq4}) is a complete metric space.

Hamiltonian flow

Let $(M,\la.,.\ra)$ be a Riemannian manifold and $\pi:TM\rar M$ the canonical projection, i.e. for all $m\in M$ and all $X_m\in T_mM$: $\pi(X_m)=m$. Suppose $\o^1$ is the cannonical $1$-form on $TM$, i.e. $$ \o^1({\cal X}_{X_m})=\la\pi_*({\cal X}_{X_m}),X_m\ra $$ and $H:TM\rar\R$ is a smooth function, a so called Hamiltonian. The Hamiltonian vector field ${\cal X}^H$ on $TM$ is defined by $dH=-{\cal X}^H\contract d\o^1$. Then the Riemannian volume on $TM$ is invariant under the flow of the Hamiltonian vector field ${\cal X}^H$. On $T\R^d=\R^d\times\R^d$ this field is given by $$ {\cal X}_{(q,p)}^H \colon=\sum_{j=1}^d\Big(\pa_{p_j}H\pa_{q_j}-\pa_{q_j}H\pa_{p_j}\Big)~. $$ The flow of this vector field is a collection of curves $t\mapsto(q_1(t),\ldots,q_n(t),p_1(t),\ldots,p_n(t))=(q(t),p(t))$ (the integral curves), which satisfy the so called Hamilton equations: $$ q_j^\prime(t)=\pa_{p_j}H(q(t),p(t)),\quad p_j^\prime(t)=-\pa_{p_j}H(q(t),p(t)) $$ simply written as $q_j^\prime=\pa_{p_j}H$ and $p_j^\prime=-\pa_{q_j}H$.
Verify that $H$ is a first integral of the Hamiltonian flow.
Hence ergodicity of the Hamiltonian flow is usually investigated on submanifolds $[H=const]$; the density of an invariant measure on this manifold with respect to the Riemannian measure on $[H=const]$ is given by $1/\norm{\nabla H}$.
Determine the Hamilton equations for $H(q,p)=\tfrac12m\sum p_j^2+U(q)$ for some smooth function $U:M(\sbe\R^d)\rar\R$. Check that these are the Newtonian equations of a particle of mass $m$ moving in a potential $U$. 2. Verify for $m=1$, $E,\a > 0$ and $U(q)=\norm q^{\a}/\a$ that the set $[H=E]$ is a compact submanifold of $\R^d\times\R^d$ and compute $\norm{\nabla H}$ for $(q,p)\in[H=E]$.
The motion of a particle of mass $m$ and charge $q$ in a stationary electromagnetic field is described by the Lorentz equation: $$ m\ftd tv=q(E+v\times B),\quad \ftd tx=v, $$ where $E$ and $B$ are vector fields on $M=\R^3$ (they are called the electric and the magnetic field respectively). Put $$ X\colon=\sum_{j=1}^3v_j\pa_{x_j}+ q\sum_{j=1}^3(E_j+(v\times B)_j)\pa_{v_j}, $$ then the Lebesgue measure on $\R^6$ is invariant under the flow of $X$.

Diffusions

In what is going to follow we construct a model describing the distribution of temperature in an infinite rod: The basic assumptions are as follows:
  1. There is no heat transfer into its environment.
  2. The change of temperature in time at any point is proportional to the differences in temperature to its neighbours.
Fix some numbers $\D x,\D t > 0$ and take for $n\in\Z$: $x\in\{n\D x:n\in\Z\}$ and $t\in\{n\D t:n\in\N\}$. We assume the rod is made of a series of molecules in fixed positions $x$. By $u(x,t)$ we denote the temperature (i.e. the kinetic energy) at times $t$ of the molecule in position $x$. This molecule has two relevant neighbours located in positions $x-\D x$ and $x+\D x$, respectively. The second assumption implies that $$ u(x,t+\D t)-u(x,t) =a(x)((u(x-\D x,t)-u(x,t))+(u(x+\D x,t)-u(x,t)))~. $$ The condition $a(x) > 0$ says that the temperature in $x$ can only increase if $$ u(x,t) < \tfrac12((u(x-\D x,t)+(u(x+\D x,t)), $$ meaning that the temperature in $x$ is smaller than the mean temperature of its neighbours. In the continuous case (i.e. $x\in\R$, $t\in\R^+$) there is only one reasonable analogon: $$ \pa_tu(x,t)=a(x)\pa_x^2u(x,t) $$ If $a(x) > 0$ this equation is called a diffusion equation. The general form of which is given by $$ \pa_tu(x,t) =a(x)\pa_x^2u(x,t)+b(x,t)\pa_xu(x,t)~. $$ The factor $b$ is called the drift of the diffusion. For $a=1$ and $b=0$ this is known as the heat equation on the real line.

A probabilistic interpretation

We construct very simple Markov chains approximating a Markov process known as Brownian motion: For $x\in\{n\D x:n\in\Z\}$ let us call the interval $(x-\D x/2,x+\D x/2)$ the box about $x$. A particle moving on the real line $\R$ obeys the following rules:
  1. At time $t=0$ the particle is located in the box about $y$.
  2. If at time $t=0,\D t,2\D t,\ldots$ the particle is located in the box about $x$ the probability of finding the particle at time $t+\D t$ in the boxes about $x-\D x$ and $x+\D x$ is $1/2$.
Let us denote by $p(x,t)\D x$ the probability of finding the particle at time $t$ in the box about $x$. By assumption the probability of finding the particle at time $t+\D t$ in the box about $x$ is given by $$ p(x,t+\D t)\D x =\tfrac12p(x-\D x,t)\D x +\tfrac12p(x+\D x,t)\D x, $$ this is because with probability $1/2$ the particle has moved from the box about $x-\D x$ or from the box about $x-\D x$ to the box about $x$. Thus we obtain \begin{eqnarray*} p(x,t+\D t)-p(x,t) &=&\tfrac12(p(x-\D x,t)-p(x,t) +p(x+\D x,t)-p(x,t))\\ &=&\tfrac12(p(x-\D x,t)-2p(x,t)+p(x+\D x,t))~. \end{eqnarray*} Putting $\D x=\sqrt{\D t}$ and letting $\D t\to0$ we get \begin{equation}\label{difeq1}\tag{DIF1} \pa_tp(x,t) =\tfrac12\pa_x^2p(x,t)~. \end{equation} For any $y\in\R$ the function $$ p_y(x,t) \colon=\tfrac1{\sqrt{2\pi t}}\exp\left(-\tfrac1{2t}(x-y)^2\right) $$ solves the PDE \eqref{difeq1}; moreover it has the following properties $$ p_y(x,t)\geq0,\quad \int_\R p_y(x,t)\,dx=1\quad\mbox{and}\quad \forall r > 0:\ \lim_{t\dar0}\int_{y-r}^{y+r}p_y(x,t)\,dx=1~. $$ The first two properties express the fact that $x\mapsto p_y(x,t)$ is the density of a probability measure: the probability of finding the particle at time $t$ in the interval $(\a,\b)$ is given by $\int_\a^\b p_y(x,t)\,dx$. The third property reflects the initial condition: at time $t=0$ the particle is located in position $y$, i.e. the probability of finding the particle in any interval of the form $(y-r,y+r)$ at about $y$ equals $1$.

Heat semigroup on $\R^d$

Let $\D\colon=-\sum\pa_j^2$ be the Laplace operator
on $\R^d$. The heat equation on $\R^d$ is $$ \pa_tu =-\D u \colon=\sum\pa_j^2u~. $$ For any $f\in C_0(\R^d)$ there is a unique bounded solution $u(t,x)$ satisfying $u(0,x)=f(x)$: $$ P_tf(x) \colon=u(t,x) =\int f(x-y)p_t(y)\,dy =\int f(y)p_t(x-y)\,dy =f*p_t(x)~. $$ Here $p_t(y)$ denotes the heat kernel of $\R^d$: $$ p_t(y)\colon=(4\pi t)^{-d/2}e^{-\norm y^2/4t}~. $$ The family of operators $P_t$ forms a so called continuous contraction semigroup on the space $C_0(\R^d)$ with generator $-\D$, i.e. $$ \forall f\in C_c^\infty(\R^d):\quad \lim_{t\to0}\frac{P_tf-f}t=-\D f $$ and the limit is in $C_0(\R^d)$. Moreover, it's also a continuous contraction semigroup on the spaces $L_p(\R^d)$ for $1\leq p < \infty$: for $f\in L_p(\R^d)$, $g\in L_q(\R^d)$, $1/p+1/q=1$, we conclude by Hölder's inequality: $$ \Big|\int P_tf(x)g(x)\,dx\Big| \leq\iint|f(x-y)g(x)p_t(y)|\,dx\,dy \leq\norm f_p\norm g_q $$ i.e. $\norm{P_t:L_p(\R^d)\rar L_p(\R^d)}\leq1$. Usually it's not that difficult to verify continuity of these semigroups (cf. lemma). However the following provides an altenative way for $p=2$:
The Fourier transform of the heat kernel $p_t$ is given by $$ \wh p_t(y) \colon=c_d\int p_t(x)e^{-i\la x,y\ra}\,dx =c_de^{-t\norm y^2} \quad\mbox{where}\quad c_d=(2\pi)^{-d/2}~. $$
  1. Conclude from this fact that $p_s*p_t=p_{s+t}$, i.e. $P_t$ is a semigroup.
  2. For all $f\in L_2(\R^d)$ the map $f\mapsto P_tf$ is continuous.
  3. For all $f\in H^2(\R^d)$: $$ \norm{\frac{P_tf-f}t+\D f}_2\leq\tfrac12t\norm f_{H^2}, \quad\mbox{where}\quad \norm f_{H^s}^2\colon=\int|\wh f(y)|^2(1+\norm y^2)^{s}\,dy~. $$ Thus $-\D$ is the generator of the heat semigroup on $L_2(\R^d)$ and $\dom(\D)\spe H^2(\R^d)$.
1. The formula can be checked by straightforward calculation. However another way to derive the formula for the Fourier transform of $P_t$ is as follows: the Fourier transform of $\D f$ is $y\mapsto\norm y^2\wh f(y)$ and thus the Fourier transform of $e^{-t\D}f$ is $y\mapsto e^{-t\norm y^2}\wh f(y)$. Now, since $c_d\wh p_{s+t}=\wh p_s.\wh p_t$ it follows that $p_s*p_t=p_{s+t}$ and thus $P_t$ is a semigroup.
2. We simply utilize the fact that $f\mapsto\wh f$ is an isometry on $L_2(\R^d)$ and it maps the Laplacian to multiplication by $\Vert.\Vert^2$! The Fourier transform of $P_tf$ is $y\mapsto e^{-t\norm y^2}\wh f(y)$ and thus: $$ \norm{f-P_tf}_2^2 =\int|1-e^{-t\norm y^2}|^2|\wh f(y)|^2\,dy $$ and by dominated convergence this converges to $0$ as $t$ converges to $0$. 3. Similarly \begin{eqnarray*} \norm{\frac{P_tf-f}t+\D f}_2^2 &=&t^2\int\Big(\frac{e^{-t\norm y^2}-1+t\norm y^2}{t^2(1+\norm y^2)}\Big)^2 (1+\norm y^2)^2\wh f(y)\,dy\\ &\leq&t^2\sup_{r > 0}\Big|\frac{e^{-r}-1+r}{r^2}\Big|\,\norm f_{H^2}^2 =\tfrac12t^2\norm f_{H^2}^2~. \end{eqnarray*}
For $f,g\in L_1(\R^d)$ and $g\in L_1(\R^d)$ we have: $$ \Big|\int P_tf(x)g(x)\,dx\Big|\leq\norm f_1\norm g_1\norm{p_t}_\infty $$ and thus: $\norm{P_t:L_1(\R^d)\rar L_\infty(\R^d)}\leq(4\pi t)^{-d/2}$ - $P_t$ is said to be ultracontractive. Solution by T. Speckhofer.
Suppose $f\in L_2(\R^d)$ satisfies $P_tf=f$ for some $t > 0$. Prove that $f=0$. Hint: Use the Fourier transform.

The heat semigroup on $\TT^d$

As the heat semigroup on $\R^d$ doesn't admit an invariant probability measures we give another example based on the heat semigroup on $\R^d$, which has an invariant probability measure. The heat semigroup on $\TT^d$ is exactly of the sort of examples we have in mind when talking about ergodic semigroups!
Verify that the heat kernel of the $d$-dimensional torus $\TT^d$ is given by $$ \forall t > 0\ \forall x\in\TT^d:\quad q_t(x)\colon=\sum_{n\in\Z^d}p_t(x+2\pi n)~. $$ Just check that $q_t$ is the density of a probability measure on $\TT^d$ and $\pa_tq_t(x)=-\D q_t(x)$.
The associated semigroup on $L_2(\TT^d)$ is self-adjoint and ergodic, i.e. if $P_tf=f$ for all $t > 0$, then $f$ ist constant. Indeed, the semigroup $P_t$ on $L_2(\TT^d)$ is $$ P_tf(x) =\int_{\TT^d}q_t(y)f(x-y)\,dy =\int_{\TT^d}q_t(x-y)f(y)\,dy $$ which is self-adjoint for $(x,y)\mapsto q_t(x-y)$ is symmetric. For $f\in C(\TT^d)$ let $F\in C_b(\R^d)$ denote the periodic extension of $f$, then $$ P_tf(x) =\int_{\TT^d}\sum_{n\in\Z^d}p_t(x-y-2\pi n)f(y)\,dy =\int_{\R^d}p_t(x-y)F(y)\,dy~. $$ Thus the heat semigroup on $\TT^d$ applied to $f\in C(\TT^d)$ is just the heat semigroup on $\R^d$ applied to the periodic extension $F$ of $f$. Now $e_m(x)\colon=e^{i\la x,m\ra}$, $m\in\Z^d$, is an orthonormal basis for the Hilbert space $L_2(\TT^d)$; by the preceeding formula and exam we get: $$ P_te_m(x) =\int_{\R^d}p_t(y)e^{i\la x-y,m\ra}\,dy =e^{-t\norm m^2}e^{i\la x,m\ra}~. $$ Assume $P_tf=f$ for $f=\sum c_me_m\in L_2(\TT^d)$ then: $\sum c_me_m=\sum c_me^{-t\norm m^2}e_m$, which can only hold if for all $m\neq0$: $c_m=0$, i.e. $f$ is constant: The heat semigroup on $L_2(\TT^d)$ is ergodic!
For $f\in L_2(\R^d)$ and $x\in\TT^d$ put $F(x)\colon=\sum_{n\in\Z^d}f(x+2\pi n)$, Then the Fourier transform of $F$ in $\TT^d$ at $m\in\Z^d$ concides with the Fourier transform of $f$ in $\R^d$ at $m\in\Z^d$ up to a constant factor: $\wh F(m)=c_d\wh f(m)$.
Prove that the Fourier transform of $q_t$ (on $\TT^d$) is $\wh q_t(n)=e^{-t\norm n^2}$ ($n\in\Z^d$) and verify that for all $f\in L_2(\TT^d)$: $\wh{P_tf}(n)=e^{-t\norm n^2}\wh f(n)$.

Diffusion semigroups

Let $(M,\la.,.\ra)$ be a Riemannian manifold with Riemannian metric $\la.,.\ra$ and Riemannian volume $v$. Suppose $a,\r:M\rar\R^+$ are strictly positive and smooth - actually by adjusting the Riemannian metric we way assume that $a=1$ (cf.
subsection) . Put $\mu(dx)=\r(x)\,v(dx)$ and let $X$ be a vector field such that $\nabla(a\r)+\r X=0$ - in this case the measure $\mu$ is called the speed measure of $H$. The operator $$ Hf=a\D f+Xf \quad\mbox{where}\quad \D f\colon=\divergence(\nabla f), $$ is a densly defined, symmetric and positive (definite) operator on $L_2(\mu)$, i.e. $Hf$ is defined for all $f$ in the dense subspace $C_c^\infty(M)$ of $L_2(\mu)$ and $$ \forall f,g\in C_c^\infty(M):\quad \int_M f.Hg\,d\mu=\int_M Hf.g\,d\mu \quad\mbox{and}\quad \int_M f.Hf\,d\mu\geq0~. $$ All of this follows from
Under the above conditions for $a,\r$ and $X$ we have for all $f,g\in C_c^\infty(M)$ the following relation \begin{equation}\label{difeq2}\tag{DIF2} \int_M Hf.g\,d\mu=\int_M a\la\nabla g,\nabla f\ra\,d\mu~. \end{equation}
$\proof$ By definition of the divergence (cf. theorem) we have: $$ \int ag\r\D f\,dv =\int\la\nabla(ag\r),\nabla f\ra\,dv $$ and by assumption we infer that: \begin{eqnarray*} \int Hf.g\,d\mu &=&\int ag\r\D f+g\r Xf\,dv\\ &=&\int\la\nabla(ag\r),\nabla f\ra+g\r Xf\,dv\\ &=&\int\la\nabla(a\r),\nabla f\ra g +\la\nabla g,\nabla f\ra a\r +g\r\la X,\nabla f\ra\,dv =\int\la\nabla g,\nabla f\ra a\,d\mu~. \end{eqnarray*} $\eofproof$
If $a=1$ and $X=\nabla U$, then $\r=e^{-U}$. In the particular case $M=\R^d$, $U(x)=\Vert x\Vert^2/2$ we get the diffusion operator $$ Hf(x)=-\sum\pa_j^2f(x)+\sum x_j\pa_jf(x)~. $$ Its speed measure is the standard gaussian measure $\g$ on $\R^d$, i.e. the density of $\g$ is given by $$ (2\pi)^{-d/2}e^{-\Vert x\Vert^2/2}~. $$
If $a,\r$ and $X$ satisfy the conditions of lemma and if $\mu$ is a probability measure, then the operator $H$ (or $-H$) is called a diffusion operator on $M$ and it generates a diffusion semigroup $P_t=e^{-tH}$. Actually any densly defined, positive, symmetric operator $H_0$ on a Hilbert space $E$ generates a continuous contraction semigroup $P_t$. This is because any such operator can be extended to a self-adjoint (unbounded) linear operator $H$ (the so called Friedrichs extension). The semigroup $P_t\colon=e^{-tH}$ is a continuous contraction semigroup with generator $-H$ - this is more or less a special case of the Hille-Yosida Theorem (cf. theorem). If $H$ is a diffusion operator, then the relation \eqref{difeq2} also holds for $f=g$ in the domain of the extension! In finite dimensions all of this is pretty obvious, nonetheless worth mentioning because it already comprises all the essential results (cf. e.g. theorem):
If $H$ is a positive, self-adjoint operator on a finite dimensional Hilbert space (i.e. a euclidean space), then $P_t\colon=e^{-tH}$ is a self-adjoint, continuous contraction semigroup. If $H$ is a self-adjoint operator on a finite dimensional Hilbert space, then $U_t\colon=e^{-itH}$ is a continuous group of isometries. 2. Prove that (suggested solution) $$ \lim_{T\to\infty}\frac1T\int_0^T P_tx\,dt =\lim_{T\to\infty}\frac1T\int_0^T U_tx\,dt =\Prn_{\ker H}x $$ is the orthogonal projection onto the kernel of $H$ (compare exam). This easily extends to the infinite dimensional case provided we have a spectral decomposition of $H:\dom(H)(\sbe E)\rar E$ of the form (cf. exam): $$ Hx=\sum_{j=1}^\infty \l_j\la x,x_j\ra x_j,\quad \dom(H)\colon=\Big\{x\in E: \sum\l_j^2\la x,x_j\ra^2 < \infty\Big\},\quad \l_j\geq0~. $$
That's the case for a diffusion on a compact Riemannian manifold. In general the diffusion semigroup $P_t$ generated by $H$ sends any $f\in L_1(\mu)$ onto a smooth function. If the Riemannian manifold is complete and satisfies a certain geometric condition then $P_t$ is Feller. The associated Markov process $B_t$, $t\geq0$, is reversible with respect to the speed measure $\mu$ - in case $a=\tfrac12$ the Markov process $B_t$ is called Brownian motion on $M$ with drift $X$.
If $H$ is a diffusion operator on a complete Riemannian manifold with Ricci curvature bounded from below, then the associated diffusion semigroup is ergodic.
$\proof$ Once we know that the corresponding diffusion semigroup $P_t$, $t > 0$, maps any $f\in L_1(\mu)$ onto a smooth function, a solution of $P_tf=f$ must be smooth and thus $Hf=0$. By lemma - and its extension to functions in the domain of $H$ - we infer that $$ 0=\int f.Hf\,d\mu=\int a\norm{\nabla f}^2\,d\mu $$ and since $a,\r$ are strictly positive: $\norm{\nabla f}=0$ and thus $f$ must be constant. $\eofproof$
This in particular applies to compact Riemannian manifolds, to all complete manifolds of constant curvature, e.g. $\R^d$ or the hyperbolic spaces $H^d$.
Suppose $M$ is an open interval, possibly unbounded, $a,\r:M\rar\R^+$ strictly positive and smooth, $b:M\rar\R$ smooth and $\mu(dx)=\r(x)\,dx$ such that $(a\r)^\prime+\r b=0$, i.e. for some $x_0\in M$ and $c\in\R^+$: $$ \r(x)=c\exp\Big(-\int_{x_0}^x\frac{a^\prime(y)+b(y)}{a(y)}\,dy\Big) $$ Then $Hu\colon=-au^\dprime+bu^\prime$ is a densly defined positive (definite) symmetric operator on $L_2(\mu)$.
In any case ergodicity of the following examples can be checked directly; as in exam you just need to know all eigen functions of the diffusion operator:
The following are classical diffusion operators on open intervals. Prove that all of them generate ergodic semigroups (cf. e.g. exam). Suggested solution.
  1. $M=(-1,1)$, $a(x)=1-x^2$, $b(x)=2x$ for some $\a,\b > -1$. Compute the density $\r$. The Jacobi polynomials $P_n^{(\a,\b)}$ are eigen functions of $H$ for the eigenvalues $\l_n=n(n+\a+\b+1)$ and they form a complete orthogonal set. For $\a=\b$ these polynomials are called ultraspherical or Gegenbauer polynomials.
  2. $M=\R$, $a(x)=1$, $b(x)=2x$. Compute the density $\r$. The Hermite polynomials $H_n$ are eigen functions of $H$ for the eigenvalues $\l_n=2n$ and they form a complete orthogonal set.
  3. $M=\R^+$, $a(x)=x$, $b(x)=x-\a-1$ for some $\a > -1$. Compute the density $\r$. The Laguerre polynomials $L_n^\a$ are eigen functions of $H$ for the eigenvalues $\l_n=n$ and they form a complete orthogonal set.

Adjusting geometry

On e.g. a domain $M$ of $\R^d$ a diffusion operator $H$ is commonly defined as a second order partial differential operator: \begin{equation}\label{difeq3}\tag{DIF3} Hf\colon=-\sum_{j,k}a_{jk}\pa_j\pa_kf+\sum b_j\pa_j \end{equation} where $a_{jk}$, $b_j$ are smooth functions such that $A(x)\colon=(a_{jk}(x))$ is symmetric and strictly positive definite. However any diffusion operator on a domain can be written as the sum of the Laplacian and a vector field, we just need to define a suitable Riemannian metric:
Put $(g_{jk})\colon=A^{-1}$, then we get for the Laplace operator on the Riemannian manifold $M\colon=(M,(g_{jk}))$: $$ \D f=-\sum_{j,k}a_{jk}\pa_j\pa_kf+Xf~, $$ where the vector field is given by $X=\sum_j\z_j(x)\pa_j$ with $$ \z_j\colon=-\frac{\sum_k\pa_k(a_{jk}\sqrt G)}{\sqrt G} \quad\mbox{and}\quad G\colon=\det(g_{jk})=\frac1{\det A}~. $$
The Riemannian metric $(g_{jk})$ induces the geodesic distance: \begin{equation}\label{difeq4}\tag{DIF4} d_g(x,y) \colon=\inf\{L(c):c:[0,1]\rar M\mbox{ smooth, }c(0)=x,c(1)=y\} \end{equation} where $$ L(c)\colon=\int_0^1\sqrt{\sum g_{jk}(c(t))c_j^\prime(t)c_k^\prime(t)}\,dt $$ is called the length of the curve $c$ in $(M,(g_{jk}))$. This is particularly easy to compute in case the dimension equals $1$, for there is only one curve connecting two points (modulo reparametrization): $c(t)=x+t(y-x)$.
The interval $M\colon=\R^+$ with the Riemannian metric $g(x)=1/x$ produces the geodesic distance $$ \forall x,y\in M:\quad d_g(x,y)=\int_0^1\frac{|c^\prime(t)|}{\sqrt{c(t)}}\,dt=2|\sqrt y-\sqrt x| $$ $(M,d_g)$ is not complete!.
The interval $M\colon=(-1,1)$ with the Riemannian metric $g(x)=1/(1-x^2)$ produces the geodesic distance $d_g(x,y)=2|\arcsin y-\arcsin x|$. Check that $(M,d_g)$ is not complete!
Thus neither the first nor the last example in exam is covered by proposition!
The interval $M\colon=\R^+$ with the Riemannian metric $g(x)=1/x^2$ produces the geodesic distance $d_g(x,y)=|\log(x/y)|$. Verify that $(M,d_g)$ is complete.
Find a Riemannian metric on $M\colon=(-1,1)$ such that $(M,d_g)$ is complete.
Consider the diffusion operator \eqref{difeq3} on a bounded domain $M$ in $\R^d$ with smooth boundary under Neumann boundary conditions (i.e. for all $x\in\pa M$: $\la \nabla f(x),N(x)\ra=0$). 1. If for some smooth function $U$ - here a smooth function will always be defined in a neighborhood of $\cl M$: $$ \forall j:\quad \sum_k a_{jk}\pa_kU=b_j+\sum_k\pa_ka_{jk}~. $$ Then $\r\colon=e^{-U}$ is the smooth density of a speed measure $\mu$, i.e. for all smooth $f,g$ satisfying the boundary conditions: $$ \int_M f(x)Hg(x)\,\mu(dx) =\int_M\sum a_{jk}(x)\pa_jf(x)\pa_kg(x)\,\mu(dx)~. $$ 2. Conclude that if $\mu$ is a probability measure then any smooth function $f$ satisfying both the boundary conditions and $Hf=0$ must be constant. 3. Altenatively, use Zaremba's principle (or Hopf's lemma) to prove 2. - this way you see that the speed measure doesn't matter at all!

Convolution semigroups

We just talk about convolution semigroups on $\R^+$: suppose $\mu_t$, $t > 0$, is a family of probability measures on $\R^+$ such that for all $s,t > 0$: $\mu_s*\mu_t=\mu_{s+t}$, where $$ \mu_s*\mu_t((0,x]) \colon=\int_{(0,x]}\mu_s((0,x-y])\,\mu_t(dy) $$ is the convolution of $\mu_s$ and $\mu_t$. The $1/2$-stable distribution defined in exam is a prominent example. Here are a few more:
For $t > 0$ let $\mu_t$ be the measure on $\R^+$ with density $$ x\mapsto\frac{e^{-x}x^{t-1}}{\G(t)}~. $$ Show that the Laplace transform of $\mu_t$ is given by $\o_t(y)=(1+y)^{-t}$. 2. Conclude that $\mu_s*\mu_t=\mu_{s+t}$. The measure $\mu_t$ is called the $\G$-distribution with parameter $t$. 3. Compute $\int x\,\mu_t(dx)$. Suggested solution.
For $t > 0$ let $\mu_t$ be the measure on $\N_0$ with density $$ n\mapsto\frac{t^ne^{-t}}{n!}~. $$ Show that the Laplace transform of $\mu_t$ is given by $\o_t(y)=\exp\Big(-ty(1-e^{-y})\Big)$. 2. Conclude that $\mu_s*\mu_t=\mu_{s+t}$. The measure $\mu_t$ is called the Poisson distribution with parameter $t$. 3. Compute $\int x\,\mu_t(dx)$.
Suppose $\mu_t$, $t > 0$, is a family of probability measures on $\R^+$. Assume that its Laplace transform $\vp_t$ is given by $$ \vp_t(y)=\exp\Big(-t\int_0^\infty\frac{1-e^{-xy}}{x}\,\nu(dx)\Big) $$ For $0 < p < 1$ let $\nu$ be the measure on $\R^+$ with density $$ x\mapsto\frac{p}{\G(1-p)}x^{-p}~. $$ Show that for all $y > 0$: $$ \int_0^\infty\frac{1-e^{-xy}}x\,\nu(dx)=y^p~. $$
All these are particular cases of so called infinitely divisible distributions, cf. e.g. wikipedia.
← Introduction → Continuous Contraction Semigroups

<Home>
Last modified: Mon Aug 26 13:00:51 CEST 2024