MATH 403 Homework

Homework 12 due May 8

    1. Let \((Ω, A, P)\) be a probability space, and let \((A_1 , A_2 , \cdots , A_n )\) be a measurable partition of \(Ω\) such that \(P(A_i ) > 0\) for every \(i ∈ \{1, . . . , n\}\). Prove that, for every $B ∈ A$ such that $P(B) > 0$, for every $i ∈ {1, . . . , n}$,

      \(P(A_i | B) = \frac{P(A_i )P(B | A_i )}{\sum_{j=1}^n P(A_j) P(B | A_j)}\)

    2. Suppose that we have n boxes numbered $1, 2, \cdots, n$, and that the $i$-th box contains \(r_i\) red balls and \(n_i\) black boxes, where \(r_i\) , \(n_i ≥ 1\). Imagine that one chooses a box uniformly at random, and then picks a ball (again at random) in the chosen box. Compute the probability that the i-th box was chosen knowing that a red box was picked.

  1. Let \((\mathcal{A}_n )_{n∈N}\) be a decreasing sequence of sub-\(σ\)-fields of \(A\), with \(\mathcal{A}_1 = \mathcal{A}\), and let \(X ∈ L_2(Ω, A, P)\).

    1. Prove that the random variables \(E[X \vert \mathcal{A}_n ] − E[X \vert \mathcal{A}_{n+1} ]\), for \(n ∈ N\), are orthogonal in $L_2$, and that the series

      \(\sum_{n∈N} E[X | \mathcal{A}_n ] − E[X | \mathcal{A}_{n+1} ]\)

      converges in \(L_2\).

    2. Let \(\mathcal{A}_∞ = \cap_{n∈N} \mathcal{A}_n\). Prove that

      \(\lim_{n→∞} E[X | \mathcal{A}_n ] = E[X | \mathcal{A}_∞ ],\)

      converges in $L_2$.

  2. Let \(X\) and \(Y\) be two independent \(N(0,1)\) random variables. Compute \(\E[ X \vert X^2 + Y^2 ]\).

  3. Let \((X_n)_{n \in \mathbb{N}}\) be a sequence of iid random variables, and let \(S_n = \sum_{i=1}^n X_i\).

    1. Prove that if the moment generating function of \(X_1\) is finite on some interval \((-\delta,\delta)\), then for any fixed $\theta \in (-\delta,\delta)$, \(Y_n(\theta) = e^{\theta S_n}/\E[ e^{\theta S_n} ]\) is a martingale.
    2. Differentiate \(\E[ Y_n(\theta) \vert \mathcal{F}_{n-1}] = Y_{n-1}(\theta)\) twice with respect to \(\theta\), and set \(\theta = 0\) to discover a new martingale that is a function of \(S_n\).
    3. Suppose \(X_1 \in \{-1,+1\}\) with \(P(X_1 = 1) = 1/2\) i.e., \(S_n\) is the simple random walk. Prove that \(\phi(S_n,n)\) is a martingale if \(\phi\) satisfies the functional relation

      \(\frac{1}{2} (\phi(s+1,n+1) + \phi(s-1,n+1)) = \phi(s,n)\)

  4. (Ballot Theorem) In an election, candidate A has obtained a votes and candidate B has obtained b votes, where a > b. The scrutineer proceeds to the counting of votes by reading the ballot papers one after the other in a random order. We will prove that the probability that candidate A has strictly more votes than candidate B at each step of the counting process is \((a − b)/(a + b)\). First, represent the difference between votes for A and votes for B during the counting process by a simple random walk $S_j$ where \(j \in \{0, 1, . . . , a + b\}\) that starts from 0, has jumps of size +1 or −1 and terminates at \(a − b\). Then note that the probability of occurrence of any trajectory of the random walk is the same, so that the problem reduces to enumerating those random walk trajectories that stay positive on \({1, . . . , a + b}\). This can be computed using the “reflection principle”. Let $n = a + b$. In terms of the random walk, we want to compute

    \(P( S_1 > 0, S_2 > 0, \ldots, S_{n-1} > 0 | S_n = a - b)\)

    1. The first step has to be $X_1 = 1$, otherwise the above event will not occur. So restart the walk with at time $1$, run it for $n-1$ steps and condition it to end up at $a - 1 - b$. Show

      \(P( S_1 > 0, S_2 > 0, \ldots, S_{n-1} > 0 | S_n = a - b) = \frac{P(X_1 = 1) P( S_1 \geq 0, S_2 \geq 0, \ldots, S_{n-2} \geq 0 , S_{n-1} = a - 1 - b)}{P(S_n = a - b)}\)

    2. Let $T$ be the time at which the trajectory hits $-1$ for the first time. Prove the relation

      \(P( S_1 \geq 0, S_2 \geq 0, \ldots, S_{n-2} \geq 0 , S_{n-1} = a - 1 - b) = P(S_{n-1} = a - b - 1, T > n - 1)\)

    3. Using the reflection idea we talked about in class, show

      \(P(S_{n-1} = a - b - 1, T \leq n - 1) = P(S_{n-1} = a + 1 - b)\)

    4. Use the law of total probability to show the desired result.

Homework 11 due Apr 25

  1. Given two distribution functions \(F\) and \(G\) let \(\rho\) be defined by \(\rho(F,G):=\inf\{\varepsilon: F(x-\varepsilon)-\varepsilon\leq G(x)\leq F(x+\varepsilon)+\varepsilon\text{ for all }x\}.\)

    1. Prove that \(\rho(F,G)\) is equal to the side length of the largest square with sides parallel to the axes that can fit between the graphs of \(F\) and \(G\).

    2. Show that the function \(\rho\) defined in the previous problem defines a metric on the space of distribution functions and \(\rho(F_n,F)\to 0\) if and only if \(F_n\Rightarrow F\). The metric \(\rho\) is called the L'{e}vy metric.

    3. Show that if \(F_n\Rightarrow F_\infty\) then there are random variables \(Y_n\) with distribution functions \(F_n\) such that \(Y_n\to Y_\infty\) almost surely.

    4. Let \(f\geq 0\) be continuous. Show that if \(X_n\Rightarrow X_\infty\) then \(\liminf_{n\to\infty}E f(X_n)\geq E f(X_\infty).\)

    Solution

    1. Suppose \(\varepsilon<\rho(F,G)\). Then \(\exists x\) such that \(G(x)<F(x-\varepsilon)-\varepsilon\) or \(G(x)>F(x+\varepsilon)+\varepsilon\). In the first case the square of side length \(\varepsilon\) with upper left corner at \((x-\varepsilon,F(x-\varepsilon)\) fits between the graphs of \(F\) and \(G\). In the second case the square of side length \(\varepsilon\) with lower right corner at \((x+\varepsilon,F(x+\varepsilon)\) fits between the graphs of \(F\) and \(G\). It follows that the side length of the largest square that fits between the graphs is at least \(\varepsilon\) for all \(\varepsilon<\rho(F,G)\), so it is at least \(\rho(F,G)\).

      Now, suppose a square of side length \(\varepsilon\) fits between the graphs of \(F\) and \(G\). Suppose it is below the graph of \(F\) and above the graph of \(G\). Let \((x,y)\) and \((z,w)\) be the coordinates of respectively the lower right and upper left corners of the square. We have \(G(x)\leq y=w-\varepsilon\leq F(z)-\varepsilon=F(x-\varepsilon)-\varepsilon\). Since \(F\) is non-decreasing, it follows that for any \(0<\delta<\varepsilon\) we have \(G(x)\leq F(x-\delta)-\delta\). This implies \(\varepsilon\leq \rho(F,G)\), and therefore that the side length of the largest square that fits between the graphs is at most \(\rho(F,G)\).

    2. Suppose $\rho(F,G)=0$. then $\forall\varepsilon>\delta>0$, $F(x-\varepsilon)\leq F(x-\delta)\leq G(x)+\delta$. Hence $F(x-\varepsilon)\leq G(x)$, whence $F(x^-)\leq G(x)$. Similarly $G(x)\leq F(x^+)=F(x)$. Suppose $G(x)<F(x)$. Then it follows from right continuity that $\exists\varepsilon>0$ such that $G(x)<G(x+\varepsilon)<F(x)$. But then $F(x)<F(x+\varepsilon^-)\leq G(x+\varepsilon)<F(x)$ which is a contradiction. Thus $\rho(F,G)=0\Rightarrow F=G$.

      The other conditions of a metric are straightforward.

      Suppose \(\rho(F_n,F)\rightarrow 0\), let \(x\) be a point of continuity of \(F\) and let \(\varepsilon>0\). Pick \(\varepsilon/2>\delta>0\) so that if \(\vert y-x\vert <2\delta\) then \(\vert F(y)-F(x)\vert <\varepsilon/2\). If \(n\) is large enough, then \(\rho(F_n,F)<\delta\), so \(F(x-\delta)-\delta\leq F_n(x)\leq F(x+\delta)+\delta\). We have \(F(x-\delta)-\delta>F(x)-\varepsilon/2-\varepsilon/2\) and \(F(x+\delta)+\delta<F(x)+\varepsilon\), so \(\vert F_n(x)-F(x)\vert <\varepsilon\). This is true \(\forall\varepsilon>0\) so \(F_n(x)\rightarrow F(x)\), whence \(F_n\Rightarrow F\).

      Suppose \(F_n\Rightarrow F\) but \(\rho(F_n,F)\nrightarrow 0\). Then \(\exists\varepsilon>0\) and a subsequence \(F_{n(k)}\) such that \(\rho(F_{n(k)},F)>\varepsilon\). WLOG \(n(k)=n\). Equivalently, \(\exists x_n\) such that \(F_n(x_n)\geq F(x_n+\varepsilon)+\varepsilon\) or \(F_n(x_n)\leq F(x_n-\varepsilon)-\varepsilon\) for all \(n\). Either one or the other inequality will happen infinitely often, so WLOG let’s assume the first one happens for all \(n\). \(\exists M:F(M)>1-\varepsilon\), so \(x_n<M\ \forall n\). Thus, either \(x_n\) has a subsequence converging to a point in \([-\infty,M]\). Again, WLOG \(x_n\) itself converges. If \(x_n\rightarrow -\infty\), pick \(t:F(t)<\varepsilon/2\), \(F\) continuous at \(t\). If \(n\) is large enough then \(x_n<t\) whence \(F_n(t)\geq F_n(x_n)\geq F(x_n+\varepsilon)+\varepsilon\geq \varepsilon>F(t)+\varepsilon/2\). It follows that \(F_n(t)\nrightarrow F(t)\), which is a contradiction.

      If \(x_n+\varepsilon/2\rightarrow t\in(-\infty,M)\), let \(s\in (t-\varepsilon/4,t+\varepsilon/4)\) be such that \(F\) is continuous at \(s\). If \(n\) is large enough, then \(x_n<s<x_n+\varepsilon\). Then \(F_n(s)\geq F_n(x_n)\geq F(x_n+\varepsilon)+\varepsilon\geq F(s)+\varepsilon\), which implies \(F_n(s)\nrightarrow F(s)\), a contradiction.

    3. hw11-1.3-solution.pdf

  2. Riemann-Lebesgue lemma. Prove that \(\lim_{\vert t\vert \to \infty} \E [ \exp( i t \cdot X) ] = 0\) for all $k$-dimensional absolutely continuous random variables $X$. Is it still true if $X$ is not absolutely continuous?

  3. Let $X_j$ be independent $L^2$ random variables in $\R$. Assume there exists a $\delta > 0$ such that

    \(\lim_{\delta \to 0} \frac{1}{s_n^{2 + \delta}} \sum_{j=1}^n E\left[ |X_j - \mu_j|^{2+\delta} \right] = 0,\)

    where \(\mu_j\) and \(\sigma_j^2\) are the mean and variance of \(X_j\), and \(s_n^2 = \sum_{j=1}^n \sigma_j^2\).

    Prove

    \(\frac{S_n - \sum_{j=1}^n \mu_j}{\sigma_n} \Rightarrow N(0,1).\)

    Solution

  4. The Simple Random Walk. Let \(e_1,\ldots,e_d\) denote the usual basis vectors of \(\R^d\). Let \(\{X_i\}_{i=1}^\infty\) such that \(\Prob \left( X_1 = \pm e_j \right) = \frac{1}{2d}\) Then \(S_n = X_1 + \cdots + X_n\) is the simple random walk on \(\Z^d\). Show that there are vectors $a_n$ and $b_n$ such that $(S_n - a_n)/b_n$ converges weakly to a nontrivial distribution. Compute the distribution.
  5. On \((\Omega,\Sigma,\Prob)\) let \(X_n \to X \, \almostsurely\). Let \(\mathcal{G}\) be a sub-\(\sigma\) algebra of \(\Sigma\).
    1. If \(X_n \geq 0\), then \(\liminf E[X_n \vert \mathcal{G}] \geq \E[X \vert \mathcal{G}]\).
    2. If \(\vert X_n\vert \leq Y\) and \(Y \in L^1(\Omega,\Sigma,\Prob)\), show \(\lim_{n \to \infty} E[X_n \vert \mathcal{G}] = E[ X \vert \mathcal{G}]\).
  6. Prove the conditional Holder inequality stated in class.

Homework 10 due Apr 18

    1. Suppose that the family of probability measures \(\{\mu_i,i\in I\}\) is tight, that is, \(\sup_i\mu_i([-M,M]^c)\to 0\) as \(M\to\infty\). Show that their characteristic functions \(\phi_i\) are equicontinuous, i.e if \(\varepsilon>0\), we can pick \(\delta>0\) such that \(\vert h\vert <\delta\) implies \(\vert \phi_i(t+h)-\phi_i(t)\vert <\varepsilon\) for all \(i\).
    2. Suppose \(\mu_n\Rightarrow\mu_\infty\). Show that \(\phi_n\to\phi_\infty\) uniformly on compact sets. Hint: Look at the statement and proof of the Arzela-Ascoli theorem.
    3. Give an example to show that the convergence need not be uniform on the whole real line.

    Solution

    1. Let \(1>\varepsilon>0\) and let \(n\) be large enough that \(\sup_i\mu_i([-n,n]^c)<\varepsilon/8\). Then \(\vert \phi_i(t+h)-\phi_i(t)\vert \leq \E\vert e^{i hX_i}-1\vert \leq \E\min(\vert hX_i\vert ,2)=\vert h\vert \E\vert X_i\vert 1_{\vert X_i\vert <2/\vert h\vert }+2\mu_i(\{\vert X_i\vert \geq 2/\vert h\vert \}\). If \(\vert h\vert <\varepsilon/(2n)\) then the last term is \(<\varepsilon/4\). For the other term we have \(\vert h\vert \E\vert X_i\vert 1_{\vert X_i\vert <2/\vert h\vert }<\vert h\vert \E\vert X_i\vert 1_{\vert X_i\vert \leq n}+\vert h\vert \E\vert X_i\vert 1_{n< \vert X_i\vert <2/\vert h\vert }\leq \vert h\vert n+\vert h\vert 2/\vert h\vert \varepsilon/8<3\varepsilon/4\). Combining with the previous estimate we get \(\vert \phi_i(t+h)-\phi_i(t)\vert <\varepsilon\forall \vert h\vert <\varepsilon/2n\). Thus, the characteristic functions are equicontinuous.

    2. It’s enough to show that \(\mu_n\Rightarrow\mu_\infty\) implies that \(\mu_n\) is tight since by the previous part the characterist functions \(\phi_n\) are equicontinuous and it’s a standard result of analysis that a convergent sequence of equicontinuous functions converges uniformly on compacts.

      Suppose \(\mu_n\) is not tight. Net \(\exists\varepsilon>0\) and a subsequence \(n(k)\to\infty\) such that \(1-F_{n(k)}(k)+F_{n(k)}(-k)\geq \varepsilon\) for all \(k\). Let \(r<0<s\) be continuity points of \(\mu_\infty\). We have

      \(\begin{align*} 1-F(s)+F(r)&=\lim_{j\to\infty}1-F_{n(j)}(s)+F_{n(j)}(r) \\&\geq \liminf_{j\to\infty}1-F_{n(j)}(j)+F_{n(j)}(-j)\geq \varepsilon. \end{align*}\)

      Sending \(s\to\infty\) and \(r\to-\infty\) we see that \(F\) is not a distribution function.

    3.Take \(X_n=1/n\) (deterministic).

  1. Show that \(C_c(\R^d)\) contains a countable, dense subset (in the sup norm). Also show that functions with three continuous, bounded derivatives (like the ones we used in Lindeberg’s proof of the CLT) is dense in \(C_c(\R^d)\)).

  2. Let \(g_\sigma\) be the \(d\)-dimensional Gaussian density, and let $f \in C_c(\R^d)$.

    1. Show that \(f \ast g_\sigma(x) = \E[f(x + \sigma Z)]\) where \(Z\) is a standard $d$-dimensional Normal random variable with the identity matrix as its covariance.
    2. Show \(\sup_x \vert f \ast g_\sigma(x) - f(x)\vert \to 0\) as \(\sigma \to 0\). Hint: Use bounded convergence and part 1.
    3. Suppose \(f\) is the pdf of a random variable \(X\), and let \(Z\) be an independent \(N(0,\sigma^2)\) random variable. Find the pdf of \(X + Z\) in terms of the pdfs of \(X\) and \(Z\).
  3. Differentiation under the integral sign: Let \(h \colon I \times \Omega \to \R\) be an \(L^1(\Omega,\Sigma,\Prob)\) function that is differentiable at some \(x_0 \in I\), where $I$ is an open interval. Suppose \(\vert h(x,\omega) - h(x_0,\omega)\vert \leq g(\w)\vert x - x_0\vert\), where \(g \in L^1(\Omega,\Sigma,\Prob)\). Let \(F(x) = \E[ h(x_0,\w) ]\), where the expectation is over $\W$. Doing this problem in general $d \geq 1$ is preferable.

    1. Use the dominated convergence theorem to show that \(F'(x_0) = \E[ \partial_x h(x_0,\w) ]\).
    2. In $d=1$, show that \(g_\sigma\) has bounded \(k\)th derivative for any \(k=0,1,\ldots\). Hint: Use the generalized Leibniz rule and consider a proof by induction..
    3. As before, let \(f \in C_b(\R)\). Show that \(\phi(x) = f \ast g_\sigma(x)\) is infinitely differentiable, and compute its derivatives.
    4. Find \(\widehat{\phi^{(k)}(x)}\), the Fourier transform of the \(k\)th derivative of \(\phi\).

Homework 9 due Apr 11

  1. Let \(X = L^p(\Omega,\mathcal{F},\mu)\) where \(\mu\) is a \(\sigma\)-finite measure. For \(1 < p < \infty\), identify \(X^*\), the space of bounded linear functions on \(X\). You may look in your favorite analysis book if you cannot produce the proof yourself.

  2. Let \((\Omega, \mathcal{F}, \mu)\) be a \(\sigma\)-finite measure space. Let \(\Omega = \sqcup_n A_n\), such that \(\mu(A_n) < \infty\). Let \(F\) be the linear subspace of \(L^\infty(E,\mathcal{F},\mu)\) spanned by the functions \(1\) and \(\{1_{A_n}\}_{n=1}^\infty\). Show that for every $f \in F$, the limit

    \(\lim_{n \to \infty} \frac{1}{\mu(A_n)} \int 1_{A_n} f d \mu\)

    exists in \(\R\). Using the Hahn-Banach theorem, construct a continuous linear functional \(\Phi\) on \(L^\infty(\Omega,\mathcal{F},\mu)\) which cannot be represented as \(\Phi(f) = \int f g d\mu\) for some \(g \in L^1(\Omega,\mathcal{F},\mu)\).

  3. Show that $C_c(\R^d)$ with sup norm is not a Banach space. Show that $C_0(\R^d)$ with sup norm is a Banach space. What is the dual space $C_0(\R^d)$? It is sufficient to identify the dual space, and no proof is necessary.

  4. Find a random variable $X$ whose pdf is continuous on $\R$ and has compact support, but whose characteristic function is not in $L^1$.

  5. Compute the moment generating functions and characteristic functions of \(\Uniform[0,1]\), \(\Exponential(\lambda)\), \(\operatorname{Gamma}(\alpha)\) distributions. The \(\alpha\) is the shape parameter of the gamma distribution.

  6. Let \(\{X_i\}_{i=1}^n\) be iid \(\Exponential(\lambda)\) random variables. Identify the distribution of \(S_n = X_1 + \cdots + X_n\) by computing its characteristic function. Let \(\tilde{S}_n = \frac{S_n - E[S_n]}{\sqrt{\Var(S_n)}}\) be the squared and centered version of \(S_n\). Compute

    \(\lim_{n \to \infty} \phi_{\tilde{S}_n}(t)\)

    and identify the limit of \(\tilde{S}_n\).

Homework 8 due Apr 4

Problem 1

Let \(\{X_{n,k}\}_{1\leq k\leq n}\) be a triangular array of random variables such that \(X_{n,1},\dots,X_{n,n}\) are independent for every \(n\). Let \(b_n>0,\lim_{n\to\infty}b_n=\infty\) and define the truncated variables \(\bar{X}_{n,k}=X_{n,k}1_{\vert X_{n,k}\vert \leq b_n}\). Suppose

  1. \(\sum_{k=1}^nP(\vert X_{n,k}\vert >b_n)\to 0\) as \(n\to \infty\),

  2. \(b_n^{-2}\sum_{k=1}^nE\bar{X}_{n,k}^2\to 0\) as \(n\to\infty\).

Let \(S_n=X_{n,1}+\dots+X_{n,n}\) and \(a_n=\sum_{k=1}^nE\bar{X}_{n,k}\). Show that \(\frac{S_n-a_n}{b_n}\to 0\) in probability as \(n\to\infty\).

Solution Let \(T_n=\sum_{i=1}^n\bar{X}_{n,i}\). We have \(a_n=ET_n\). Moreover, for every \(\varepsilon>0\) we have

\(P\left(\left\vert \frac{T_n-a_n}{b_n}\right\vert >\varepsilon\right)\leq \frac{\Var (T_n/b_n)}{\varepsilon^2}=\frac{\sum_{i=1}^n \Var(\bar{X}_{n,i})}{\varepsilon^2b_n^2}\leq\frac{\sum_{i=1}^n E(\bar{X}_{n,i}^2)}{\varepsilon^2b_n^2}\to 0\)

as \(n\to\infty\).

Comparing \(T_n\) to \(S_n\) we get

\(\begin{align*} P\left(\left|\frac{S_n-a_n}{b_n}\right|>\varepsilon\right) &=P\left(\left|\frac{S_n-a_n}{b_n}\right|>\varepsilon,T_n=S_n\right) +P\left(\left|\frac{S_n-a_n}{b_n}\right|>\varepsilon,T_n\neq S_n\right) \\&\leq P\left(\left|\frac{T_n-a_n}{b_n}\right|>\varepsilon,T_n=S_n\right) +P\left(T_n\neq S_n\right) \\&\leq P\left(\left|\frac{T_n-a_n}{b_n}\right|>\varepsilon\right) +\sum_{i=1}^n P\left(\bar{X}_{n,i}\neq X_{n,i}\right) \\&= P\left(\left|\frac{T_n-a_n}{b_n}\right|>\varepsilon\right) +\sum_{i=1}^n P\left(|\bar{X}_{n,i}|> b_{n}\right)\to 0 \end{align*}\)

as \(n\to\infty\).

Problem 2

If \(n\) is an integer, a permutation of \(\{1,2,\dots,n\}\) is an reordering of these integers. E.g. \(\pi=(7,5,8,3,2,1,6,4)\) is a permutation if \(n=8\). There are, of course \(n!\) permutations, given \(n\). There are many ways to represent permutations. For example, the same \(\pi\) as above can be written as

\(\pi=\left(\begin{array}{cccccccc} 1&2&3&4&5&6&7&8\\ 7&5&8&3&2&1&6&4 \end{array}\right),\)

representing the fact that \(7\) is in the first position, \(5\) in the second, etc. A different way to think about \(\pi\) is a function from \(\{1,2,\dots,8\}\) to \(\{1,2,\dots,8\}\), sending \(1\to 7\), \(2\to 5\), etc. Lastly, when thinking of \(\pi\) as a function, we can write it in cycle form: we start with \(1\), and see where it maps to, in this case \(7\), then see where \(7\) maps to, in this case \(6\), and \(6\) maps to \(1\) completing our first cycle \((176)\). We start with the smallest integer we haven’t encountered, \(2\) here, and repeat: \(2\) maps to \(5\), \(5\) maps back to \(2\), so we get another cycle \((25)\). Lastly we get the cycle \((384)\). Thus, the cycle structure of \(\pi\) is \(\pi=(176)(25)(384)\), so it has three cycles.

Let’s ask the following question: if we pick a permutation uniformly at random out of the \(n!\) different ones (i.e. all \(n!\) are equally likely), how does the number of cycles behave as \(n\to\infty\)?

In this problem you will answer this question. In the cycle-structure construction above let \(X_{n,k}\) be \(1\) if the \(k\)‘th number encountered finishes a cycle and \(0\) otherwise. In our example \(X_{8,3}=X_{8,5}=X_{8,8}=1\) and all the others are \(0\).

  1. Prove that \(X_{n,1},\dots,X_{n,n}\) are in fact independent, and \(P(X_{n,j}=1)=\frac{1}{n-j+1}\) for all \(j=1,\dots,n\).

  2. Let \(S_n=X_{n,1}+\dots+X_{n,n}\), \(\mu_n=ES_n\) and \(\sigma_n^2=\Var(S_n)\). What does \(S_n\) represent?

  3. Show that $\forall\varepsilon>0$ we have \(\frac{S_n-\sum_{m=1}^n\frac 1m}{(\ln n)^{0.5+\varepsilon}}\to 0\) in probability, as $n\to\infty$.

  4. Conclude that \(\frac{S_n}{\ln n}\to 1\) in probability as \(n\to\infty\).

Solution

  • Pick the random permutation by building the cycle structure. If we already have \((176)(25\), have \(4\) choices for what comes next, namely \(),3,4,8\). Pick one uniformly at random. After \()\) the next one is deterministic. Under this procedure the probability of each permutation is \(1/n!\), so will get a uniformly random permutation. Now it is easy to see independence of the \(X_{n,i}\)’s and that \(P(X_{n,i})=\frac{1}{n-i+1}\).

    The above is Sevak’s proof, which I quite like. There are other ways of doing this of course, and one is by induction.

  • $S_n$ represents the number of cycles.
  • This is essentially a special case of the previous problem (the triangular weak law of large numbers).
  • Take $\varepsilon $ to be $0.5$ in the previous part.

Problem 3

Suppose \((X_i)_{i=1}^\infty\) be a sequence of iid random variables on \((\Omega,\Sigma,P)\) that take values in some topological space \((E,\mathcal{B}(E))\). Show that if \(A\) is any event such that \(P(A) > 0\), then \(P(X_i \in A \text{ i.o.}) = 1\). This is the probabilistic iid special case of the Poincare recurrence theorem, which you will revisit in Ergodic theory.

Suppose we have a monkey typing out letters on a typewriter. Let \(X_i \in \mathcal{A}\) be the \(i\)th letter typed out by the monkey, where \(\mathcal{A}\) is the English alphabet (including spaces and punctuation marks). Let \(n\) be the length of Shakespeare’s Hamlet in the English alphabet. Show, using the previous result, that if the monkey types infinitely many letters, then the monkey will type out Hamlet at some point. (Please prove this statement in an appropriate mathematical model).

Problem 4

Prove that every set of six people have at least 3 mutual acquaintances or 3 mutual strangers. Compute the Ramsey number \(R_3\).

Problem 5

Let \(\pi\) denote a collection of edges in the graph \(\Z^d\). We say that \(\pi\) is a path of length \(n\) if there are \(n\) edges in \(\pi\) and they connect neighboring points in \(\Z^d\), starting with the origin. A path is self-avoiding if no 3 of its edges share a vertex.

  1. Prove that there are at most $\lambda_{n,d} := 2d(2d - 1)^{n-1}$ self-avoiding paths of length $n$.
  2. Recall in edge-percolation that you keep an edge of \(\Z^d\) with probability \(p\) and discard it with probability \(1-p\). Prove that the expected number of paths of length n that remain after keeping and discarding edges is at most \(p^n \lambda_{n,d}\). Use this to prove that \(p_c(\Z^d) \geq (2d - 1)^{-1}\).

This is called Peierl’s argument in percolation theory.

Homework 7 due Mar 28

  1. Suppose \(\W = \{1,\ldots,n\}\) and let \(H(\Prob) = - \sum_{i=1}^n \Prob(x) \log \Prob(x)\) be the entropy of any measure on \(\W\). Show that \(H\) is maximized by the uniform measure.

  2. Let \((X_n)_{n\in \N}\) be a sequence of real random variables, suppose they are uniformly bounded in $L^2$, and uncorrelated. Show that

    1. \(\frac{S_{n^2} - \E[S_{n^2}]}{n^2} \to 0\) as \(n \to \infty\) a.s.
    2. Deduce that \(\frac{S_n - \E[S_n]}{n} \to 0, \almostsurely.\)
  3. Suppose \((X_n)_{n=1}^\infty\) is an iid sequence such that \(X_n \in L_4(\Omega,\Sigma,P)\). Compute \(\operatorname{Var}(S_n/n)\) and use the Borel-Cantelli lemma to conclude that

    \(\frac{S_{n} - \mathbb{E}[S_{n}]}{n} \to 0 \text{ a.s.}\)

  4. Let \(\{X_n:n\geq 1\}\) be independent and exponentially distributed with parameter \(1\).

    1. Show

      \(\mathbb{P}\left(\limsup_{n\rightarrow\infty}\frac{X_n}{\ln n}=1\right) = 1, \almostsurely\)

    2. Let \(Z_n = \max\{ X_1, \ldots, X_n \}\). Verify that

      \(\liminf_{n \to \infty} \frac{Z_n}{\log n } \geq 1, \almostsurely\)

    3. For an appropriately chosen sequence \(n_k \to \infty\), show that

      \(\limsup_{k \to \infty} \frac{Z_{n_k}}{\log n_k } \leq 1, \almostsurely\)

      Conclude that \(\lim_{n \to \infty} (\log n)^{-1} Z_n = 1, \almostsurely\).

  5. Suppose \((X_n)_{n=1}^\infty\) is an iid sequence such that \(\E[X_n^+] = +\infty\) and \(\E[X_n^-] < \infty\). Show that \(\lim_{n \to \infty} S_n/n \to +\infty ~\almostsurely\).

Homework 6 due Mar 7

  1. Let \(\{X_i\}_{i=1}^\infty\) be iid random variables, all taking values \(0\) and \(1\) with probability \(1/2\) each, and define \(Y \colon = \sum_{i=1}^\infty 4^{-i} X_i\). If $\mu_Y$ denotes the distribution of $Y$, then $\mu$ is a probability measure on $([0,1],\mathcal{B}([0,1]))$. Show that

    1. \(\mu_Y(\{x\}) = 0\) for all \(x \in [0,1]\).
    2. Let \(A\) be the range of \(Y\), \(A \subset [0,1]\). Show that \(\mu_Y(A) = 1\) but \(\lambda(A) = 0\), where \(\lambda\) is Lebesgue measure.
    3. Find the Hausdorff dimension of \(A\).
  2. Consider a population of \(n\) individuals and \(r ∈ \{1, \ldots , n − 1\}\).

    1. Give a probability space for the random experiment consisting in choosing at random a sample of \(r\) individuals in the population.
    2. Suppose that the population is composed of individuals of two types, with \(n_i\) individuals of type \(i\) for \(i=1,2\) and \(n_1 + n_2 = n\). Let \(X\) be the number of individuals of type 1 in the sample. Prove that the law of \(X\) is given by

      \(P(X = k) = \binom{n_1}{k} \binom{n_2}{r-k} / \binom{n}{r}\)

      for every \(k ∈ {0, 1, . . . , r}\) (with the convention that \(\binom{k}{j}=0\) if \(j > k\)). This is the so-called hypergeometric distribution.

    3. Show that when \(n\), \(n_1\), \(n_2 → ∞\) in such a way that \(n_1/n\) tends to \(p ∈ (0, 1)\), and \(r\) remains fixed, the law of \(X\) becomes close to the binomial \(\Binomial(r, p)\) distribution. Interpret this result.
  3. Let \(X=(X_1,\ldots,X_n)\) be an absolutely continuous random vector. Show that \(X_i\) \(i=1,\ldots,n\) are independent iff the density \(p(x_1,\ldots,x_n) = \prod_{i=1}^n p_i(x_i)\), factors as a product and each marginal \(P_{X_i}\) has density \(p_i(x)/\int p_i(x)dx\).

  4. Show that \(\Prob(Z > \theta \E[Z]) \geq (1-\theta)^2 \E[Z]^2/\E[Z^2]\). This is the Paley-Zygmund inequality.

  5. Let \(\{A_i\}\) be a set of independent events, let \(Z_n = \sum_{i=1}^n 1_{A_i}\), and suppose \(\E[Z_n] \to \infty\).
    1. Show $\E[Z_n^2] \leq 2 \E[Z_n]^2$ for all large enough $n$.
    2. Prove the second Borel-Cantelli lemma using the Payley-Zygmund inequality.
  6. Prove the Kolmogorov 1-series theorem: If \(\{X_i\}\) are independent mean-zero random variables and if \(\sum_j \E[X_j^2] < \infty\), then \(\sum_i X_i\) converges almost surely. Use this to prove the following: suppose \(\{X_i\}_{i=1}^\infty\) are iid \(L^2\) mean-zero random variables, then \(\sum_{i=1}^\infty X_i/i\) converges; i.e., the random harmonic series converges.

Homework 5 due Feb 25

  1. Let \(C_c(\R^d)\) be the set of continuous functions with compact support. Show that they are dense in \(f \in L^p(\R^d, \mathcal{B}(\R^d), \lambda)\) for \(p \in [1,\infty)\).

  2. Let \(p \in [1,\infty)\) and \(q = p^*\) the conjugate Holder exponent. Let \(f \in L^p(\R^d, \mathcal{B}(\R^d), \lambda)\) and \(q \in L^q (\R^d, \mathcal{B}(\R^d), \lambda)\), where \(\lambda\) is Lebesgue measure. Show

    1. \(\vert f \ast g(x)\vert \leq \Norm{f}{p} \Norm{g}{q}\), and
    2. \(f \ast g(x)\) is uniformly continuous on \(\R^d\). For this, first show it for \(f\) and \(g\) continuous with compact support.
  3. Let \(X_n \overset{d}{=} N(0,1/n)\) be a sequence of Gaussian random variables. For any bounded continuous function \(f\), show that

    \(\E[ f(X_n) ] \to f(0)\)

  4. Let \(X\) be a \(\operatorname{Gamma}(\alpha)\) \(\Exponential(\lambda)\) and \(N(\mu,\sigma^2)\) random variables. Compute in each case

    1. \(\E[X]\)
    2. \(\Var(X)\)
    3. \(\phi_X(t) = \E[ \exp(i X t) ]\)
  5. Let $X_i$, $i=1,\ldots,N$ be random variables in $L^2$. Show that

    \(\Var( \sum_{i=1}^N a_i X_i ) = \sum_{i=1}^N a_i^2 \Var(X_i) + \sum_{i \neq j} a_i a_j \Cov(X_i, X_j)\)

    How does the formula simplify if you assume that the variables are independent?

  6. Suppose \(\mu\) is a finite, inner regular measure on \((\R^d,\mathcal{B}(\R^d))\). Show that \(\mu\) is also outer regular.

Homework 4 due Feb 18

  1. Find an example where

    1. \(X_n \to X\) in \(L^p\) \(1 \leq p \leq \infty\) but \(X_n \not\to X \almostsurely\).
    2. \(X_n \to X\) in \(\almostsurely\) but \(X_n \not\to X\) in \(L^p\) \(1 \leq p \leq \infty\).
  2. Suppose $X \sim \Binomial(n,p)$. Compute $\E[X]$, $\Var(X)$.

  3. Suppose \(p ∈ [1, ∞)\).

    1. If \((E, A, μ)\) is a measure space, show that the set of all integrable simple functions is dense in \(L^p (E, A, μ)\).
    2. If \((E, d)\) is a metric space, and \(μ\) is an outer regular measure on \((E, B(E))\), show that the set of all bounded Lipschitz functions that belong to \(L^p (E, \mathcal{B}(E), μ)\) is dense in $L^p(E, \mathcal{B}(E), μ)$.
    3. Consider \(L^\infty( \N, 2^\N, \mu)\) where \(\mu\) is counting measure. This is the sequence space \(\ell^\infty\). Show that
      this space is not separable.
  4. Let \(X \sim \Geometric(p)\) have pmf \(\Prob(X = k) = (1-p)^{k-1} p, \quad k=1,2,\ldots\). Suppose you keep flipping a coin until you flip heads, where the probability of heads is \(p\). \(X\) represents the toss at which the first heads occurs.

    1. Show that \(\Prob(X > k) = (1-p)^k\).
    2. Let \(X_n = \Geometric(\lambda/n)\). Show that \(X_n/n \overset{d}{\to} X\), and identify the distribution of \(X\).

Homework 3 due Feb 11

  1. Suppose we have \(X_n \colon \W \to \R\), \(n=1,2,\ldots\) measurable random variables defined on the same probability space.

    1. Give an example of \(X_n \to X\) in distribution, but \(X_n \not\to X\) in probability.
    2. Show that if \(X_n \to X \quad a.s.\), then $X_n \to X$ in probability (in measure).
    3. Give an example of \(X_n \to X\) in probability, but \(X_n \not\to X \quad a.s.\).

    It’s enough to construct two random variables \(Y\) and \(X\) on the same sample space with the same distribution. Then we can simply let \(X_n = Y\). Let our probability space be the usual interval \([0,1)\) with Lebesgue measure. Let \(Y(x) = 1-x\) and \(X(x) = x\). It is easy to check that

    \(\begin{align*} \mu\left( \{ x \colon X(x) \leq t \} \right) & = t \forall t \in [0,1)\\ \mu\left( \{ x \colon Y(x) \leq t \} \right) & = t \forall t \in [0,1)\\ \end{align*}\)

    This shows that the two cdfs are the same, and that \(X_n \to X\) in distribution.

  2. Choose and fix $p_1,\ldots,p_n > 0$ such that $\sum_i p_i = 1$. Prove that if $x_1,\ldots,x_n$ are positive, then

    \(\prod x_i^{p_i} \leq \sum_i p_i x_i\)

  3. (Hausdorff Dimension) If \(A\) is a subset of \(R^d\), we denote the diameter of \(A\) by \(Δ(A) = \sup\{\vert x − y\vert : x, y ∈ A\} ∈ [0, ∞]\). For every \(α > 0\) and every \(ε > 0\), we set, for every subset \(A\) of \(R^d\),

    \(μ_{α,ε} (A) = \inf_{(U_k )_{k∈N} ∈ \mathcal{R}_ε (A)} \sum_{k∈N} Δ(U_k)^α\)

    where \(\mathcal{R}_ε (A)\) is the set of all countable coverings of \(A\) by open sets of diameter smaller than $ε$. 1. Observe that \(μ_{α,ε} (A) ≥ μ_{α,ε'}(A)\) if \(ε < ε'\) , and thus one can define a mapping \(A \mapsto μ_α (A)\) by

    \(μ_α (A) := \lim_{ε↓0} μ_{α,ε} (A) ∈ [0, ∞].\)

    1. Prove that \(μ_α\) is an outer measure on \(R^d\).
    2. Verify that, for every subset \(A\) of \(R^d\) , there exists a (unique) real number \(\operatorname{dim}(A) ∈ [0, d]\), called the Hausdorff dimension of \(A\), such that, for every \(α > 0\),

      \(μ_α (A) = \begin{cases} 0 & \text{if } α > dim(A) \\ ∞ & \text{if } α < dim(A). \end{cases}\)

    3. Let \(α > 0\) and let \(A\) be a Borel subset of \(R^d\). Assume that there exists a measure \(μ\) on \((R^d , \mathcal{B}(R^d ))\) such that \(μ(A) > 0\) and \(μ(B) ≤ r^α\) for every open ball \(B\) of radius \(r > 0\) centered at a point of \(A\). Prove that \(dim(A) ≥ α\).

    Solution

  4. Suppose \(X\) is a random variable such that with probability a \(1/2\) it takes the value \(1\), and with probability a half it takes the values of \(\Uniform([0,2])\). Find the cdf of \(X\). Does it have a density?

    1. Let \(\nu\) be a finite measure on the finite measure space \((\W,\mathcal{F},\mu)\). Then, show that there is a decomposition \(\nu = \nu_1 + \nu_2\) where \(\nu_1 \ll \mu\) and \(\nu_2 \perp \mu\), where we take the last \(\perp\) to mean there exists \(A\), \(B\) measurable such that \(A \cup B = \W\), and \(\nu_2(A) = 0\), and \(\mu(B) = 0\). This is called the Lebesgue decomposition theorem. Hint: apply the Radon-Nikodym theorem to \(\mu\) and \(\lambda = \mu + \nu\), or look at Axler, Rudin or Royden.

    2. Give the Lebesgue decomposition of \(\mu_X\) with respect to Lebesgue measure on \([0,2]\).

Homework 2 due Feb 4

Solution for Problem 2, Scheffe’s Lemma

  1. Let \(\varphi \in L^1(\R, \mathcal{B}, \lambda)\), where \(\lambda\) is Lebesgue measure. Show that the Fourier transform defined by \(\hat\varphi \colon \R \to \C\) defined by

    \(\hat\varphi(u) = \int \exp(i u x ) \varphi(x) \lambda (d x)\)

    is a continuous function on \(\R\). More generally, if \(\mu\) is a finite measure on \(\R\), show that

    \(\hat\mu(u) = \int \exp(i u x ) \mu (d x)\)

    is a continuous function on \(\R\).

  2. Let \((E, \mathcal{A}, \mu)\) be a measure space, and let \((f_n)_{n \in \N}\) be a nonnegative sequence of measurable functions on \(E\). Suppose \(f_n(x) \to f(x)\) \(\mu\) a.e. and \(\int f d \mu < \infty\). Show that

    \(\int f_n d\mu \to \int f d \mu \Leftrightarrow \int \vert f_n - f \vert d\mu \to 0\)

  3. Let \(\phi \colon [0,1] \to [0,1]\) be the Cantor-Lebesgue function (Devil’s staircase). Let \(\psi(x) = \phi(x) + x\). Show that \(\psi\) is a bijective, continuous function from \([0,1]\) to \([0,2]\), and show that the image of the Cantor set \(\mathbf{C}\) under \(\psi\) has Lebesgue measure \(1\). We know that any subset of positive measure contains a nonmeasurable set (by the Vitali construction). Thus \(\psi(\mathbf{C})\) must contain a nonmeasurable set \(V\), but \(\psi^{-1}(V)\) is measurable. Conclude that

    1. There is a continuous function \(g\) and a (Lebesgue measurable) set \(B\) such that \(g^{-1}(B)\) is nonmeasurable.
    2. There is a measurable set that is not Borel.
  4. A function \(f \colon I \to \R\), where \(I = [a,b]\) is called convex if \(f(\lambda x + (1 - \lambda) y) \leq \lambda f(x) + (1-\lambda) f(y)\) for all \(x,y \in I\).

    1. Prove that if \(f'' \geq 0\) on \(I\) then \(f\) is convex on \(I\).
    2. Prove that \(f\) has left and right derivatives everywhere; i.e., \(\lim_{\e \downarrow 0} (f(x \pm \e) - f(x))/\e\) exists for all \(x \in I\). These are the left and right derivatives \(\pm \partial_\pm f(x)\) respectively. Prove that they are both nondecreasing, and that \(\partial_- f(x) \leq \partial_+ f(x)\).
    3. \(L(x) = a x + b\) is a supporting line of \(f(x)\) at \(x_0\) if \(f(x) \geq L(x)\) and \(f(x_0) = L(x_0)\). Show that \(f\) has a supporting line at every point \(x_0 \in I\).

Homework 1 due Jan 28

  1. Construct a countable family of \(\sigma\)-algebras \(\{\mathcal{F}\}_{i=1}^\infty\) such that \(\mathcal{F}_i \subset \mathcal{F}_{i+1}\) and \(\sigma( \cup_{i=1}^\infty \mathcal{F}_i ) \neq \cup_{i=1}^\infty \mathcal{F}_i\). (3.2 from Khoshnevisan)

  2. Prove that Lebesgue measure on \(\R^d\) is regular. That is, for every \(A\) measurable, we have

    \(\begin{align*} \lambda(A) & = \inf \{ \lambda(U) \colon U \text{ is open }, A \subset U \}\\ & = \sup \{ \lambda(F) \colon F \text{ is compact }, F \subset A \}\\ \end{align*}\)

    In other words, Lebesgue measure is outer and inner regular respectively.

    Extend this to all metric spaces \((E,d)\), where \(\mu\) is a finite measure on \((E,\mathcal{B}(E))\) where \(\mathcal{B}(E)\) is the Borel \(\sigma\)-algebra on \(E\). Then for every \(A \in \mathcal{B}(E)\),

    \(\begin{align*} \mu(A) & = \inf \{ \lambda(U) \colon U \text{ is open }, A \subset U \}\\ & = \sup \{ \lambda(F) \colon F \text{ is closed }, F \subset A \}\\ \end{align*}\)

    (Proposition 3.1 in Le Gall)

  3. A function \(F \colon \R \to [0,1]\) is cumulative distribution function on \(\R\) if it is

    1. It is non-decreasing and right-continuous
    2. \(\lim_{x \to -\infty} F(x) = 0\) and \(\lim_{x \to \infty} F(x) = 1\).

    Prove that if \(\mu\) is a probability measure on \((\R,\mathcal{B}(\R))\); i.e., \(\mu(\R) = 1\), then \(F(x) = \mu((-\infty,x])\) defines a distribution function on \(\R\). Conversely, if \(F(x)\) is a distribution function on \(\R\), then there is a unique probability distribution \(\mu\) on \((\R,\mathcal{B(\R)})\) such that \(F(x) = \mu((-\infty,x])\).

    (3.9 from Khoshnevisan)