# [Monte Carlo Methods and Applications](../index.html)
### CMU 21-387 | 15-327 | 15-627 | 15-860 FALL 2023
[Home](../index.html)
## Midterm (Fall 2023)
### Instructions
Your take-home midterm is comprised of the five problems below (four written, one coding). Each question is worth the same amount (20%).
The midterm must be handed in no later than **4:00pm EDT on Monday, October 9th.** Please note that _this deadline is different from the usual homework due date!_ You can hand-in your exam solutions the same way you hand in regular assignments (via Gradescope).
Unlike assignments, **the midterm is NOT collaborative**. You must complete these problems completely on your own, without discussing with your classmates.
### Problem 1: Checking for Bias and Consistency
Suppose you're given a black-box function `estimate(N)` which computes an estimate of the integral _I_ of a function _f_ with finite variance, using `N` samples. However, you do not know ahead of time whether this black box estimator is unbiased or even consistent.
Sketch a strategy for checking, empirically, whether the `estimate` function satisfies these two properties. In other words, for each property describe an algorithm to check whether it holds to reasonable degree of certainty (so, two algorithms total). In particular, you should be able to get more and more confidence about the answers by spending more and more compute time. **Important:** _you may assume that you know the true value of I_. You can also assume that, if the black-box function uses random sampling, it uses a different random seed upon each invocation.
**Note:** your algorithm can be described in words and/or in pseudocode, but it _must be detailed enough that someone could implement it_ without knowing anything about Monte Carlo, probability, etc. In other words, the algorithm you describe should consist of only basic low-level programming constructs such as loops, arithmetic, etc. Answers that give high-level descriptions like _“…then compute the expected value…”_ will receive no points.
### Problem 2: Constantly Thinking About Markov Processes
Let \(X_0, X_1, \dots\) be an irreducible, time homogeneous, Markov chain on a finite state space \(\mathcal X\).
Suppose \(f \colon \mathcal X \to \mathbb{R}\) is a function such that for all \(x \in \mathcal X\) we have
\[ f(x) = \mathbb{E} ( f(X_1) ~|~ X_0 = x )\,. \]
Must \(f\) be a constant function?
If yes, prove it.
If no, find an example of a non-constant function \(f\) and an irreducible Markov chain that satisfies the above property.
**Note:** Here \(\mathbb{E} ( f(X_1) ~|~ X_0 = x ) = \sum_{y \in \mathcal X} P(x, y) f(y)\), where \(P\) is the transition matrix of the Markov chain.
### Problem 3: Sampling in the Dark

Consider the sets described below:
1. The set of all black squares on a \(10 \times 10\) checkerboard, i.e., the union of all squares with index \((i,j) \in \{1, \ldots, 10\}^2\) such that \(i+j \equiv 1\ (\!\!\!\!\mod{} 2)\).
2. The set of configurations in an Ising model on a \(10 \times 10\) grid where exactly 50% of the states are “spin up” and 50% are “spin down.” In other words, the set of functions \(\sigma: \{1, \ldots, 10\}^2 \to \{+1,-1\}\) such that \(\sum_{i=1}^{10} \sum_{j=1}^{10} \sigma(i,j) = 0\).
3. The set of all corners of a \(10 \times 10\) grid, i.e., the set \(\{0, \ldots, 10\}^2 \subset [0,10]^2\).
4. The circle of radius 5, centered on a \(10 \times 10\) grid, i.e., the set \(\{(x,y) \in [0,10]^2 | (x-5)^2 + (y-5)^2 = 5^2\}\).
5. The union of all circular disks of radius 1/2, centered on each cell of a \(10 \times 10\) grid, i.e., the set \(\cup_{i=1}^{10} \cup_{j=1}^{10} D_{ij}\) where \(D_{ij} := \{(x,y) \in [0,10]^2 | (i-\tfrac{1}{2}-x)^2 + (j-\tfrac{1}{2}-y)^2 \leq (1/2)^2\}\).
Suppose your friend is trying to draw uniformly random samples from these sets, but has access to them only via black-box functions `inSetN(p)` (for `N = 1, ..., 5`). Each function takes as input a point `p`, returning `True` if the point is in the set and `False` otherwise. In particular, the function in item (2) takes a point \(p \in \{+1,-1\}^{10 \times 10}\) (i.e., a \(10 \times 10\) array of values \(\pm 1\)), and the functions in items (1), (3), (4,), and (5) take a point \(p \in [0,10]^2\), i.e., a pair of coordinates \((x,y)\) in the \(10 \times 10\) box.
Importantly, your friend _does not have access to the descriptions above_, i.e., they do not know how these functions are defined, nor what the sets look like. All they know is (i) the domain of each function, and (ii) that the function returns true if and only if the given point is inside the region (and false otherwise).
Give a strategy your friend could use to generate samples from these sets, and rank the cost (using this strategy) of sampling the five different sets from “easiest” to “hardest”, possibly with ties. You can ignore the cost of evaluating `inSet` (e.g., assume it's negligible relative to the cost of sample generation), and should also assume that all calculations are performed in fixed-precision floating point.
### Problem 4: Making π with Antithetic Sampling

Consider integrating \(\pi\) using antithetic sampling, i.e., we want to integrate the indicator function \(f(p) := 1 - \text{step}(\|p\| - 1)\) over the square \(p \in [-1,1] \times [-1,1]\), where \(\text{step}(x)\) is the step function equal to zero for \(x \leq 0\) and one for \(x > 0\). Performing a reflection around the origin (\(x \mapsto -x\)) obviously doesn't make any difference, because the integrand \((f(x) + f(-x))/2\) will look just like the original integrand \(f(x)\). Suppose we instead integrate over just the upper-right quadrant \(p \in [0,1] \times [0,1]\), using pairs of samples \(f(p)\), \(f(-(p-c)+c)\), where \(c := (1/2,1/2)\) is the center of this quadrant.
Does this scheme reduce variance relative to taking the same total number of samples with basic Monte Carlo? You must justify your answer in _quantitative_ terms.
### Problem 5: Rejection Sampling from a Multimodal Distribution
Download this Jupyter notebook: [download here](midterm.ipynb). In this coding task, you will implement two strategies, both based on rejection sampling, for sampling uniformly from a multimodal distribution described by a probability density function \(p(x)\). In particular, the two strategies are:
* Rejection sampling, using an upper bound on all of \(p(x)\).
* Rejection sampling, using separate upper bounds on each mode.
For generating an equal number of samples \(N\), your code should be more efficient with the second strategy (meaning it rejects fewer candidates, and thus takes fewer trials to achieve the same number \(N\) of samples.)
Note: You don't have to empirically verify that your second strategy does better, but if you implement some way of checking it, you'll have more confidence that you've done well on the exam! ;-))
All instructions are contained within text cells in the notebook. When you're done implementing all the specified functions in the notebook, submit your completed notebook on Gradescope **with any output left visible in the notebook**.
This page rendered with showdownjs.