A random/stochastic process is a sequence of random variables (Xn)n≥0.
A random process (Xn)n∈N is stationary if for all k,n>0 and all events A1,⋯,An, then
Pr{X1∈A1,⋯,Xn∈An}=Pr{Xk+1∈A1,⋯,Ak+n∈An}
(Xn)n≥0 is a Markov Chain if each random variable Xi takes values in a discrete set S (the state space), and,
∀n≥0, i,j∈S, Pr{Xn+1=j∣Xn=i,⋯,X0=x0}=Pr{Xn+1=i∣Xn=j}
A temporally homogenous Markov Chain is one where the transition probabilities Pr{Xn+1=j∣Xn=i}=pij for all i,j∈S and n≥0.
Temporally Homogenous Markov Chains don’t change their transition probabilities over time. Since the pij are conditional probabilities, they must satisfy
∀i,j∈S, pij≥0
∀i∈S, ∑j∈Spij=1
The transition matrix of a Markov Chain is a matrix P where the ijth entry Pij=pij for all i,j∈S.
The n-step transition probabilities (i.e starting in i and ending in j n steps later) of the Markov Chain are given by pij(n)=Pijn.
For a A⊂S, the hitting time of A is given by
TA=minn{n≥0:Xn∈A}
For i∈A, E[TA∣X0=i]=1+∑jpijE[TA∣X0=j]
For i∈A, E[TA∣X0=i]=0
If ∃n≥1 such that pij(n)=0, then j is accessible from i, and we write i→j.
States i and j communicate with each other when i→j and j→i. We write this as i↔j.
By convention, we say that i↔i. It turns out that ↔ is an equivalence relation on the state space S. An equivalence relation means that
∀i∈S, i↔i
∀i,j∈S, i↔j⇔j↔i
∀i,j,k∈S,i↔k,k↔jRightarrowi↔j
This means that ↔ partitions the state-space S into equivalence classes (i.e classes of communicating states).
A Markov Chain is irreducible if Sis the only class.
An irreducible Markov Chain is reversible if and only if there exists a probability vector π that satisfies the Detailed Balance Equations
∀i,j∈S, πjpij=πipji
Markov Chains which satisfy the detailed balance equations are called reversible because if X0∼π, then the random vectors (X0,X1,⋯,Xn) and (Xn,Xn−1,⋯,X0) are equal in distribution.
A state i∈S is recurrent if given that X0=i, the process revisits state iwith probability 1.
A state is i∈Sis transient if it is not recurrent.
We can further break recurrence down if we modify the definition of hitting time to be Ti=minn{n≥1:Xn=i} (the first time the chain enters state i).
State i is positive recurrent if it is recurrent and E[Ti∣X0=i]is finite.
State i is null recurrent if it is recurrent and E[Ti∣X0=i]is infinite.
For a state i∈S, we define the period of the state to be
period(i)=GCD{n≥1:pii(n)>0}.
If we start in state i, then revists to i only occur at integer multiples of the period.
Since the pij completely characterize the Markov Chain, we can also describe what happens to the chain in the limit.
A probability distribution π over the states is a stationary distribution if π=πP
It is called a stationary distribution because the distribution over states is invariant with time. A Markov Chain is only at stationarity if and only if it has been started from the stationary distribution. The relationship π=πP can be expanded for the jth element to show that any stationary distribution must satisfy the Global Balance Equations:
πj=∑ipijπi.
Note that if a distribution π satisfies the detailed balance equations from Definition 56, then π also satisfies Definition 63.
Both the global balance equations and detailed balance equations can be conceptualized as statements of flow. If each πj indicates how much mass is placed on state j, then the global balance equations tell us the mass leaving the node (going to each neighbor i in proportion to pij) is equal to the mass entering the node (which must sum to πj since it is a stationary distribution. Rather than looking at the flow of the whole chain, the detailed balance equations look at the flow between two states. The mass i gives to j is equal to the mass j gives to i.
Let (Xn)n≥0 be an irreducible Markov Chain. Then one of the following is true.
Either all states are transient, or all states are null recurrent, and no stationary distribution exists, and limn→∞pij(n)=0.
πj=limn→∞n1∑k=0nPij(k)=E[Tj∣X0=j]1.
If the Markov Chain is aperiodic, then limn→∞pij(n)=πj
One consequence of Theorem 31 is that it means the stationary distribution π of a reversible Markov Chain is unique. This makes solving the detailed balance equations a good technique of solving for the stationary distribution. If a stationary distribution exists, then we can also say when the chain will converge to the stationary distribution.
If a chain is irreducible, positive, recurrent, and aperiodic with stationary distribution π, then the distribution at time n πn→π
A process (Xt)t≥0 taking values in a countable state space S is a temporally homogenous continuous time markov chain if it satisfies the Markov Property
Pr{Xt+τ=j∣Xt=i,Xs=is,0≤s≤t}=Pr{Xt+τ=j∣Xt=i}=Pr{Xτ=j∣X0=i}
qi is the transition rate of state i
pij is the transition probability bewteen states i and j
Every time a CTMC enters a state i, it will hold in that state for Exp(qi) time before transitioning to the next state j with probability pij.
Note that the jump chain cannot have self-loops (pii=0) because otherwise the amount of time spent in state i would not be exponentially distributed. An alternative interpretation of a CTMC is
Define jump rates qij=qipij
On entering state i, jump to j⋆=argminjTj where Tj∼Exp(qij) for all j=i and are independent from each other.
Qij={−qiqij if i=j if i=j
Following from the first interprentation, all entries of Q are non-negative, and the rows must sum to 0. One useful quantity which we can define is how long it takes to come back to a particular state.
The time to first re-entry of state j is
Tj=min{t≥0:Xt=j and Xs=j for some s<t}
Since a CTMC is essentially a DTMC where we hold in each state for an exponential amount of time, we can apply First Step Analysis in essentially the same way that we do for DTMCs. In fact, hitting probabilities will look exactly the same since we can just use the jump chain to comute the transition probabilities. The only differences will arise when we consider the time dependent quantities. For hitting times (how long it takes to enter a state from A⊆S),
If i∈A,E[TA∣X0=i]=0
If i∈A,E[TA∣X0=i]=qi1+∑j∈SpijE[TA∣X0=j]
States i and j communicate with eachc other if i and jcommunicate in the jump chain.
State j is transient if given X0=j, the process enters jfinitely many times with probability 1. Otherwise, it is recurrent.
A state jis positive recurrent if its time to first re-entry is finite, and null recurrent otherwise.
A probability vector π is a stationary ditribution for a CTMC with rate matrix Q if
πQ=0⇔πjqj=∑i=jπiqij.
If π is a stationary distribution for a CTMC, then the stationary distribution of the jump chain is given by
π~i=∑jπjqjπiqi
To describe how a CTMC behaves over time, first define pij(t)=Pr{Xt=j∣X0=i}and mj=E[Tj∣X0=j].
All states are transient or null recurrent, no stationary distribution exists, and limt→∞pij(t)=0
πj=mjqj1=limt→∞pij(t)
Let P(t) denote the matrix of transition probabiltiies at time t>0. By the Markov property, we know that P(s+t)=P(s)P(t). For h≈0,P(h)≈I+hQ+o(h). This approximation allows us to compute the derivative of P(t).
∂t∂P(t)=limh→0hP(t+h)−P(t)=P(t)Q
Theorem 35 tells us that the transition probabilties P(t)=etQ for all t≥0. This is why Q is sometimes called the generator matrix: it generates the transition probabilities. However, matrix exponentials are difficult to compute. Instead, we can turn to Uniformization, which allows us to estimate P(t) by simulating it through a DTMC.
Given a CTMC where ∃M such that qi≤M for all i,j∈S. Fix a γ≥M, and the uniformized chain will be a DTMC with transition probabilities pij=γqij and pii=1−γqi.
Pu=I+γ1Q.
Pun=(I+γ1Q)n≈eγnQ
when γ1 is small. This means that we can approximate the transition probabilties of the CTMC using the uniformized chain. Observe that uniformization also helps in finding the stationary distribution since the stationary distribution of the uniformized chain is identical to the original chain.
πPu=π+γ1πQ=π⇔πQ=0.
A counting process (Nt)t≥0is a non-decreasing, continuous time, integer valued random process which has right continuous sample paths.
The ith arrival time Ti is given by
Ti=mint{t≥0: Nt≥i}
The ith inter-arrival time Si is given by
Si=Ti−Ti−1,i>0
A rate λ Poisson Process is a counting process with independently and identically distributed inter-arrival times Si∼Exp(λ).
If (Nt)t≥0 is a rate λ Poisson Process, then for each t≥0, Nt∼Poisson(λt)
A Poisson Process is a special case of a CTMC where the transition rates qi=λ and the transition probabilties pij are 1 if j=i+1 and 0 otherwise. Since the inter-arrival times are memoryless and i.i.d, Poisson Processes have many useful properties.
If (Nt)t≥0 is a rate λ Poisson Process, then (Nt+s−Ns)t≥0 is also a rate λ Poisson Process for all s≥0and is independent of the original process.
For t0<t1<…<tk, then the increments of a rate λ Poisson Process (Nt1−Nt0),(Nt2−Nt1),…,(Ntk−Ntk−1) are independent and Nti−Nti−1∼Poisson(λ(ti−ti−1))
Conditioned on Nt=n, the random vector T1,T2,⋯,Tn has the same distribution as the order statistics of n random variables U∼Uniform(0,t).
What Theorem 39 says is that given n arrivals up to time t occur, the distribution of arrival times is equivalent to taking n i.i.d uniform random variables and sorting them.
If N1,t and N2,t are independent Poisson Processes with rates λ1 and λ2, then N1,t+N2,t is a Poisson Process with rate λ1+λ2.
Let p(x) be a probability distribution and Nt be a rate λ Poisson process. If each arrival is marked with the label i independently with probability p(x=i), then Ni,t, the process counting the number of arrivals labeled i is an independent Poisson Process with rate λpi.