Berkeley Notes
  • Introduction
  • EE120
    • Introduction to Signals and Systems
    • The Fourier Series
    • The Fourier Transform
    • Generalized transforms
    • Linear Time-Invariant Systems
    • Feedback Control
    • Sampling
    • Appendix
  • EE123
    • The DFT
    • Spectral Analysis
    • Sampling
    • Filtering
  • EECS126
    • Introduction to Probability
    • Random Variables and their Distributions
    • Concentration
    • Information Theory
    • Random Processes
    • Random Graphs
    • Statistical Inference
    • Estimation
  • EECS127
    • Linear Algebra
    • Fundamentals of Optimization
    • Linear Algebraic Optimization
    • Convex Optimization
    • Duality
  • EE128
    • Introduction to Control
    • Modeling Systems
    • System Performance
    • Design Tools
    • Cascade Compensation
    • State-Space Control
    • Digital Control Systems
    • Cayley-Hamilton
  • EECS225A
    • Hilbert Space Theory
    • Linear Estimation
    • Discrete Time Random Processes
    • Filtering
  • EE222
    • Real Analysis
    • Differential Geometry
    • Nonlinear System Dynamics
    • Stability of Nonlinear Systems
    • Nonlinear Feedback Control
Powered by GitBook
On this page
  • Discrete Time Markov Chains
  • Properties of Markov Chains
  • Class Properties
  • Long-Term Behavior of Markov Chains
  • Continuous Time Markov Chains
  • Class Properties
  • Long Term Behavior of CTMCs
  • Uniformization
  • Poisson Processes

Was this helpful?

  1. EECS126

Random Processes

PreviousInformation TheoryNextRandom Graphs

Last updated 2 years ago

Was this helpful?

Definition 47

A random/stochastic process is a sequence of random variables (Xn)n≥0(X_n)_{n\geq 0}(Xn​)n≥0​.

The random variables in a stochastic process do not have to be independently and identically distributed. In fact, if they are not, then we can get additional modeling power.

Definition 48

A random process (Xn)n∈N(X_n)_{n\in\mathbb{N}}(Xn​)n∈N​ is stationary if for all k,n>0k, n > 0k,n>0 and all events A1,⋯ ,AnA_1,\cdots,A_nA1​,⋯,An​, then

Pr{X1∈A1,⋯ ,Xn∈An}=Pr{Xk+1∈A1,⋯ ,Ak+n∈An}\text{Pr}\left\{X_1\in A_1,\cdots,X_n\in A_n\right\} = \text{Pr}\left\{X_{k+1}\in A_1,\cdots,A_{k+n}\in A_n\right\}Pr{X1​∈A1​,⋯,Xn​∈An​}=Pr{Xk+1​∈A1​,⋯,Ak+n​∈An​}

Stationarity is often a good assumption that can simplify systems which have been running for a long period of time.

Discrete Time Markov Chains

Definition 49

(Xn)n≥0(X_n)_{n\geq 0}(Xn​)n≥0​ is a Markov Chain if each random variable XiX_iXi​ takes values in a discrete set SSS (the state space), and,

∀n≥0, i,j∈S, Pr{Xn+1=j∣Xn=i,⋯ ,X0=x0}=Pr{Xn+1=i∣Xn=j}\forall n \geq 0,\ i,j\in S,\ \text{Pr}\left\{X_{n+1}=j|X_n=i,\cdots,X_0=x_0\right\} = \text{Pr}\left\{X_{n+1}=i|X_n=j\right\}∀n≥0, i,j∈S, Pr{Xn+1​=j∣Xn​=i,⋯,X0​=x0​}=Pr{Xn+1​=i∣Xn​=j}

In words, a Markov Chain is a sequence of random variables satisfying the Markov Property where probability of being in a state during the next time step only depends on the current state.

Definition 50

A temporally homogenous Markov Chain is one where the transition probabilities Pr{Xn+1=j∣Xn=i}=pij\text{Pr}\left\{X_{n+1}=j|X_n=i\right\} = p_{ij}Pr{Xn+1​=j∣Xn​=i}=pij​ for all i,j∈Si,j\in Si,j∈S and n≥0n\geq 0n≥0.

Temporally Homogenous Markov Chains don’t change their transition probabilities over time. Since the pijp_{ij}pij​ are conditional probabilities, they must satisfy

Definition 51

The transition matrix encodes the one-step transition probabilities of the Markov Chain.

Theorem 27 (Chapman-Kolmogorov Equation)

One useful thing we can comptue with Markov Chain is when the chain first enters a particular state.

Definition 52

Computing the expected hitting time is an example of a broader type of Markov Chain Analysis called First Step Analysis. In First Step Analysis, we set up a system of equations that relies on the Markov property to generate a system of equations that only look at the first transition in the chain. For expected hitting time, these look like

Properties of Markov Chains

Definition 53

Definition 54

Definition 55

Definition 56

Theorem 28

If the graph of a Markov Chain (transform the state transition diagram by making edges undirected, removing self-loops, and removing multiple edges) is a tree, then the Markov Chain is reversible.

Class Properties

A class property is a property where if one element of a class has the property, all elements of the class have the property. Markov Chains have several of these properties which allow us to classify states.

Definition 57

Definition 58

Recurrence means that we will visit a state infinitely often in the future if we start in that state, while transience means we will only visit the state finitely many times. Recurrence and transience can be easily identified from the transition diagram.

  1. Any finite communicating class which has no edges leaving the class is recurrent

  2. If a state has an edge leading outside its communicating class, then it is transient

  3. If a state is recurrent, then any state it can reach is recurrent

Definition 59

Definition 60

Positive recurrence means we visit a recurrent state so frequently that we spend a positive fraction of time in that state. Null recurrencce means we visit a recurrent state so infrequently (but still infinitely many times) that we spend virtually no time in that state.

Theorem 29

Every irreducible finite state Markov Chain is positive recurrent.

Definition 61

Definition 62

An irreducible markov chain is aperiodic if any state has period 1.

All of the above properties are class properties.

Long-Term Behavior of Markov Chains

Definition 63

Theorem 30

If an irreducible Markov Chain is at stationarity, then the flow-in equals flow-out relationship holds for any cut of the Markov Chain where a cut is a partition of the chain into two disjoint subsets.

Theorem 30 is one useful result can help solve for stationary distributions.

Theorem 31 (Big Theorem for Markov Chains)

  1. All states are positive recurrent and the stationary distribution exists, is unique, and satisfies

Theorem 32 (Convergence Theorem)

Continuous Time Markov Chains

Definition 64

To characterize how a CTMC functions, we need to define some additional quantities.

Definition 65

The jump chain is a DTMC which describes the transition probabilities between states in the CTMC

Essentially, every time we enter a state, we set an alarm clock for all other states, and then jump to the state whose alarm clock goes off first. This equivalent interpretation allows us to summarize a CTMC using the rate matrix.

Definition 66

Class Properties

Just like in DTMCs, we can classify states in the CTMC.

Definition 67

Definition 68

Definition 69

Long Term Behavior of CTMCs

CTMCs also have stationary distributions.

Definition 70

The stationary distribution of the CTMC is also related to the jump chain, but we need to normalize for the hold times.

Theorem 33

Theorem 34 (Big Theorem for CTMCs)

For an irreducible CTMC, exactly one of the following is true.

  1. All states are positive recurrent, a unique stationary distribution exists, and the stationary distribution satisfies

Uniformization

Theorem 35 (Forward Kolmogorov Equation)

Definition 71

It turns out that

Poisson Processes

Definition 72

There are two important metrics which describe counting processes.

Definition 73

Definition 74

Definition 75

The name Poisson comes from the distribution of each varible in the process.

Theorem 36

Theorem 37

Theorem 38

Poisson Processes are the only counting processes with these particular properties.

It turns out that Poisson Processes can be connected with the Order Statistics of Uniform Random Variables.

Theorem 39 (Conditional Distribution of Arrivals)

Two other useful properties of Poisson Processes involve combining and separating them.

Theorem 40 (Poisson Merging)

Theorem 41 (Poisson Splitting)

∀i,j∈S, pij≥0\forall i,j\in S,\ p_{ij} \geq 0∀i,j∈S, pij​≥0

∀i∈S, ∑j∈Spij=1\forall i\in S,\ \sum_{j\in S}p_{ij} = 1∀i∈S, ∑j∈S​pij​=1

The transition matrix of a Markov Chain is a matrix PPP where the ijth entry Pij=pijP_{ij} = p_{ij}Pij​=pij​ for all i,j∈Si,j\in Si,j∈S.

The n-step transition probabilities (i.e starting in iii and ending in jjj nnn steps later) of the Markov Chain are given by pij(n)=Pijnp_{ij}^{(n)} = P^n_{ij}pij(n)​=Pijn​.

For a A⊂SA \subset SA⊂S, the hitting time of AAA is given by

TA=min⁡n{n≥0:Xn∈A}T_A = \min_n \{ n\geq 0: X_n\in A\}TA​=minn​{n≥0:Xn​∈A}

For i∉Ai\not\in Ai∈A, E[TA∣X0=i]=1+∑jpijE[TA∣X0=j]\mathbb{E}\left[T_A|X_0 = i\right] = 1 + \sum_j p_{ij} \mathbb{E}\left[T_A|X_0 = j\right]E[TA​∣X0​=i]=1+∑j​pij​E[TA​∣X0​=j]

For i∈Ai\in Ai∈A, E[TA∣X0=i]=0\mathbb{E}\left[T_A|X_0 = i\right] = 0E[TA​∣X0​=i]=0

If ∃n≥1\exists n \geq 1∃n≥1 such that pij(n)≠0p_{ij}^{(n)} \ne 0pij(n)​=0, then jjj is accessible from iii, and we write i→ji\rightarrow ji→j.

States iii and jjj communicate with each other when i→ji\rightarrow ji→j and j→ij\rightarrow ij→i. We write this as i↔ji\leftrightarrow ji↔j.

By convention, we say that i↔ii\leftrightarrow ii↔i. It turns out that ↔\leftrightarrow↔ is an equivalence relation on the state space SSS. An equivalence relation means that

∀i∈S, i↔i\forall i\in S,\ i \leftrightarrow i∀i∈S, i↔i

∀i,j∈S, i↔j⇔j↔i\forall i,j\in S,\ i\leftrightarrow j \Leftrightarrow j \leftrightarrow i∀i,j∈S, i↔j⇔j↔i

∀i,j,k∈S,i↔k,k↔jRightarrowi↔j\forall i,j,k \in S, i\leftrightarrow k, k\leftrightarrow j \mathbb{R}ightarrow i \leftrightarrow j∀i,j,k∈S,i↔k,k↔jRightarrowi↔j

This means that ↔\leftrightarrow↔ partitions the state-space SSS into equivalence classes (i.e classes of communicating states).

A Markov Chain is irreducible if SSSis the only class.

An irreducible Markov Chain is reversible if and only if there exists a probability vector π\piπ that satisfies the Detailed Balance Equations

∀i,j∈S, πjpij=πipji\forall i,j \in S,\ \pi_j p_{ij} = \pi_i p_{ji}∀i,j∈S, πj​pij​=πi​pji​

Markov Chains which satisfy the detailed balance equations are called reversible because if X0∼πX_0\sim \piX0​∼π, then the random vectors (X0,X1,⋯ ,Xn)(X_0, X_1, \cdots, X_n)(X0​,X1​,⋯,Xn​) and (Xn,Xn−1,⋯ ,X0)(X_n, X_{n-1}, \cdots, X_0)(Xn​,Xn−1​,⋯,X0​) are equal in distribution.

A state i∈Si\in Si∈S is recurrent if given that X0=iX_0=iX0​=i, the process revisits state iiiwith probability 1.

A state is i∈Si\in Si∈Sis transient if it is not recurrent.

We can further break recurrence down if we modify the definition of hitting time to be Ti=min⁡n{n≥1:Xn=i}T_i = \min_n \{ n \geq 1 : X_n=i \}Ti​=minn​{n≥1:Xn​=i} (the first time the chain enters state iii).

State iii is positive recurrent if it is recurrent and E[Ti∣X0=i]\mathbb{E}\left[T_i|X_0=i\right] E[Ti​∣X0​=i]is finite.

State iii is null recurrent if it is recurrent and E[Ti∣X0=i]\mathbb{E}\left[T_i|X_0=i\right] E[Ti​∣X0​=i]is infinite.

For a state i∈Si\in Si∈S, we define the period of the state to be

period(i)=GCD{n≥1:pii(n)>0}.\text{period}(i) = \text{GCD}\{n\geq 1 : p_{ii}^{(n)} > 0 \}.period(i)=GCD{n≥1:pii(n)​>0}.

If we start in state iii, then revists to iii only occur at integer multiples of the period.

Since the pijp_{ij}pij​ completely characterize the Markov Chain, we can also describe what happens to the chain in the limit.

A probability distribution π\piπ over the states is a stationary distribution if π=πP\pi = \pi Pπ=πP

It is called a stationary distribution because the distribution over states is invariant with time. A Markov Chain is only at stationarity if and only if it has been started from the stationary distribution. The relationship π=πP\pi = \pi Pπ=πP can be expanded for the jth element to show that any stationary distribution must satisfy the Global Balance Equations:

πj=∑ipijπi.\pi_j = \sum_i p_{ij}\pi_i.πj​=∑i​pij​πi​.

Note that if a distribution π\piπ satisfies the detailed balance equations from Definition 56, then π\piπ also satisfies Definition 63.

Both the global balance equations and detailed balance equations can be conceptualized as statements of flow. If each πj\pi_jπj​ indicates how much mass is placed on state jjj, then the global balance equations tell us the mass leaving the node (going to each neighbor iii in proportion to pijp_{ij}pij​) is equal to the mass entering the node (which must sum to πj\pi_jπj​ since it is a stationary distribution. Rather than looking at the flow of the whole chain, the detailed balance equations look at the flow between two states. The mass iii gives to jjj is equal to the mass jjj gives to iii.

Let (Xn)n≥0(X_n)_{n\geq 0}(Xn​)n≥0​ be an irreducible Markov Chain. Then one of the following is true.

Either all states are transient, or all states are null recurrent, and no stationary distribution exists, and lim⁡n→∞pij(n)=0\lim_{n\to\infty}p_{ij}^{(n)} = 0limn→∞​pij(n)​=0.

πj=lim⁡n→∞1n∑k=0nPij(k)=1E[Tj∣X0=j].\pi_j = \lim_{n\to\infty}\frac{1}{n}\sum_{k=0}^{n}P_{ij}^{(k)} = \frac{1}{\mathbb{E}\left[T_j|X_0=j\right] }.πj​=limn→∞​n1​∑k=0n​Pij(k)​=E[Tj​∣X0​=j]1​.

If the Markov Chain is aperiodic, then lim⁡n→∞pij(n)=πj\lim_{n\to\infty}p_{ij}^{(n)} = \pi_jlimn→∞​pij(n)​=πj​

One consequence of Theorem 31 is that it means the stationary distribution π\piπ of a reversible Markov Chain is unique. This makes solving the detailed balance equations a good technique of solving for the stationary distribution. If a stationary distribution exists, then we can also say when the chain will converge to the stationary distribution.

If a chain is irreducible, positive, recurrent, and aperiodic with stationary distribution π\piπ, then the distribution at time nnn πn→π\pi_n \to \piπn​→π

A process (Xt)t≥0(X_t)_{t\geq 0}(Xt​)t≥0​ taking values in a countable state space SSS is a temporally homogenous continuous time markov chain if it satisfies the Markov Property

Pr{Xt+τ=j∣Xt=i,Xs=is,0≤s≤t}=Pr{Xt+τ=j∣Xt=i}=Pr{Xτ=j∣X0=i}\text{Pr}\left\{X_{t+\tau}=j|X_t=i,X_s=i_s, 0 \leq s \leq t\right\} = \text{Pr}\left\{X_{t+\tau}=j|X_t=i\right\} = \text{Pr}\left\{X_\tau = j | X_0 = i\right\}Pr{Xt+τ​=j∣Xt​=i,Xs​=is​,0≤s≤t}=Pr{Xt+τ​=j∣Xt​=i}=Pr{Xτ​=j∣X0​=i}

qiq_iqi​ is the transition rate of state iii

pijp_{ij}pij​ is the transition probability bewteen states iii and jjj

Every time a CTMC enters a state iii, it will hold in that state for Exp(qi)\text{Exp}(q_i)Exp(qi​) time before transitioning to the next state jjj with probability pijp_{ij}pij​.

Note that the jump chain cannot have self-loops (pii=0p_{ii}=0pii​=0) because otherwise the amount of time spent in state iii would not be exponentially distributed. An alternative interpretation of a CTMC is

Define jump rates qij=qipijq_{ij} = q_i p_{ij}qij​=qi​pij​

On entering state iii, jump to j⋆=argminjTjj^\star = \text{argmin}_j T_jj⋆=argminj​Tj​ where Tj∼Exp(qij)T_j \sim \text{Exp}(q_{ij})Tj​∼Exp(qij​) for all j≠ij\neq ij=i and are independent from each other.

Qij={−qi if i=jqij if i≠jQ_{ij} = \begin{cases} -q_i & \text{ if } i=j\\ q_{ij} & \text{ if } i \neq j \end{cases}Qij​={−qi​qij​​ if i=j if i=j​

Following from the first interprentation, all entries of QQQ are non-negative, and the rows must sum to 0. One useful quantity which we can define is how long it takes to come back to a particular state.

The time to first re-entry of state jjj is

Tj=min⁡{t≥0:Xt=j and Xs≠j for some s<t}T_j = \min \{t \geq 0: X_t=j \text{ and } X_s \neq j \text{ for some } s < t\}Tj​=min{t≥0:Xt​=j and Xs​=j for some s<t}

Since a CTMC is essentially a DTMC where we hold in each state for an exponential amount of time, we can apply First Step Analysis in essentially the same way that we do for DTMCs. In fact, hitting probabilities will look exactly the same since we can just use the jump chain to comute the transition probabilities. The only differences will arise when we consider the time dependent quantities. For hitting times (how long it takes to enter a state from A⊆SA\subseteq SA⊆S),

If i∈A,E[TA∣X0=i]=0i\in A, \mathbb{E}\left[T_A|X_0=i\right] = 0i∈A,E[TA​∣X0​=i]=0

If i∉A,E[TA∣X0=i]=1qi+∑j∈SpijE[TA∣X0=j]i \not \in A, \mathbb{E}\left[T_A|X_0=i\right] = \frac{1}{q_i} + \sum_{j\in S} p_{ij}\mathbb{E}\left[T_A|X_0=j\right]i∈A,E[TA​∣X0​=i]=qi​1​+∑j∈S​pij​E[TA​∣X0​=j]

States iii and jjj communicate with eachc other if iii and jjjcommunicate in the jump chain.

State jjj is transient if given X0=jX_0=jX0​=j, the process enters jjjfinitely many times with probability 1. Otherwise, it is recurrent.

A state jjjis positive recurrent if its time to first re-entry is finite, and null recurrent otherwise.

A probability vector π\piπ is a stationary ditribution for a CTMC with rate matrix QQQ if

πQ=0⇔πjqj=∑i≠jπiqij.\pi Q = 0 \Leftrightarrow \pi_jq_j = \sum_{i\neq j}\pi_iq_{ij}.πQ=0⇔πj​qj​=∑i=j​πi​qij​.

If π\piπ is a stationary distribution for a CTMC, then the stationary distribution of the jump chain is given by

π~i=πiqi∑jπjqj\tilde{\pi}_i = \frac{\pi_i q_i}{\sum_j \pi_j q_j}π~i​=∑j​πj​qj​πi​qi​​

To describe how a CTMC behaves over time, first define pij(t)=Pr{Xt=j∣X0=i}p_{ij}^{(t)} = \text{Pr}\left\{X_t=j|X_0=i\right\} pij(t)​=Pr{Xt​=j∣X0​=i}and mj=E[Tj∣X0=j]m_j = \mathbb{E}\left[T_j|X_0=j\right] mj​=E[Tj​∣X0​=j].

All states are transient or null recurrent, no stationary distribution exists, and lim⁡t→∞pij(t)=0\lim_{t\to\infty}p_{ij}^{(t)} = 0limt→∞​pij(t)​=0

πj=1mjqj=lim⁡t→∞pij(t)\pi_j = \frac{1}{m_jq_j} = \lim_{t\to\infty}p_{ij}^{(t)}πj​=mj​qj​1​=limt→∞​pij(t)​

Let P(t)P^{(t)}P(t) denote the matrix of transition probabiltiies at time t>0t>0t>0. By the Markov property, we know that P(s+t)=P(s)P(t)P^{(s+t)} = P^{(s)}P^{(t)}P(s+t)=P(s)P(t). For h≈0,P(h)≈I+hQ+o(h)h \approx 0, P^{(h)} \approx I + hQ + o(h)h≈0,P(h)≈I+hQ+o(h). This approximation allows us to compute the derivative of P(t)P^{(t)}P(t).

∂∂tP(t)=lim⁡h→0P(t+h)−P(t)h=P(t)Q\frac{\partial}{\partial t}P^{(t)} = \lim_{h\to 0}\frac{P^{(t+h)} - P^{(t)}}{h} = P^{(t)}Q∂t∂​P(t)=limh→0​hP(t+h)−P(t)​=P(t)Q

Theorem 35 tells us that the transition probabilties P(t)=etQP^{(t)} = e^{tQ}P(t)=etQ for all t≥0t \geq 0t≥0. This is why Q is sometimes called the generator matrix: it generates the transition probabilities. However, matrix exponentials are difficult to compute. Instead, we can turn to Uniformization, which allows us to estimate P(t)P^{(t)}P(t) by simulating it through a DTMC.

Given a CTMC where ∃M\exists M∃M such that qi≤Mq_{i} \leq Mqi​≤M for all i,j∈Si,j\in Si,j∈S. Fix a γ≥M\gamma \geq Mγ≥M, and the uniformized chain will be a DTMC with transition probabilities pij=qijγp_{ij} = \frac{q_{ij}}{\gamma}pij​=γqij​​ and pii=1−qiγp_{ii} = 1 - \frac{q_i}{\gamma}pii​=1−γqi​​.

Pu=I+1γQ.P_u = I + \frac{1}{\gamma}Q.Pu​=I+γ1​Q.

Pun=(I+1γQ)n≈enγQP_u^n = \left( I + \frac{1}{\gamma}Q \right)^n \approx e^{\frac{n}{\gamma}Q}Pun​=(I+γ1​Q)n≈eγn​Q

when 1γ\frac{1}{\gamma}γ1​ is small. This means that we can approximate the transition probabilties of the CTMC using the uniformized chain. Observe that uniformization also helps in finding the stationary distribution since the stationary distribution of the uniformized chain is identical to the original chain.

πPu=π+1γπQ=π⇔πQ=0.\pi P_u = \pi + \frac{1}{\gamma}\pi Q = \pi \Leftrightarrow \pi Q = 0.πPu​=π+γ1​πQ=π⇔πQ=0.

A counting process (Nt)t≥0(N_t)_{t\geq 0}(Nt​)t≥0​is a non-decreasing, continuous time, integer valued random process which has right continuous sample paths.

The ith arrival time TiT_iTi​ is given by

Ti=min⁡t{t≥0: Nt≥i}T_i = \min_t \{ t \geq 0: \ N_t \geq i \}Ti​=mint​{t≥0: Nt​≥i}

The ith inter-arrival time SiS_iSi​ is given by

Si=Ti−Ti−1,i>0S_i = T_i - T_{i-1}, i > 0Si​=Ti​−Ti−1​,i>0

A rate λ\lambdaλ Poisson Process is a counting process with independently and identically distributed inter-arrival times Si∼Exp(λ)S_i \sim \text{Exp}(\lambda)Si​∼Exp(λ).

If (Nt)t≥0(N_t)_{t\geq 0}(Nt​)t≥0​ is a rate λ\lambdaλ Poisson Process, then for each t≥0t\geq 0t≥0, Nt∼Poisson(λt)N_t\sim \text{Poisson}(\lambda t)Nt​∼Poisson(λt)

A Poisson Process is a special case of a CTMC where the transition rates qi=λq_i = \lambdaqi​=λ and the transition probabilties pijp_{ij}pij​ are 1 if j=i+1j=i+1j=i+1 and 0 otherwise. Since the inter-arrival times are memoryless and i.i.d, Poisson Processes have many useful properties.

If (Nt)t≥0(N_t)_{t\geq 0}(Nt​)t≥0​ is a rate λ\lambdaλ Poisson Process, then (Nt+s−Ns)t≥0(N_{t+s} - N_s)_{t\geq0}(Nt+s​−Ns​)t≥0​ is also a rate λ\lambdaλ Poisson Process for all s≥0s \geq 0s≥0and is independent of the original process.

For t0<t1<…<tkt_0 < t_1 <\ldots< t_kt0​<t1​<…<tk​, then the increments of a rate λ\lambdaλ Poisson Process (Nt1−Nt0),(Nt2−Nt1),…,(Ntk−Ntk−1)(N_{t_1} - N_{t_0}), (N_{t_2} - N_{t_1}),\ldots,(N_{t_k} - N_{t_{k-1}})(Nt1​​−Nt0​​),(Nt2​​−Nt1​​),…,(Ntk​​−Ntk−1​​) are independent and Nti−Nti−1∼Poisson(λ(ti−ti−1))N_{t_i} - N_{t_{i-1}} \sim \text{Poisson}(\lambda(t_i - t_{i-1}))Nti​​−Nti−1​​∼Poisson(λ(ti​−ti−1​))

Conditioned on Nt=nN_t = nNt​=n, the random vector T1,T2,⋯ ,TnT_1, T_2, \cdots, T_nT1​,T2​,⋯,Tn​ has the same distribution as the order statistics of nnn random variables U∼Uniform(0,t)U\sim \text{Uniform}(0, t)U∼Uniform(0,t).

What Theorem 39 says is that given nnn arrivals up to time ttt occur, the distribution of arrival times is equivalent to taking nnn i.i.d uniform random variables and sorting them.

If N1,tN_{1,t}N1,t​ and N2,tN_{2,t}N2,t​ are independent Poisson Processes with rates λ1\lambda_1λ1​ and λ2\lambda_2λ2​, then N1,t+N2,tN_{1, t} + N_{2,t}N1,t​+N2,t​ is a Poisson Process with rate λ1+λ2\lambda_1+\lambda_2λ1​+λ2​.

Let p(x)p(x)p(x) be a probability distribution and NtN_tNt​ be a rate λ\lambdaλ Poisson process. If each arrival is marked with the label iii independently with probability p(x=i)p(x=i)p(x=i), then Ni,tN_{i,t}Ni,t​, the process counting the number of arrivals labeled iii is an independent Poisson Process with rate λpi\lambda p_iλpi​.