Berkeley Notes
  • Introduction
  • EE120
    • Introduction to Signals and Systems
    • The Fourier Series
    • The Fourier Transform
    • Generalized transforms
    • Linear Time-Invariant Systems
    • Feedback Control
    • Sampling
    • Appendix
  • EE123
    • The DFT
    • Spectral Analysis
    • Sampling
    • Filtering
  • EECS126
    • Introduction to Probability
    • Random Variables and their Distributions
    • Concentration
    • Information Theory
    • Random Processes
    • Random Graphs
    • Statistical Inference
    • Estimation
  • EECS127
    • Linear Algebra
    • Fundamentals of Optimization
    • Linear Algebraic Optimization
    • Convex Optimization
    • Duality
  • EE128
    • Introduction to Control
    • Modeling Systems
    • System Performance
    • Design Tools
    • Cascade Compensation
    • State-Space Control
    • Digital Control Systems
    • Cayley-Hamilton
  • EECS225A
    • Hilbert Space Theory
    • Linear Estimation
    • Discrete Time Random Processes
    • Filtering
  • EE222
    • Real Analysis
    • Differential Geometry
    • Nonlinear System Dynamics
    • Stability of Nonlinear Systems
    • Nonlinear Feedback Control
Powered by GitBook
On this page

Was this helpful?

  1. EECS127

Duality

Definition 40

A primal optimization problem is given by

p∗=min⁡x∈Rnf0(x):∀i∈[1,m] fi(x)≤0,∀k∈[1,n] hk(x)=0p^* = \min_{\mathbf{x}\in\mathbb{R}^n} f_0(\mathbf{x}) : \forall i\in[1,m]\ f_i(\mathbf{x}) \leq 0, \forall k\in[1,n]\ h_k(\mathbf{x}) = 0p∗=minx∈Rn​f0​(x):∀i∈[1,m] fi​(x)≤0,∀k∈[1,n] hk​(x)=0

The primal problem is essentially the standard form of optimization. There are no assumptions of convexity on any of the functions involved. We can would like to express primal problems as a min-max optimization with no constraints.

Definition 41

The Lagrangian L(x,λ,μ)\mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu})L(x,λ,μ) using Lagrange Multipliers λ\boldsymbol{\lambda}λ and μ\boldsymbol{\mu}μ is given by

L(x,λ,μ)=f0(x)+∑i=1mλifi(x)+∑k=1nμihi(x)\mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu}) = f_0(\mathbf{x}) + \sum_{i=1}^m\lambda_i f_i(\mathbf{x}) + \sum_{k=1}^n \mu_i h_i(\mathbf{x})L(x,λ,μ)=f0​(x)+∑i=1m​λi​fi​(x)+∑k=1n​μi​hi​(x)

The Lagrangian achieves the goal of removing the constraints in the min-max optimization

p∗=min⁡x∈Rnmax⁡λ≥0,μL(x,λ,μ)p^* = \min_{\mathbf{x}\in\mathbb{R}^n}\max_{\boldsymbol{\lambda}\geq \boldsymbol{0}, \boldsymbol{\mu}} \mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu})p∗=minx∈Rn​maxλ≥0,μ​L(x,λ,μ)

This is true because if any inequality constraints are violated, then fi(x)≥0f_i(\mathbf{x}) \geq 0fi​(x)≥0, and the maximization could set λi\lambda_iλi​ very large to make the overall problem ∞\infty∞, and if any equality constraints are violated, then hk(x)≠0h_k(\mathbf{x}) \ne 0hk​(x)=0, and the maximization would set μi\mu_iμi​ to a very large number of the same sign as hk(x)h_k(\mathbf{x})hk​(x) to make the overall problem ∞\infty∞. Thus the minimax problem is equivalent to the original problem. At this point, it might be easier to solve the problem if the order of min and max were switched.

Theorem 22 (Minimax Inequality)

For any sets X,YX, YX,Y and any function F:X×Y→RF:X\times Y\to\mathbb{R}F:X×Y→R

min⁡x∈Xmax⁡y∈YF(x,y)≥max⁡y∈Ymin⁡x∈XF(x,y)\min_{\mathbf{x}\in X}\max_{\mathbf{y}\in Y} F(\mathbf{x}, \mathbf{y}) \geq \max_{\mathbf{y}\in Y}\min_{\mathbf{x}\in X}F(\mathbf{x}, \mathbf{y})minx∈X​maxy∈Y​F(x,y)≥maxy∈Y​minx∈X​F(x,y)

Theorem 22 can be interpreted as a game where there is a minimizing player and a maximizing player. If the maximizer goes first, it will always produce a higher score than if the minimizer goes first (unless they are equal). We can now apply Theorem 22 to switch the min⁡\minmin and max⁡\maxmax in our optimization with the Lagrangian.

Theorem 23 (Weak Duality)

min⁡x∈Rnmax⁡λ≥0,μL(x,λ,μ)≥max⁡λ≥0,μmin⁡x∈RnL(x,λ,μ)\min_{\mathbf{x}\in\mathbb{R}^n}\max_{\boldsymbol{\lambda}\geq \boldsymbol{0}, \boldsymbol{\mu}} \mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu}) \geq \max_{\boldsymbol{\lambda}\geq \boldsymbol{0}, \boldsymbol{\mu}} \min_{\mathbf{x}\in\mathbb{R}^n} \mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu})minx∈Rn​maxλ≥0,μ​L(x,λ,μ)≥maxλ≥0,μ​minx∈Rn​L(x,λ,μ)

What weak duality does is convert our minimization problem to a maximization problem.

Definition 42

The dual function of the primal problem is given by

g(λ,μ)=min⁡xL(x,λ,μ)g(\boldsymbol{\lambda}, \boldsymbol{\mu}) = \min_{\mathbf{x}} \mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu})g(λ,μ)=minx​L(x,λ,μ)

Note that ggg is a concave function because it is the pointwise minimum of functions that are affine in μ\boldsymbol{\mu}μ and λ\boldsymbol{\lambda}λ. A maximization of a concave function over a convex set is a convex problem, so the dual problem (minimizing ggg) is convex. Thus duality achieves two primary purposes.

  1. It removes constraints, potentially making the problem easier to solve.

  2. It can turn a non-convex problems into a convex one.

Even when there are no constraints, we can sometimes introduce constraints to leverage duality by adding slack variables that are equal to expressions in the objective.

Strong Duality

In some cases, duality gives not just a lower bound, but an exact value. When this happens, we have Strong Duality.

Theorem 24 (Sion's MiniMax Theorem)

Let X⊆RnX\subseteq\mathbb{R}^nX⊆Rn be convex, and Y⊆RmY\subseteq\mathbb{R}^mY⊆Rm be bounded and closed (compact). Let F:X×Y→RF:X \times Y \to \mathbb{R}F:X×Y→R be a function such that ∀y, F(⋅,y)\forall y,\ F(\cdot, y)∀y, F(⋅,y) is convex and continuous, and ∀x, F(x,⋅)\forall x,\ F(x, \cdot)∀x, F(x,⋅) is concave and continuous, then

min⁡x∈Xmax⁡y∈YF(x,y)=max⁡y∈Ymin⁡x∈XF(x,y)\min_{\mathbf{x}\in X}\max_{\mathbf{y}\in Y} F(\mathbf{x}, \mathbf{y}) = \max_{\mathbf{y}\in Y}\min_{\mathbf{x}\in X}F(\mathbf{x}, \mathbf{y})minx∈X​maxy∈Y​F(x,y)=maxy∈Y​minx∈X​F(x,y)

If we focus on convex problems, then we can find conditions which indicate strong duality holds.

Theorem 25 (Slater's Condition)

If a convex optimization problem is strictly feasible, then strong duality holds

Once we find a solution to the dual problem, then the solution to the primal problem is recovered by minimized L(x,λ∗,μ∗)\mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}^*, \boldsymbol{\mu}^*)L(x,λ∗,μ∗) where λ∗,μ∗\boldsymbol{\lambda}^*,\boldsymbol{\mu}^*λ∗,μ∗ are the optimal dual variables, and if no such feasible point x\mathbf{x}x exists, then the primal itself is infeasible. When searching for strong duality and an optimal solution (x,λ,μ)(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu})(x,λ,μ), it can be useful to consider particular conditions.

Theorem 26

For a convex primal problem which is feasible and has a feasible dual where strong duality holds, a primal dual pair (x,λ,μ)(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu})(x,λ,μ) is optimal if and only if the KKT conditions are satisfied.

  1. Primal Feasibility x\mathbf{x}x satisfies ∀i∈[1,m], fi(x)≤0\forall i\in[1,m],\ f_i(\mathbf{x}) \leq 0∀i∈[1,m], fi​(x)≤0 and ∀k∈[1,n], hi(x)=0\forall k\in[1,n],\ h_i(\mathbf{x}) = 0∀k∈[1,n], hi​(x)=0.

  2. Dual Feasibility λ≥0\boldsymbol{\lambda} \geq \boldsymbol{0}λ≥0.

  3. Complementary Slackness ∀i∈[1,m], λifi(x)=0\forall i\in[1,m],\ \lambda_if_i(\mathbf{x}) = 0∀i∈[1,m], λi​fi​(x)=0

  4. Lagrangian Stationarity If the lagrangian is differentiable, then

ablaxf0(x)+∑i=1kλi∇xfi(x)+∑k=1nμihk(x)=0abla_xf_0(\mathbf{x}) +\sum_{i=1}^k\lambda_i\nabla_xf_i(\mathbf{x}) + \sum_{k=1}^n\mu_ih_k(\mathbf{x})=0ablax​f0​(x)+∑i=1k​λi​∇x​fi​(x)+∑k=1n​μi​hk​(x)=0

The complementary slackness requirement essentially says that if a primal constraint is slack (fi(x)<0)f_i(\mathbf{x}) < 0)fi​(x)<0), then λi=0\lambda_i=0λi​=0, and if λi>0\lambda_i > 0λi​>0, then fi(x)=0f_i(\mathbf{x}) = 0fi​(x)=0.

PreviousConvex OptimizationNextEE128

Last updated 3 years ago

Was this helpful?