Berkeley Notes
  • Introduction
  • EE120
    • Introduction to Signals and Systems
    • The Fourier Series
    • The Fourier Transform
    • Generalized transforms
    • Linear Time-Invariant Systems
    • Feedback Control
    • Sampling
    • Appendix
  • EE123
    • The DFT
    • Spectral Analysis
    • Sampling
    • Filtering
  • EECS126
    • Introduction to Probability
    • Random Variables and their Distributions
    • Concentration
    • Information Theory
    • Random Processes
    • Random Graphs
    • Statistical Inference
    • Estimation
  • EECS127
    • Linear Algebra
    • Fundamentals of Optimization
    • Linear Algebraic Optimization
    • Convex Optimization
    • Duality
  • EE128
    • Introduction to Control
    • Modeling Systems
    • System Performance
    • Design Tools
    • Cascade Compensation
    • State-Space Control
    • Digital Control Systems
    • Cayley-Hamilton
  • EECS225A
    • Hilbert Space Theory
    • Linear Estimation
    • Discrete Time Random Processes
    • Filtering
  • EE222
    • Real Analysis
    • Differential Geometry
    • Nonlinear System Dynamics
    • Stability of Nonlinear Systems
    • Nonlinear Feedback Control
Powered by GitBook
On this page
  • Convexity
  • Optimality
  • Conic Programming
  • Quadratic Programming
  • Linear Programming

Was this helpful?

  1. EECS127

Convex Optimization

PreviousLinear Algebraic OptimizationNextDuality

Last updated 3 years ago

Was this helpful?

Convexity

Definition 30

A subset C∈RnC\in\mathbb{R}^nC∈Rn is convex if it contains the line segment between any two points in the set.

∀x1,x2∈C, λ∈[0,1],λx1+(1−λ)x2∈C\forall \mathbf{x}_1, \mathbf{x}_2\in C,\ \lambda\in[0, 1],\quad \lambda \mathbf{x}_1+(1-\lambda)\mathbf{x}_2 \in C∀x1​,x2​∈C, λ∈[0,1],λx1​+(1−λ)x2​∈C

Convexity can be preserved by some operations.

Theorem 10

If C1,⋯ ,CmC_1,\cdots,C_mC1​,⋯,Cm​ are convex sets, then their intersection C=⋂i=1,⋯ ,mCiC = \bigcap_{i=1,\cdots,m}C_iC=⋂i=1,⋯,m​Ci​is also a convex set.

Theorem 11

If a map f:Rn→Rmf:\mathbb{R}^n\to\mathbb{R}^mf:Rn→Rm is affine and C⊂RnC \subset \mathbb{R}^nC⊂Rn is convex, then f(C)={f(x):x∈C}f(C) = \{ f(\mathbf{x}): \mathbf{x}\in C \}f(C)={f(x):x∈C}is convex.

Theorem 10, Theorem 11 are important because they allow us to prove sets are convex using sets that we know are convex. For example, Theorem 11 tells us that a projection of a convex set onto a subspace must also be convex since projection is a linear operator.

Definition 31

A function f:Rn→Rf:\mathbb{R}^n\to\mathbb{R}f:Rn→R is convex if its domain is a convex set and ∀x,y\forall \mathbf{x}, \mathbf{y}∀x,y in the domain, λ∈[0,1]\lambda \in[0, 1]λ∈[0,1],

f(λx+(1−λ)y)≤λf(x)+(1−λ)f(y)f(\lambda \mathbf{x} + (1-\lambda)\mathbf{y}) \leq \lambda f(\mathbf{x}) + (1-\lambda)f(\mathbf{y})f(λx+(1−λ)y)≤λf(x)+(1−λ)f(y)

Loosely, convexity means that the function is bowl shaped since a line connecting any two points on the function is above the function itself. A concave function is simply one where −f-f−f is convex, and these appear like a “hill”. Because convex functions are bowl shaped, they must be ∞\infty∞ outside their domain.

Theorem 12

A function fffis convex if and only if its epigraph is a convex set.

Just like convex sets, some operations preserve convexity for functions.

Theorem 13

A similar property to Theorem 11 exists for convex functions.

Theorem 14

We can also look at the first and second order derivatives to determine the convexity of a function.

Theorem 15

Theorem 16

Theorem 17

Theorem 18

Because of the nice geometry that convexity gives, optimization problems which involve convex functions and sets are reliably solveable.

Definition 32

A convex optimization problem in standard form is

Theorem 19

Theorem 19 is why convex problems are nice to solve.

Optimality

When problems are convex, we can define conditions that any optimal solution must satisfy.

Theorem 20

Theorem 21

Conic Programming

Conic programming is the set of optimization problems which deal with variables constrained to a second-order cone.

Definition 33

A n-dimensional second-order cone is the set

Definition 34

Definition 35

The standard Second Order Cone Constraint is

Definition 36

A second-order cone program in standard inequality form is given by

An SOC program is a convex problem since its objective is linear, and hence convex, and the SOC constraints are also convex.

Quadratic Programming

A special case of SOCPs are Quadratic Programs. These programs have constraints and an objective function which can be expressed as a quadratic function. In SOCP form, they look like

Since they are a special case of SOCPs, Quadratic Programs are also convex.

Definition 37

The standard form of a quadratic constrained quadratic program is

Definition 38

The standard form of a quadratic program is given by

Its SOCP form looks like

Thus

Linear Programming

If the matrix in the objective function of a quadratic program is 0 (and there are no quadratic constraints), then the resulting objective and constraints are affine functions. This is a linear program.

Definition 39

The inequality form of a linear program is given by

Since linear program is a special case of a quadratic program, it can also be expressed as an SOCP.

Because of the constraints, the feasible set of a linear program is a polyhedron. Thus linear programs are also convex.

If fi:Rn→Rf_i:\mathbb{R}^n\to\mathbb{R}fi​:Rn→R are convex functions, then f(x)=∑i=1mαifi(x)f(\mathbf{x}) = \sum_{i=1}^m\alpha_if_i(\mathbf{x})f(x)=∑i=1m​αi​fi​(x) where αi≥0\alpha_i\geq 0αi​≥0is also convex.

If f:Rn→Rf:\mathbb{R}^n\to\mathbb{R}f:Rn→R is convex, then g(x)=f(Ax+b)g(\mathbf{x}) = f(A\mathbf{x}+b)g(x)=f(Ax+b)is also convex.

If fff is differentiable, then fff is convex if and only if

∀x,y,f(y)≥f(x)+∇xT(y−x)\forall \mathbf{x}, \mathbf{y},\quad f(\mathbf{y}) \geq f(\mathbf{x}) + \nabla_x^T (\mathbf{y}-\mathbf{x})∀x,y,f(y)≥f(x)+∇xT​(y−x)

Theorem 15 can be understood geometrically by saying the graph of fff is bounded below everywhere by its tangent hyperplanes.

If fff is twice differentiable, then fff is convex if and only if the Hessian abla2abla^2abla2is positive semi-definite everywhere.

Geometrically, the second-order condition says that fff looks bowl-shaped.

A function fff is convex if and only if its restriction to any line g(t)=f(x0+tv)g(t)=f(\mathbf{x}_0+t\mathbf{v})g(t)=f(x0​+tv)is convex.

If (fα)α∈A(f_\alpha)_{\alpha\in\mathcal{A}}(fα​)α∈A​ is a family of convex functions, then the pointwise maximum f(x)=max⁡α∈Afα(x)f(\mathbf{x}) = \max_{\alpha\in\mathcal{A}} f_\alpha(\mathbf{x})f(x)=maxα∈A​fα​(x)is convex.

p∗=min⁡xf0(x):∀i∈[1,m],fi(x)≤0,Ax=bp^* = \min_{\mathbf{x}}f_0(\mathbf{x}) : \quad \forall i\in[1,m], f_i(\mathbf{x}) \leq 0, A\mathbf{x} = \mathbf{b}p∗=minx​f0​(x):∀i∈[1,m],fi​(x)≤0,Ax=b

where f0,f1,⋯f_0, f_1, \cdotsf0​,f1​,⋯are convex functions and the equality constraints are affine.

Since the constraints form a convex set, Definition 32 is equivalent to minimizing a convex function over a convex set X\mathcal{X}X.

A locally optimal solution to a convex problem is also globally optimal, and this set X\mathcal{X}Xis convex.

For a convex optimization problem with a differentiable objective function f0(x)f_0(\mathbf{x})f0​(x) and feasible set X\mathcal{X}X,

x is optimal ⇔∀y∈X,∇xf0(x)⊤(y−x)≥0\mathbf{x} \text{ is optimal } \Leftrightarrow \forall \mathbf{y}\in\mathcal{X}, \nabla_xf_0(\mathbf{x})^\top(\mathbf{y}-\mathbf{x}) \geq 0x is optimal ⇔∀y∈X,∇x​f0​(x)⊤(y−x)≥0

Since the gradient points in the direction of greatest increase, the dot product of the gradient with the different between any vector and the optimal solution being positive means other solutions will only increase the value of f0(x)f_0(\mathbf{x})f0​(x). For unconstrained problems, we can make this condition even sharper.

In a convex unconstrained problem with a differentiable objective function f0(x)f_0(\mathbf{x})f0​(x), x\mathbf{x}x is optimal if an only if ablaxf0(x)=0abla_xf_0(\mathbf{x}) = \boldsymbol{0}ablax​f0​(x)=0

Kn={(x,t), x∈Rn, t∈R: ∥x∥2≤t}\mathcal{K}_n = \{(\mathbf{x}, t),\ \mathbf{x}\in\mathbb{R}^n,\ t\in\mathbb{R}:\ \|\mathbf{x}\|_2 \leq t\}Kn​={(x,t), x∈Rn, t∈R: ∥x∥2​≤t}

By Cauchy-Schwartz, ∥x∥2=max⁡u:∥u∥≤1uTx≤t\|\mathbf{x}\|_2 = \max_{\mathbf{u}:\|\mathbf{u}\|\leq 1} \mathbf{u}^T\mathbf{x} \leq t∥x∥2​=maxu:∥u∥≤1​uTx≤t. This means that second order cones are convex sets since they are the intersection of half-spaces. In spaces 3-dimensions and higher, we can rotate these cones.

A rotated second order cone in Rn+2\mathbb{R}^{n+2}Rn+2 is the set

Knr={(x,y,z),x∈Rn,y∈R,z∈R: xTx≤yz,y≥0,z≥0}.\mathcal{K}_n^r = \{(\mathbf{x}, y, z),\mathbf{x}\in\mathbb{R}^n, y\in\mathbb{R}, z\in\mathbb{R}:\ \mathbf{x}^T\mathbf{x} \leq yz, y\geq 0, z \geq 0 \}.Knr​={(x,y,z),x∈Rn,y∈R,z∈R: xTx≤yz,y≥0,z≥0}.

The rotated second-order cone can be interpreted as a rotation because the hyperbolic constraint ∥x∥22≤yz\|\mathbf{x}\|_2^2\leq yz∥x∥22​≤yz can be expressed equivalently as

∥[2xy−z]∥2≤y+z.\left\lVert\begin{bmatrix}2\mathbf{x} \\ y - z\end{bmatrix}\right\rVert_2 \leq y+z.​[2xy−z​]​2​≤y+z.

∥Ax+b∥2≤cTx+d.\|A\mathbf{x}+\mathbf{b}\|_2 \leq \mathbf{c}^T\mathbf{x} +d.∥Ax+b∥2​≤cTx+d.

A SOC constraint will confine x\mathbf{x}x to a second order cone since if we let y=Ax+b∈Rm\mathbf{y} = A\mathbf{x}+\mathbf{b} \in \mathbb{R}^my=Ax+b∈Rm and t=cTx+dt = \mathbf{c}^T\mathbf{x}+dt=cTx+d, then (y,t)∈Km(\mathbf{y}, t)\in\mathcal{K}_m(y,t)∈Km​.

min⁡cTx such that ∥Aix+bi∥2≤ciTx+di.\min \mathbf{c}^T\mathbf{x} \text{ such that } \|A_i\mathbf{x}+\mathbf{b}_i\|_2 \leq \mathbf{c}_i^T\mathbf{x}+d_i.mincTx such that ∥Ai​x+bi​∥2​≤ciT​x+di​.

min⁡x,ta0Tx+ts.t: ∥[2Q012xt−1]∥2≤t+1∥[2Qi12xbi−aiTx−1]∥2≤bi−aix+1\begin{aligned} \min_{\mathbf{x}, t} &\quad \mathbf{a}_0^T\mathbf{x} + t\\ \text{s.t: } & \left\lVert \begin{bmatrix}2Q_0^{\frac{1}{2}}\mathbf{x}\\ t-1 \end{bmatrix}\right\rVert_2 \leq t+1\\ & \left\lVert \begin{bmatrix}2Q_i^{\frac{1}{2}}\mathbf{x}\\ b_i-\mathbf{a}_i^T\mathbf{x}-1 \end{bmatrix}\right\rVert_2 \leq b_i - \mathbf{a}_i\mathbf{x} + 1\end{aligned}x,tmin​s.t: ​a0T​x+t​[2Q021​​xt−1​]​2​≤t+1​[2Qi21​​xbi​−aiT​x−1​]​2​≤bi​−ai​x+1​

min⁡xxTQ0x+a0Tx:∀i∈[1,m], xTQix+aiTx≤bi\min_\mathbf{x} \mathbf{x}^TQ_0\mathbf{x} + \mathbf{a}_0^T\mathbf{x} \quad : \quad \forall i\in[1,m],\ \mathbf{x}^TQ_i\mathbf{x} + \mathbf{a}_i^T\mathbf{x} \leq b_iminx​xTQ0​x+a0T​x:∀i∈[1,m], xTQi​x+aiT​x≤bi​

To be a quadratic program, the matrix HHH must be positive semi-definite. If the Qi=0Q_i=0Qi​=0 in the constraints, then we get a normal quadratic program.

min⁡x12xTHx+cTx:∀i∈[1,m], aiTx≤bi\min_\mathbf{x}\frac{1}{2}\mathbf{x}^TH\mathbf{x} + \mathbf{c}^T\mathbf{x} \quad : \quad \forall i\in[1,m],\ \mathbf{a}_i^T\mathbf{x} \leq b_iminx​21​xTHx+cTx:∀i∈[1,m], aiT​x≤bi​

min⁡x,ycTx+ys.t: ∥[2H12xy−1]∥2≤y+1,aix≤bi\begin{aligned} \min_{\mathbf{x}, y} &\quad \mathbf{c}^T\mathbf{x} + y\\ \text{s.t: } &\left\lVert \begin{bmatrix}2H^{\frac{1}{2}}\mathbf{x} \\ y - 1 \end{bmatrix}\right\rVert_2 \leq y + 1,\\ & \mathbf{a}_i\mathbf{x} \leq b_i\end{aligned}x,ymin​s.t: ​cTx+y​[2H21​xy−1​]​2​≤y+1,ai​x≤bi​​

In the special case where HHH is positive definite and we have no constraints, then

12xTHx+cTx+d=12(x+H−1c)TH(x+H−1c)+d−(H−1c)TH(H−1c)\frac{1}{2}\mathbf{x}^TH\mathbf{x} + \mathbf{c}^T\mathbf{x} + d = \frac{1}{2}(\mathbf{x} + H^{-1}\mathbf{c})^TH(\mathbf{x} + H^{-1}\mathbf{c}) + d - (H^{-1}\mathbf{c})^TH(H^{-1}\mathbf{c})21​xTHx+cTx+d=21​(x+H−1c)TH(x+H−1c)+d−(H−1c)TH(H−1c)

argminx12xTHx+cTx+d=−H−1c\text{argmin}_\mathbf{x} \frac{1}{2}\mathbf{x}^TH\mathbf{x} + \mathbf{c}^T\mathbf{x} + d = -H^{-1}\mathbf{c}argminx​21​xTHx+cTx+d=−H−1c

min⁡xcTx+d:∀i∈[1,m], aiTx≤bi\min_\mathbf{x} \mathbf{c}^T\mathbf{x} + d \quad : \quad \forall i\in[1,m],\ \mathbf{a}_i^T\mathbf{x} \leq b_iminx​cTx+d:∀i∈[1,m], aiT​x≤bi​

min⁡xcTxs.t ∀i∈[1,m], ∥0x+0∥2≤bi−aiTx\begin{aligned} \min_\mathbf{x} &\quad \mathbf{c}^T\mathbf{x}\\ \text{s.t } &\quad \forall i\in[1,m],\ \|0\mathbf{x} + 0\|_2 \leq b_i - \mathbf{a}_i^T\mathbf{x}\end{aligned}xmin​s.t ​cTx∀i∈[1,m], ∥0x+0∥2​≤bi​−aiT​x​