# Convex Optimization

## Convexity

{% hint style="info" %}

#### Definition 30

A subset $$C\in\mathbb{R}^n$$ is convex if it contains the line segment between any two points in the set.

$$\forall \mathbf{x}\_1, \mathbf{x}\_2\in C,\ \lambda\in\[0, 1],\quad \lambda \mathbf{x}\_1+(1-\lambda)\mathbf{x}\_2 \in C$$
{% endhint %}

Convexity can be preserved by some operations.

{% hint style="info" %}

#### Theorem 10

If $$C\_1,\cdots,C\_m$$ are convex sets, then their intersection $$C = \bigcap\_{i=1,\cdots,m}C\_i$$is also a convex set.
{% endhint %}

{% hint style="info" %}

#### Theorem 11

If a map $$f:\mathbb{R}^n\to\mathbb{R}^m$$ is affine and $$C \subset \mathbb{R}^n$$ is convex, then $$f(C) = { f(\mathbf{x}): \mathbf{x}\in C }$$is convex.
{% endhint %}

Theorem 10, Theorem 11 are important because they allow us to prove sets are convex using sets that we know are convex. For example, Theorem 11 tells us that a projection of a convex set onto a subspace must also be convex since projection is a linear operator.

{% hint style="info" %}

#### Definition 31

A function $$f:\mathbb{R}^n\to\mathbb{R}$$ is convex if its domain is a convex set and $$\forall \mathbf{x}, \mathbf{y}$$ in the domain, $$\lambda \in\[0, 1]$$,

$$f(\lambda \mathbf{x} + (1-\lambda)\mathbf{y}) \leq \lambda f(\mathbf{x}) + (1-\lambda)f(\mathbf{y})$$
{% endhint %}

Loosely, convexity means that the function is bowl shaped since a line connecting any two points on the function is above the function itself. A concave function is simply one where $$-f$$ is convex, and these appear like a “hill”. Because convex functions are bowl shaped, they must be $$\infty$$ outside their domain.

{% hint style="info" %}

#### Theorem 12

A function $$f$$is convex if and only if its epigraph is a convex set.
{% endhint %}

Just like convex sets, some operations preserve convexity for functions.

{% hint style="info" %}

#### Theorem 13

If $$f\_i:\mathbb{R}^n\to\mathbb{R}$$ are convex functions, then $$f(\mathbf{x}) = \sum\_{i=1}^m\alpha\_if\_i(\mathbf{x})$$ where $$\alpha\_i\geq 0$$is also convex.
{% endhint %}

A similar property to Theorem 11 exists for convex functions.

{% hint style="info" %}

#### Theorem 14

If $$f:\mathbb{R}^n\to\mathbb{R}$$ is convex, then $$g(\mathbf{x}) = f(A\mathbf{x}+b)$$is also convex.
{% endhint %}

We can also look at the first and second order derivatives to determine the convexity of a function.

{% hint style="info" %}

#### Theorem 15

If $$f$$ is differentiable, then $$f$$ is convex if and only if

$$\forall \mathbf{x}, \mathbf{y},\quad f(\mathbf{y}) \geq f(\mathbf{x}) + \nabla\_x^T (\mathbf{y}-\mathbf{x})$$
{% endhint %}

Theorem 15 can be understood geometrically by saying the graph of $$f$$ is bounded below everywhere by its tangent hyperplanes.

{% hint style="info" %}

#### Theorem 16

If $$f$$ is twice differentiable, then $$f$$ is convex if and only if the Hessian $$abla^2$$is positive semi-definite everywhere.
{% endhint %}

Geometrically, the second-order condition says that $$f$$ looks bowl-shaped.

{% hint style="info" %}

#### Theorem 17

A function $$f$$ is convex if and only if its restriction to any line $$g(t)=f(\mathbf{x}\_0+t\mathbf{v})$$is convex.
{% endhint %}

{% hint style="info" %}

#### Theorem 18

If $$(f\_\alpha)*{\alpha\in\mathcal{A}}$$ is a family of convex functions, then the pointwise maximum $$f(\mathbf{x}) = \max*{\alpha\in\mathcal{A}} f\_\alpha(\mathbf{x})$$is convex.
{% endhint %}

Because of the nice geometry that convexity gives, optimization problems which involve convex functions and sets are reliably solveable.

{% hint style="info" %}

#### Definition 32

A convex optimization problem in standard form is

$$p^\* = \min\_{\mathbf{x}}f\_0(\mathbf{x}) : \quad \forall i\in\[1,m], f\_i(\mathbf{x}) \leq 0, A\mathbf{x} = \mathbf{b}$$

where $$f\_0, f\_1, \cdots$$are convex functions and the equality constraints are affine.
{% endhint %}

Since the constraints form a convex set, Definition 32 is equivalent to minimizing a convex function over a convex set $$\mathcal{X}$$.

{% hint style="info" %}

#### Theorem 19

A locally optimal solution to a convex problem is also globally optimal, and this set $$\mathcal{X}$$is convex.
{% endhint %}

Theorem 19 is why convex problems are nice to solve.

## Optimality

When problems are convex, we can define conditions that any optimal solution must satisfy.

{% hint style="info" %}

#### Theorem 20

For a convex optimization problem with a differentiable objective function $$f\_0(\mathbf{x})$$ and feasible set $$\mathcal{X}$$,

$$\mathbf{x} \text{ is optimal } \Leftrightarrow \forall \mathbf{y}\in\mathcal{X}, \nabla\_xf\_0(\mathbf{x})^\top(\mathbf{y}-\mathbf{x}) \geq 0$$
{% endhint %}

Since the gradient points in the direction of greatest increase, the dot product of the gradient with the different between any vector and the optimal solution being positive means other solutions will only increase the value of $$f\_0(\mathbf{x})$$. For unconstrained problems, we can make this condition even sharper.

{% hint style="info" %}

#### Theorem 21

In a convex unconstrained problem with a differentiable objective function $$f\_0(\mathbf{x})$$, $$\mathbf{x}$$ is optimal if an only if $$abla\_xf\_0(\mathbf{x}) = \boldsymbol{0}$$
{% endhint %}

## Conic Programming

Conic programming is the set of optimization problems which deal with variables constrained to a second-order cone.

{% hint style="info" %}

#### Definition 33

A n-dimensional second-order cone is the set

$$\mathcal{K}\_n = {(\mathbf{x}, t),\ \mathbf{x}\in\mathbb{R}^n,\ t\in\mathbb{R}:\ |\mathbf{x}|\_2 \leq t}$$
{% endhint %}

By Cauchy-Schwartz, $$|\mathbf{x}|*2 = \max*{\mathbf{u}:|\mathbf{u}|\leq 1} \mathbf{u}^T\mathbf{x} \leq t$$. This means that second order cones are convex sets since they are the intersection of half-spaces. In spaces 3-dimensions and higher, we can rotate these cones.

{% hint style="info" %}

#### Definition 34

A rotated second order cone in $$\mathbb{R}^{n+2}$$ is the set

$$\mathcal{K}\_n^r = {(\mathbf{x}, y, z),\mathbf{x}\in\mathbb{R}^n, y\in\mathbb{R}, z\in\mathbb{R}:\ \mathbf{x}^T\mathbf{x} \leq yz, y\geq 0, z \geq 0 }.$$
{% endhint %}

The rotated second-order cone can be interpreted as a rotation because the hyperbolic constraint $$|\mathbf{x}|\_2^2\leq yz$$ can be expressed equivalently as

$$\left\lVert\begin{bmatrix}2\mathbf{x} \ y - z\end{bmatrix}\right\rVert\_2 \leq y+z.$$

{% hint style="info" %}

#### Definition 35

The standard Second Order Cone Constraint is

$$|A\mathbf{x}+\mathbf{b}|\_2 \leq \mathbf{c}^T\mathbf{x} +d.$$
{% endhint %}

A SOC constraint will confine $$\mathbf{x}$$ to a second order cone since if we let $$\mathbf{y} = A\mathbf{x}+\mathbf{b} \in \mathbb{R}^m$$ and $$t = \mathbf{c}^T\mathbf{x}+d$$, then $$(\mathbf{y}, t)\in\mathcal{K}\_m$$.

{% hint style="info" %}

#### Definition 36

A second-order cone program in standard inequality form is given by

$$\min \mathbf{c}^T\mathbf{x} \text{ such that } |A\_i\mathbf{x}+\mathbf{b}\_i|\_2 \leq \mathbf{c}\_i^T\mathbf{x}+d\_i.$$
{% endhint %}

An SOC program is a convex problem since its objective is linear, and hence convex, and the SOC constraints are also convex.

### Quadratic Programming

A special case of SOCPs are Quadratic Programs. These programs have constraints and an objective function which can be expressed as a quadratic function. In SOCP form, they look like

$$\begin{aligned} \min\_{\mathbf{x}, t} &\quad \mathbf{a}\_0^T\mathbf{x} + t\ \text{s.t: } & \left\lVert \begin{bmatrix}2Q\_0^{\frac{1}{2}}\mathbf{x}\ t-1 \end{bmatrix}\right\rVert\_2 \leq t+1\ & \left\lVert \begin{bmatrix}2Q\_i^{\frac{1}{2}}\mathbf{x}\ b\_i-\mathbf{a}\_i^T\mathbf{x}-1 \end{bmatrix}\right\rVert\_2 \leq b\_i - \mathbf{a}\_i\mathbf{x} + 1\end{aligned}$$

Since they are a special case of SOCPs, Quadratic Programs are also convex.

{% hint style="info" %}

#### Definition 37

The standard form of a quadratic constrained quadratic program is

$$\min\_\mathbf{x} \mathbf{x}^TQ\_0\mathbf{x} + \mathbf{a}\_0^T\mathbf{x} \quad : \quad \forall i\in\[1,m],\ \mathbf{x}^TQ\_i\mathbf{x} + \mathbf{a}\_i^T\mathbf{x} \leq b\_i$$
{% endhint %}

To be a quadratic program, the matrix $$H$$ must be positive semi-definite. If the $$Q\_i=0$$ in the constraints, then we get a normal quadratic program.

{% hint style="info" %}

#### Definition 38

The standard form of a quadratic program is given by

$$\min\_\mathbf{x}\frac{1}{2}\mathbf{x}^TH\mathbf{x} + \mathbf{c}^T\mathbf{x} \quad : \quad \forall i\in\[1,m],\ \mathbf{a}\_i^T\mathbf{x} \leq b\_i$$
{% endhint %}

Its SOCP form looks like

$$\begin{aligned} \min\_{\mathbf{x}, y} &\quad \mathbf{c}^T\mathbf{x} + y\ \text{s.t: } &\left\lVert \begin{bmatrix}2H^{\frac{1}{2}}\mathbf{x} \ y - 1 \end{bmatrix}\right\rVert\_2 \leq y + 1,\ & \mathbf{a}\_i\mathbf{x} \leq b\_i\end{aligned}$$

In the special case where $$H$$ is positive definite and we have no constraints, then

$$\frac{1}{2}\mathbf{x}^TH\mathbf{x} + \mathbf{c}^T\mathbf{x} + d = \frac{1}{2}(\mathbf{x} + H^{-1}\mathbf{c})^TH(\mathbf{x} + H^{-1}\mathbf{c}) + d - (H^{-1}\mathbf{c})^TH(H^{-1}\mathbf{c})$$

Thus

$$\text{argmin}\_\mathbf{x} \frac{1}{2}\mathbf{x}^TH\mathbf{x} + \mathbf{c}^T\mathbf{x} + d = -H^{-1}\mathbf{c}$$

### Linear Programming

If the matrix in the objective function of a quadratic program is 0 (and there are no quadratic constraints), then the resulting objective and constraints are affine functions. This is a linear program.

{% hint style="info" %}

#### Definition 39

The inequality form of a linear program is given by

$$\min\_\mathbf{x} \mathbf{c}^T\mathbf{x} + d \quad : \quad \forall i\in\[1,m],\ \mathbf{a}\_i^T\mathbf{x} \leq b\_i$$
{% endhint %}

Since linear program is a special case of a quadratic program, it can also be expressed as an SOCP.

$$\begin{aligned} \min\_\mathbf{x} &\quad \mathbf{c}^T\mathbf{x}\ \text{s.t } &\quad \forall i\in\[1,m],\ |0\mathbf{x} + 0|\_2 \leq b\_i - \mathbf{a}\_i^T\mathbf{x}\end{aligned}$$

Because of the constraints, the feasible set of a linear program is a polyhedron. Thus linear programs are also convex.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://aparande.gitbook.io/berkeley-notes/eecs127-0/eecs127-4.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
