# Duality

{% hint style="info" %}

#### Definition 40

A primal optimization problem is given by

$$p^\* = \min\_{\mathbf{x}\in\mathbb{R}^n} f\_0(\mathbf{x}) : \forall i\in\[1,m]\ f\_i(\mathbf{x}) \leq 0, \forall k\in\[1,n]\ h\_k(\mathbf{x}) = 0$$
{% endhint %}

The primal problem is essentially the standard form of optimization. There are no assumptions of convexity on any of the functions involved. We can would like to express primal problems as a min-max optimization with no constraints.

{% hint style="info" %}

#### Definition 41

The Lagrangian $$\mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu})$$ using Lagrange Multipliers $$\boldsymbol{\lambda}$$ and $$\boldsymbol{\mu}$$ is given by

$$\mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu}) = f\_0(\mathbf{x}) + \sum\_{i=1}^m\lambda\_i f\_i(\mathbf{x}) + \sum\_{k=1}^n \mu\_i h\_i(\mathbf{x})$$
{% endhint %}

The Lagrangian achieves the goal of removing the constraints in the min-max optimization

$$p^\* = \min\_{\mathbf{x}\in\mathbb{R}^n}\max\_{\boldsymbol{\lambda}\geq \boldsymbol{0}, \boldsymbol{\mu}} \mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu})$$

This is true because if any inequality constraints are violated, then $$f\_i(\mathbf{x}) \geq 0$$, and the maximization could set $$\lambda\_i$$ very large to make the overall problem $$\infty$$, and if any equality constraints are violated, then $$h\_k(\mathbf{x}) \ne 0$$, and the maximization would set $$\mu\_i$$ to a very large number of the same sign as $$h\_k(\mathbf{x})$$ to make the overall problem $$\infty$$. Thus the minimax problem is equivalent to the original problem. At this point, it might be easier to solve the problem if the order of min and max were switched.

{% hint style="info" %}

#### Theorem 22 (Minimax Inequality) <a href="#theorem-22" id="theorem-22"></a>

For any sets $$X, Y$$ and any function $$F:X\times Y\to\mathbb{R}$$

$$\min\_{\mathbf{x}\in X}\max\_{\mathbf{y}\in Y} F(\mathbf{x}, \mathbf{y}) \geq \max\_{\mathbf{y}\in Y}\min\_{\mathbf{x}\in X}F(\mathbf{x}, \mathbf{y})$$
{% endhint %}

Theorem 22 can be interpreted as a game where there is a minimizing player and a maximizing player. If the maximizer goes first, it will always produce a higher score than if the minimizer goes first (unless they are equal). We can now apply Theorem 22 to switch the $$\min$$ and $$\max$$ in our optimization with the Lagrangian.

{% hint style="info" %}

#### Theorem 23 (Weak Duality) <a href="#theorem-23" id="theorem-23"></a>

$$\min\_{\mathbf{x}\in\mathbb{R}^n}\max\_{\boldsymbol{\lambda}\geq \boldsymbol{0}, \boldsymbol{\mu}} \mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu}) \geq \max\_{\boldsymbol{\lambda}\geq \boldsymbol{0}, \boldsymbol{\mu}} \min\_{\mathbf{x}\in\mathbb{R}^n} \mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu})$$
{% endhint %}

What weak duality does is convert our minimization problem to a maximization problem.

{% hint style="info" %}

#### Definition 42

The dual function of the primal problem is given by

$$g(\boldsymbol{\lambda}, \boldsymbol{\mu}) = \min\_{\mathbf{x}} \mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu})$$
{% endhint %}

Note that $$g$$ is a concave function because it is the pointwise minimum of functions that are affine in $$\boldsymbol{\mu}$$ and $$\boldsymbol{\lambda}$$. A maximization of a concave function over a convex set is a convex problem, so the dual problem (minimizing $$g$$) is convex. Thus duality achieves two primary purposes.

1. It removes constraints, potentially making the problem easier to solve.
2. It can turn a non-convex problems into a convex one.

Even when there are no constraints, we can sometimes introduce constraints to leverage duality by adding slack variables that are equal to expressions in the objective.

## Strong Duality

In some cases, duality gives not just a lower bound, but an exact value. When this happens, we have **Strong Duality**.

{% hint style="info" %}

#### Theorem 24 (Sion's MiniMax Theorem) <a href="#theorem-24" id="theorem-24"></a>

Let $$X\subseteq\mathbb{R}^n$$ be convex, and $$Y\subseteq\mathbb{R}^m$$ be bounded and closed (compact). Let $$F:X \times Y \to \mathbb{R}$$ be a function such that $$\forall y,\ F(\cdot, y)$$ is convex and continuous, and $$\forall x,\ F(x, \cdot)$$ is concave and continuous, then

$$\min\_{\mathbf{x}\in X}\max\_{\mathbf{y}\in Y} F(\mathbf{x}, \mathbf{y}) = \max\_{\mathbf{y}\in Y}\min\_{\mathbf{x}\in X}F(\mathbf{x}, \mathbf{y})$$
{% endhint %}

If we focus on convex problems, then we can find conditions which indicate strong duality holds.

{% hint style="info" %}

#### Theorem 25 (Slater's Condition) <a href="#theorem-25" id="theorem-25"></a>

If a convex optimization problem is strictly feasible, then strong duality holds
{% endhint %}

Once we find a solution to the dual problem, then the solution to the primal problem is recovered by minimized $$\mathcal{L}(\mathbf{x}, \boldsymbol{\lambda}^*, \boldsymbol{\mu}^*)$$ where $$\boldsymbol{\lambda}^*,\boldsymbol{\mu}^*$$ are the optimal dual variables, and if no such feasible point $$\mathbf{x}$$ exists, then the primal itself is infeasible. When searching for strong duality and an optimal solution $$(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu})$$, it can be useful to consider particular conditions.

{% hint style="info" %}

#### Theorem 26

For a convex primal problem which is feasible and has a feasible dual where strong duality holds, a primal dual pair $$(\mathbf{x}, \boldsymbol{\lambda}, \boldsymbol{\mu})$$ is optimal if and only if the KKT conditions are satisfied.

1. **Primal Feasibility** $$\mathbf{x}$$ satisfies $$\forall i\in\[1,m],\ f\_i(\mathbf{x}) \leq 0$$ and $$\forall k\in\[1,n],\ h\_i(\mathbf{x}) = 0$$.
2. **Dual Feasibility** $$\boldsymbol{\lambda} \geq \boldsymbol{0}$$.
3. **Complementary Slackness** $$\forall i\in\[1,m],\ \lambda\_if\_i(\mathbf{x}) = 0$$
4. **Lagrangian Stationarity** If the lagrangian is differentiable, then

$$abla\_xf\_0(\mathbf{x}) +\sum\_{i=1}^k\lambda\_i\nabla\_xf\_i(\mathbf{x}) + \sum\_{k=1}^n\mu\_ih\_k(\mathbf{x})=0$$
{% endhint %}

The complementary slackness requirement essentially says that if a primal constraint is slack ($$f\_i(\mathbf{x}) < 0)$$, then $$\lambda\_i=0$$, and if $$\lambda\_i > 0$$, then $$f\_i(\mathbf{x}) = 0$$.
