Optimization

Concept

Optimization - Linear, Nonlinear, and Variational Methods

Table of Contents

  1. Linear Programming
  2. Nonlinear Optimization
  3. Convex Optimization
  4. Constrained Optimization
  5. Duality Theory
  6. Calculus of Variations
  7. Optimal Control

Linear Programming

Standard Form

Minimize: cTxc^T x Subject to: Ax=b,x0Ax = b, x \geq 0

where xRnx \in \mathbb{R}^n

Feasible region: Polyhedron

Fundamental Theorem

If LP has optimal solution, it occurs at vertex (extreme point).

Simplex Method

1. Start at feasible vertex 2. Move to adjacent vertex with better objective 3. Continue until optimal

Pivot operation: Replace one basic variable

Duality

Primal: min{cTx:Ax=b,x0}\min\{c^T x : Ax = b, x \geq 0\}

Dual: max{bTy:ATyc}\max\{b^T y : A^T y \leq c\}

Weak duality: cTxbTyc^T x \geq b^T y for feasible x,yx, y

Strong duality: If primal optimal exists, then cTx=bTyc^T x^* = b^T y^*

Complementary slackness: xi>0    (ATy)i=cix_i > 0 \implies (A^T y)_i = c_i

Interior Point Methods

Karmarkar’s algorithm: Move through interior of feasible region

Polynomial time for general LPs

Transportation Problem

Minimize shipping costs

Special structure allows simpler algorithms


Nonlinear Optimization

Unconstrained Problems

minf(x)\min f(x) where f:RnRf: \mathbb{R}^n \to \mathbb{R}

Necessary conditions: f(x)=0\nabla f(x^*) = 0

Sufficient conditions: f(x)=0\nabla f(x^*) = 0, 2f(x)\nabla^2 f(x^*) positive definite

Gradient Descent

xk+1=xkαkf(xk)x_{k+1} = x_k - \alpha_k \nabla f(x_k)

where αk\alpha_k is step size

Method: Choose search direction, line search for step

Newton’s Method

xk+1=xk[2f(xk)]1f(xk)x_{k+1} = x_k - [\nabla^2f(x_k)]^{-1} \nabla f(x_k)

Convergence: Quadratic near minimum

Issue: Requires Hessian computation and inversion

Modified Newton: Use approximate Hessian (BFGS, L-BFGS)

Quasi-Newton Methods

Approximate 2f\nabla^2f using gradient information

BFGS update: (Bk+1B_{k+1} from Bk,sk,ykB_k, s_k, y_k)

Bk+1=Bk+ykykTykTskBkskskTBkskTBkskB_{k+1} = B_k + \frac{y_k y_k^T}{y_k^T s_k} - \frac{B_k s_k s_k^T B_k}{s_k^T B_k s_k}

L-BFGS: Limited memory version

Line Search Methods

Goldstein conditions:

f(xk+αpk)f(xk)+c1αfTpkf(x_k + \alpha p_k) \leq f(x_k) + c_1 \alpha \nabla f^T p_k f(xk+αpk)f(xk)+c2αfTpkf(x_k + \alpha p_k) \geq f(x_k) + c_2 \alpha \nabla f^T p_k

Armijo backtracking: Start with α=1\alpha = 1, reduce until satisfied


Convex Optimization

Convex Sets

Set C is convex if for x,yCx, y \in C and θ[0,1]\theta \in [0,1]: θx+(1θ)yC\theta x + (1-\theta)y \in C

Examples:

  • Hyperplane: {x : a^T x = b}
  • Halfspace: {x : a^T x \leq b}
  • Norm balls
  • Intersection of convex sets

Convex Functions

ff is convex if for x,ydomain f,θ[0,1]x, y \in \text{domain } f, \theta \in [0,1]: f(θx+(1θ)y)θf(x)+(1θ)f(y)f(\theta x + (1-\theta)y) \leq \theta f(x) + (1-\theta)f(y)

Characterization (twice differentiable): 2f(x)0\nabla^2 f(x) \succeq 0 (positive semidefinite) for all xx

Global Optima

For convex problem minf(x)\min f(x) subject to xCx \in C:

Local minimum     \iff global minimum

Examples

Least squares: minAxb2\min ||Ax - b||^2 Quadratic programming: minxTQx+cTx\min x^T Qx + c^T x (Q0Q \succeq 0) Lasso: minAxb2+λx1\min ||Ax - b||^2 + \lambda||x||_1 (convex)


Constrained Optimization

Karush-Kuhn-Tucker Conditions

Problem: minf(x)\min f(x) s.t. gi(x)0,hj(x)=0g_i(x) \leq 0, h_j(x) = 0

KKT conditions (necessary):

  1. Stationary: f+λigi+μjhj=0\nabla f + \sum \lambda_i \nabla g_i + \sum \mu_j \nabla h_j = 0
  2. Primal feasibility: gi0,hj=0g_i \leq 0, h_j = 0
  3. Dual feasibility: λi0\lambda_i \geq 0
  4. Complementary slackness: λigi=0\lambda_i g_i = 0

If convex, KKT conditions sufficient for optimality.

Lagrange Multipliers

For equality constraints: f=μjhj\nabla f = \sum \mu_j \nabla h_j

Interpretation: Penalty for constraint violation

Penalty Methods

Unconstrained approximation:

minf(x)+μ[gi(x)+]2+μ[hj(x)]2\min f(x) + \mu \sum [g_i(x)^+]^2 + \mu \sum [h_j(x)]^2

where [t]+=max(0,t)[t]^+ = \max(0,t)

Increase μ\mu \to solution approaches constrained optimum

Barrier Methods

Interior point: minf(x)μlog(gi(x))\min f(x) - \mu \sum \log(-g_i(x))

satisfies gi(x)<0g_i(x) < 0

Decrease μ\mu \to approach boundary

Augmented Lagrangian

Method: Add penalty to Lagrangian

ADMM (Alternating Direction Method of Multipliers): For separable problems


Duality Theory

Lagrangian

L(x,λ,μ)=f(x)+λigi(x)+μjhj(x)L(x, \lambda, \mu) = f(x) + \sum \lambda_i g_i(x) + \sum \mu_j h_j(x)

Dual Function

g(λ,μ)=infxL(x,λ,μ)g(\lambda, \mu) = \inf_x L(x, \lambda, \mu)

Concave in (λ,μ)(\lambda, \mu)

Dual Problem

maxg(λ,μ)\max g(\lambda, \mu) s.t. λ0\lambda \geq 0

Always convex (even if primal not)

Weak and Strong Duality

Weak: dpd^* \leq p^* (dual \leq primal)

Strong: If Slater conditions hold, d=pd^* = p^*

Saddle Point

(x,λ,μ)(x^*, \lambda^*, \mu^*) is saddle point if:

L(x,λ,μ)L(x,λ,μ)L(x,λ,μ)L(x^*, \lambda, \mu) \geq L(x^*, \lambda^*, \mu^*) \geq L(x, \lambda^*, \mu^*)

for all x,λ0,μx, \lambda \geq 0, \mu

Saddle point     \iff optimality


Calculus of Variations

Fundamental Problem

Find function yy that minimizes:

J[y]=abF(x,y,y)dxJ[y] = \int_a^b F(x, y, y') dx

subject to y(a)=ya,y(b)=yby(a) = y_a, y(b) = y_b

Euler-Lagrange Equation

Necessary condition:

FyddxFy=0\frac{\partial F}{\partial y} - \frac{d}{dx}\frac{\partial F}{\partial y'} = 0

Deduce: d/dx(FyF/y)=F/xd/dx (F - y' \partial F/\partial y') = \partial F/\partial x

Examples

Brachistochrone: Minimum time path Geodesics: Shortest path on surface Isoperimetric problem: Maximum area for given perimeter

Hamilton’s Principle

Action: S=LdtS = \int L dt where L=TVL = T - V

Minimize S     \implies Lagrange’s equations

Noether’s Theorem

Continuous symmetry     \implies conserved quantity

Energy conservation: Time translation symmetry Momentum conservation: Space translation symmetry


Optimal Control

Control Problem

State: x(t)x(t) Control: u(t)u(t)

Minimize: J=L(x,u,t)dtJ = \int L(x, u, t) dt

subject to x˙=f(x,u,t),x(0)=x0,x(T)\dot{x} = f(x, u, t), x(0) = x_0, x(T) free or fixed

Hamiltonian

H(x,u,p,t)=L(x,u,t)+pTf(x,u,t)H(x, u, p, t) = L(x, u, t) + p^T f(x, u, t)

where pp is adjoint (costate) variable

Pontryagin Minimum Principle

Optimal control uu^* minimizes HH:

u=argminuUH(x,u,p,t)u^* = \arg \min_{u \in U} H(x^*, u, p^*, t)

State equation: x˙=H/p\dot{x} = \partial H/\partial p Adjoint equation: p˙=H/x\dot{p} = -\partial H/\partial x Transversality conditions: p(T)p(T) determined by end constraints

Linear Quadratic Regulator (LQR)

Minimize: J=[xTQx+uTRu]dtJ = \int [x^T Q x + u^T R u] dt

x˙=Ax+Bu\dot{x} = Ax + Bu (linear dynamics)

Solution: u=R1BTPxu^* = -R^{-1}B^T P x

where PP from Riccati equation

Closed-form solution possible


Next: Topology or Abstract Algebra


Last updated: Comprehensive optimization reference covering linear, nonlinear, and variational problems.