Optimization

Concept

Optimization - Linear, Nonlinear, and Variational Methods

Table of Contents

  1. Linear Programming
  2. Nonlinear Optimization
  3. Convex Optimization
  4. Constrained Optimization
  5. Duality Theory
  6. Calculus of Variations
  7. Optimal Control

Linear Programming

Standard Form

Minimize: cTxc^T x Subject to: Ax=b,x0Ax = b, x \geq 0

where xRnx \in \mathbb{R}^n

Feasible region: Polyhedron

Fundamental Theorem

If LP has optimal solution, it occurs at vertex (extreme point).

Simplex Method

1. Start at feasible vertex 2. Move to adjacent vertex with better objective 3. Continue until optimal

Pivot operation: Replace one basic variable

Duality

Primal: min{cTx:Ax=b,x0}\min\{c^T x : Ax = b, x \geq 0\}

Dual: max{bTy:ATyc}\max\{b^T y : A^T y \leq c\}

Weak duality: cTxbTyc^T x \geq b^T yfor feasible x,yx, y

Strong duality: If primal optimal exists, then cTx=bTyc^T x^* = b^T y^*

Complementary slackness: xi>0    (ATy)i=cix_i > 0 \implies (A^T y)_i = c_i

Interior Point Methods

Karmarkar’s algorithm: Move through interior of feasible region

Polynomial time for general LPs

Transportation Problem

Minimize shipping costs

Special structure allows simpler algorithms


Nonlinear Optimization

Unconstrained Problems

** minf(x)\min f(x)** where f:RnRf: \mathbb{R}^n \to \mathbb{R}

Necessary conditions: f(x)=0\nabla f(x^*) = 0

Sufficient conditions: f(x)=0\nabla f(x^*) = 0, 2f(x)\nabla^2 f(x^*)positive definite

Gradient Descent

** xk+1=xkαkf(xk)x_{k+1} = x_k - \alpha_k \nabla f(x_k)**

where αk\alpha_kis step size

Method: Choose search direction, line search for step

Newton’s Method

** xk+1=xk[2f(xk)]1f(xk)x_{k+1} = x_k - [\nabla^2f(x_k)]^{-1} \nabla f(x_k)**

Convergence: Quadratic near minimum

Issue: Requires Hessian computation and inversion

Modified Newton: Use approximate Hessian (BFGS, L-BFGS)

Quasi-Newton Methods

Approximate 2f\nabla^2f using gradient information

BFGS update: (Bk+1B_{k+1}from Bk,sk,ykB_k, s_k, y_k)

Bk+1=Bk+ykykTykTskBkskskTBkskTBkskB_{k+1} = B_k + \frac{y_k y_k^T}{y_k^T s_k} - \frac{B_k s_k s_k^T B_k}{s_k^T B_k s_k}

L-BFGS: Limited memory version

Line Search Methods

Goldstein conditions:

f(xk+αpk)f(xk)+c1αfTpkf(x_k + \alpha p_k) \leq f(x_k) + c_1 \alpha \nabla f^T p_k f(xk+αpk)f(xk)+c2αfTpkf(x_k + \alpha p_k) \geq f(x_k) + c_2 \alpha \nabla f^T p_k

Armijo backtracking: Start with α=1\alpha = 1, reduce until satisfied


Convex Optimization

Convex Sets

Set C is convex if for x,yCx, y \in Cand θ[0,1]\theta \in [0,1]: θx+(1θ)yC\theta x + (1-\theta)y \in C

Examples:

  • Hyperplane: {x : a^T x = b}
  • Halfspace: {x : a^T x \leq b}
  • Norm balls
  • Intersection of convex sets

Convex Functions

** ffis convex if** for x,ydomain f,θ[0,1]x, y \in \text{domain } f, \theta \in [0,1]: f(θx+(1θ)y)θf(x)+(1θ)f(y)f(\theta x + (1-\theta)y) \leq \theta f(x) + (1-\theta)f(y)

Characterization (twice differentiable): 2f(x)0\nabla^2 f(x) \succeq 0(positive semidefinite) for all xx

Global Optima

For convex problem minf(x)\min f(x)subject to xCx \in C:

Local minimum     \iffglobal minimum

Examples

Least squares: minAxb2\min ||Ax - b||^2 Quadratic programming: minxTQx+cTx\min x^T Qx + c^T x(Q0Q \succeq 0) Lasso: minAxb2+λx1\min ||Ax - b||^2 + \lambda||x||_1(convex)


Constrained Optimization

Karush-Kuhn-Tucker Conditions

Problem: minf(x)\min f(x)s.t. gi(x)0,hj(x)=0g_i(x) \leq 0, h_j(x) = 0

KKT conditions (necessary):

  1. Stationary: f+λigi+μjhj=0\nabla f + \sum \lambda_i \nabla g_i + \sum \mu_j \nabla h_j = 0
  2. Primal feasibility: gi0,hj=0g_i \leq 0, h_j = 0
  3. Dual feasibility: λi0\lambda_i \geq 0
  4. Complementary slackness: λigi=0\lambda_i g_i = 0

If convex, KKT conditions sufficient for optimality.

Lagrange Multipliers

For equality constraints: f=μjhj\nabla f = \sum \mu_j \nabla h_j

Interpretation: Penalty for constraint violation

Penalty Methods

Unconstrained approximation:

minf(x)+μ[gi(x)+]2+μ[hj(x)]2\min f(x) + \mu \sum [g_i(x)^+]^2 + \mu \sum [h_j(x)]^2

where [t]+=max(0,t)[t]^+ = \max(0,t)

Increase μ\mu \tosolution approaches constrained optimum

Barrier Methods

Interior point: minf(x)μlog(gi(x))\min f(x) - \mu \sum \log(-g_i(x))

satisfies gi(x)<0g_i(x) < 0

Decrease μ\mu \toapproach boundary

Augmented Lagrangian

Method: Add penalty to Lagrangian

ADMM (Alternating Direction Method of Multipliers): For separable problems


Duality Theory

Lagrangian

** L(x,λ,μ)=f(x)+λigi(x)+μjhj(x)L(x, \lambda, \mu) = f(x) + \sum \lambda_i g_i(x) + \sum \mu_j h_j(x)**

Dual Function

** g(λ,μ)=infxL(x,λ,μ)g(\lambda, \mu) = \inf_x L(x, \lambda, \mu)**

Concave in (λ,μ)(\lambda, \mu)

Dual Problem

** maxg(λ,μ)\max g(\lambda, \mu)s.t. λ0\lambda \geq 0**

Always convex (even if primal not)

Weak and Strong Duality

Weak: dpd^* \leq p^*(dual \leqprimal)

Strong: If Slater conditions hold, d=pd^* = p^*

Saddle Point

** (x,λ,μ)(x^*, \lambda^*, \mu^*)** is saddle point if:

L(x,λ,μ)L(x,λ,μ)L(x,λ,μ)L(x^*, \lambda, \mu) \geq L(x^*, \lambda^*, \mu^*) \geq L(x, \lambda^*, \mu^*)

for all x,λ0,μx, \lambda \geq 0, \mu

Saddle point     \iffoptimality


Calculus of Variations

Fundamental Problem

Find function yythat minimizes:

J[y]=abF(x,y,y)dxJ[y] = \int_a^b F(x, y, y') dx

subject to y(a)=ya,y(b)=yby(a) = y_a, y(b) = y_b

Euler-Lagrange Equation

Necessary condition:

FyddxFy=0\frac{\partial F}{\partial y} - \frac{d}{dx}\frac{\partial F}{\partial y'} = 0

Deduce: d/dx(FyF/y)=F/xd/dx (F - y' \partial F/\partial y') = \partial F/\partial x

Examples

Brachistochrone: Minimum time path Geodesics: Shortest path on surface Isoperimetric problem: Maximum area for given perimeter

Hamilton’s Principle

Action: S=LdtS = \int L dtwhere L=TVL = T - V

Minimize S     \impliesLagrange’s equations

Noether’s Theorem

Continuous symmetry     \impliesconserved quantity

Energy conservation: Time translation symmetry Momentum conservation: Space translation symmetry


Optimal Control

Control Problem

State: x(t)x(t) Control: u(t)u(t)

Minimize: J=L(x,u,t)dtJ = \int L(x, u, t) dt

subject to x˙=f(x,u,t),x(0)=x0,x(T)\dot{x} = f(x, u, t), x(0) = x_0, x(T)free or fixed

Hamiltonian

** H(x,u,p,t)=L(x,u,t)+pTf(x,u,t)H(x, u, p, t) = L(x, u, t) + p^T f(x, u, t)**

where ppis adjoint (costate) variable

Pontryagin Minimum Principle

Optimal control uu^*minimizes HH:

u=argminuUH(x,u,p,t)u^* = \arg \min_{u \in U} H(x^*, u, p^*, t)

State equation: x˙=H/p\dot{x} = \partial H/\partial p Adjoint equation: p˙=H/x\dot{p} = -\partial H/\partial x Transversality conditions: p(T)p(T)determined by end constraints

Linear Quadratic Regulator (LQR)

Minimize: J=[xTQx+uTRu]dtJ = \int [x^T Q x + u^T R u] dt

** x˙=Ax+Bu\dot{x} = Ax + Bu** (linear dynamics)

Solution: u=R1BTPxu^* = -R^{-1}B^T P x

where PP from Riccati equation

Closed-form solution possible


Next: [[topology|Topology]] or [[abstract-algebra|Abstract Algebra]]


Last updated: Comprehensive optimization reference covering linear, nonlinear, and variational problems.