Constrained optimization

Instructions:

Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.

FAIR USE ACT DISCLAIMER:
This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

Outline

  • The following topics will be covered in this lecture:
    • Linear programming problems
    • Nonlinear programming problems

Motivation

  • In our previous discussion of optimization, we focused on optimization without constraints.

    • This arises commonly in statistical estimation problems and is actually a simpler case of the constrained optimization problem.
  • Constrained optimization also commonly arises in statistical estimation.

    • For instance, suppose that we need to estimate the variance of some distribution by an optimization routine.
    • A fully unconstrained optimization could (in principle) lead to nonsensical values for the variance.
    • The variance is defined to be strictly non-negative, where a solution giving \( \sigma^2<0 \) would cause obvious errors in our analysis.
  • In this final discussion of optimization, we will consider how constraints are introduced to our optimization framework.

  • This will lead us into two classes of constrained optimization problems:

    • Linear Programming (LP) problems; and
    • Nonlinear Programming (NLP) problems.
  • After introducing some general concepts, we will consider several techniques that can be used in the R language to handle such problems.

Linear Programming problems

  • A LP optimization is a method to find the solution to an optimization problem with:
    1. a linear objective function, and
    2. constraints in the form of linear equalities and linear inequalities.
  • When we say a linear objective function, we are referring to a linear \( f \) such that \[ \begin{align} f:\mathbb{R}^{n} &\rightarrow \mathbb{R}\\ \pmb{x}&\rightarrow \pmb{a}^\top \pmb{x} \end{align} \] as this must be represented by a linear map (matrix multiplication) that transfers \( \pmb{x}\in\mathbb{R}^n \) to a real value.
  • This is thus given precisely by a vector product as above for some \( \pmb{a}\in \mathbb{R}^n \) as we can treat this product as \[ \begin{align} \underbrace{\pmb{a}^\top}_{1\times n} \underbrace{\pmb{x}}_{n \times 1} = \underbrace{y}_{1\times 1}\in \mathbb{R} \end{align} \]
  • Respectively, when we say that we have linear constraints, these can describe, e.g., \[ \begin{align} x_1 + x_2 \leq 1 & & x_1 - x_2 \leq 1 & & -x_1 + x_2 \leq 1 & & -x_1 - x_2 \leq 1. \end{align} \]
  • Generally, therefore, a LP optimization has a region of acceptable solutions defined by a convex polyhedron, which is a set made by the intersection of finitely many half-spaces.
Feasible region.

Courtesy of: J. Nocedal and S. Wright. Numerical optimization. Springer Science & Business Media, 2006.

  • This convex polyhedron is denoted \( \Omega \) and called the feasible region of the problem.
  • The objective of linear programming is to find a point in the feasible region where the objective function reaches a minimum or maximum value.

Linear Programming problems

  • A representative LP problem can be expressed as finding \( \pmb{x}^\ast \) such that

    \[ \begin{align} f(\pmb{x}^\ast) &= \max_{\pmb{x}\in \Omega} \pmb{a}^\top \pmb{x},\quad \text{subject to:}\\ \\ &\mathbf{C}\pmb{x} \leq \pmb{b},\\ &\pmb{x} \geq \pmb{0}, \end{align} \] where \( \pmb{b} \) is a vector of known coefficients and \( \mathbf{C} \) is a known matrix of the coefficients in the constraints.

  • Because the objective function is linear, it is both convex and concave simultaneously.

  • Therefore, as long as the constraints are consistent,

    • and provide a bounded feasible region as seen before,
  • we can obtain a global minimum or maximum for such a problem on the feasible region boundary.

  • Although the linear objective function is a strong constraint on the problem

    • this type of problem can represent a variety of practical optimization scenarios…

Linear Programming problems

  • For example, suppose that a farmer has a piece of farm land, say \( L \) \( \mathrm{km}^2 \), to be planted with either wheat or barley or some combination of the two.

  • The farmer has a limited amount of fertilizer, \( F \) kg, and pesticide, \( P \) kg.

  • Every square kilometer of wheat requires \( F_1 \) kilograms of fertilizer and \( P_1 \) kilograms of pesticide.

  • On the other hand, every square kilometer of barley requires \( F_2 \) kilograms of fertilizer and \( P_2 \) kilograms of pesticide.

  • Let \( S_1 \) be the selling price of wheat per square kilometer, and \( S_2 \) be the selling price of barley.

  • If we denote the area of land planted with wheat and barley by \( x_1 \) and \( x_2 \) respectively, then profit can be maximized by choosing optimal values for \( x_1 \) and \( x_2 \).

Linear Programming problems

  • The example problem can be expressed with the following linear programming problem in the standard form:

    \[ \begin{align} \text{Maximize: } S_{1}\cdot x_{1}+S_{2}\cdot x_{2} & &\text{(maximize the revenue)}\\ \text{Subject to:}\\ x_{1}+x_{2}\leq L & & \text{(limit on total area)}\\ F_{1}\cdot x_{1}+F_{2}\cdot x_{2}\leq F & & \text{(limit on fertilizer)}\\ P_{1}\cdot x_{1}+P_{2}\cdot x_{2}\leq P & & \text{(limit on pesticide)}\\ x_{1}\geq 0,x_{2}\geq 0 & & \text{(cannot plant a negative area).} \end{align} \]

  • Alternatively, in matrix form we have this written equivalently as

    \[ \begin{align} \text{Maximize: }\pmb{S}^\top \pmb{x} \\ \text{Subject to:}\\ \begin{pmatrix} 1 & 1\\ F_1 & F_2 \\ P_1 & P_2 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} \leq \begin{pmatrix}L \\ F \\ P \end{pmatrix} \\ \begin{pmatrix}x_1 \\ x_2 \end{pmatrix} \geq \begin{pmatrix} 0 \\ 0 \end{pmatrix} \end{align} \]

Linear Programming problems

  • Linear programming problems as above can be solved in R by using the R wrapper for the GNU Linear Programming Kit (GLPK).

  • This comes in the package Rglpk below:

require(Rglpk)
  • As an example, we will show how to solve the following problem,

    \[ \begin{align} \text{Maximize: }\begin{pmatrix}2 \\ 4\end{pmatrix}^\top \begin{pmatrix}x_1 \\ x_2\end{pmatrix}\\ \text{Subject to:} & & \begin{pmatrix} 3 \\ 4\end{pmatrix}^\top \begin{pmatrix}x_1 \\ x_2\end{pmatrix} \leq 60 & & \begin{pmatrix}x_1 \\ x_2\end{pmatrix} \geq \pmb{0} \end{align} \]

  • We will use the Rglpk_solve_LP with the following arguments:

    • obj - a numeric vector representing the objective coefficients.
    • mat - a matrix of constraint coefficients.
    • dir - a character vector with the directions of the constraints, "<", "<=", ">", ">=", or "==".
    • rhs - a numeric vector representing the right hand side of the constraints.

Linear Programming problems

Rglpk_solve_LP(obj = c(2, 4), 
               mat = matrix(c(3, 4), nrow = 1),
               dir ="<=",
               rhs = 60,
               max = TRUE)
$optimum
[1] 60

$solution
[1]  0 15

$status
[1] 0

$solution_dual
[1] -1  0

$auxiliary
$auxiliary$primal
[1] 60

$auxiliary$dual
[1] 1


$sensitivity_report
[1] NA

Linear Programming problems

  • The geometry of the LP problem can be understood where the output of the objective function \( f \) is given as a hyper-plane above the \( x_1,x_2 \) plane.
  • The constraints likewise define the convex polyhedron through the intersection of the corresponding half-hyper-planes.
  • Correspondingly, the maximum is attained where the polyhedron intersects the hyper-plane of the objective function.
  • This is visualized to the right for this problem.
lp_intersection.

Courtesy of Härdle, W.K. et al. Basic Elements of Computational Statistics. Springer International Publishing, 2017.

Nonlinear Programming problems

  • The NLP has an analogous definition as that of the LP problem.

  • The differences between NLP and LP are that the objective function or the constraints in an NLP can be nonlinear functions.

  • NLP has some similarity then to what we saw in our discussion of unconstrained optimization, thus, in that we will often be concerned with finding a local minima or local maxima.

  • Techniques such as Newton's descent and gradient descent can be revised to handle the constraints defining the feasible region.

  • These methods tend to become, however, much more complex to develop, and we will only introduce a simple example here.

  • Particularly, we will consider the nonlinear objective function with the linear constraints below:

    \[ \begin{align} f(\pmb{x}^\ast) =& \max_{x_1 , x_2} \sqrt{5x_1} + \sqrt{3x_2} ,\\ & \text{subject to:} \quad 3x_1 + 5x_2 \leq 10,\\ & x_1 \geq 0,\\ & x_2 \geq 0. \end{align} \]

  • This type of linearly constrained, nonlinear objective function can be optimized with the constOptim function of the stats package in R.

require(stats)

Nonlinear Programming problems

  • The constOptim function is an extension of the optim function in R that allows for a feasible region.

  • This function is given with syntax

constrOptim(theta, f, grad, ui, ci)
  • where

    • theta – is the numerical starting value in the feasible region.
    • f – is the function to minimize.
    • grad – is the gradient of f as a function or NULL.
    • ui – is the constraint matrix (k x p).
    • ci – is the constraint vector of length k.
  • We'll start by defining the function and the constraints:

f <- function(x){
 -sqrt(5 * x[1]) - sqrt(3 * x[2])
 }
A <- matrix(c(-3, -5), nrow = 1, ncol = 2, byrow = TRUE)
b <- c(-10)

Nonlinear Programming problems

  • Running the optimization:
constrOptim(f = f, theta = c(1, 1), grad = NULL, ui = A, ci = b)
$par
[1] 2.4510595 0.5293643

$value
[1] -4.760952

$counts
function gradient 
     170       NA 

$convergence
[1] 0

$message
NULL

$outer.iterations
[1] 3

$barrier.value
[1] -0.0009999994
  • Because we phrased this as the minimization of the negative function, the max value is \( 4.760952 \) of the original.

Nonlinear Programming problems

  • The geometry of this particular NLP problem can be understood by the last example.
  • Particularly, we have the same constraints as before, but we now have a curved graph of the objective function above the \( x_1,x_2 \) plane.
  • This is visualized to the right for this problem.
nlp_intersection.

Courtesy of Härdle, W.K. et al. Basic Elements of Computational Statistics. Springer International Publishing, 2017.