  • The following topics will be covered in this lecture:
    • Linear programming problems
    • Nonlinear programming problems


  • In our previous discussion of optimization, we focused on optimization without constraints.

    • This arises commonly in statistical estimation problems and is actually a simpler case of the constrained optimization problem.
  • Constrained optimization also commonly arises in statistical estimation.

    • For instance, suppose that we need to estimate the variance of some distribution by an optimization routine.
    • A fully unconstrained optimization could (in principle) lead to nonsensical values for the variance.
    • The variance is defined to be strictly non-negative, where a solution giving \( \sigma^2<0 \) would cause obvious errors in our analysis.
  • In this final discussion of optimization, we will consider how constraints are introduced to our optimization framework.

  • This will lead us into two classes of constrained optimization problems:

    • Linear Programming (LP) problems; and
    • Nonlinear Programming (NLP) problems.
  • After introducing some general concepts, we will consider several techniques that can be used in the R language to handle such problems.

Linear Programming problems

  • A LP optimization is a method to find the solution to an optimization problem with:
    1. a linear objective function, and
    2. constraints in the form of linear equalities and linear inequalities.
  • When we say a linear objective function, we are referring to a linear \( f \) such that \[ \begin{align} f:\mathbb{R}^{n} &\rightarrow \mathbb{R}\\ \pmb{x}&\rightarrow \pmb{a}^\top \pmb{x} \end{align} \] as this must be represented by a linear map (matrix multiplication) that transfers \( \pmb{x}\in\mathbb{R}^n \) to a real value.
  • This is thus given precisely by a vector product as above for some \( \pmb{a}\in \mathbb{R}^n \) as we can treat this product as \[ \begin{align} \underbrace{\pmb{a}^\top}_{1\times n} \underbrace{\pmb{x}}_{n \times 1} = \underbrace{y}_{1\times 1}\in \mathbb{R} \end{align} \]
  • Respectively, when we say that we have linear constraints, these can describe, e.g., \[ \begin{align} x_1 + x_2 \leq 1 & & x_1 - x_2 \leq 1 & & -x_1 + x_2 \leq 1 & & -x_1 - x_2 \leq 1. \end{align} \]
  • Generally, therefore, a LP optimization has a region of acceptable solutions defined by a convex polyhedron, which is a set made by the intersection of finitely many half-spaces.
Feasible region.

Courtesy of: J. Nocedal and S. Wright. Numerical optimization. Springer Science & Business Media, 2006.

  • This convex polyhedron is denoted \( \Omega \) and called the feasible region of the problem.
  • The objective of linear programming is to find a point in the feasible region where the objective function reaches a minimum or maximum value.

Linear Programming problems

  • A representative LP problem can be expressed as finding \( \pmb{x}^\ast \) such that

    \[ \begin{align} f(\pmb{x}^\ast) &= \max_{\pmb{x}\in \Omega} \pmb{a}^\top \pmb{x},\quad \text{subject to:}\\ \\ &\mathbf{C}\pmb{x} \leq \pmb{b},\\ &\pmb{x} \geq \pmb{0}, \end{align} \] where \( \pmb{b} \) is a vector of known coefficients and \( \mathbf{C} \) is a known matrix of the coefficients in the constraints.

  • Because the objective function is linear, it is both convex and concave simultaneously.

  • Therefore, as long as the constraints are consistent,

    • and provide a bounded feasible region as seen before,
  • we can obtain a global minimum or maximum for such a problem on the feasible region boundary.

  • Although the linear objective function is a strong constraint on the problem

    • this type of problem can represent a variety of practical optimization scenarios…

Linear Programming problems

  • For example, suppose that a farmer has a piece of farm land, say \( L \) \( \mathrm{km}^2 \), to be planted with either wheat or barley or some combination of the two.

  • The farmer has a limited amount of fertilizer, \( F \) kg, and pesticide, \( P \) kg.

  • Every square kilometer of wheat requires \( F_1 \) kilograms of fertilizer and \( P_1 \) kilograms of pesticide.

  • On the other hand, every square kilometer of barley requires \( F_2 \) kilograms of fertilizer and \( P_2 \) kilograms of pesticide.

  • Let \( S_1 \) be the selling price of wheat per square kilometer, and \( S_2 \) be the selling price of barley.

  • If we denote the area of land planted with wheat and barley by \( x_1 \) and \( x_2 \) respectively, then profit can be maximized by choosing optimal values for \( x_1 \) and \( x_2 \).

Linear Programming problems

  • The example problem can be expressed with the following linear programming problem in the standard form:

    \[ \begin{align} \text{Maximize: } S_{1}\cdot x_{1}+S_{2}\cdot x_{2} & &\text{(maximize the revenue)}\\ \text{Subject to:}\\ x_{1}+x_{2}\leq L & & \text{(limit on total area)}\\ F_{1}\cdot x_{1}+F_{2}\cdot x_{2}\leq F & & \text{(limit on fertilizer)}\\ P_{1}\cdot x_{1}+P_{2}\cdot x_{2}\leq P & & \text{(limit on pesticide)}\\ x_{1}\geq 0,x_{2}\geq 0 & & \text{(cannot plant a negative area).} \end{align} \]

  • Alternatively, in matrix form we have this written equivalently as

    \[ \begin{align} \text{Maximize: }\pmb{S}^\top \pmb{x} \\ \text{Subject to:}\\ \begin{pmatrix} 1 & 1\\ F_1 & F_2 \\ P_1 & P_2 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} \leq \begin{pmatrix}L \\ F \\ P \end{pmatrix} \\ \begin{pmatrix}x_1 \\ x_2 \end{pmatrix} \geq \begin{pmatrix} 0 \\ 0 \end{pmatrix} \end{align} \]

Linear Programming problems

  • Linear programming problems as above can be solved in R by using the R wrapper for the GNU Linear Programming Kit (GLPK).

  • This comes in the package Rglpk below:

  • As an example, we will show how to solve the following problem,

    \[ \begin{align} \text{Maximize: }\begin{pmatrix}2 \\ 4\end{pmatrix}^\top \begin{pmatrix}x_1 \\ x_2\end{pmatrix}\\ \text{Subject to:} & & \begin{pmatrix} 3 \\ 4\end{pmatrix}^\top \begin{pmatrix}x_1 \\ x_2\end{pmatrix} \leq 60 & & \begin{pmatrix}x_1 \\ x_2\end{pmatrix} \geq \pmb{0} \end{align} \]

  • We will use the Rglpk_solve_LP with the following arguments:

    • obj - a numeric vector representing the objective coefficients.
    • mat - a matrix of constraint coefficients.
    • dir - a character vector with the directions of the constraints, "<", "<=", ">", ">=", or "==".
    • rhs - a numeric vector representing the right hand side of the constraints.

Linear Programming problems

Rglpk_solve_LP(obj = c(2, 4), 
               mat = matrix(c(3, 4), nrow = 1),
               dir ="<=",
               rhs = 60,
               max = TRUE)
[1] 60

[1]  0 15

[1] 0

[1] -1  0

[1] 60

[1] 1

[1] NA

Linear Programming problems

  • The geometry of the LP problem can be understood where the output of the objective function \( f \) is given as a hyper-plane above the \( x_1,x_2 \) plane.
  • The constraints likewise define the convex polyhedron through the intersection of the corresponding half-hyper-planes.
  • Correspondingly, the maximum is attained where the polyhedron intersects the hyper-plane of the objective function.
  • This is visualized to the right for this problem.

Courtesy of Härdle, W.K. et al. Basic Elements of Computational Statistics. Springer International Publishing, 2017.

Nonlinear Programming problems

  • The NLP has an analogous definition as that of the LP problem.

  • The differences between NLP and LP are that the objective function or the constraints in an NLP can be nonlinear functions.

  • NLP has some similarity then to what we saw in our discussion of unconstrained optimization, thus, in that we will often be concerned with finding a local minima or local maxima.

  • Techniques such as Newton's descent and gradient descent can be revised to handle the constraints defining the feasible region.

  • These methods tend to become, however, much more complex to develop, and we will only introduce a simple example here.

  • Particularly, we will consider the nonlinear objective function with the linear constraints below:

    \[ \begin{align} f(\pmb{x}^\ast) =& \max_{x_1 , x_2} \sqrt{5x_1} + \sqrt{3x_2} ,\\ & \text{subject to:} \quad 3x_1 + 5x_2 \leq 10,\\ & x_1 \geq 0,\\ & x_2 \geq 0. \end{align} \]

  • This type of linearly constrained, nonlinear objective function can be optimized with the constOptim function of the stats package in R.


Nonlinear Programming problems

  • The constOptim function is an extension of the optim function in R that allows for a feasible region.

  • This function is given with syntax

constrOptim(theta, f, grad, ui, ci)
  • where

    • theta – is the numerical starting value in the feasible region.
    • f – is the function to minimize.
    • grad – is the gradient of f as a function or NULL.
    • ui – is the constraint matrix (k x p).
    • ci – is the constraint vector of length k.
  • We'll start by defining the function and the constraints:

f <- function(x){
 -sqrt(5 * x[1]) - sqrt(3 * x[2])
A <- matrix(c(-3, -5), nrow = 1, ncol = 2, byrow = TRUE)
b <- c(-10)

Nonlinear Programming problems

  • Running the optimization:
constrOptim(f = f, theta = c(1, 1), grad = NULL, ui = A, ci = b)
[1] 2.4510595 0.5293643

[1] -4.760952

function gradient 
     170       NA 

[1] 0


[1] 3

[1] -0.0009999994
  • Because we phrased this as the minimization of the negative function, the max value is \( 4.760952 \) of the original.

Nonlinear Programming problems

  • The geometry of this particular NLP problem can be understood by the last example.
  • Particularly, we have the same constraints as before, but we now have a curved graph of the objective function above the \( x_1,x_2 \) plane.
  • This is visualized to the right for this problem.

Courtesy of Härdle, W.K. et al. Basic Elements of Computational Statistics. Springer International Publishing, 2017.