Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.
FAIR USE ACT DISCLAIMER: This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.
The maximization and minimization of functions, or optimization problems, contain two components:
E.g., we may wish to optimize factory output \( f(x) \) as a function of hours \( x \) in a week, with a measure of our active machine-hours \( g(x) \) not exceeding a pre-specified limitation \( g(x)\leq C \).
Optimization problems can thus be classified into two categories:
We will focus on unconstrained optimization as often arises in MLE; this is formulated as the following problem,
\[ \begin{align} f: \mathbb{R}^n &\rightarrow \mathbb{R}\\ \mathbf{x} &\rightarrow f(\mathbf{x})\\ f(\mathbf{x}^\ast) &= \mathrm{max}_{\mathbf{x} \in \mathcal{D}} f \end{align} \]
We note that the above problem is equivalent to a minimization problem by a subsitution of \( \tilde{f} = -f \), i.e.,
\[ \begin{align} \tilde{f}: \mathbb{R}^n &\rightarrow \mathbb{R}\\ \mathbf{x} &\rightarrow -f(\mathbf{x})\\ f(\mathbf{x}^\ast) &= \mathrm{max}_{\mathbf{x} \in \mathcal{D}} \tilde{f} = \mathrm{min}_{\mathbf{x}\in \mathcal{D}} f \end{align} \]
A point \( \mathbf{x}^\ast \) is a global minimizer if \( f(\mathbf{x}^\ast) \leq f(\mathbf{x}) \) for all other possible \( \mathbf{x} \) in the domain of consideration \( D\subset \mathbb{R}^n \).
Courtesy of: J. Nocedal and S. Wright. Numerical optimization. Springer Science & Business Media, 2006.
Let \( \mathcal{N}\subset \mathcal{D} \subset\mathbb{R}^n \) be a neighborhood of the point \( \mathbf{x}^\ast \) in the domain of consideration. We say \( \mathbf{x}^\ast \) is a local minimizer in the neighborhood of \( \mathcal{N} \) if \[ \begin{align} f(\mathbf{x}^\ast) \leq f(\mathbf{x}) & & \text{ for each other }\mathbf{x}\in \mathcal{N} \end{align} \]
For finding a local minimizer, the main tools will be derived directly from the second order approximation of the objective function \( f \), defined by
\[ \begin{align} f(\mathbf{x}_1) \approx f(\mathbf{x}_0) + \left(\nabla f(\mathbf{x}_0)\right)^\mathrm{T} \boldsymbol{\delta}_{x_1}+\frac{1}{2} \boldsymbol{\delta}_{x_1}^\mathrm{T}\mathbf{H}_f (\mathbf{x}_0) \boldsymbol{\delta}_{x_1} \end{align} \]
We will consider how this is related to the notion of convexity as follows.
Courtesy of: Oleg Alexandrov. Public domain, via Wikimedia Commons.
For the function of one variable \( f(x) \) we say that \( x^\ast \) is a local minimizer if \( f'(x^\ast)=0 \) and \( f''(x^\ast)> 0 \).