Leah Pope

2022-06-28

Actually i don't know how to distinguish maximization and minimization of nonlinear function with Newton Raphson Method.

What i know is, for finding the optimization points, we have to calculate this iteration:

${x}_{i+1}={x}_{i}-[\mathbf{H}f({x}_{i}){]}^{-1}\mathrm{\nabla}f({x}_{i})$

Then, what is actually the difference between maximization and minimization using this method?

What i know is, for finding the optimization points, we have to calculate this iteration:

${x}_{i+1}={x}_{i}-[\mathbf{H}f({x}_{i}){]}^{-1}\mathrm{\nabla}f({x}_{i})$

Then, what is actually the difference between maximization and minimization using this method?

Trey Ross

Beginner2022-06-29Added 30 answers

Newton-Raphson is based on a local quadratic approximation. The iterate moves to the optimum of the quadratic approximation. Whether you minimize or maximize does not depend on the iteration calculation (you cannot modify it to turn minimization into maximization or vice versa) but on the shape of the approximation. The approximation is convex when $Hf({x}_{i})$ is positive semidefinite (psd), and concave when $Hf({x}_{i})$ is negative semidefinite (nsd). When $Hf({x}_{i})$ is psd, you expect to move to a local minimum, whereas when it is nsd,you expect to move to a local maximum.

The easiest way to think about this is for functions $\mathbb{R}\to \mathbb{R}$, so let's take $f(x)={x}^{3}$. At $x=1$ the local quadratic approximation is $g(x)=1+3(x-1)+3(x-1{)}^{2}$ which is convex. So if you perform an iteration of Newton raphson, you move to the minimum of $g$ and you hope to find a minimum of $f$.

On the other hand, if you start at $x=-1$, the local quadratic approximation is $g(x)=-1+3(x+1)-3(x+1{)}^{2}$, which is concave. So if you perform an iteration of Newton raphson, you move to the maximum of $g$ and you hope to find a maximum of $f$.

If the definiteness of $Hf$ does not agree with your goal (e.g., $Hf$ is nsd but you want to minimize), then a quadratic approximation is not useful. It might be better to switch to other methods such as gradient descent.

The easiest way to think about this is for functions $\mathbb{R}\to \mathbb{R}$, so let's take $f(x)={x}^{3}$. At $x=1$ the local quadratic approximation is $g(x)=1+3(x-1)+3(x-1{)}^{2}$ which is convex. So if you perform an iteration of Newton raphson, you move to the minimum of $g$ and you hope to find a minimum of $f$.

On the other hand, if you start at $x=-1$, the local quadratic approximation is $g(x)=-1+3(x+1)-3(x+1{)}^{2}$, which is concave. So if you perform an iteration of Newton raphson, you move to the maximum of $g$ and you hope to find a maximum of $f$.

If the definiteness of $Hf$ does not agree with your goal (e.g., $Hf$ is nsd but you want to minimize), then a quadratic approximation is not useful. It might be better to switch to other methods such as gradient descent.

Yahir Crane

Beginner2022-06-30Added 9 answers

Suppose we want to find the $\hat{x}\in {\mathbb{R}}^{k}$ that maximizes the (twice continuously) differentiable function $f:{\mathbb{R}}^{k}\to \mathbb{R}$.

Well

$f(\mathbf{x}\mathbf{+}\mathbf{h}\mathbf{)}\approx \mathbf{a}\mathbf{+}{\mathbf{b}}^{\prime}\mathbf{h}\mathbf{+}\frac{\mathbf{1}}{\mathbf{2}}{\mathbf{h}}^{\prime}\mathbf{C}\mathbf{h}$

where $a=f(x),b=\mathrm{\nabla}f(x)$ and $C={D}^{2}f(x)$.

Note that C will be symmetric. This implies

$\nabla f(x+h)\approx b+Ch.$

Thus the first order condition for a maximum is

$0=b+C\hat{h}$

which implies that

$\hat{h}={C}^{-1}b$

In other words, the vector that maximizes the second order Taylor approximation to $f$ at $x$ is

$\begin{array}{rl}x+\hat{h}& =x-{C}^{-1}b\\ & =x-({D}^{2}f(x{)}^{-1})\mathrm{\nabla}f(x)\end{array}$

Which I am sure you can relate to your initial formula above.

Well

$f(\mathbf{x}\mathbf{+}\mathbf{h}\mathbf{)}\approx \mathbf{a}\mathbf{+}{\mathbf{b}}^{\prime}\mathbf{h}\mathbf{+}\frac{\mathbf{1}}{\mathbf{2}}{\mathbf{h}}^{\prime}\mathbf{C}\mathbf{h}$

where $a=f(x),b=\mathrm{\nabla}f(x)$ and $C={D}^{2}f(x)$.

Note that C will be symmetric. This implies

$\nabla f(x+h)\approx b+Ch.$

Thus the first order condition for a maximum is

$0=b+C\hat{h}$

which implies that

$\hat{h}={C}^{-1}b$

In other words, the vector that maximizes the second order Taylor approximation to $f$ at $x$ is

$\begin{array}{rl}x+\hat{h}& =x-{C}^{-1}b\\ & =x-({D}^{2}f(x{)}^{-1})\mathrm{\nabla}f(x)\end{array}$

Which I am sure you can relate to your initial formula above.

The distance between the centers of two circles C1 and C2 is equal to 10 cm. The circles have equal radii of 10 cm.

A part of circumference of a circle is called

A. Radius

B. Segment

C. Arc

D. SectorThe perimeter of a basketball court is 108 meters and the length is 6 meters longer than twice the width. What are the length and width?

What are the coordinates of the center and the length of the radius of the circle represented by the equation ${x}^{2}+{y}^{2}-4x+8y+11=0$?

Which of the following pairs of angles are supplementary?

128,62

113,47

154,36

108,72What is the surface area to volume ratio of a sphere?

An angle which measures 89 degrees is a/an _____.

right angle

acute angle

obtuse angle

straight angleHerman drew a 4 sided figure which had only one pair of parallel sides. What could this figure be?

Trapezium

Parallelogram

Square

RectangleWhich quadrilateral has: All sides equal, and opposite angles equal?

Trapezium

Rhombus

Kite

RectangleKaren says every equilateral triangle is acute. Is this true?

Find the number of lines of symmetry of a circle.

A. 0

B. 4

C. 2

D. InfiniteThe endpoints of a diameter of a circle are located at (5,9) and (11, 17). What is the equation of the circle?

What is the number of lines of symmetry in a scalene triangle?

A. 0

B. 1

C. 2

D. 3How many diagonals does a rectangle has?

A quadrilateral whose diagonals are unequal, perpendicular and bisect each other is called a.

A. rhombus

B. trapezium

C. parallelogram