This is an archived course. A more recent version may be available at ocw.mit.edu.

3.6 Introduction to Design Optimization

3.6.2 Gradient Based Optimization

Measurable Outcome 3.17

Most optimization algorithms are iterative. We begin with an initial guess for our design variables, denoted \(x^0\). We then iteratively update this guess until the optimal design is achieved.

\[x^ q = x^{q-1} + \alpha ^ q S^ q, \; q = 1,2, \ldots\] (3.67)

where

\(q\)=iteration number

\(x^ q\) is our guess for \(x\) at iteration \(q\)

\(S^ q \in \mathbb {R}^ n\) is our vector search direction at iteration \(q\)

\(\alpha ^ q\) is the scalar step length at iteration \(q\)

\(x^0\) is given initial guess.

At each iteration we have two decisions to make: in which direction to move (i.e., what \(S^ q\) to choose, and how far to move along that direction (i.e., how large should \(\alpha ^ q\) be). Optimization algorithms determine the search direction \(S^ q\) according to some criteria. Gradient-based algorithms use gradient information to compute the search direction.

For \(J(x)\) a scalar objective function that depends on \(n\) design variables, the gradient of \(J\) with respect to \(x=[x_1 x_2 \ldots x_ n]^ T\) is a vector of length \(n\). In general, we need the gradient evaluated at some point \(x^ k\):

\[\nabla J(x^ k) = [ \frac{\partial J}{\partial x_1}(x^ k) \ \ldots \frac{\partial J}{\partial x_ n}(x^ k) ]\] (3.68)

The second derivative of \(J\) with respect to \(x\) is a matrix, called the Hessian matrix, of dimension \(n \times n\). In general, we need the Hessian evaluated at some point \(x^ k\):

\[\nabla ^2 J(x^ k) = [ \frac{\partial ^2 J}{\partial x_1^2}(x^ k) \ \ldots \frac{\partial ^2 J}{\partial x_1 \partial x_ n}(x^ k) \\ \frac{\partial ^2 J}{\partial x_1 \partial x_ n}(x^ k) \ \ldots \frac{\partial ^2 J}{\partial x_ n^2}(x^ k)]\] (3.69)

In other words, the \((i,j)\) entry of the Hessian is given by,

\[\nabla ^2 J(x^ k)_{i,j} = \frac{\partial ^2 J}{\partial x_ i \partial x_ j}(x^ k)\] (3.70)

Consider the function \(J(x)=3x_1 + x_1 x_2 + x_3^{2} + 6x_2^{3} x_3\).

Enter the gradient evaluated at the point \((x_1,x_2,x_3)=(1,1,1)\):

Exercise 1

Enter the last row of the Hessian evaluated at the point \((x_1,x_2,x_3)=(1,1,1)\):

Exercise 2