optimizer_update#

ivy.optimizer_update(w, effective_grad, lr, /, *, stop_gradients=True, out=None)[source]#

Update weights ws of some function, given the true or effective derivatives of some cost c with respect to ws, [dc/dw for w in ws].

Parameters:

w (Union[Array, NativeArray]) – Weights of the function to be updated.
effective_grad (Union[Array, NativeArray]) – Effective gradients of the cost c with respect to the weights ws, [dc/dw for w in ws].
lr (Union[float, Array, NativeArray]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.
stop_gradients (bool, default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.
out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The new function weights ws_new, following the optimizer updates.

Examples

With ivy.Array inputs:

>>> w = ivy.array([1., 2., 3.])
>>> effective_grad = ivy.zeros(3)
>>> lr = 3e-4
>>> ws_new = ivy.optimizer_update(w, effective_grad, lr)
>>> print(ws_new)
ivy.array([1., 2., 3.])

>>> w = ivy.array([1., 2., 3.])
>>> effective_grad = ivy.zeros(3)
>>> lr = 3e-4
>>> ws_new = ivy.optimizer_update(w, effective_grad, lr,
...                               out=None, stop_gradients=True)
>>> print(ws_new)
ivy.array([1., 2., 3.])

>>> w = ivy.array([[1., 2.], [4., 5.]])
>>> out = ivy.zeros_like(w)
>>> effective_grad = ivy.array([[4., 5.], [7., 8.]])
>>> lr = ivy.array([3e-4, 1e-2])
>>> ws_new = ivy.optimizer_update(w, effective_grad, lr, out=out)
>>> print(out)
ivy.array([[0.999, 1.95],
           [4., 4.92]])

>>> w = ivy.array([1., 2., 3.])
>>> out = ivy.zeros_like(w)
>>> effective_grad = ivy.array([4., 5., 6.])
>>> lr = ivy.array([3e-4])
>>> ws_new = ivy.optimizer_update(w, effective_grad, lr,
...                               stop_gradients=False, out=out)
>>> print(out)
ivy.array([0.999, 2.   , 3.   ])

With one ivy.Container input:

>>> w = ivy.Container(a=ivy.array([0., 1., 2.]),
...                   b=ivy.array([3., 4., 5.]))
>>> effective_grad = ivy.array([0., 0., 0.])
>>> lr = 3e-4
>>> ws_new = ivy.optimizer_update(w, effective_grad, lr)
>>> print(ws_new)
{
    a: ivy.array([0., 1., 2.]),
    b: ivy.array([3., 4., 5.])
}

With multiple ivy.Container inputs:

>>> w = ivy.Container(a=ivy.array([0., 1., 2.]),
...                   b=ivy.array([3., 4., 5.]))
>>> effective_grad = ivy.Container(a=ivy.array([0., 0., 0.]),
...                                b=ivy.array([0., 0., 0.]))
>>> lr = 3e-4
>>> ws_new = ivy.optimizer_update(w, effective_grad, lr, out=w)
>>> print(w)
{
    a: ivy.array([0., 1., 2.]),
    b: ivy.array([3., 4., 5.])
}

>>> w = ivy.Container(a=ivy.array([0., 1., 2.]),
...                   b=ivy.array([3., 4., 5.]))
>>> effective_grad = ivy.Container(a=ivy.array([0., 0., 0.]),
...                                b=ivy.array([0., 0., 0.]))
>>> lr = ivy.array([3e-4])
>>> ws_new = ivy.optimizer_update(w, effective_grad, lr,
...                               stop_gradients=False)
>>> print(ws_new)
{
    a: ivy.array([0., 1., 2.]),
    b: ivy.array([3., 4., 5.])
}

Array.optimizer_update(self, effective_grad, lr, /, *, stop_gradients=True, out=None)[source]#

ivy.Array instance method variant of ivy.optimizer_update. This method simply wraps the function, and so the docstring for ivy.optimizer_update also applies to this method with minimal changes.

Parameters:

self (Array) – Weights of the function to be updated.
effective_grad (Union[Array, NativeArray]) – Effective gradients of the cost c with respect to the weights ws, [dc/dw for w in ws].
lr (Union[float, Array, NativeArray]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.
stop_gradients (bool, default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.
out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The new function weights ws_new, following the optimizer updates.

Examples

>>> w = ivy.array([1., 2., 3.])
>>> effective_grad = ivy.zeros(3)
>>> lr = 3e-4
>>> ws_new = w.optimizer_update(effective_grad, lr)
>>> print(ws_new)
ivy.array([1., 2., 3.])

Container.optimizer_update(self, effective_grad, lr, /, *, stop_gradients=True, out=None)[source]#

Update weights ws of some function, given the true or effective derivatives of some cost c with respect to ws, [dc/dw for w in ws].

Parameters:

self (Container) – Weights of the function to be updated.
effective_grad (Union[Array, NativeArray, Container]) – Effective gradients of the cost c with respect to the weights ws, [dc/dw for w in ws].
lr (Union[float, Array, NativeArray, Container]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.
stop_gradients (Union[bool, Container], default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.
out (Optional[Container], default: None) – optional output container, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Container

Returns:

ret – The new function weights ws_new, following the optimizer updates.

Examples

With one ivy.Container input:

>>> w = ivy.Container(a=ivy.array([0., 1., 2.]),
...                    b=ivy.array([3., 4., 5.]))
>>> effective_grad = ivy.array([0., 0., 0.])
>>> lr = 3e-4
>>> ws_new = w.optimizer_update(effective_grad, lr)
>>> print(ws_new)
{
    a: ivy.array([0., 1., 2.]),
    b: ivy.array([3., 4., 5.])
}

With multiple ivy.Container inputs:

>>> w = ivy.Container(a=ivy.array([0., 1., 2.]),
...                      b=ivy.array([3., 4., 5.]))
>>> effective_grad = ivy.Container(a=ivy.array([0., 0., 0.]),
...                                   b=ivy.array([0., 0., 0.]))
>>> lr = 3e-4
>>> ws_new = w.optimizer_update(effective_grad, lr, out=w)
>>> print(w)
{
    a: ivy.array([0., 1., 2.]),
    b: ivy.array([3., 4., 5.])
}

>>> w = ivy.Container(a=ivy.array([0., 1., 2.]),
...                    b=ivy.array([3., 4., 5.]))
>>> effective_grad = ivy.Container(a=ivy.array([0., 0., 0.]),
...                                b=ivy.array([0., 0., 0.]))
>>> lr = ivy.array([3e-4])
>>> ws_new = w.optimizer_update(effective_grad, lr, stop_gradients=False)
>>> print(ws_new)
{
    a: ivy.array([0., 1., 2.]),
    b: ivy.array([3., 4., 5.])
}