optimizer_update#
- ivy.optimizer_update(w, effective_grad, lr, /, *, stop_gradients=True, out=None)[source]#
Update weights ws of some function, given the true or effective derivatives of some cost c with respect to ws, [dc/dw for w in ws].
- Parameters:
w (
Union
[Array
,NativeArray
]) – Weights of the function to be updated.effective_grad (
Union
[Array
,NativeArray
]) – Effective gradients of the cost c with respect to the weights ws, [dc/dw for w in ws].lr (
Union
[float
,Array
,NativeArray
]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.stop_gradients (
bool
, default:True
) – Whether to stop the gradients of the variables after each gradient step. Default isTrue
.out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
- Returns:
ret – The new function weights ws_new, following the optimizer updates.
Examples
With
ivy.Array
inputs:>>> w = ivy.array([1., 2., 3.]) >>> effective_grad = ivy.zeros(3) >>> lr = 3e-4 >>> ws_new = ivy.optimizer_update(w, effective_grad, lr) >>> print(ws_new) ivy.array([1., 2., 3.])
>>> w = ivy.array([1., 2., 3.]) >>> effective_grad = ivy.zeros(3) >>> lr = 3e-4 >>> ws_new = ivy.optimizer_update(w, effective_grad, lr, ... out=None, stop_gradients=True) >>> print(ws_new) ivy.array([1., 2., 3.])
>>> w = ivy.array([[1., 2.], [4., 5.]]) >>> out = ivy.zeros_like(w) >>> effective_grad = ivy.array([[4., 5.], [7., 8.]]) >>> lr = ivy.array([3e-4, 1e-2]) >>> ws_new = ivy.optimizer_update(w, effective_grad, lr, out=out) >>> print(out) ivy.array([[0.999, 1.95], [4., 4.92]])
>>> w = ivy.array([1., 2., 3.]) >>> out = ivy.zeros_like(w) >>> effective_grad = ivy.array([4., 5., 6.]) >>> lr = ivy.array([3e-4]) >>> ws_new = ivy.optimizer_update(w, effective_grad, lr, ... stop_gradients=False, out=out) >>> print(out) ivy.array([0.999, 2. , 3. ])
With one
ivy.Container
input:>>> w = ivy.Container(a=ivy.array([0., 1., 2.]), ... b=ivy.array([3., 4., 5.])) >>> effective_grad = ivy.array([0., 0., 0.]) >>> lr = 3e-4 >>> ws_new = ivy.optimizer_update(w, effective_grad, lr) >>> print(ws_new) { a: ivy.array([0., 1., 2.]), b: ivy.array([3., 4., 5.]) }
With multiple
ivy.Container
inputs:>>> w = ivy.Container(a=ivy.array([0., 1., 2.]), ... b=ivy.array([3., 4., 5.])) >>> effective_grad = ivy.Container(a=ivy.array([0., 0., 0.]), ... b=ivy.array([0., 0., 0.])) >>> lr = 3e-4 >>> ws_new = ivy.optimizer_update(w, effective_grad, lr, out=w) >>> print(w) { a: ivy.array([0., 1., 2.]), b: ivy.array([3., 4., 5.]) }
>>> w = ivy.Container(a=ivy.array([0., 1., 2.]), ... b=ivy.array([3., 4., 5.])) >>> effective_grad = ivy.Container(a=ivy.array([0., 0., 0.]), ... b=ivy.array([0., 0., 0.])) >>> lr = ivy.array([3e-4]) >>> ws_new = ivy.optimizer_update(w, effective_grad, lr, ... stop_gradients=False) >>> print(ws_new) { a: ivy.array([0., 1., 2.]), b: ivy.array([3., 4., 5.]) }
- Array.optimizer_update(self, effective_grad, lr, /, *, stop_gradients=True, out=None)[source]#
ivy.Array instance method variant of ivy.optimizer_update. This method simply wraps the function, and so the docstring for ivy.optimizer_update also applies to this method with minimal changes.
- Parameters:
self (
Array
) – Weights of the function to be updated.effective_grad (
Union
[Array
,NativeArray
]) – Effective gradients of the cost c with respect to the weights ws, [dc/dw for w in ws].lr (
Union
[float
,Array
,NativeArray
]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.stop_gradients (
bool
, default:True
) – Whether to stop the gradients of the variables after each gradient step. Default isTrue
.out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – The new function weights ws_new, following the optimizer updates.
Examples
>>> w = ivy.array([1., 2., 3.]) >>> effective_grad = ivy.zeros(3) >>> lr = 3e-4 >>> ws_new = w.optimizer_update(effective_grad, lr) >>> print(ws_new) ivy.array([1., 2., 3.])
- Container.optimizer_update(self, effective_grad, lr, /, *, stop_gradients=True, out=None)[source]#
Update weights ws of some function, given the true or effective derivatives of some cost c with respect to ws, [dc/dw for w in ws].
- Parameters:
self (
Container
) – Weights of the function to be updated.effective_grad (
Union
[Array
,NativeArray
,Container
]) – Effective gradients of the cost c with respect to the weights ws, [dc/dw for w in ws].lr (
Union
[float
,Array
,NativeArray
,Container
]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.stop_gradients (
Union
[bool
,Container
], default:True
) – Whether to stop the gradients of the variables after each gradient step. Default isTrue
.out (
Optional
[Container
], default:None
) – optional output container, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Container
- Returns:
ret – The new function weights ws_new, following the optimizer updates.
Examples
With one
ivy.Container
input:>>> w = ivy.Container(a=ivy.array([0., 1., 2.]), ... b=ivy.array([3., 4., 5.])) >>> effective_grad = ivy.array([0., 0., 0.]) >>> lr = 3e-4 >>> ws_new = w.optimizer_update(effective_grad, lr) >>> print(ws_new) { a: ivy.array([0., 1., 2.]), b: ivy.array([3., 4., 5.]) }
With multiple
ivy.Container
inputs:>>> w = ivy.Container(a=ivy.array([0., 1., 2.]), ... b=ivy.array([3., 4., 5.])) >>> effective_grad = ivy.Container(a=ivy.array([0., 0., 0.]), ... b=ivy.array([0., 0., 0.])) >>> lr = 3e-4 >>> ws_new = w.optimizer_update(effective_grad, lr, out=w) >>> print(w) { a: ivy.array([0., 1., 2.]), b: ivy.array([3., 4., 5.]) }
>>> w = ivy.Container(a=ivy.array([0., 1., 2.]), ... b=ivy.array([3., 4., 5.])) >>> effective_grad = ivy.Container(a=ivy.array([0., 0., 0.]), ... b=ivy.array([0., 0., 0.])) >>> lr = ivy.array([3e-4]) >>> ws_new = w.optimizer_update(effective_grad, lr, stop_gradients=False) >>> print(ws_new) { a: ivy.array([0., 1., 2.]), b: ivy.array([3., 4., 5.]) }