gradient_descent_update#
- ivy.gradient_descent_update(w, dcdw, lr, /, *, stop_gradients=True, out=None)[source]#
Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws].
- Parameters:
w (
Union
[Array
,NativeArray
]) – Weights of the function to be updated.dcdw (
Union
[Array
,NativeArray
]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].lr (
Union
[float
,Array
,NativeArray
]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.stop_gradients (
bool
, default:True
) – Whether to stop the gradients of the variables after each gradient step. Default isTrue
.out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
- Returns:
ret – The new weights, following the gradient descent updates.
Examples
With
ivy.Array
inputs:>>> w = ivy.array([[1., 2, 3], ... [4, 6, 1], ... [1, 0, 7]]) >>> dcdw = ivy.array([[0.5, 0.2, 0.1], ... [0.3, 0.6, 0.4], ... [0.4, 0.7, 0.2]]) >>> lr = ivy.array(0.1) >>> new_weights = ivy.gradient_descent_update(w, dcdw, lr, stop_gradients=True) >>> print(new_weights) ivy.array([[ 0.95, 1.98, 2.99], ... [ 3.97, 5.94, 0.96], ... [ 0.96, -0.07, 6.98]])
>>> w = ivy.array([1., 2., 3.]) >>> dcdw = ivy.array([0.5, 0.2, 0.1]) >>> lr = ivy.array(0.3) >>> out = ivy.zeros_like(w) >>> ivy.gradient_descent_update(w, dcdw, lr, out=out) >>> print(out) ivy.array([0.85, 1.94, 2.97])
With one
ivy.Container
inputs:>>> w = ivy.Container(a=ivy.array([1., 2., 3.]), ... b=ivy.array([3.48, 5.72, 1.98])) >>> dcdw = ivy.array([0.5, 0.2, 0.1]) >>> lr = ivy.array(0.3) >>> w_new = ivy.gradient_descent_update(w, dcdw, lr) >>> print(w_new) { a: ivy.array([0.85, 1.94, 2.97]), b: ivy.array([3.33, 5.66, 1.95]) }
With multiple
ivy.Container
inputs:>>> w = ivy.Container(a=ivy.array([1., 2., 3.]), ... b=ivy.array([3.48, 5.72, 1.98])) >>> dcdw = ivy.Container(a=ivy.array([0.5, 0.2, 0.1]), ... b=ivy.array([2., 3.42, 1.69])) >>> lr = ivy.array(0.3) >>> w_new = ivy.gradient_descent_update(w, dcdw, lr) >>> print(w_new) { a: ivy.array([0.85, 1.94, 2.97]), b: ivy.array([2.88, 4.69, 1.47]) }
- Array.gradient_descent_update(self, dcdw, lr, /, *, stop_gradients=True, out=None)[source]#
ivy.Array instance method variant of ivy.gradient_descent_update. This method simply wraps the function, and so the docstring for ivy.gradient_descent_update also applies to this method with minimal changes.
- Parameters:
self (
Array
) – Weights of the function to be updated.dcdw (
Union
[Array
,NativeArray
]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].lr (
Union
[float
,Array
,NativeArray
]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.stop_gradients (
bool
, default:True
) – Whether to stop the gradients of the variables after each gradient step. Default isTrue
.out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – The new weights, following the gradient descent updates.
Examples
With
ivy.Array
inputs:>>> w = ivy.array([[1., 2, 3], ... [4, 6, 1], ... [1, 0, 7]]) >>> dcdw = ivy.array([[0.5, 0.2, 0.1], ... [0.3, 0.6, 0.4], ... [0.4, 0.7, 0.2]]) >>> lr = ivy.array(0.1) >>> new_weights = w.gradient_descent_update(dcdw, lr, stop_gradients = True) >>> print(new_weights) ivy.array([[ 0.95, 1.98, 2.99], ... [ 3.97, 5.94, 0.96], ... [ 0.96, -0.07, 6.98]])
- Container.gradient_descent_update(self, dcdw, lr, /, *, stop_gradients=True, out=None)[source]#
ivy.Container instance method variant of ivy.gradient_descent_update. This method simply wraps the function, and so the docstring for ivy.gradient_descent_update also applies to this method with minimal changes.
- Parameters:
self (
Container
) – Weights of the function to be updated.dcdw (
Union
[Array
,NativeArray
,Container
]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].lr (
Union
[float
,Array
,NativeArray
,Container
]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.key_chains – The key-chains to apply or not apply the method to. Default is
None
.to_apply – If True, the method will be applied to key_chains, otherwise key_chains will be skipped. Default is
True
.prune_unapplied – Whether to prune key_chains for which the function was not applied. Default is
False
.map_sequences – Whether to also map method to sequences (lists, tuples). Default is
False
.stop_gradients (
Union
[bool
,Container
], default:True
) – Whether to stop the gradients of the variables after each gradient step. Default isTrue
.out (
Optional
[Container
], default:None
) – optional output container, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Container
- Returns:
ret – The new weights, following the gradient descent updates.
Examples
With one
ivy.Container
inputs:>>> w = ivy.Container(a=ivy.array([1., 2., 3.]), ... b=ivy.array([3.48, 5.72, 1.98])) >>> dcdw = ivy.array([0.5, 0.2, 0.1]) >>> lr = ivy.array(0.3) >>> w_new = w.gradient_descent_update(dcdw, lr) >>> print(w_new) { a: ivy.array([0.85, 1.94, 2.97]), b: ivy.array([3.33, 5.66, 1.95]) }
With multiple
ivy.Container
inputs:>>> w = ivy.Container(a=ivy.array([1., 2., 3.]), ... b=ivy.array([3.48, 5.72, 1.98])) >>> dcdw = ivy.Container(a=ivy.array([0.5, 0.2, 0.1]), ... b=ivy.array([2., 3.42, 1.69])) >>> lr = ivy.array(0.3) >>> w_new = w.gradient_descent_update(dcdw, lr) >>> print(w_new) { a: ivy.array([0.85, 1.94, 2.97]), b: ivy.array([2.88, 4.69, 1.47]) }