lars_update#
- ivy.lars_update(w, dcdw, lr, /, *, decay_lambda=0, stop_gradients=True, out=None)[source]#
Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws], by applying Layerwise Adaptive Rate Scaling (LARS) method.
- Parameters:
w (
Union
[Array
,NativeArray
]) – Weights of the function to be updated.dcdw (
Union
[Array
,NativeArray
]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].lr (
Union
[float
,Array
,NativeArray
]) – Learning rate, the rate at which the weights should be updated relative to the gradient.decay_lambda (
float
, default:0
) – The factor used for weight decay. Default is zero.stop_gradients (
bool
, default:True
) – Whether to stop the gradients of the variables after each gradient step. Default isTrue
.out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
- Returns:
ret – The new function weights ws_new, following the LARS updates.
Examples
With
ivy.Array
inputs:>>> w = ivy.array([[3., 1, 5], ... [7, 2, 9]]) >>> dcdw = ivy.array([[0.3, 0.1, 0.2], ... [0.1, 0.2, 0.4]]) >>> lr = ivy.array(0.1) >>> new_weights = ivy.lars_update(w, dcdw, lr) >>> print(new_weights) ivy.array([[2.34077978, 0.78025991, 4.56051969], ... [6.78026009, 1.56051981, 8.12103939]])
>>> w = ivy.array([3., 1, 5]) >>> dcdw = ivy.array([0.3, 0.1, 0.2]) >>> lr = ivy.array(0.1) >>> out = ivy.zeros_like(dcdw) >>> ivy.lars_update(w, dcdw, lr, out=out) >>> print(out) ivy.array([2.52565837, 0.8418861 , 4.68377209])
With one
ivy.Container
inputs:>>> w = ivy.Container(a=ivy.array([3.2, 2.6, 1.3]), ... b=ivy.array([1.4, 3.1, 5.1])) >>> dcdw = ivy.array([0.2, 0.4, 0.1]) >>> lr = ivy.array(0.1) >>> new_weights = ivy.lars_update(w, dcdw, lr) >>> print(new_weights) { a: ivy.array([3.01132035, 2.22264051, 1.2056601]), b: ivy.array([1.1324538, 2.56490755, 4.96622658]) }
With multiple
ivy.Container
inputs:>>> w = ivy.Container(a=ivy.array([3.2, 2.6, 1.3]), ... b=ivy.array([1.4, 3.1, 5.1])) >>> dcdw = ivy.Container(a=ivy.array([0.2, 0.4, 0.1]), ... b=ivy.array([0.3,0.1,0.2])) >>> lr = ivy.array(0.1) >>> new_weights = ivy.lars_update(w, dcdw, lr) >>> print(new_weights) { a: ivy.array([3.01132035, 2.22264051, 1.2056601]), b: ivy.array([0.90848625, 2.93616199, 4.77232409]) }
- Array.lars_update(self, dcdw, lr, /, *, decay_lambda=0, stop_gradients=True, out=None)[source]#
ivy.Array instance method variant of ivy.lars_update. This method simply wraps the function, and so the docstring for ivy.lars_update also applies to this method with minimal changes.
- Parameters:
self (
Array
) – Weights of the function to be updated.dcdw (
Union
[Array
,NativeArray
]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].lr (
Union
[float
,Array
,NativeArray
]) – Learning rate, the rate at which the weights should be updated relative to the gradient.decay_lambda (
float
, default:0
) – The factor used for weight decay. Default is zero.stop_gradients (
bool
, default:True
) – Whether to stop the gradients of the variables after each gradient step. Default isTrue
.out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – The new function weights ws_new, following the LARS updates.
Examples
With
ivy.Array
inputs:>>> w = ivy.array([[3., 1, 5], ... [7, 2, 9]]) >>> dcdw = ivy.array([[0.3, 0.1, 0.2], ... [0.1, 0.2, 0.4]]) >>> lr = ivy.array(0.1) >>> new_weights = w.lars_update(dcdw, lr, stop_gradients = True) >>> print(new_weights) ivy.array([[2.34077978, 0.78025991, 4.56051969], ... [6.78026009, 1.56051981, 8.12103939]])
- Container.lars_update(self, dcdw, lr, /, *, decay_lambda=0, stop_gradients=True, out=None)[source]#
Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws], by applying Layerwise Adaptive Rate Scaling (LARS) method.
- Parameters:
self (
Container
) – Weights of the function to be updated.dcdw (
Union
[Array
,NativeArray
,Container
]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].lr (
Union
[float
,Array
,NativeArray
,Container
]) – Learning rate, the rate at which the weights should be updated relative to the gradient.decay_lambda (
Union
[float
,Container
], default:0
) – The factor used for weight decay. Default is zero.stop_gradients (
Union
[bool
,Container
], default:True
) – Whether to stop the gradients of the variables after each gradient step. Default isTrue
.out (
Optional
[Container
], default:None
) – optional output container, for writing the result to. It must have a shape that the inputs broadcast to.
- Returns:
ret – The new function weights ws_new, following the LARS updates.
Examples
With one
ivy.Container
inputs:>>> w = ivy.Container(a=ivy.array([3.2, 2.6, 1.3]), ... b=ivy.array([1.4, 3.1, 5.1])) >>> dcdw = ivy.array([0.2, 0.4, 0.1]) >>> lr = ivy.array(0.1) >>> new_weights = w.lars_update(dcdw, lr) >>> print(new_weights) { a: ivy.array([3.01132035, 2.22264051, 1.2056601]), b: ivy.array([1.1324538, 2.56490755, 4.96622658]) }
With multiple
ivy.Container
inputs:>>> w = ivy.Container(a=ivy.array([3.2, 2.6, 1.3]), ... b=ivy.array([1.4, 3.1, 5.1])) >>> dcdw = ivy.Container(a=ivy.array([0.2, 0.4, 0.1]), ... b=ivy.array([0.3,0.1,0.2])) >>> lr = ivy.array(0.1) >>> new_weights = w.lars_update(dcdw, lr) >>> print(new_weights) { a: ivy.array([3.01132035, 2.22264051, 1.2056601]), b: ivy.array([0.90848625, 2.93616199, 4.77232409]) }