kl_div#
- ivy.kl_div(input, target, /, *, reduction='mean', log_target=False, out=None)[source]#
Compute the Kullback-Leibler divergence loss between two input tensors (conventionally, probability distributions).
- Parameters:
input (array_like) – Tensor of arbitrary shape in log-probabilities
target (array_like) – Tensor of the same shape as input. See log_target for the target’s interpretation
reduction ({'mean', 'sum', 'batchmean', 'none'}, optional) – Type of reduction to apply to the output. Default is ‘mean’.
log_target (bool) – A flag indicating whether target is passed in the log space. It is recommended to pass certain distributions (like softmax) in the log space to avoid numerical issues caused by explicit log. Default: False
- Return type:
- Returns:
ret (array) – The Kullback-Leibler divergence loss between the two input tensors.
Examples
>>> input = ivy.array([[0.2, 0.8], [0.5, 0.5]]) >>> target = ivy.array([[0.6, 0.4], [0.3, 0.7]]) >>> ivy.kl_div(input, target) ivy.array(-0.555969)
>>> input = ivy.array([[0.2, 0.8], [0.5, 0.5]]) >>> target = ivy.array([[0.6, 0.4], [0.3, 0.7]]) >>> ivy.kl_div(input, target, reduction='sum') ivy.array(-2.223876)
>>> input = ivy.array([[0.2, 0.8], [0.5, 0.5]]) >>> target = ivy.array([[0.6, 0.4], [0.3, 0.7]]) >>> ivy.kl_div(input, target, reduction='batchmean') ivy.array(-1.111938)
>>> input = ivy.array([0.2, 0.8], [0.5, 0.5]) >>> target = ivy.array([0.6, 0.4], [0.3, 0.7]) >>> ivy.kl_div(input, target, reduction='none') ivy.array([[-0.42649534, -0.68651628], [-0.51119184, -0.59967244]])
- Array.kl_div(self, target, /, *, reduction='mean', log_target=False, out=None)[source]#
ivy.Array instance method variant of ivy.kl_div. This method simply wraps the function, and so the docstring for ivy.kl_div also applies to this method with minimal changes.
- Parameters:
self (
Array
) – Array containing input probability distribution.target (
Union
[Array
,NativeArray
]) – Array contaiing target probability distribution.reduction (
Optional
[str
], default:'mean'
) – ‘none’: No reduction will be applied to the output. ‘mean’: The output will be averaged. ‘batchmean’: The output will be divided by batch size. ‘sum’: The output will be summed. Default: ‘mean’.out (
Optional
[Array
], default:None
) – Optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – The Kullback-Leibler divergence loss between the two input arrays.
Examples
>>> input = ivy.array([0.2, 0.8], [0.5, 0.5]) >>> target = ivy.array([0.6, 0.4], [0.3, 0.7]) >>> output_array = input.kl_div(target) >>> print(output_array) ivy.array(0.0916)
- Container.kl_div(self, target, /, *, reduction='mean', log_target=False, key_chains=None, to_apply=True, prune_unapplied=False, map_sequences=False, out=None)[source]#
ivy.Container instance method variant of ivy.kl_div. This method simply wraps the function, and so the docstring for ivy.kl_div also applies to this method with minimal changes.
- Parameters:
self (
Container
) – input container containing input distribution.target (
Union
[Container
,Array
,NativeArray
]) – input array or container containing target distribution.reduction (
Optional
[Union
[str
,Container
]], default:'mean'
) – the reduction method. Default: “mean”.key_chains (
Optional
[Union
[List
[str
],Dict
[str
,str
],Container
]], default:None
) – The key-chains to apply or not apply the method to. Default is None.to_apply (
Union
[bool
,Container
], default:True
) – If input, the method will be applied to key_chains, otherwise key_chains will be skipped. Default is input.prune_unapplied (
Union
[bool
,Container
], default:False
) – Whether to prune key_chains for which the function was not applied. Default is False.map_sequences (
Union
[bool
,Container
], default:False
) – Whether to also map method to sequences (lists, tuples). Default is False.out (
Optional
[Container
], default:None
) – optional output container, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Container
- Returns:
ret – The Kullback-Leibler divergence loss between the given distributions.