lstm#
- ivy.lstm(input, initial_states, all_weights, num_layers, dropout, train, bidirectional, batch_first=False, batch_sizes=None, weights_transposed=False, has_ih_bias=True, has_hh_bias=True)[source]#
Applies a multi-layer long-short term memory to an input sequence.
- Parameters:
input (
Array
) – input array of shape (seq_len, batch, input_size) when batch_first is False or (batch, seq_len, input_size) when batch_first is Trueinitial_states (
Tuple
[Array
]) –tuple of two arrays (h_0, c_0) where h_0 is the initial hidden state of shape (num_layers * num_directions, batch, hidden_size) and c_0 is the initial cell state of shape (num_layers * num_directions, batch, hidden_size)
(num_directions being 2 when bidirectional, otherwise 1)
all_weights (
Tuple
[Array
]) –tuple of arrays representing the learnable weights of the lstm, with each layer having up to four arrays (w_ih, w_hh, b_ih, b_hh) representing the weights and biases (if biases are being used)
w_ih: weight of shape (4 * hidden_size, input_size) w_hh: weight of shape (4 * hidden_size, hidden_size) b_ih: bias of shape (4 * hidden_size,) b_hh: bias of shape (4 * hidden_size,)
num_layers (
int
) – number of layers for the lstm to usedropout (
float
) – dropout ratetrain (
bool
) – whether to run the lstm in train mode or eval modebidirectional (
bool
) – whether the lstm is bidirectional or unidirectionalbatch_first (
bool
, default:False
) – defines the data format of the input and output arraysbatch_sizes (
Optional
[Sequence
], default:None
) – specifies the batch size at each timestep, when the input is a packed sequenceweights_transposed (
bool
, default:False
) – whether the weights are transposed compared to the format in which they are expected (input_size, 4 * hidden_size) rather than (4 * hidden_size, input_size)has_ih_bias (
bool
, default:True
) – whether the all_weights argument includes a input-hidden biashas_hh_bias (
bool
, default:True
) – whether the all_weights argument includes a hidden-hidden bias
- Returns:
output – output array of shape (seq_len, batch, num_directions * hidden_size) or (batch, seq_len, num_directions * hidden_size), depending on batch_first
h_outs – final hidden state of shape (num_layers * num_directions, batch, hidden_size)
c_outs – final cell state of shape (num_layers * num_directions, batch, hidden_size)