lstm#

ivy.lstm(input, initial_states, all_weights, num_layers, dropout, train, bidirectional, batch_first=False, batch_sizes=None, weights_transposed=False, has_ih_bias=True, has_hh_bias=True)[source]#

Applies a multi-layer long-short term memory to an input sequence.

Parameters:
  • input (Array) – input array of shape (seq_len, batch, input_size) when batch_first is False or (batch, seq_len, input_size) when batch_first is True

  • initial_states (Tuple[Array]) –

    tuple of two arrays (h_0, c_0) where h_0 is the initial hidden state of shape (num_layers * num_directions, batch, hidden_size) and c_0 is the initial cell state of shape (num_layers * num_directions, batch, hidden_size)

    (num_directions being 2 when bidirectional, otherwise 1)

  • all_weights (Tuple[Array]) –

    tuple of arrays representing the learnable weights of the lstm, with each layer having up to four arrays (w_ih, w_hh, b_ih, b_hh) representing the weights and biases (if biases are being used)

    w_ih: weight of shape (4 * hidden_size, input_size) w_hh: weight of shape (4 * hidden_size, hidden_size) b_ih: bias of shape (4 * hidden_size,) b_hh: bias of shape (4 * hidden_size,)

  • num_layers (int) – number of layers for the lstm to use

  • dropout (float) – dropout rate

  • train (bool) – whether to run the lstm in train mode or eval mode

  • bidirectional (bool) – whether the lstm is bidirectional or unidirectional

  • batch_first (bool, default: False) – defines the data format of the input and output arrays

  • batch_sizes (Optional[Sequence], default: None) – specifies the batch size at each timestep, when the input is a packed sequence

  • weights_transposed (bool, default: False) – whether the weights are transposed compared to the format in which they are expected (input_size, 4 * hidden_size) rather than (4 * hidden_size, input_size)

  • has_ih_bias (bool, default: True) – whether the all_weights argument includes a input-hidden bias

  • has_hh_bias (bool, default: True) – whether the all_weights argument includes a hidden-hidden bias

Returns:

  • output – output array of shape (seq_len, batch, num_directions * hidden_size) or (batch, seq_len, num_directions * hidden_size), depending on batch_first

  • h_outs – final hidden state of shape (num_layers * num_directions, batch, hidden_size)

  • c_outs – final cell state of shape (num_layers * num_directions, batch, hidden_size)