Layers#

class ivy.data_classes.array.layers._ArrayWithLayers[source]#

Bases: ABC

_abc_impl = <_abc._abc_data object>#
conv1d(filters, strides, padding, /, *, data_format='NWC', filter_format='channel_last', x_dilations=1, dilations=1, bias=None, out=None)[source]#

ivy.Array instance method variant of ivy.conv1d. This method simply wraps the function, and so the docstring for ivy.conv1d also applies to this method with minimal changes.

Parameters:
  • self (Array) – Input image [batch_size,w,d_in] or [batch_size,d_in,w].

  • filters (Union[Array, NativeArray]) – Convolution filters [fw,d_in,d_out].

  • strides (Union[int, Tuple[int]]) – The stride of the sliding window for each dimension of input.

  • padding (str) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • data_format (str, default: 'NWC') – “NWC” or “NCW”. Defaults to “NWC”.

  • filter_format (str, default: 'channel_last') –

    Either “channel_first” or “channel_last”. Defaults to “channel_last”. x_dilations

    The dilation factor for each dimension of input. (Default value = 1)

  • dilations (Union[int, Tuple[int]], default: 1) – The dilation factor for each dimension of input. (Default value = 1)

  • bias (Optional[Array], default: None) – Bias array of shape [d_out].

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The result of the convolution operation.

Examples

>>> x = ivy.array([[[1., 2.], [3., 4.], [6., 7.], [9., 11.]]])  # NWC
>>> filters = ivy.array([[[0., 1.], [1., 1.]]])  # WIO (I == C)
>>> result = x.conv1d(filters, (1,), 'VALID')
>>> print(result)
ivy.array([[[ 2.,  3.],
...         [ 4.,  7.],
...         [ 7., 13.],
...         [11., 20.]]])
conv1d_transpose(filters, strides, padding, /, *, output_shape=None, filter_format='channel_last', data_format='NWC', dilations=1, bias=None, out=None)[source]#

ivy.Array instance method variant of ivy.conv1d_transpose. This method simply wraps the function, and so the docstring for ivy.conv1d_transpose also applies to this method with minimal changes.

Parameters:
  • self (Array) – Input image [batch_size,w,d_in] or [batch_size,d_in,w].

  • filters (Union[Array, NativeArray]) – Convolution filters [fw,d_out,d_in].

  • strides (Union[int, Tuple[int]]) – The stride of the sliding window for each dimension of input.

  • padding (str) – either the string ‘SAME’ (padding with zeros evenly), the string ‘VALID’ (no padding), or a sequence of n (low, high) integer pairs that give the padding to apply before and after each spatial dimension.

  • output_shape (Optional[Union[Shape, NativeShape]], default: None) – Shape of the output (Default value = None)

  • filter_format (str, default: 'channel_last') – Either “channel_first” or “channel_last”. “channel_first” corresponds to “IOW”,input data formats, while “channel_last” corresponds to “WOI”.

  • data_format (str, default: 'NWC') – The ordering of the dimensions in the input, one of “NWC” or “NCW”. “NWC” corresponds to input with shape (batch_size, width, channels), while “NCW” corresponds to input with shape (batch_size, channels, width).

  • dilations (Union[int, Tuple[int]], default: 1) – The dilation factor for each dimension of input. (Default value = 1)

  • bias (Optional[Array], default: None) – Bias array of shape [d_out].

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The result of the transpose convolution operation.

Examples

>>> x = ivy.array([[[1., 2.], [3., 4.], [6., 7.], [9., 11.]]])  # NWC
>>> filters = ivy.array([[[0., 1.], [1., 1.]]])  # WIO (I == C)
>>> result = x.conv1d_transpose(filters, (1,), 'VALID')
>>> print(result)
ivy.array([[[ 2.,  3.],
...         [ 4.,  7.],
...         [ 7., 13.],
...         [11., 20.]]])
conv2d(filters, strides, padding, /, *, data_format='NHWC', filter_format='channel_last', x_dilations=1, dilations=1, bias=None, out=None)[source]#

ivy.Array instance method variant of ivy.conv2d. This method simply wraps the function, and so the docstring for ivy.conv2d also applies to this method with minimal changes.

Parameters:
  • self (Array) – Input image [batch_size,h,w,d_in] or [batch_size,d_in,h,w].

  • filters (Union[Array, NativeArray]) – Convolution filters [fh,fw,d_in,d_out].

  • strides (Union[int, Tuple[int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (str) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • data_format (str, default: 'NHWC') – “NHWC” or “NCHW”. Defaults to “NHWC”.

  • dilations (Union[int, Tuple[int, int]], default: 1) – The dilation factor for each dimension of input. (Default value = 1)

  • filter_format (str, default: 'channel_last') – Either “channel_first” or “channel_last”. Defaults to “channel_last”.

  • x_dilations (Union[int, Tuple[int, int]], default: 1) – The dilation factor for each dimension of input. (Default value = 1)

  • bias (Optional[Container], default: None) – Bias array of shape [d_out].

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The result of the convolution operation.

Examples

>>> x = ivy.array([[[[1.], [2.0],[3.]],
...                 [[1.], [2.0],[3.]],
...                 [[1.], [2.0],[3.]]]]) #NHWC
>>> filters = ivy.array([[[[0.]], [[1.]], [[0.]]],
...                      [[[0.]], [[1.]], [[0.]]],
...                      [[[0.]], [[1.]], [[0.]]]]) #HWIO
>>> result = x.conv2d(filters, 1, 'SAME', data_format='NHWC',
...    dilations= 1)
>>> print(result)
ivy.array([[
          [[2.],[4.],[6.]],
          [[3.],[6.],[9.]],
          [[2.],[4.],[6.]]
          ]])
conv2d_transpose(filters, strides, padding, /, *, output_shape=None, filter_format='channel_last', data_format='NHWC', dilations=1, out=None, bias=None)[source]#

ivy.Array instance method variant of ivy.conv2d_transpose. This method simply wraps the function, and so the docstring for ivy.conv2d_transpose also applies to this method with minimal changes.

Parameters:
  • self (Array) – Input image [batch_size,h,w,d_in] or [batch_size,d_in,h,w].

  • filters (Union[Array, NativeArray]) – Convolution filters [fh,fw,d_out,d_in].

  • strides (Union[int, Tuple[int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (str) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • output_shape (Optional[Union[Shape, NativeShape]], default: None) – Shape of the output (Default value = None)

  • filter_format (str, default: 'channel_last') – Either “channel_first” or “channel_last”. “channel_first” corresponds to “IOHW”,input data formats, while “channel_last” corresponds to “HWOI”.

  • data_format (str, default: 'NHWC') – The ordering of the dimensions in the input, one of “NHWC” or “NCHW”. “NHWC” corresponds to inputs with shape (batch_size, height, width, channels), while “NCHW” corresponds to input with shape (batch_size, channels, height, width). Default is "NHWC".

  • dilations (Union[int, Tuple[int, int]], default: 1) – The dilation factor for each dimension of input. (Default value = 1)

  • bias (Optional[Array], default: None) – Bias array of shape [d_out].

  • out (Optional[Array], default: None) – Optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The result of the transpose convolution operation.

Examples

>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 28, 28, 3])
>>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 6, 3])
>>> y = x.conv2d_transpose(filters,2,'SAME',)
>>> print(y.shape)
(1, 56, 56, 6)
conv3d(filters, strides, padding, /, *, data_format='NDHWC', filter_format='channel_last', x_dilations=1, dilations=1, bias=None, out=None)[source]#

ivy.Array instance method variant of ivy.conv3d. This method simply wraps the function, and so the docstring for ivy.conv3d also applies to this method with minimal changes.

Parameters:
  • x – Input volume [batch_size,d,h,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fd,fh,fw,d_in,d_out].

  • strides (Union[int, Tuple[int, int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (str) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • data_format (str, default: 'NDHWC') – “NDHWC” or “NCDHW”. Defaults to “NDHWC”.

  • filter_format (str, default: 'channel_last') – Either “channel_first” or “channel_last”. Defaults to “channel_last”.

  • x_dilations (Union[int, Tuple[int, int, int]], default: 1) – The dilation factor for each dimension of input. (Default value = 1)

  • dilations (Union[int, Tuple[int, int, int]], default: 1) – The dilation factor for each dimension of input. (Default value = 1)

  • bias (Optional[Array], default: None) – Bias array of shape [d_out].

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The result of the convolution operation.

Examples

>>> x = ivy.ones((1, 3, 3, 3, 1)).astype(ivy.float32)
>>> filters = ivy.ones((1, 3, 3, 1, 1)).astype(ivy.float32)
>>> result = x.conv3d(filters, 2, 'SAME')
>>> print(result)
ivy.array([[[[[4.],[4.]],[[4.],[4.]]],[[[4.],[4.]],[[4.],[4.]]]]])
conv3d_transpose(filters, strides, padding, /, *, output_shape=None, filter_format='channel_last', data_format='NDHWC', dilations=1, bias=None, out=None)[source]#

ivy.Array instance method variant of ivy.conv3d_transpose. This method simply wraps the function, and so the docstring for ivy.conv3d_transpose also applies to this method with minimal changes.

Parameters:
  • self (Array) – Input volume [batch_size,d,h,w,d_in] or [batch_size,d_in,d,h,w].

  • filters (Union[Array, NativeArray]) – Convolution filters [fd,fh,fw,d_out,d_in].

  • strides (Union[int, Tuple[int], Tuple[int, int], Tuple[int, int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (Union[str, List[int]]) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • output_shape (Optional[Union[Shape, NativeShape]], default: None) – Shape of the output (Default value = None)

  • filter_format (str, default: 'channel_last') – Either “channel_first” or “channel_last”. “channel_first” corresponds to “IODHW”,input data formats, while “channel_last” corresponds to “DHWOI”.

  • data_format (str, default: 'NDHWC') –

    The ordering of the dimensions in the input, one of “NDHWC” or “NCDHW”. “NDHWC” corresponds to inputs with shape (batch_size,

    depth, height, width, channels), while “NCDHW” corresponds to input with shape (batch_size, channels, depth, height, width).

  • dilations (Union[int, Tuple[int], Tuple[int, int], Tuple[int, int, int]], default: 1) – The dilation factor for each dimension of input. (Default value = 1)

  • bias (Optional[Array], default: None) – Bias array of shape [d_out].

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The result of the transpose convolution operation.

Examples

>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3])
>>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 6, 3])
>>> y = x.conv3d_transpose(filters, 2, 'SAME')
>>> print(y.shape)
(1, 6, 56, 56, 6)
depthwise_conv2d(filters, strides, padding, /, *, data_format='NHWC', dilations=1, out=None)[source]#

ivy.Array instance method variant of ivy.depthwise_conv2d. This method simply wraps the function, and so the docstring for ivy.depthwise_conv2d also applies to this method with minimal changes.

Parameters:
  • self (Array) – Input image [batch_size,h,w,d].

  • filters (Union[Array, NativeArray]) – Convolution filters [fh,fw,d_in]. (d_in must be the same as d from self)

  • strides (Union[int, Tuple[int], Tuple[int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (Union[str, List[int]]) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • data_format (str, default: 'NHWC') – “NHWC” or “NCHW”. Defaults to “NHWC”.

  • dilations (Union[int, Tuple[int], Tuple[int, int]], default: 1) – The dilation factor for each dimension of input. (Default value = 1)

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The result of the convolution operation.

Examples

>>> x = ivy.randint(0, 255, shape=(1, 128, 128, 3)).astype(ivy.float32) / 255.0
>>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3])
>>> y = x.depthwise_conv2d(filters, 2, 'SAME')
>>> print(y.shape)
(1, 64, 64, 3)
dropout(prob, /, *, scale=True, dtype=None, training=True, seed=None, noise_shape=None, out=None)[source]#

ivy.Array instance method variant of ivy.dropout. This method simply wraps the function, and so the docstring for ivy.dropout also applies to this method with minimal changes.

Parameters:
  • self (Array) – The input array x to perform dropout on.

  • prob (float) – The probability of zeroing out each array element, float between 0 and 1.

  • scale (bool, default: True) – Whether to scale the output by 1/(1-prob), default is True.

  • dtype (Optional[Union[Dtype, NativeDtype]], default: None) – output array data type. If dtype is None, the output array data type must be inferred from x. Default: None.

  • training (bool, default: True) – Turn on dropout if training, turn off otherwise. Default is True.

  • seed (Optional[int], default: None) – Set a default seed for random number generating (for reproducibility).Default is None.

  • noise_shape (Optional[Sequence[int]], default: None) – a sequence representing the shape of the binary dropout mask that will be multiplied with the input.

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – Result array of the output after dropout is performed.

Examples

With ivy.Array instances:

>>> x = ivy.array([[1., 2., 3.],
...                [4., 5., 6.],
...                [7., 8., 9.],
...                [10., 11., 12.]])
>>> y = x.dropout(0.3)
>>> print(y)
ivy.array([[ 1.42857146,  2.85714293,  4.28571415],
           [ 5.71428585,  7.14285755,  8.5714283 ],
           [ 0.        , 11.4285717 , 12.8571434 ],
           [14.2857151 ,  0.        ,  0.        ]])
>>> x = ivy.array([[1., 2., 3.],
...                [4., 5., 6.],
...                [7., 8., 9.],
...                [10., 11., 12.]])
>>> y = x.dropout(0.3, scale=False)
>>> print(y)
ivy.array([[ 1.,  2., 3.],
           [ 4.,  5., 0.],
           [ 7.,  0., 9.],
           [10., 11., 0.]])
dropout1d(prob, /, *, training=True, data_format='NWC', out=None)[source]#

ivy.Array instance method variant of ivy.dropout1d. This method simply wraps the function, and so the docstring for ivy.dropout1d also applies to this method with minimal changes.

Parameters:
  • self (Array) – The input array x to perform dropout on.

  • prob (float) – The probability of zeroing out each array element, float between 0 and 1.

  • training (bool, default: True) – Turn on dropout if training, turn off otherwise. Default is True.

  • data_format (str, default: 'NWC') – “NWC” or “NCW”. Default is "NWC".

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – Result array of the output after dropout is performed.

Examples

>>> x = ivy.array([1, 1, 1]).reshape([1, 1, 3])
>>> y = x.dropout1d(0.5)
>>> print(y)
ivy.array([[[2., 0, 2.]]])
dropout2d(prob, /, *, training=True, data_format='NHWC', out=None)[source]#

ivy.Array instance method variant of ivy.dropout2d. This method simply wraps the function, and so the docstring for ivy.dropout1d also applies to this method with minimal changes.

Parameters:
  • self (Array) – The input array x to perform dropout on.

  • prob (float) – The probability of zeroing out each array element, float between 0 and 1.

  • training (bool, default: True) – Turn on dropout if training, turn off otherwise. Default is True.

  • data_format (str, default: 'NHWC') – “NHWC” or “NCHW”. Default is "NHWC".

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – Result array of the output after dropout is performed.

Examples

>>> x = ivy.array([[1, 1, 1], [2, 2, 2]])
>>> y = x.dropout2d(0.5)
>>> print(y)
ivy.array([[0., 0., 2.],
       [4., 4., 4.]])
dropout3d(prob, /, *, training=True, data_format='NDHWC', out=None)[source]#

ivy.Array instance method variant of ivy.dropout3d. This method simply wraps the function, and so the docstring for ivy.dropout3d also applies to this method with minimal changes.

Parameters:
  • self (Array) – The input array x to perform dropout on.

  • prob (float) – The probability of zeroing out each array element, float between 0 and 1.

  • training (bool, default: True) – Turn on dropout if training, turn off otherwise. Default is True.

  • data_format (str, default: 'NDHWC') – “NDHWC” or “NCDHW”. Default is "NDHWC".

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – Result array of the output after dropout is performed.

linear(weight, /, *, bias=None, out=None)[source]#

ivy.Array instance method variant of ivy.linear. This method simply wraps the function, and so the docstring for ivy.linear also applies to this method with minimal changes.

Parameters:
  • self (Array) – The input array to compute linear transformation on. [outer_batch_shape,inner_batch_shape,in_features]

  • weight (Union[Array, NativeArray]) – The weight matrix. [outer_batch_shape,out_features,in_features]

  • bias (Optional[Union[Array, NativeArray]], default: None) – The bias vector, default is None. [outer_batch_shape,out_features]

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – Result array of the linear transformation. [outer_batch_shape,inner_batch_shape,out_features]

Examples

>>> x = ivy.array([[1.1, 2.2, 3.3],                            [4.4, 5.5, 6.6],                            [7.7, 8.8, 9.9]])
>>> w = ivy.array([[1., 2., 3.],                            [4., 5., 6.],                            [7., 8., 9.]])
>>> b = ivy.array([1., 0., -1.])
>>> y = x.linear(w, bias=b)
>>> print(y)
ivy.array([[ 16.4,  35.2,  54. ],
           [ 36.2,  84.7, 133. ],
           [ 56. , 134. , 212. ]])
lstm_update(init_h, init_c, kernel, recurrent_kernel, /, *, bias=None, recurrent_bias=None)[source]#

ivy.Array instance method variant of ivy.lstm_update. This method simply wraps the function, and so the docstring for ivy.lstm_update also applies to this method with minimal changes.

Parameters:
  • init_h (Union[Array, NativeArray]) – initial state tensor for the cell output [batch_shape, out].

  • init_c (Union[Array, NativeArray]) – initial state tensor for the cell hidden state [batch_shape, out].

  • kernel (Union[Array, NativeArray]) – weights for cell kernel [in, 4 x out].

  • recurrent_kernel (Union[Array, NativeArray]) – weights for cell recurrent kernel [out, 4 x out].

  • bias (Optional[Union[Array, NativeArray]], default: None) – bias for cell kernel [4 x out]. (Default value = None)

  • recurrent_bias (Optional[Union[Array, NativeArray]], default: None) – bias for cell recurrent kernel [4 x out]. (Default value = None)

Return type:

Tuple[Array, Array]

Returns:

ret – hidden state for all timesteps [batch_shape,t,out] and cell state for last timestep [batch_shape,out]

Examples

>>> x = ivy.randint(0, 20, shape=(6, 20, 3))
>>> h_i = ivy.random_normal(shape=(6, 5))
>>> c_i = ivy.random_normal(shape=(6, 5))
>>> kernel = ivy.random_normal(shape=(3, 4 * 5))
>>> rc = ivy.random_normal(shape=(5, 4 * 5))
>>> result = x.lstm_update(h_i, c_i, kernel, rc)
>>> result[0].shape
(6, 20, 5)
>>> result[1].shape
(6, 5)
multi_head_attention(*, key=None, value=None, num_heads=8, scale=None, attention_mask=None, in_proj_weights=None, q_proj_weights=None, k_proj_weights=None, v_proj_weights=None, out_proj_weights=None, in_proj_bias=None, out_proj_bias=None, is_causal=False, key_padding_mask=None, bias_k=None, bias_v=None, static_k=None, static_v=None, add_zero_attn=False, return_attention_weights=False, average_attention_weights=True, dropout=0.0, training=False, out=None)[source]#
Return type:

Array

scaled_dot_product_attention(key, value, /, *, scale=None, mask=None, dropout_p=0.0, is_causal=False, training=False, out=None)[source]#

ivy.Array instance method variant of ivy.scaled_dot_product_attention. This method simply wraps the function, and so the docstring for ivy.scaled_dot_product_attention also applies to this method with minimal changes.

Parameters:
  • self (Array) – The queries input array. The shape of queries input array should be in [batch_shape,num_queries,feat_dim]. The queries input array should have the same size as keys and values.

  • key (Union[Array, NativeArray]) – The keys input array. The shape of keys input array should be in [batch_shape,num_keys,feat_dim]. The keys input array should have the same size as queries and values.

  • value (Union[Array, NativeArray]) – The values input array. The shape of values input should be in [batch_shape,num_keys,feat_dim]. The values input array should have the same size as queries and keys.

  • scale (Optional[float], default: None) – The scale float value. The scale float value is used to scale the query-key pairs before softmax.

  • mask (Optional[Union[Array, NativeArray]], default: None) – The mask input array. The mask to apply to the query-key values. Default is None. The shape of mask input should be in [batch_shape,num_queries,num_keys].

  • dropout_p (Optional[float], default: 0.0) – Specifies the dropout probability, if greater than 0.0, dropout is applied

  • is_causal (Optional[bool], default: False) – If true, assumes causal attention masking and errors if both mask and is_causal are set.

  • training (Optional[bool], default: False) – If True, dropout is used, otherwise dropout is not activated.

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The output following application of scaled dot-product attention. The output array is the weighted sum produced by the attention score and value. The shape of output array is [batch_shape,num_queries,feat_dim] .

Examples

With ivy.Array input:

>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, scale=1, dropout_p=0.1,
...                                           is_causal=True, training=True)
>>> print(result)
ivy.array([[[0.40000001, 1.29999995],
            [2.19994521, 3.09994531],
            [4.30000019, 5.30000019]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> mask = ivy.array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0],[0.0, 0.0, 0.0]]])
>>> result = ivy.scaled_dot_product_attention(q,k,v,scale=1, mask=mask)
>>> print(result)
ivy.array([[[0.40000001, 1.29999995],
            [2.19994521, 3.09994531],
            [4.30000019, 5.30000019]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> out = ivy.zeros(shape=(1, 3, 2))
>>> ivy.scaled_dot_product_attention(q, k, v, scale=1, dropout_p=0.1,
...                                  is_causal=True, training=True, out=out)
>>> print(out)
ivy.array([[[0.40000001, 1.29999995],
            [2.19994521, 3.09994531],
            [4.30000019, 5.30000019]]])