gradient_descent_update#

ivy.gradient_descent_update(w, dcdw, lr, /, *, stop_gradients=True, out=None)[source]#

Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws].

Parameters:
  • w (Union[Array, NativeArray]) – Weights of the function to be updated.

  • dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (Union[float, Array, NativeArray]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.

  • stop_gradients (bool, default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The new weights, following the gradient descent updates.

Examples

With ivy.Array inputs:

>>> w = ivy.array([[1., 2, 3],
...                [4, 6, 1],
...                [1, 0, 7]])
>>> dcdw = ivy.array([[0.5, 0.2, 0.1],
...                   [0.3, 0.6, 0.4],
...                   [0.4, 0.7, 0.2]])
>>> lr = ivy.array(0.1)
>>> new_weights = ivy.gradient_descent_update(w, dcdw, lr, stop_gradients=True)
>>> print(new_weights)
ivy.array([[ 0.95,  1.98,  2.99],
...        [ 3.97,  5.94,  0.96],
...        [ 0.96, -0.07,  6.98]])
>>> w = ivy.array([1., 2., 3.])
>>> dcdw = ivy.array([0.5, 0.2, 0.1])
>>> lr = ivy.array(0.3)
>>> out = ivy.zeros_like(w)
>>> ivy.gradient_descent_update(w, dcdw, lr, out=out)
>>> print(out)
ivy.array([0.85, 1.94, 2.97])

With one ivy.Container inputs:

>>> w = ivy.Container(a=ivy.array([1., 2., 3.]),
...                   b=ivy.array([3.48, 5.72, 1.98]))
>>> dcdw = ivy.array([0.5, 0.2, 0.1])
>>> lr = ivy.array(0.3)
>>> w_new = ivy.gradient_descent_update(w, dcdw, lr)
>>> print(w_new)
{
    a: ivy.array([0.85, 1.94, 2.97]),
    b: ivy.array([3.33, 5.66, 1.95])
}

With multiple ivy.Container inputs:

>>> w = ivy.Container(a=ivy.array([1., 2., 3.]),
...                   b=ivy.array([3.48, 5.72, 1.98]))
>>> dcdw = ivy.Container(a=ivy.array([0.5, 0.2, 0.1]),
...                      b=ivy.array([2., 3.42, 1.69]))
>>> lr = ivy.array(0.3)
>>> w_new = ivy.gradient_descent_update(w, dcdw, lr)
>>> print(w_new)
{
    a: ivy.array([0.85, 1.94, 2.97]),
    b: ivy.array([2.88, 4.69, 1.47])
}
Array.gradient_descent_update(self, dcdw, lr, /, *, stop_gradients=True, out=None)[source]#

ivy.Array instance method variant of ivy.gradient_descent_update. This method simply wraps the function, and so the docstring for ivy.gradient_descent_update also applies to this method with minimal changes.

Parameters:
  • self (Array) – Weights of the function to be updated.

  • dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (Union[float, Array, NativeArray]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.

  • stop_gradients (bool, default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The new weights, following the gradient descent updates.

Examples

With ivy.Array inputs:

>>> w = ivy.array([[1., 2, 3],
...                [4, 6, 1],
...                [1, 0, 7]])
>>> dcdw = ivy.array([[0.5, 0.2, 0.1],
...                   [0.3, 0.6, 0.4],
...                   [0.4, 0.7, 0.2]])
>>> lr = ivy.array(0.1)
>>> new_weights = w.gradient_descent_update(dcdw, lr, stop_gradients = True)
>>> print(new_weights)
ivy.array([[ 0.95,  1.98,  2.99],
...        [ 3.97,  5.94,  0.96],
...        [ 0.96, -0.07,  6.98]])
Container.gradient_descent_update(self, dcdw, lr, /, *, stop_gradients=True, out=None)[source]#

ivy.Container instance method variant of ivy.gradient_descent_update. This method simply wraps the function, and so the docstring for ivy.gradient_descent_update also applies to this method with minimal changes.

Parameters:
  • self (Container) – Weights of the function to be updated.

  • dcdw (Union[Array, NativeArray, Container]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (Union[float, Array, NativeArray, Container]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.

  • key_chains – The key-chains to apply or not apply the method to. Default is None.

  • to_apply – If True, the method will be applied to key_chains, otherwise key_chains will be skipped. Default is True.

  • prune_unapplied – Whether to prune key_chains for which the function was not applied. Default is False.

  • map_sequences – Whether to also map method to sequences (lists, tuples). Default is False.

  • stop_gradients (Union[bool, Container], default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.

  • out (Optional[Container], default: None) – optional output container, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Container

Returns:

ret – The new weights, following the gradient descent updates.

Examples

With one ivy.Container inputs:

>>> w = ivy.Container(a=ivy.array([1., 2., 3.]),
...                      b=ivy.array([3.48, 5.72, 1.98]))
>>> dcdw = ivy.array([0.5, 0.2, 0.1])
>>> lr = ivy.array(0.3)
>>> w_new = w.gradient_descent_update(dcdw, lr)
>>> print(w_new)
{
    a: ivy.array([0.85, 1.94, 2.97]),
    b: ivy.array([3.33, 5.66, 1.95])
}

With multiple ivy.Container inputs:

>>> w = ivy.Container(a=ivy.array([1., 2., 3.]),
...                      b=ivy.array([3.48, 5.72, 1.98]))
>>> dcdw = ivy.Container(a=ivy.array([0.5, 0.2, 0.1]),
...                         b=ivy.array([2., 3.42, 1.69]))
>>> lr = ivy.array(0.3)
>>> w_new = w.gradient_descent_update(dcdw, lr)
>>> print(w_new)
{
    a: ivy.array([0.85, 1.94, 2.97]),
    b: ivy.array([2.88, 4.69, 1.47])
}