Optimizers#
Collection of Ivy optimizers.
- class ivy.stateful.optimizers.Adam(lr=0.0001, beta1=0.9, beta2=0.999, epsilon=1e-07, inplace=True, stop_gradients=True, compile_on_next_step=False, device=None)[source]#
Bases:
Optimizer
- __init__(lr=0.0001, beta1=0.9, beta2=0.999, epsilon=1e-07, inplace=True, stop_gradients=True, compile_on_next_step=False, device=None)[source]#
Construct an ADAM optimizer.
- Parameters:
lr (
float
) – Learning rate, default is1e-4
. (default:0.0001
)beta1 (
float
) – gradient forgetting factor, default is0.9
(default:0.9
)beta2 (
float
) – second moment of gradient forgetting factor, default is0.999
(default:0.999
)epsilon (
float
) – divisor during adam update, preventing division by zero, (default:1e-07
) default is1e-07
inplace (
bool
) – Whether to update the variables in-place, or to create new variable handles. (default:True
) This is only relevant for frameworks with stateful variables such as PyTorch. Default isTrue
, provided the backend framework supports it.stop_gradients (
bool
) – Whether to stop the gradients of the variables after each gradient step. (default:True
) Default isTrue
.compile_on_next_step (
bool
) – Whether to compile the optimizer on the next step. Default isFalse
. (default:False
)device (
Optional
[Union
[Device
,NativeDevice
]]) – Device on which to create the layer’s variables ‘cuda:0’, ‘cuda:1’, ‘cpu’ (default:None
) etc. (Default value = None)
- set_state(state)[source]#
Set state of the optimizer.
- Parameters:
state (
Container
) – Nested state to update.
- property state#
- class ivy.stateful.optimizers.LAMB(lr=0.0001, beta1=0.9, beta2=0.999, epsilon=1e-07, max_trust_ratio=10, decay_lambda=0, inplace=True, stop_gradients=True, compile_on_next_step=False, device=None)[source]#
Bases:
Optimizer
- __init__(lr=0.0001, beta1=0.9, beta2=0.999, epsilon=1e-07, max_trust_ratio=10, decay_lambda=0, inplace=True, stop_gradients=True, compile_on_next_step=False, device=None)[source]#
Construct an LAMB optimizer.
- Parameters:
lr (
float
) – Learning rate, default is1e-4
. (default:0.0001
)beta1 (
float
) – gradient forgetting factor, default is0.9
(default:0.9
)beta2 (
float
) – second moment of gradient forgetting factor, default is0.999
(default:0.999
)epsilon (
float
) – divisor during adam update, preventing division by zero, (default:1e-07
) default is1e-07
max_trust_ratio (
float
) – The max value of the trust ratio; the ratio between the norm of the layer (default:10
) weights and norm of gradients update. Default is10
.decay_lambda (
float
) – The factor used for weight decay. Default is0
. (default:0
)inplace (
bool
) – Whether to update the variables in-place, or to create new variable handles. (default:True
) This is only relevant for frameworks with stateful variables such as PyTorch. Default isTrue
, provided the backend framework supports it.stop_gradients (
bool
) – Whether to stop the gradients of the variables after each gradient step. (default:True
) Default isTrue
.compile_on_next_step (
bool
) – Whether to compile the optimizer on the next step. Default isFalse
. (default:False
)device (
Optional
[Union
[Device
,NativeDevice
]]) – Device on which to create the layer’s variables ‘cuda:0’, ‘cuda:1’, ‘cpu’ (default:None
) etc. (Default value = None)
- set_state(state)[source]#
Set state of the optimizer.
- Parameters:
state (
Container
) – Nested state to update.
- property state#
- class ivy.stateful.optimizers.LARS(lr=0.0001, decay_lambda=0, inplace=True, stop_gradients=True, compile_on_next_step=False)[source]#
Bases:
Optimizer
- __init__(lr=0.0001, decay_lambda=0, inplace=True, stop_gradients=True, compile_on_next_step=False)[source]#
Construct a Layer-wise Adaptive Rate Scaling (LARS) optimizer.
- Parameters:
lr (
float
) – Learning rate, default is1e-4
. (default:0.0001
)decay_lambda (
float
) – The factor used for weight decay. Default is0
. (default:0
)inplace (
bool
) – Whether to update the variables in-place, or to create new variable handles. (default:True
) This is only relevant for frameworks with stateful variables such as PyTorch. Default isTrue
, provided the backend framework supports it.stop_gradients (
bool
) – Whether to stop the gradients of the variables after each gradient step. (default:True
) Default isTrue
.compile_on_next_step (
bool
) – Whether to compile the optimizer on the next step. Default isFalse
. (default:False
)
- set_state(state)[source]#
Set state of the optimizer.
- Parameters:
state (
Container
) – Nested state to update.
- property state#
- class ivy.stateful.optimizers.Optimizer(lr, inplace=True, stop_gradients=True, init_on_first_step=False, compile_on_next_step=False, fallback_to_non_compiled=False, device=None)[source]#
Bases:
ABC
- __init__(lr, inplace=True, stop_gradients=True, init_on_first_step=False, compile_on_next_step=False, fallback_to_non_compiled=False, device=None)[source]#
Construct a general Optimizer. This is an abstract class, and must be derived.
- Parameters:
lr (
Union
[float
,Callable
]) – Learning rate.inplace (
bool
) – Whether to update the variables in-place, or to create new variable handles. (default:True
) This is only relevant for frameworks with stateful variables such as PyTorch. Default isTrue
, provided the backend framework supports it.stop_gradients (
bool
) – Whether to stop the gradients of the variables after each gradient step. (default:True
) Default isTrue
.init_on_first_step (
bool
) – Whether the optimizer is initialized on the first step. (default:False
) Default isFalse
.compile_on_next_step (
bool
) – Whether to compile the optimizer on the next step. Default isFalse
. (default:False
)fallback_to_non_compiled (
bool
) – Whether to fall back to non-compiled forward call in the case that an error (default:False
) is raised during the compiled forward pass. Default isTrue
.device (
Optional
[Union
[Device
,NativeDevice
]]) – Device on which to create the layer’s variables ‘cuda:0’, ‘cuda:1’, ‘cpu’ (default:None
) etc. (Default value = None)
- abstract set_state(state)[source]#
Set state of the optimizer.
- Parameters:
state (
Container
) – Nested state to update.
- class ivy.stateful.optimizers.SGD(lr=0.0001, inplace=True, stop_gradients=True, compile_on_next_step=False)[source]#
Bases:
Optimizer
- __init__(lr=0.0001, inplace=True, stop_gradients=True, compile_on_next_step=False)[source]#
Construct a Stochastic-Gradient-Descent (SGD) optimizer.
- Parameters:
lr (
float
) – Learning rate, default is1e-4
. (default:0.0001
)inplace (
bool
) – Whether to update the variables in-place, or to create new variable handles. (default:True
) This is only relevant for frameworks with stateful variables such as PyTorch. Default isTrue
, provided the backend framework supports it.stop_gradients (
bool
) – Whether to stop the gradients of the variables after each gradient step. (default:True
) Default isTrue
.compile_on_next_step (
bool
) – Whether to compile the optimizer on the next step. Default isFalse
. (default:False
)
- set_state(state)[source]#
Set state of the optimizer.
- Parameters:
state (
Container
) – Nested state to update.
- property state#
This should have hopefully given you an overview of the optimizers submodule, if you have any questions, please feel free to reach out on our discord in the optimizers channel or in the optimizers forum!