Initializers#
- class ivy.stateful.initializers.Constant(constant)[source]#
Bases:
Initializer
- __init__(constant)[source]#
Constant initializer, will fill in all values with the value of constant.
- Parameters:
constant (
float
) – Constant value for initialization.
- create_variables(var_shape, device, fan_out=None, fan_in=None, dtype=None)[source]#
Create internal variables for the layer.
- Parameters:
var_shape (
Tuple
[int
,int
]) –- Tuple representing the shape of the desired array. If considering
the array as a rectangular matrix, this tuple is represented as ‘(ROWS, COLUMNS)’.
device (
Union
[Device
,NativeDevice
]) – Device on which to create the layer’s variables ‘cuda:0’, ‘cuda:1’, ‘cpu’ etc. Default is cpu.fan_out (
Optional
[float
]) – The number of nodes in the next layer. (default:None
)fan_in (
Optional
[float
]) – The number of nodes in the previous layer. (default:None
)dtype (
Optional
[Union
[Dtype
,NativeDtype
]]) – Desired data type. (default:None
)
- Return type:
- class ivy.stateful.initializers.Initializer[source]#
Bases:
ABC
An initializer for internal variables for a layer.
A neuron is a function of the form a = g(z), where g is the activation functions and z = w_1x_1 + w_2x_2 + … + w_nx_n where the w_i are the weights and the x_i are the inputs. To prevent this z from vanishing (getting too small) or exploding (getting too big), the initial weights must be picked carefully.
- abstract create_variables(var_shape, device, fan_out=None, fan_in=None, dtype=None)[source]#
Create internal variables for the layer.
- Parameters:
var_shape (
Tuple
[int
,int
]) –- Tuple representing the shape of the desired array. If considering
the array as a rectangular matrix, this tuple is represented as ‘(ROWS, COLUMNS)’.
device (
Union
[Device
,NativeDevice
]) – Device on which to create the layer’s variables ‘cuda:0’, ‘cuda:1’, ‘cpu’ etc. Default is cpu.fan_out (
Optional
[float
]) – The number of nodes in the next layer. (default:None
)fan_in (
Optional
[float
]) – The number of nodes in the previous layer. (default:None
)dtype (
Optional
[Union
[Dtype
,NativeDtype
]]) – Desired data type. (default:None
)
- Return type:
- class ivy.stateful.initializers.KaimingNormal(mean=0, fan_mode='fan_in')[source]#
Bases:
Initializer
- __init__(mean=0, fan_mode='fan_in')[source]#
Initialize Kaiming normal, also known as He Initialization.
It is an method for initializing layers that takes into account the non-linearity of activation functions. It uses a normal distribution centered at mean with standard distribution sqrt(2 / ((1 + negative_slope^2) * fan)).
- Parameters:
mean – Sets the expected value, average, and center of the normal distribution.
fan_mode –
Determines how fan is calculated. - fan_out sets fan to the number of output features of this neuron.
This is useful when training using back-propogation.
fan_in sets fan to the number of input features of this neuron. This is useful when training using forward-propogation.
fan_sum sets fan to the sum of the number of input features and output features of this neuron.
fan_sum sets fan to the average of the number of input features and output features of this neuron.
- create_variables(var_shape, device, fan_out=None, fan_in=None, negative_slope=0.0, dtype=None)[source]#
Create internal variables for the layer.
- Parameters:
var_shape –
- Tuple representing the shape of the desired array. If considering
the array as a rectangular matrix, this tuple is represented as ‘(ROWS, COLUMNS)’.
device – Device on which to create the layer’s variables ‘cuda:0’, ‘cuda:1’, ‘cpu’ etc. Default is cpu.
fan_out – The number of nodes in the next layer.
fan_in – The number of nodes in the previous layer.
negative_slope – How much a higher fan should lower the standard deviation. A value of 0 gives a relationship proportional to 1/fan.
dtype – Desired data type.
- class ivy.stateful.initializers.Uniform(numerator, fan_mode, power, gain)[source]#
Bases:
Initializer
- __init__(numerator, fan_mode, power, gain)[source]#
Initialize based on a uniform distribution, will fill in all values with values drawn from a uniform (all values have an equal probability) distribution.
with range [-wlim, wlim] (endpoints included) with wlim being calculated as gain * (numerator / fan)**power. This distribution helps with issues when trying to optimize and train networks. The expected value of this distribution is 0 and the variance is (gain * numerator / fan)^power / 4.
This is intended as a base-class for special predefined initialzers.
- Parameters:
numerator –
fan_mode –
Determines how fan is calculated. - fan_out sets fan to the number of output features of this neuron.
This is useful when training using back-propogation.
fan_in sets fan to the number of input features of this neuron. This is useful when training using forward-propogation.
fan_sum sets fan to the sum of the number of input features and output features of this neuron.
fan_avg sets fan to the average of the number of input features and output features of this neuron.
power – Sets the drop-off factor for the calculated fan.
gain – Scales the output of the distribution.
- create_variables(var_shape, device, fan_out=None, fan_in=None, dtype=None)[source]#
Create internal variables for the layer.
- Parameters:
var_shape –
- Tuple representing the shape of the desired array. If considering
the array as a rectangular matrix, this tuple is represented as ‘(ROWS, COLUMNS)’.
device – Device on which to create the layer’s variables ‘cuda:0’, ‘cuda:1’, ‘cpu’ etc. Default is cpu.
fan_out – The number of nodes in the next layer.
fan_in – The number of nodes in the previous layer.
dtype – Desired data type.
This should have hopefully given you an overview of the initializers submodule, if you have any questions, please feel free to reach out on our discord in the initializers channel or in the initializers forum!