holocron.trainer¶
holocron.trainer provides some basic objects for training purposes.
Trainer
¶
Trainer(model: Module, train_loader: DataLoader, val_loader: DataLoader, criterion: Module, optimizer: Optimizer, gpu: int | None = None, output_file: str = './checkpoint.pth', amp: bool = False, skip_nan_loss: bool = False, nan_tolerance: int = 5, gradient_acc: int = 1, gradient_clip: float | None = None, on_epoch_end: Callable[[dict[str, float]], Any] | None = None)
Baseline trainer class.
| PARAMETER | DESCRIPTION |
|---|---|
model
|
model to train
TYPE:
|
train_loader
|
training loader
TYPE:
|
val_loader
|
validation loader
TYPE:
|
criterion
|
loss criterion
TYPE:
|
optimizer
|
parameter optimizer
TYPE:
|
gpu
|
index of the GPU to use
TYPE:
|
output_file
|
path where checkpoints will be saved
TYPE:
|
amp
|
whether to use automatic mixed precision
TYPE:
|
skip_nan_loss
|
whether the optimizer step should be skipped when the loss is NaN
TYPE:
|
nan_tolerance
|
number of consecutive batches with NaN loss before stopping the training
TYPE:
|
gradient_acc
|
number of batches to accumulate the gradient of before performing the update step
TYPE:
|
gradient_clip
|
the gradient clip value
TYPE:
|
on_epoch_end
|
callback triggered at the end of an epoch
TYPE:
|
Source code in holocron/trainer/core.py
set_device
¶
set_device(gpu: int | None = None) -> None
Move tensor objects to the target GPU
| PARAMETER | DESCRIPTION |
|---|---|
gpu
|
index of the target GPU device
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
if PyTorch cannot access the GPU |
ValueError
|
if the device index is invalid |
Source code in holocron/trainer/core.py
save
¶
save(output_file: str) -> None
Save a trainer checkpoint
| PARAMETER | DESCRIPTION |
|---|---|
output_file
|
destination file path
TYPE:
|
Source code in holocron/trainer/core.py
load
¶
Resume from a trainer state
| PARAMETER | DESCRIPTION |
|---|---|
state
|
checkpoint dictionary |
Source code in holocron/trainer/core.py
to_cuda
¶
to_cuda(x: Tensor, target: Tensor | list[dict[str, Tensor]]) -> tuple[Tensor, Tensor | list[dict[str, Tensor]]]
Move input and target to GPU
| PARAMETER | DESCRIPTION |
|---|---|
x
|
input tensor
TYPE:
|
target
|
target tensor or list of target dictionaries |
| RETURNS | DESCRIPTION |
|---|---|
tuple[Tensor, Tensor | list[dict[str, Tensor]]]
|
tuple of input and target tensors |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the device index is invalid |
Source code in holocron/trainer/core.py
fit_n_epochs
¶
fit_n_epochs(num_epochs: int, lr: float, freeze_until: str | None = None, sched_type: str = 'onecycle', norm_weight_decay: float | None = None, **kwargs: Any) -> None
Train the model for a given number of epochs.
| PARAMETER | DESCRIPTION |
|---|---|
num_epochs
|
number of epochs to train
TYPE:
|
lr
|
learning rate to be used by the scheduler
TYPE:
|
freeze_until
|
last layer to freeze
TYPE:
|
sched_type
|
type of scheduler to use
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
**kwargs
|
keyword args passed to the
TYPE:
|
Source code in holocron/trainer/core.py
find_lr
¶
find_lr(freeze_until: str | None = None, start_lr: float = 1e-07, end_lr: float = 1, norm_weight_decay: float | None = None, num_it: int = 100) -> None
Gridsearch the optimal learning rate for the training as described in "Cyclical Learning Rates for Training Neural Networks".
| PARAMETER | DESCRIPTION |
|---|---|
freeze_until
|
last layer to freeze
TYPE:
|
start_lr
|
initial learning rate
TYPE:
|
end_lr
|
final learning rate
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
num_it
|
number of iterations to perform
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the number of iterations is greater than the number of available batches |
Source code in holocron/trainer/core.py
plot_recorder
¶
Display the results of the LR grid search
| PARAMETER | DESCRIPTION |
|---|---|
beta
|
smoothing factor
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
if the number of learning rate recorder and loss recorder are not the same or if the number of learning rate recorder is 0 |
Source code in holocron/trainer/core.py
check_setup
¶
check_setup(freeze_until: str | None = None, lr: float = 0.0003, norm_weight_decay: float | None = None, num_it: int = 100, **kwargs: Any) -> None
Check whether you can overfit one batch
| PARAMETER | DESCRIPTION |
|---|---|
freeze_until
|
last layer to freeze
TYPE:
|
lr
|
learning rate to be used for training
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
num_it
|
number of iterations to perform
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the loss value is NaN or inf |
Source code in holocron/trainer/core.py
Image classification¶
ClassificationTrainer
¶
ClassificationTrainer(model: Module, train_loader: DataLoader, val_loader: DataLoader, criterion: Module, optimizer: Optimizer, gpu: int | None = None, output_file: str = './checkpoint.pth', amp: bool = False, skip_nan_loss: bool = False, nan_tolerance: int = 5, gradient_acc: int = 1, gradient_clip: float | None = None, on_epoch_end: Callable[[dict[str, float]], Any] | None = None)
Bases: Trainer
Image classification trainer class.
| PARAMETER | DESCRIPTION |
|---|---|
model
|
model to train
TYPE:
|
train_loader
|
training loader
TYPE:
|
val_loader
|
validation loader
TYPE:
|
criterion
|
loss criterion
TYPE:
|
optimizer
|
parameter optimizer
TYPE:
|
gpu
|
index of the GPU to use
TYPE:
|
output_file
|
path where checkpoints will be saved
TYPE:
|
amp
|
whether to use automatic mixed precision
TYPE:
|
skip_nan_loss
|
whether the optimizer step should be skipped when the loss is NaN
TYPE:
|
nan_tolerance
|
number of consecutive batches with NaN loss before stopping the training
TYPE:
|
gradient_acc
|
number of batches to accumulate the gradient of before performing the update step
TYPE:
|
gradient_clip
|
the gradient clip value
TYPE:
|
on_epoch_end
|
callback triggered at the end of an epoch
TYPE:
|
Source code in holocron/trainer/core.py
set_device
¶
set_device(gpu: int | None = None) -> None
Move tensor objects to the target GPU
| PARAMETER | DESCRIPTION |
|---|---|
gpu
|
index of the target GPU device
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
if PyTorch cannot access the GPU |
ValueError
|
if the device index is invalid |
Source code in holocron/trainer/core.py
save
¶
save(output_file: str) -> None
Save a trainer checkpoint
| PARAMETER | DESCRIPTION |
|---|---|
output_file
|
destination file path
TYPE:
|
Source code in holocron/trainer/core.py
load
¶
Resume from a trainer state
| PARAMETER | DESCRIPTION |
|---|---|
state
|
checkpoint dictionary |
Source code in holocron/trainer/core.py
to_cuda
¶
to_cuda(x: Tensor, target: Tensor | list[dict[str, Tensor]]) -> tuple[Tensor, Tensor | list[dict[str, Tensor]]]
Move input and target to GPU
| PARAMETER | DESCRIPTION |
|---|---|
x
|
input tensor
TYPE:
|
target
|
target tensor or list of target dictionaries |
| RETURNS | DESCRIPTION |
|---|---|
tuple[Tensor, Tensor | list[dict[str, Tensor]]]
|
tuple of input and target tensors |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the device index is invalid |
Source code in holocron/trainer/core.py
fit_n_epochs
¶
fit_n_epochs(num_epochs: int, lr: float, freeze_until: str | None = None, sched_type: str = 'onecycle', norm_weight_decay: float | None = None, **kwargs: Any) -> None
Train the model for a given number of epochs.
| PARAMETER | DESCRIPTION |
|---|---|
num_epochs
|
number of epochs to train
TYPE:
|
lr
|
learning rate to be used by the scheduler
TYPE:
|
freeze_until
|
last layer to freeze
TYPE:
|
sched_type
|
type of scheduler to use
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
**kwargs
|
keyword args passed to the
TYPE:
|
Source code in holocron/trainer/core.py
find_lr
¶
find_lr(freeze_until: str | None = None, start_lr: float = 1e-07, end_lr: float = 1, norm_weight_decay: float | None = None, num_it: int = 100) -> None
Gridsearch the optimal learning rate for the training as described in "Cyclical Learning Rates for Training Neural Networks".
| PARAMETER | DESCRIPTION |
|---|---|
freeze_until
|
last layer to freeze
TYPE:
|
start_lr
|
initial learning rate
TYPE:
|
end_lr
|
final learning rate
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
num_it
|
number of iterations to perform
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the number of iterations is greater than the number of available batches |
Source code in holocron/trainer/core.py
plot_recorder
¶
Display the results of the LR grid search
| PARAMETER | DESCRIPTION |
|---|---|
beta
|
smoothing factor
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
if the number of learning rate recorder and loss recorder are not the same or if the number of learning rate recorder is 0 |
Source code in holocron/trainer/core.py
check_setup
¶
check_setup(freeze_until: str | None = None, lr: float = 0.0003, norm_weight_decay: float | None = None, num_it: int = 100, **kwargs: Any) -> None
Check whether you can overfit one batch
| PARAMETER | DESCRIPTION |
|---|---|
freeze_until
|
last layer to freeze
TYPE:
|
lr
|
learning rate to be used for training
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
num_it
|
number of iterations to perform
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the loss value is NaN or inf |
Source code in holocron/trainer/core.py
evaluate
¶
Evaluate the model on the validation set
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
evaluation metrics |
Source code in holocron/trainer/classification.py
plot_top_losses
¶
plot_top_losses(mean: tuple[float, float, float], std: tuple[float, float, float], classes: Sequence[str] | None = None, num_samples: int = 12, **kwargs: Any) -> None
Plot the top losses
| PARAMETER | DESCRIPTION |
|---|---|
mean
|
mean of the dataset |
std
|
standard deviation of the dataset |
classes
|
list of classes |
num_samples
|
number of samples to plot
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
if the argument 'classes' is not specified for multi-class classification |
Source code in holocron/trainer/classification.py
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | |
BinaryClassificationTrainer
¶
BinaryClassificationTrainer(model: Module, train_loader: DataLoader, val_loader: DataLoader, criterion: Module, optimizer: Optimizer, gpu: int | None = None, output_file: str = './checkpoint.pth', amp: bool = False, skip_nan_loss: bool = False, nan_tolerance: int = 5, gradient_acc: int = 1, gradient_clip: float | None = None, on_epoch_end: Callable[[dict[str, float]], Any] | None = None)
Bases: ClassificationTrainer
Image binary classification trainer class.
| PARAMETER | DESCRIPTION |
|---|---|
model
|
model to train
TYPE:
|
train_loader
|
training loader
TYPE:
|
val_loader
|
validation loader
TYPE:
|
criterion
|
loss criterion
TYPE:
|
optimizer
|
parameter optimizer
TYPE:
|
gpu
|
index of the GPU to use
TYPE:
|
output_file
|
path where checkpoints will be saved
TYPE:
|
amp
|
whether to use automatic mixed precision
TYPE:
|
Source code in holocron/trainer/core.py
set_device
¶
set_device(gpu: int | None = None) -> None
Move tensor objects to the target GPU
| PARAMETER | DESCRIPTION |
|---|---|
gpu
|
index of the target GPU device
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
if PyTorch cannot access the GPU |
ValueError
|
if the device index is invalid |
Source code in holocron/trainer/core.py
save
¶
save(output_file: str) -> None
Save a trainer checkpoint
| PARAMETER | DESCRIPTION |
|---|---|
output_file
|
destination file path
TYPE:
|
Source code in holocron/trainer/core.py
load
¶
Resume from a trainer state
| PARAMETER | DESCRIPTION |
|---|---|
state
|
checkpoint dictionary |
Source code in holocron/trainer/core.py
to_cuda
¶
to_cuda(x: Tensor, target: Tensor | list[dict[str, Tensor]]) -> tuple[Tensor, Tensor | list[dict[str, Tensor]]]
Move input and target to GPU
| PARAMETER | DESCRIPTION |
|---|---|
x
|
input tensor
TYPE:
|
target
|
target tensor or list of target dictionaries |
| RETURNS | DESCRIPTION |
|---|---|
tuple[Tensor, Tensor | list[dict[str, Tensor]]]
|
tuple of input and target tensors |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the device index is invalid |
Source code in holocron/trainer/core.py
fit_n_epochs
¶
fit_n_epochs(num_epochs: int, lr: float, freeze_until: str | None = None, sched_type: str = 'onecycle', norm_weight_decay: float | None = None, **kwargs: Any) -> None
Train the model for a given number of epochs.
| PARAMETER | DESCRIPTION |
|---|---|
num_epochs
|
number of epochs to train
TYPE:
|
lr
|
learning rate to be used by the scheduler
TYPE:
|
freeze_until
|
last layer to freeze
TYPE:
|
sched_type
|
type of scheduler to use
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
**kwargs
|
keyword args passed to the
TYPE:
|
Source code in holocron/trainer/core.py
find_lr
¶
find_lr(freeze_until: str | None = None, start_lr: float = 1e-07, end_lr: float = 1, norm_weight_decay: float | None = None, num_it: int = 100) -> None
Gridsearch the optimal learning rate for the training as described in "Cyclical Learning Rates for Training Neural Networks".
| PARAMETER | DESCRIPTION |
|---|---|
freeze_until
|
last layer to freeze
TYPE:
|
start_lr
|
initial learning rate
TYPE:
|
end_lr
|
final learning rate
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
num_it
|
number of iterations to perform
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the number of iterations is greater than the number of available batches |
Source code in holocron/trainer/core.py
plot_recorder
¶
Display the results of the LR grid search
| PARAMETER | DESCRIPTION |
|---|---|
beta
|
smoothing factor
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
if the number of learning rate recorder and loss recorder are not the same or if the number of learning rate recorder is 0 |
Source code in holocron/trainer/core.py
check_setup
¶
check_setup(freeze_until: str | None = None, lr: float = 0.0003, norm_weight_decay: float | None = None, num_it: int = 100, **kwargs: Any) -> None
Check whether you can overfit one batch
| PARAMETER | DESCRIPTION |
|---|---|
freeze_until
|
last layer to freeze
TYPE:
|
lr
|
learning rate to be used for training
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
num_it
|
number of iterations to perform
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the loss value is NaN or inf |
Source code in holocron/trainer/core.py
plot_top_losses
¶
plot_top_losses(mean: tuple[float, float, float], std: tuple[float, float, float], classes: Sequence[str] | None = None, num_samples: int = 12, **kwargs: Any) -> None
Plot the top losses
| PARAMETER | DESCRIPTION |
|---|---|
mean
|
mean of the dataset |
std
|
standard deviation of the dataset |
classes
|
list of classes |
num_samples
|
number of samples to plot
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
if the argument 'classes' is not specified for multi-class classification |
Source code in holocron/trainer/classification.py
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | |
evaluate
¶
Evaluate the model on the validation set
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
evaluation metrics |
Source code in holocron/trainer/classification.py
Semantic segmentation¶
SegmentationTrainer
¶
Bases: Trainer
Semantic segmentation trainer class.
| PARAMETER | DESCRIPTION |
|---|---|
*args
|
args of
TYPE:
|
num_classes
|
number of output classes
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
Source code in holocron/trainer/segmentation.py
set_device
¶
set_device(gpu: int | None = None) -> None
Move tensor objects to the target GPU
| PARAMETER | DESCRIPTION |
|---|---|
gpu
|
index of the target GPU device
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
if PyTorch cannot access the GPU |
ValueError
|
if the device index is invalid |
Source code in holocron/trainer/core.py
save
¶
save(output_file: str) -> None
Save a trainer checkpoint
| PARAMETER | DESCRIPTION |
|---|---|
output_file
|
destination file path
TYPE:
|
Source code in holocron/trainer/core.py
load
¶
Resume from a trainer state
| PARAMETER | DESCRIPTION |
|---|---|
state
|
checkpoint dictionary |
Source code in holocron/trainer/core.py
to_cuda
¶
to_cuda(x: Tensor, target: Tensor | list[dict[str, Tensor]]) -> tuple[Tensor, Tensor | list[dict[str, Tensor]]]
Move input and target to GPU
| PARAMETER | DESCRIPTION |
|---|---|
x
|
input tensor
TYPE:
|
target
|
target tensor or list of target dictionaries |
| RETURNS | DESCRIPTION |
|---|---|
tuple[Tensor, Tensor | list[dict[str, Tensor]]]
|
tuple of input and target tensors |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the device index is invalid |
Source code in holocron/trainer/core.py
fit_n_epochs
¶
fit_n_epochs(num_epochs: int, lr: float, freeze_until: str | None = None, sched_type: str = 'onecycle', norm_weight_decay: float | None = None, **kwargs: Any) -> None
Train the model for a given number of epochs.
| PARAMETER | DESCRIPTION |
|---|---|
num_epochs
|
number of epochs to train
TYPE:
|
lr
|
learning rate to be used by the scheduler
TYPE:
|
freeze_until
|
last layer to freeze
TYPE:
|
sched_type
|
type of scheduler to use
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
**kwargs
|
keyword args passed to the
TYPE:
|
Source code in holocron/trainer/core.py
find_lr
¶
find_lr(freeze_until: str | None = None, start_lr: float = 1e-07, end_lr: float = 1, norm_weight_decay: float | None = None, num_it: int = 100) -> None
Gridsearch the optimal learning rate for the training as described in "Cyclical Learning Rates for Training Neural Networks".
| PARAMETER | DESCRIPTION |
|---|---|
freeze_until
|
last layer to freeze
TYPE:
|
start_lr
|
initial learning rate
TYPE:
|
end_lr
|
final learning rate
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
num_it
|
number of iterations to perform
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the number of iterations is greater than the number of available batches |
Source code in holocron/trainer/core.py
plot_recorder
¶
Display the results of the LR grid search
| PARAMETER | DESCRIPTION |
|---|---|
beta
|
smoothing factor
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
if the number of learning rate recorder and loss recorder are not the same or if the number of learning rate recorder is 0 |
Source code in holocron/trainer/core.py
check_setup
¶
check_setup(freeze_until: str | None = None, lr: float = 0.0003, norm_weight_decay: float | None = None, num_it: int = 100, **kwargs: Any) -> None
Check whether you can overfit one batch
| PARAMETER | DESCRIPTION |
|---|---|
freeze_until
|
last layer to freeze
TYPE:
|
lr
|
learning rate to be used for training
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
num_it
|
number of iterations to perform
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the loss value is NaN or inf |
Source code in holocron/trainer/core.py
evaluate
¶
Evaluate the model on the validation set
| PARAMETER | DESCRIPTION |
|---|---|
ignore_index
|
index of the class to ignore in evaluation
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
evaluation metrics |
Source code in holocron/trainer/segmentation.py
Object detection¶
DetectionTrainer
¶
DetectionTrainer(model: Module, train_loader: DataLoader, val_loader: DataLoader, criterion: Module, optimizer: Optimizer, gpu: int | None = None, output_file: str = './checkpoint.pth', amp: bool = False, skip_nan_loss: bool = False, nan_tolerance: int = 5, gradient_acc: int = 1, gradient_clip: float | None = None, on_epoch_end: Callable[[dict[str, float]], Any] | None = None)
Bases: Trainer
Object detection trainer class.
| PARAMETER | DESCRIPTION |
|---|---|
model
|
model to train
TYPE:
|
train_loader
|
training loader
TYPE:
|
val_loader
|
validation loader
TYPE:
|
criterion
|
loss criterion
TYPE:
|
optimizer
|
parameter optimizer
TYPE:
|
gpu
|
index of the GPU to use
TYPE:
|
output_file
|
path where checkpoints will be saved
TYPE:
|
amp
|
whether to use automatic mixed precision
TYPE:
|
skip_nan_loss
|
whether the optimizer step should be skipped when the loss is NaN
TYPE:
|
nan_tolerance
|
number of consecutive batches with NaN loss before stopping the training
TYPE:
|
gradient_acc
|
number of batches to accumulate the gradient of before performing the update step
TYPE:
|
gradient_clip
|
the gradient clip value
TYPE:
|
on_epoch_end
|
callback triggered at the end of an epoch
TYPE:
|
Source code in holocron/trainer/core.py
set_device
¶
set_device(gpu: int | None = None) -> None
Move tensor objects to the target GPU
| PARAMETER | DESCRIPTION |
|---|---|
gpu
|
index of the target GPU device
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
if PyTorch cannot access the GPU |
ValueError
|
if the device index is invalid |
Source code in holocron/trainer/core.py
save
¶
save(output_file: str) -> None
Save a trainer checkpoint
| PARAMETER | DESCRIPTION |
|---|---|
output_file
|
destination file path
TYPE:
|
Source code in holocron/trainer/core.py
load
¶
Resume from a trainer state
| PARAMETER | DESCRIPTION |
|---|---|
state
|
checkpoint dictionary |
Source code in holocron/trainer/core.py
to_cuda
¶
to_cuda(x: Tensor, target: Tensor | list[dict[str, Tensor]]) -> tuple[Tensor, Tensor | list[dict[str, Tensor]]]
Move input and target to GPU
| PARAMETER | DESCRIPTION |
|---|---|
x
|
input tensor
TYPE:
|
target
|
target tensor or list of target dictionaries |
| RETURNS | DESCRIPTION |
|---|---|
tuple[Tensor, Tensor | list[dict[str, Tensor]]]
|
tuple of input and target tensors |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the device index is invalid |
Source code in holocron/trainer/core.py
fit_n_epochs
¶
fit_n_epochs(num_epochs: int, lr: float, freeze_until: str | None = None, sched_type: str = 'onecycle', norm_weight_decay: float | None = None, **kwargs: Any) -> None
Train the model for a given number of epochs.
| PARAMETER | DESCRIPTION |
|---|---|
num_epochs
|
number of epochs to train
TYPE:
|
lr
|
learning rate to be used by the scheduler
TYPE:
|
freeze_until
|
last layer to freeze
TYPE:
|
sched_type
|
type of scheduler to use
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
**kwargs
|
keyword args passed to the
TYPE:
|
Source code in holocron/trainer/core.py
find_lr
¶
find_lr(freeze_until: str | None = None, start_lr: float = 1e-07, end_lr: float = 1, norm_weight_decay: float | None = None, num_it: int = 100) -> None
Gridsearch the optimal learning rate for the training as described in "Cyclical Learning Rates for Training Neural Networks".
| PARAMETER | DESCRIPTION |
|---|---|
freeze_until
|
last layer to freeze
TYPE:
|
start_lr
|
initial learning rate
TYPE:
|
end_lr
|
final learning rate
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
num_it
|
number of iterations to perform
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the number of iterations is greater than the number of available batches |
Source code in holocron/trainer/core.py
plot_recorder
¶
Display the results of the LR grid search
| PARAMETER | DESCRIPTION |
|---|---|
beta
|
smoothing factor
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
if the number of learning rate recorder and loss recorder are not the same or if the number of learning rate recorder is 0 |
Source code in holocron/trainer/core.py
check_setup
¶
check_setup(freeze_until: str | None = None, lr: float = 0.0003, norm_weight_decay: float | None = None, num_it: int = 100, **kwargs: Any) -> None
Check whether you can overfit one batch
| PARAMETER | DESCRIPTION |
|---|---|
freeze_until
|
last layer to freeze
TYPE:
|
lr
|
learning rate to be used for training
TYPE:
|
norm_weight_decay
|
weight decay to apply to normalization parameters
TYPE:
|
num_it
|
number of iterations to perform
TYPE:
|
**kwargs
|
keyword args of
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the loss value is NaN or inf |
Source code in holocron/trainer/core.py
evaluate
¶
Evaluate the model on the validation set.
| PARAMETER | DESCRIPTION |
|---|---|
iou_threshold
|
IoU threshold for pair assignment
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float | None]
|
evaluation metrics |
Source code in holocron/trainer/detection.py
Miscellaneous¶
freeze_bn
¶
freeze_bn(mod: Module) -> None
Prevents parameter and stats from updating in Batchnorm layers that are frozen
Examples:
>>> from holocron.models import rexnet1_0x
>>> from holocron.trainer.utils import freeze_bn
>>> model = rexnet1_0x()
>>> freeze_bn(model)
| PARAMETER | DESCRIPTION |
|---|---|
mod
|
model to train
TYPE:
|
Source code in holocron/trainer/utils.py
freeze_model
¶
freeze_model(model: Module, last_frozen_layer: str | None = None, frozen_bn_stat_update: bool = False) -> None
Freeze a specific range of model layers.
Examples:
>>> from holocron.models import rexnet1_0x
>>> from holocron.trainer.utils import freeze_model
>>> model = rexnet1_0x()
>>> freeze_model(model)
| PARAMETER | DESCRIPTION |
|---|---|
model
|
model to train
TYPE:
|
last_frozen_layer
|
last layer to freeze. Assumes layers have been registered in forward order
TYPE:
|
frozen_bn_stat_update
|
force stats update in BN layers that are frozen
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
if the last frozen layer is not found |