holocron.models¶

The models subpackage contains definitions of models for addressing different tasks, including: image classification, pixelwise semantic segmentation, object detection, instance segmentation, person keypoint detection and video classification.

The following datasets are available:

Classification ¶

Classification models expect a 4D image tensor as an input (N x C x H x W) and returns a 2D output (N x K). The output represents the classification scores for each output classes.

import holocron.models as models
darknet19 = models.darknet19(num_classes=10)

ResNet ¶

holocron.models.resnet18(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

ResNet-18 from “Deep Residual Learning for Image Recognition”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.resnet34(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

ResNet-34 from “Deep Residual Learning for Image Recognition”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.resnet50(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

ResNet-50 from “Deep Residual Learning for Image Recognition”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.resnet101(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

ResNet-101 from “Deep Residual Learning for Image Recognition”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.resnet152(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

ResNet-152 from “Deep Residual Learning for Image Recognition”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.resnext50_32x4d(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

ResNeXt-50 from “Aggregated Residual Transformations for Deep Neural Networks”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.resnext101_32x8d(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

ResNeXt-101 from “Aggregated Residual Transformations for Deep Neural Networks”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.resnet50d(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

ResNet-50-D from “Bag of Tricks for Image Classification with Convolutional Neural Networks”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

Res2Net ¶

holocron.models.res2net50_26w_4s(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

Res2Net-50 26wx4s from “Res2Net: A New Multi-scale Backbone Architecture”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

TridentNet ¶

holocron.models.tridentnet50(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

TridentNet-50 from “Scale-Aware Trident Networks for Object Detection”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

PyConvResNet ¶

holocron.models.pyconv_resnet50(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

PyConvResNet-50 from “Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.pyconvhg_resnet50(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

PyConvHGResNet-50 from “Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

ReXNet ¶

holocron.models.rexnet1_0x(pretrained=False, progress=True, **kwargs)[source]¶

ReXNet-1.0x from “ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.rexnet1_3x(pretrained=False, progress=True, **kwargs)[source]¶

ReXNet-1.3x from “ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.rexnet1_5x(pretrained=False, progress=True, **kwargs)[source]¶

ReXNet-1.5x from “ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.rexnet2_0x(pretrained=False, progress=True, **kwargs)[source]¶

ReXNet-2.0x from “ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.rexnet2_2x(pretrained=False, progress=True, **kwargs)[source]¶

ReXNet-2.2x from “ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

SKNet ¶

holocron.models.sknet50(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

SKNet-50 from “Selective Kernel Networks”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.sknet152(pretrained: bool = False, progress: bool = True, **kwargs: Any) → ResNet[source]¶

SKNet-152 from “Selective Kernel Networks”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

Darknet ¶

holocron.models.darknet24(pretrained: bool = False, progress: bool = True, **kwargs: Any) → DarknetV1[source]¶

Darknet-24 from “You Only Look Once: Unified, Real-Time Object Detection”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.darknet19(pretrained: bool = False, progress: bool = True, **kwargs: Any) → DarknetV2[source]¶

Darknet-19 from “YOLO9000: Better, Faster, Stronger”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.darknet53(pretrained: bool = False, progress: bool = True, **kwargs: Any) → DarknetV3[source]¶

Darknet-53 from “YOLOv3: An Incremental Improvement”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.cspdarknet53(pretrained: bool = False, progress: bool = True, **kwargs: Any) → DarknetV4[source]¶

CSP-Darknet-53 from “CSPNet: A New Backbone that can Enhance Learning Capability of CNN”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

holocron.models.cspdarknet53_mish(pretrained: bool = False, progress: bool = True, **kwargs: Any) → DarknetV4[source]¶

Modified version of CSP-Darknet-53 from “CSPNet: A New Backbone that can Enhance Learning Capability of CNN” with Mish as activation layer and DropBlock as regularization layer.

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

classification model

Return type:

torch.nn.Module

Object Detection ¶

Object detection models expect a 4D image tensor as an input (N x C x H x W) and returns a list of dictionaries. Each dictionary has 3 keys: box coordinates, classification probability, classification label.

import holocron.models as models
yolov2 = models.yolov2(num_classes=10)

YOLO ¶

holocron.models.yolov1(pretrained: bool = False, progress: bool = True, pretrained_backbone: bool = True, **kwargs: Any) → YOLOv1[source]¶

YOLO model from “You Only Look Once: Unified, Real-Time Object Detection”.

YOLO’s particularity is to make predictions in a grid (same size as last feature map). For each grid cell, the model predicts classification scores and a fixed number of boxes (default: 2). Each box in the cell gets 5 predictions: an objectness score, and 4 coordinates. The 4 coordinates are composed of: the 2-D coordinates of the predicted box center (relative to the cell), and the width and height of the predicted box (relative to the whole image).

For training, YOLO uses a multi-part loss whose components are computed by:

L_{c o o r d s} = \sum_{i = 0}^{S^{2}} \sum_{j = 0}^{B} 1_{i j}^{o b j} [(x_{i j} - {\hat{x}}_{i j}) ² + (y_{i j} - {\hat{y}}_{i j}) ² + (\sqrt{w_{i j}} - \sqrt{{\hat{w}}_{i j}}) ² + (\sqrt{h_{i j}} - \sqrt{{\hat{h}}_{i j}}) ²]

where $S$ is size of the output feature map (7 for an input size $(448, 448)$ ), $B$ is the number of anchor boxes per grid cell (default: 2), $1_{i j}^{o b j}$ equals to 1 if a GT center falls inside the i-th grid cell and among the anchor boxes of that cell, has the highest IoU with the j-th box else 0, $(x_{i j}, y_{i j}, w_{i j}, h_{i j})$ are the coordinates of the ground truth assigned to the j-th anchor box of the i-th grid cell, and $({\hat{x}}_{i j}, {\hat{y}}_{i j}, {\hat{w}}_{i j}, {\hat{h}}_{i j})$ are the coordinate predictions for the j-th anchor box of the i-th grid cell.

L_{o b j e c t n e s s} = \sum_{i = 0}^{S^{2}} \sum_{j = 0}^{B} [1_{i j}^{o b j} (C_{i j} - {\hat{C}}_{i j})^{2} + λ_{n o o b j} 1_{i j}^{n o o b j} (C_{i j} - {\hat{C}}_{i j})^{2}]

where $λ_{n o o b j}$ is a positive coefficient (default: 0.5), $1_{i j}^{n o o b j} = 1 - 1_{i j}^{o b j}$ , $C_{i j}$ equals the Intersection Over Union between the j-th anchor box in the i-th grid cell and its matched ground truth box if that box is matched with a ground truth else 0, and ${\hat{C}}_{i j}$ is the objectness score of the j-th anchor box in the i-th grid cell..

L_{c l a s s i f i c a t i o n} = \sum_{i = 0}^{S^{2}} 1_{i}^{o b j} \sum_{c \in c l a s s e s} (p_{i} (c) - {\hat{p}}_{i} (c))^{2}

where $1_{i}^{o b j}$ equals to 1 if a GT center falls inside the i-th grid cell else 0, $p_{i} (c)$ equals 1 if the assigned ground truth to the i-th cell is classified as class $c$ , and ${\hat{p}}_{i} (c)$ is the predicted probability of class $c$ in the i-th cell.

And the full loss is given by:

L_{Y O L O v 1} = λ_{c o o r d s} \cdot L_{c o o r d s} + L_{o b j e c t n e s s} + L_{c l a s s i f i c a t i o n}

where $λ_{c o o r d s}$ is a positive coefficient (default: 5).

Parameters:

pretrained (bool, optional) – If True, returns a model pre-trained on ImageNet
progress (bool, optional) – If True, displays a progress bar of the download to stderr
pretrained_backbone (bool, optional) – If True, backbone parameters will have been pretrained on Imagenette

Returns:

detection module

Return type:

torch.nn.Module

holocron.models.yolov2(pretrained: bool = False, progress: bool = True, pretrained_backbone: bool = True, **kwargs: Any) → YOLOv2[source]¶

YOLOv2 model from “YOLO9000: Better, Faster, Stronger”.

YOLOv2 improves upon YOLO by raising the number of boxes predicted by grid cell (default: 5), introducing bounding box priors and predicting class scores for each anchor box in the grid cell.

For training, YOLOv2 uses the same multi-part loss as YOLO apart from its classification loss:

L_{c l a s s i f i c a t i o n} = \sum_{i = 0}^{S^{2}} \sum_{j = 0}^{B} 1_{i j}^{o b j} \sum_{c \in c l a s s e s} (p_{i j} (c) - {\hat{p}}_{i j} (c))^{2}

where $S$ is size of the output feature map (13 for an input size $(416, 416)$ ), $B$ is the number of anchor boxes per grid cell (default: 5), $1_{i j}^{o b j}$ equals to 1 if a GT center falls inside the i-th grid cell and among the anchor boxes of that cell, has the highest IoU with the j-th box else 0, $p_{i j} (c)$ equals 1 if the assigned ground truth to the j-th anchor box of the i-th cell is classified as class $c$ , and ${\hat{p}}_{i j} (c)$ is the predicted probability of class $c$ for the j-th anchor box in the i-th cell.

Parameters:

pretrained (bool, optional) – If True, returns a model pre-trained on ImageNet
progress (bool, optional) – If True, displays a progress bar of the download to stderr
pretrained_backbone (bool, optional) – If True, backbone parameters will have been pretrained on Imagenette

Returns:

detection module

Return type:

torch.nn.Module

holocron.models.yolov4(pretrained: bool = False, progress: bool = True, pretrained_backbone: bool = True, **kwargs: Any) → YOLOv4[source]¶

YOLOv4 model from “YOLOv4: Optimal Speed and Accuracy of Object Detection”.

YOLOv4 is an improvement on YOLOv3 that includes many changes including: the usage of DropBlock regularization, Mish activation, CSP and SAM in the backbone, SPP and PAN in the neck.

Parameters:

pretrained (bool, optional) – If True, returns a model pre-trained on ImageNet
progress (bool, optional) – If True, displays a progress bar of the download to stderr
pretrained_backbone (bool, optional) – If True, backbone parameters will have been pretrained on Imagenette

Returns:

detection module

Return type:

torch.nn.Module

Semantic Segmentation ¶

Semantic segmentation models expect a 4D image tensor as an input (N x C x H x W) and returns a classification score tensor of size (N x K x Ho x Wo).

import holocron.models as models
unet = models.unet(num_classes=10)

U-Net ¶

holocron.models.unet(pretrained: bool = False, progress: bool = True, **kwargs: Any) → UNet[source]¶

U-Net from “U-Net: Convolutional Networks for Biomedical Image Segmentation”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

semantic segmentation model

Return type:

torch.nn.Module

holocron.models.unetp(pretrained: bool = False, progress: bool = True, **kwargs: Any) → UNetp[source]¶

UNet+ from “UNet++: A Nested U-Net Architecture for Medical Image Segmentation”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

semantic segmentation model

Return type:

torch.nn.Module

holocron.models.unetpp(pretrained: bool = False, progress: bool = True, **kwargs: Any) → UNetpp[source]¶

UNet++ from “UNet++: A Nested U-Net Architecture for Medical Image Segmentation”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

semantic segmentation model

Return type:

torch.nn.Module

holocron.models.unet3p(pretrained: bool = False, progress: bool = True, **kwargs: Any) → UNet3p[source]¶

UNet3+ from “UNet 3+: A Full-Scale Connected UNet For Medical Image Segmentation”

Parameters:

pretrained (bool) – If True, returns a model pre-trained on ImageNet
progress (bool) – If True, displays a progress bar of the download to stderr

Returns:

semantic segmentation model

Return type:

torch.nn.Module