ConvNeXt¶
The ConvNeXt model is based on the “A ConvNet for the 2020s” paper.
Architecture overview¶
This architecture compiles tricks from transformer-based vision models to improve a pure convolutional model.
The key takeaways from the paper are the following:
update the stem convolution to act like a patchify layer of transformers
increase block kernel size to 7
switch to depth-wise convolutions
reduce the amount of activations and normalization layers
Model builders¶
The following model builders can be used to instantiate a ConvNeXt model, with or
without pre-trained weights. All the model builders internally rely on the
holocron.models.classification.convnext.ConvNeXt
base class. Please refer to the source
code for
more details about this class.
|
ConvNeXt-Atto variant of Ross Wightman inspired by "A ConvNet for the 2020s" |
|
ConvNeXt-Femto variant of Ross Wightman inspired by "A ConvNet for the 2020s" |
|
ConvNeXt-Pico variant of Ross Wightman inspired by "A ConvNet for the 2020s" |
|
ConvNeXt-Nano variant of Ross Wightman inspired by "A ConvNet for the 2020s" |
|
ConvNeXt-T from "A ConvNet for the 2020s" |
|
ConvNeXt-S from "A ConvNet for the 2020s" |
|
ConvNeXt-B from "A ConvNet for the 2020s" |
|
ConvNeXt-L from "A ConvNet for the 2020s" |
|
ConvNeXt-XL from "A ConvNet for the 2020s" |