holocron.ops¶
holocron.ops
implements operators that are specific for Computer Vision.
Note
Those operators currently do not support TorchScript.
Boxes¶
- holocron.ops.box_diou(boxes1, boxes2)[source]¶
Computes the Distance-IoU loss as described in “Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression”.
The loss is defined as follows:
\[\mathcal{L}_{DIoU} = 1 - IoU + \frac{\rho^2(b, b^{GT})}{c^2}\]where \(IoU\) is the Intersection over Union, \(b\) and \(b^{GT}\) are the centers of the box and the ground truth box respectively, \(c\) c is the diagonal length of the smallest enclosing box covering the two boxes, and \(\rho(.)\) is the Euclidean distance.
- Parameters:
boxes1 (torch.Tensor[M, 4]) – bounding boxes
boxes2 (torch.Tensor[N, 4]) – bounding boxes
- Returns:
Distance-IoU loss
- Return type:
torch.Tensor[M, N]
- holocron.ops.box_ciou(boxes1, boxes2)[source]¶
Computes the Complete IoU loss as described in “Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression”.
The loss is defined as follows:
\[\mathcal{L}_{CIoU} = 1 - IoU + \frac{\rho^2(b, b^{GT})}{c^2} + \alpha v\]where \(IoU\) is the Intersection over Union, \(b\) and \(b^{GT}\) are the centers of the box and the ground truth box respectively, \(c\) c is the diagonal length of the smallest enclosing box covering the two boxes, \(\rho(.)\) is the Euclidean distance, \(\alpha\) is a positive trade-off parameter, and \(v\) is the aspect ratio consistency.
More specifically:
\[v = \frac{4}{\pi^2} \Big(\arctan{\frac{w^{GT}}{h^{GT}}} - \arctan{\frac{w}{h}}\Big)^2\]and
\[\alpha = \frac{v}{(1 - IoU) + v}\]- Parameters:
boxes1 (torch.Tensor[M, 4]) – bounding boxes
boxes2 (torch.Tensor[N, 4]) – bounding boxes
- Returns:
Complete IoU loss
- Return type:
torch.Tensor[M, N]
Example
>>> import torch >>> from holocron.ops.boxes import box_ciou >>> boxes1 = torch.tensor([[0, 0, 100, 100], [100, 100, 200, 200]], dtype=torch.float32) >>> boxes2 = torch.tensor([[50, 50, 150, 150]], dtype=torch.float32) >>> box_ciou(boxes1, boxes2)