Skip to content

Simple project what helps recognizing sign language with torch vision.

Notifications You must be signed in to change notification settings

Navin3d/Sign-Language-Recognition

Repository files navigation

Sign-Language-Recognition

Simple project what helps recognizing sign language with torch vision.

Dataset:

Model Summary:

========================================================================================================================
Layer (type (var_name))                  Input Shape          Output Shape         Param #              Trainable
========================================================================================================================
TinyVGG (TinyVGG)                        [16, 3, 64, 64]      [16, 10]             --                   True
├─Sequential (conv_layer1)               [16, 3, 64, 64]      [16, 60, 29, 29]     --                   True
│    └─Conv2d (0)                        [16, 3, 64, 64]      [16, 60, 62, 62]     1,680                True
│    └─ReLU (1)                          [16, 60, 62, 62]     [16, 60, 62, 62]     --                   --
│    └─Conv2d (2)                        [16, 60, 62, 62]     [16, 60, 60, 60]     32,460               True
│    └─ReLU (3)                          [16, 60, 60, 60]     [16, 60, 60, 60]     --                   --
│    └─MaxPool2d (4)                     [16, 60, 60, 60]     [16, 60, 29, 29]     --                   --
├─Sequential (conv_layer2)               [16, 60, 29, 29]     [16, 60, 12, 12]     --                   True
│    └─Conv2d (0)                        [16, 60, 29, 29]     [16, 60, 27, 27]     32,460               True
│    └─ReLU (1)                          [16, 60, 27, 27]     [16, 60, 27, 27]     --                   --
│    └─Conv2d (2)                        [16, 60, 27, 27]     [16, 60, 25, 25]     32,460               True
│    └─ReLU (3)                          [16, 60, 25, 25]     [16, 60, 25, 25]     --                   --
│    └─MaxPool2d (4)                     [16, 60, 25, 25]     [16, 60, 12, 12]     --                   --
├─Sequential (classification_layer)      [16, 60, 12, 12]     [16, 10]             --                   True
│    └─Flatten (0)                       [16, 60, 12, 12]     [16, 8640]           --                   --
│    └─Linear (1)                        [16, 8640]           [16, 10]             86,410               True
│    └─Softmax (2)                       [16, 10]             [16, 10]             --                   --
========================================================================================================================
Total params: 185,470
Trainable params: 185,470
Non-trainable params: 0
Total mult-adds (G): 2.68
========================================================================================================================
Input size (MB): 0.79
Forward/backward pass size (MB): 67.57
Params size (MB): 0.74
Estimated Total Size (MB): 69.10
========================================================================================================================
============================================================================================================================================
Layer (type (var_name))                                      Input Shape          Output Shape         Param #              Trainable
============================================================================================================================================
EfficientNet (EfficientNet)                                  [16, 3, 64, 64]      [16, 10]             --                   Partial
├─Sequential (efficientnet_features)                         [16, 3, 64, 64]      [16, 1280, 2, 2]     --                   False
│    └─Conv2dNormActivation (0)                              [16, 3, 64, 64]      [16, 32, 32, 32]     --                   False
│    │    └─Conv2d (0)                                       [16, 3, 64, 64]      [16, 32, 32, 32]     (864)                False
│    │    └─BatchNorm2d (1)                                  [16, 32, 32, 32]     [16, 32, 32, 32]     (64)                 False
│    │    └─SiLU (2)                                         [16, 32, 32, 32]     [16, 32, 32, 32]     --                   --
│    └─Sequential (1)                                        [16, 32, 32, 32]     [16, 16, 32, 32]     --                   False
│    │    └─MBConv (0)                                       [16, 32, 32, 32]     [16, 16, 32, 32]     (1,448)              False
│    └─Sequential (2)                                        [16, 16, 32, 32]     [16, 24, 16, 16]     --                   False
│    │    └─MBConv (0)                                       [16, 16, 32, 32]     [16, 24, 16, 16]     (6,004)              False
│    │    └─MBConv (1)                                       [16, 24, 16, 16]     [16, 24, 16, 16]     (10,710)             False
│    └─Sequential (3)                                        [16, 24, 16, 16]     [16, 40, 8, 8]       --                   False
│    │    └─MBConv (0)                                       [16, 24, 16, 16]     [16, 40, 8, 8]       (15,350)             False
│    │    └─MBConv (1)                                       [16, 40, 8, 8]       [16, 40, 8, 8]       (31,290)             False
│    └─Sequential (4)                                        [16, 40, 8, 8]       [16, 80, 4, 4]       --                   False
│    │    └─MBConv (0)                                       [16, 40, 8, 8]       [16, 80, 4, 4]       (37,130)             False
│    │    └─MBConv (1)                                       [16, 80, 4, 4]       [16, 80, 4, 4]       (102,900)            False
│    │    └─MBConv (2)                                       [16, 80, 4, 4]       [16, 80, 4, 4]       (102,900)            False
│    └─Sequential (5)                                        [16, 80, 4, 4]       [16, 112, 4, 4]      --                   False
│    │    └─MBConv (0)                                       [16, 80, 4, 4]       [16, 112, 4, 4]      (126,004)            False
│    │    └─MBConv (1)                                       [16, 112, 4, 4]      [16, 112, 4, 4]      (208,572)            False
│    │    └─MBConv (2)                                       [16, 112, 4, 4]      [16, 112, 4, 4]      (208,572)            False
│    └─Sequential (6)                                        [16, 112, 4, 4]      [16, 192, 2, 2]      --                   False
│    │    └─MBConv (0)                                       [16, 112, 4, 4]      [16, 192, 2, 2]      (262,492)            False
│    │    └─MBConv (1)                                       [16, 192, 2, 2]      [16, 192, 2, 2]      (587,952)            False
│    │    └─MBConv (2)                                       [16, 192, 2, 2]      [16, 192, 2, 2]      (587,952)            False
│    │    └─MBConv (3)                                       [16, 192, 2, 2]      [16, 192, 2, 2]      (587,952)            False
│    └─Sequential (7)                                        [16, 192, 2, 2]      [16, 320, 2, 2]      --                   False
│    │    └─MBConv (0)                                       [16, 192, 2, 2]      [16, 320, 2, 2]      (717,232)            False
│    └─Conv2dNormActivation (8)                              [16, 320, 2, 2]      [16, 1280, 2, 2]     --                   False
│    │    └─Conv2d (0)                                       [16, 320, 2, 2]      [16, 1280, 2, 2]     (409,600)            False
│    │    └─BatchNorm2d (1)                                  [16, 1280, 2, 2]     [16, 1280, 2, 2]     (2,560)              False
│    │    └─SiLU (2)                                         [16, 1280, 2, 2]     [16, 1280, 2, 2]     --                   --
├─Sequential (classifier)                                    [16, 1280, 2, 2]     [16, 10]             --                   True
│    └─Flatten (0)                                           [16, 1280, 2, 2]     [16, 5120]           --                   --
│    └─Linear (1)                                            [16, 5120]           [16, 10]             51,210               True
│    └─Softmax (2)                                           [16, 10]             [16, 10]             --                   --
============================================================================================================================================
Total params: 4,058,758
Trainable params: 51,210
Non-trainable params: 4,007,548
Total mult-adds (M): 513.11
============================================================================================================================================
Input size (MB): 0.79
Forward/backward pass size (MB): 142.00
Params size (MB): 16.24
Estimated Total Size (MB): 159.02
============================================================================================================================================

Output:

(venv) navin3d@Navins-MacBook-Pro SignLanguage-Recognition % python3 train.py
Using DEVICE     : cpu
Number of Epochs : 10
Batch Size       : 16
Hidden Units     : 60
Learning Rate    : 0.001
Num of classes   : 10
Epoch: 1 | train_loss: 0.1376 | train_acc: 0.2355 | test_loss: 0.1152 | test_acc: 0.5625
Epoch: 2 | train_loss: 0.1198 | train_acc: 0.5436 | test_loss: 0.1003 | test_acc: 0.7500
Epoch: 3 | train_loss: 0.1153 | train_acc: 0.6163 | test_loss: 0.0991 | test_acc: 0.8125
Epoch: 4 | train_loss: 0.1134 | train_acc: 0.6453 | test_loss: 0.1091 | test_acc: 0.6250
Epoch: 5 | train_loss: 0.1141 | train_acc: 0.6366 | test_loss: 0.1074 | test_acc: 0.6250
Epoch: 6 | train_loss: 0.1119 | train_acc: 0.6701 | test_loss: 0.0965 | test_acc: 0.8125
Epoch: 7 | train_loss: 0.1129 | train_acc: 0.6512 | test_loss: 0.1036 | test_acc: 0.6875
Epoch: 8 | train_loss: 0.1101 | train_acc: 0.7006 | test_loss: 0.0958 | test_acc: 0.8125
Epoch: 9 | train_loss: 0.1121 | train_acc: 0.6628 | test_loss: 0.1155 | test_acc: 0.5625
Epoch: 10 | train_loss: 0.1112 | train_acc: 0.6802 | test_loss: 0.0958 | test_acc: 0.8125
[INFO] Saving model to: models/TinyVGG_Epoch_10_Batch_16.pth
(venv) navin3d@Navins-MacBook-Pro SignLanguage-Recognition % python3 train.py
Using DEVICE     : cpu
Number of Epochs : 10
Batch Size       : 16
Hidden Units     : 60
Learning Rate    : 0.001
Num of classes   : 10
Epoch: 1 | train_loss: 0.1313 | train_acc: 0.4157 | test_loss: 0.1195 | test_acc: 0.5625
Epoch: 2 | train_loss: 0.1182 | train_acc: 0.6279 | test_loss: 0.1146 | test_acc: 0.5625
Epoch: 3 | train_loss: 0.1138 | train_acc: 0.6817 | test_loss: 0.1167 | test_acc: 0.4375
Epoch: 4 | train_loss: 0.1129 | train_acc: 0.6860 | test_loss: 0.1110 | test_acc: 0.6250
Epoch: 5 | train_loss: 0.1122 | train_acc: 0.6860 | test_loss: 0.1055 | test_acc: 0.7500
Epoch: 6 | train_loss: 0.1105 | train_acc: 0.7180 | test_loss: 0.1165 | test_acc: 0.5000
Epoch: 7 | train_loss: 0.1104 | train_acc: 0.7180 | test_loss: 0.1063 | test_acc: 0.6250
Epoch: 8 | train_loss: 0.1102 | train_acc: 0.7195 | test_loss: 0.1022 | test_acc: 0.8125
Epoch: 9 | train_loss: 0.1088 | train_acc: 0.7384 | test_loss: 0.1050 | test_acc: 0.7500
Epoch: 10 | train_loss: 0.1091 | train_acc: 0.7413 | test_loss: 0.1125 | test_acc: 0.5625
[INFO] Saving model to: models/EfficientNet_Epoch_10_Batch_16_LR_0.001.pth

About

Simple project what helps recognizing sign language with torch vision.

Topics

Resources

Stars

Watchers

Forks