Evaluating Logistic Regression and SVM for Image Analysis Using VGG-16, VGG-19, and Inception V3 Features

Wildan Habibi
Airlangga University
Indonesia
Imam Yuadi
Airlangga University
Indonesia

Abstract
This paper presents a comparison of the classification accuracy of Logistic Regression (LR) and Support Vector Machine (SVM) classifiers on facial expression classification based on image embeddings obtained from pre-trained models like VGG-16, VGG-19, and Inception V3. Facial expression classification has relevance in emotion analysis, human-computer interaction, and security. The dataset consisted of five expressions: Angry, Fear, Happy, Neutral, and Sad. Feature embeddings were extracted by using CNN models, which are said to learn spatial features, and were classified using LR and SVM. Performance metrics like accuracy, precision, recall, and F1-score were evaluated. Inception V3 topped with 89.3% accuracy on SVM, followed by VGG-19 (87.6%) and VGG-16 (85.4%). Inception V3 was best in discriminating fine-grained expressions, as confirmed through confusion matrix analysis and visualization techniques like MDS and t-SNE. In contrast to earlier works on individual models or conventional approaches, this work emphasizes the merits of fusing powerful CNNs with strong classifiers. Limitations encompass a limited dataset and just five expressions, indicating that future research should address larger, varied datasets and real-time responsiveness for enhanced system robustness.
Keywords
Facial Expression Recognition; Deep Learning; Image Embeddings; Logistic Regression (LR); Support Vector Machine (SVM)
References

Simonyan, K., & Zisserman, A. (2014), “Very Deep Convolutional Networks For Large-Scale Image Recognition,” Arxiv Preprint Arxiv:1409.1556, https://arxiv.org/abs/1409.1556.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking The Inception Architecture For Computer Vision. Proceedings Of The Ieee Conference On Computer Vision And Pattern Recognition (Cvpr), 2818–2826, https://arxiv.org/abs/1512.00567.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning For Image Recognition. Proceedings Of The Ieee Conference On Computer Vision And Pattern Recognition (Cvpr), 770–778, https://arxiv.org/abs/1512.03385.

Rawat, W., & Wang, Z. (2017). Deep Convolutional Neural Networks For Image Classification: A Comprehensive Review. Neural Computation, 29(9), 2352–2449, https://direct.mit.edu/neco/article/29/9/2352/8292/Deep-Convolutional-Neural-Networks-for-Image.

Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014). How Transferable Are Features In Deep Neural Networks? Advances In Neural Information Processing Systems (Nips), 27, 3320–3328, https://arxiv.org/abs/1411.1792.

Ciresan, D., Meier, U., Masci, J., Gambardella, L. M., & Schmidhuber, J. (2012). High-Performance Neural Networks For Visual Object Classification. Proceedings Of The Ieee Conference On Computer Vision And Pattern Recognition (Cvpr), 1–8, https://arxiv.org/abs/1102.0183.

Alisawi, M., & Yalçın, N. (2022). Real-Time Emotion Recognition Using Deep Learning Methods: Systematic Review. International Journal of Emerging Networked Systems, 1(1), 11-21, https://imiens.org/index.php/imiens/article/view/11.

Oliveira, L. E., Santos, T., & Silva, L. (2019). Emotion Recognition From Faces In The Wild Using Deep Learning And Attention Mechanisms. Neurocomputing, 363, 82–92.

Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., & Darrell, T. (2014). Decaf: A Deep Convolutional Activation Feature For Generic Visual Recognition. Proceedings Of The International Conference On Machine Learning (Icml), 647–655, https://arxiv.org/abs/1310.1531.

Lecun, Y., Bengio, Y., & Hinton, G. (2015). Deep Learning. Nature, 521(7553), 436–444, https://www.nature.com/articles/nature14539.

Kim, J.-C., Kim, M.-H., Suh, H.-E., Naseem, M. T., & Lee, C.-S. (2019). A Hybrid Facial Expression Recognition System Based on Recurrent Neural Network. Proceedings of the 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 1-6, https://ieeexplore.ieee.org/abstract/document/8909888.

Agarap, A. F. (2017). An Architecture Combining Convolutional Neural Network (CNN) and Support Vector Machine (SVM) for Image Classification. arXiv preprint arXiv:1712.03541, https://arxiv.org/abs/1712.03541.

Mollahosseini, A., Hasani, B., & Mahoor, M. H. (2016). Affectnet: A Database For Facial Expression, Valence, And Arousal Computing In The Wild. Ieee Transactions On Affective Computing, 10(1), 18–31, https://ieeexplore.ieee.org/document/7933966.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. Mit Press, https://www.deeplearningbook.org/.

Vapnik, V. (1995). The Nature Of Statistical Learning Theory. Springer-Verlag New York, https://link.springer.com/book/10.1007/978-1-4757-3264-1.

Cortes, C., & Vapnik, V. (1995). Support-Vector Networks. Machine Learning, 20(3), 273–297, https://link.springer.com/article/10.1007/BF00994018.

Department of Information and Library Science, Faculty of Social and Political Sciences, Airlangga University

Information
PDF
125 times PDF : 44 times