BIDeepLab: An Improved Lightweight Multi-scale Feature Fusion Deeplab Algorithm for Facial Recognition on Mobile Devices

Authors

    Jinming Li, Yutong Zhou Jiangxi Agricultural University, Nanchang 330000, Jiangxi, China School of Computer Science, Carnegie Mellon University, Pittsburgh 15204, PA, United States

DOI:

https://doi.org/10.18063/csa.v3i1.917

Keywords:

Attention, multi-scale, segementation

Abstract

This study presents BIDeepLab, a lightweight and improved multi-scale feature fusion algorithm based on DeepLab, specifically designed for facial recognition segmentation tasks on mobile devices. In response to the growing need for high-precision, low-latency face recognition in mobile applications—such as smart security, access control, and mobile identity verification—BIDeepLab introduces two key innovations. First, to address the challenge of multi-scale feature fusion during downsampling, we propose a Multi-Scale Attention (MSA) module that enables more efficient learning and integration of facial features at various scales. Second, inspired by the BiFPN architecture, we enhance the high-low feature fusion mechanism, allowing more accurate boundary and semantic information to be preserved during upsampling. These enhancements significantly improve segmentation quality while maintaining computational efficiency. Experiments were conducted on the Labeled Faces in the Wild (LFW) dataset, which includes over 13,000 real-world face images labeled with identities and detected using the Viola-Jones face detector. BIDeepLab achieved an mIoU of 90.2%, outperforming the original DeepLab in facial edge segmentation accuracy, while substantially reducing model parameters and computational cost. These results validate BIDeepLab as a practical and efficient framework for real-time facial segmentation on mobile and embedded systems.

References

P. Ding and H. Qian, ‘‘Light-DeepLabv3+: A lightweight real-time semantic segmentation method for complex environment perception,’’ J. Real-Time Image Process., vol. 21, no. 1, Feb. 2024.

J. Wang, X. Zhang, T. Yan, and A. Tan, ‘‘DPNet: Dual-pyramid semantic segmentation network based on improved DeepLabv3 plus,’’ Electronics, vol. 12, no. 14, p. 3161, Jul. 2023.

D. Wenkuan and G. Shicai, ‘‘Hazy images segmentation method based on improved DeepLabv3+,’’ Academic J. Comput. Inf. Sci., vol. 6, no. 5, pp. 21–29, 2023.

J. Libiao, Z. Wenchao, L. Changyu, and W. Zheng, ‘‘Semantic segmentation based on DeepLabv3+ with multiple fusions of low-level features,’’ in Proc. IEEE 5th Adv. Inf. Technol., Electron. Autom. Control Conf. (IAEAC), Mar. 2021, pp. 1957–1963.

G. Lili and Z. Jinzhi, ‘‘A lightweight network for semantic segmentation of road images based on improved DeepLabv3+,’’ in Proc. 5th Int. Conf. Pattern Recognit. Artif. Intell. (PRAI), Aug. 2022, pp. 832–837.

L. Li, W. Zhang, X. Zhang, M. Emam, and W. Jing, ‘‘Semi-supervised remote sensing image semantic segmentation method based on deep learning,’’ Electronics, vol. 12, no. 2, p. 348, Jan. 2023.

S. Xiang, L. Wei, and K. Hu, ‘‘Lightweight colon polyp segmentation algorithm based on improved DeepLabv3+,’’ J. Cancer, vol. 15, no. 1, pp. 41–53, 2024.

K. Lee and K. S. Park, ‘‘Deep learning model analysis of drone images for unauthorized occupancy detection of river site,’’ J. Coastal Res., vol. 116, no. 1, pp. 284–288, Jan. 2024. P. Wu, J. Fu, X. Yi, G. Wang, L

Azad R, Asadi-Aghbolaghi M, Fathy M, et al. Attention deeplabv3+: Multi-level contextattention mechanism for skin lesion segmentation[C]//European conference on computer vision. Cham: Springer International Publishing, 2020: 251-266.

Bae G, de La Gorce M, Baltrušaitis T, et al. Digiface-1m: 1 million digital face images for face recognition[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2023: 3526-3535.

Yin B, Wang W, Yao T, et al. Adv-makeup: A new imperceptible and transferable attack on face recognition[J]. arxiv preprint arxiv:2105.03162, 2021.

Huang Z, Zhang J, Shan H. When age-invariant face recognition meets face age synthesis: A multi-task learning framework[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 7282-7291.

Wang Z, Wang J, Yang K, et al. Semantic segmentation of high-resolution remote sensing images based on a class feature attention mechanism fused with Deeplabv3+[J]. Computers & Geosciences, 2022, 158: 104969.

Zhang H, Du Q, Qi Q, et al. A recursive attention-enhanced bidirectional feature pyramid network for small object detection[J]. Multimedia tools and applications, 2023, 82(9): 13999-14018.

Behera S K, Rath A K, Sethy P K. Fruits yield estimation using Faster R-CNN with MIoU[J]. Multimedia Tools and Applications, 2021, 80(12): 19043-19056.

Downloads

Published

2025-03-26