Overview of Large Language Models
DOI:
https://doi.org/10.18063/csa.v3i1.911Keywords:
Large language model, Development history, Application scenarios, Technical challengesAbstract
This article focuses on typical large language models and conducts an in-depth analysis of their definitions, typical models and the development status of their technologies. As an advanced artificial intelligence technology, large language models are trained based on huge parameters and massive data, and achieve natural language processing with the converter structure as the core. This article elaborates on the developing history and features of various large language models. Meanwhile, it is pointed out that the development of large language model technology faces problems as well as challenges such as non-authentic output and security risks, and in the future, it will develop in the directions of lightweight, multimodal and vertical specialization. The research aims to provide references for the further study and application of large language models and contribute to promoting the healthy development of this technology in various fields.
References
A. Singh, “Exploring Language Models: A Comprehensive Survey and Analysis,” 2023 International Conference on Research Methodologies in Knowledge Management, Artificial Intelligence and Telecommunication Engineering (RMKMATE), Chennai, India, 2023, pp. 1-4.
D. H. Anh, D. -T. Do, V. Tran and N. L. Minh, “The Impact of Large Language Modeling on Natural Language Processing in Legal Texts: A Comprehensive Survey,” 2023 15th International Conference on Knowledge and Systems Engineering (KSE), Hanoi, Vietnam, 2023, pp. 1-7.
A. Aggarwal, V. Kumar and R. Gupta, “Object Detection Based Approaches in Image Classification: A Brief Overview,” 2023 IEEE Guwahati Subsection Conference (GCON), Guwahati, India, 2023, pp. 1-6.
L. N. T. Manalu, M. Arif Bijaksana and A. A. Suryani, “Analysis of the Word2Vec Model for Semantic Similarities in Indonesian Words,” 2019 7th International Conference on Information and Communication Technology (ICoICT), Kuala Lumpur, Malaysia, 2019, pp. 1-5.
Q. Lin, T. Li, Y. Zhao, J. Guan, W. Zhang and X. Wang, “Research on Charging Infrastructure Related Detection Technology Based on GAN,” 2023 2nd International Conference on Clean Energy Storage and Power Engineering (CESPE), Xi'an, China, 2023, pp. 57-62.
T. Wu et al., “A Brief Overview of ChatGPT: The History, Status Quo and Potential Future Development,” in IEEE/CAA Journal of Automatica Sinica, vol. 10, no. 5, pp. 1122-1136, May 2023.
J. Li, D. Zhang and A. Wulamu, “Chinese Text Classification Based on ERNIE-RNN,” 2021 2nd International Conference on Electronics, Communications and Information Technology (CECIT), Sanya, China, 2021, pp. 368-372.
G. Mani and G. B. Namomsa, “Large Language Models (LLMs): Representation Matters, Low-Resource Languages and Multi-Modal Architecture,” 2023 IEEE AFRICON, Nairobi, Kenya, 2023, pp. 1-6.
Q. Wu and Y. Wang, “Research on Intelligent Question-Answering Systems Based on Large Language Models and Knowledge Graphs,” 2023 16th International Symposium on Computational Intelligence and Design (ISCID), Hangzhou, China, 2023, pp. 161-164.
M. Zhou, W. Chen, S. Zhu, T. Cai, J. Yu and G. Dai, “Application of large language models in professional fields,” 2023 11th International Conference on Information Systems and Computing Technology (ISCTech), Qingdao, China, 2023, pp. 142-146.