ABC (AI-Big Data Convergence) Forum
****** Participation in the ABC forum is free for all participants. To get a certificate, please register here ******
- The slides of the tutorial on LLM and GPT can be downloaded here. The author requests that if the material is to be used or reposted, etc., proper acknowledgment should be granted as follows:
"Demystifying Large Language Models and GPT" by Prof. Won Kim, Gachon University, South Korea
AI-BigData Convergence (ABC) Tutorial
The 25th International Conference on Information Integration and Web Intelligence (iiWAS2023)
Tutorial
Demystifying Large Language Models and Generative Pretrained Transformer
Prof. Won KimAI Vice President, Gachon University South Korea |
The release of ChatGPT from OpenAI has generated huge excitement and interest in Large Language Models (LLMs). Many companies have been busy developing their own LLMs and applications of the LLMs.
An LLM is a computer algorithm that processes natural language inputs and predicts the next word based on what it has already seen. LLMs use the transformer, which is a type of neural network architecture. GPT(generative pretrained transformer) is the name OpenAI has given to its LLM, and has become perhaps the best-known LLM.
In this tutorial, I will explore what LLM is and how it works, with emphasis on GPT. In particular, I will examine how input text is transformed into vectors and matrices of numbers, and how they flow through key layers of the transformer architecture. I will also discuss the opportunities and issues of the LLMs.
The tutorial is organized as follows:
- Introduction to LLM
- LLM training and inference
- GPT performance
- Input text tokenization
- Input token embedding and position encoding
- Attention concept
- Transformer architecture
- How Attention works
- Evolution of the GPT models
- Opportunities and issues of the LLMs
Note: Although this tutorial is aimed at “beginners” (i.e., computer science majors who do not have intimate knowledge of LLM and NLP), I am assuming at least a superficial knowledge of artificial neural networks.
Biography
Won Kim is currently a distinguished professor with the School of Computing in Gachon University, near Seoul, Korea. He is also AI Vice President of Gachon University and Managing Director of the National Program of Excellence in Software Education at Gachon University, sponsored by the Korean Ministry of Science & Technology and Information Communication Technology.
Before joining the faculty of Gachon University, he was a professor with SKKU (SungKyunKwan University), and a senior vice president of Samsung Electronics, both in Korea.
He was the founder and CEO of UniSQL (where he led the development of world’s first object-relational database system) and also Cyber Database Solutions in the US. Before founding UniSQL, he worked as a researcher at IBM Almaden Research Center; and as a research director at MCC(Microelectronics and Computer Technology Corporation), where he led a team that developed the ORION object-oriented database system (one of the first OODBs).
He received a Ph.D. in computer science from the University of Illinois at Urbana-Champaign.
In 2017, he received an Order of Service Merit Medal from the Government of Korea for his services to IT industry for both the US and Korea. In 2018, he received an Alumni Outstanding Educator Award from the Computer Science Department of the University of Illinois at Urbana-Champaign.
In the US, he served as Chair of ACM(Association of Computing Machinery) SIGMOD (special interest group on management of data) and founding Chair of ACM SIGKDD (special interest group on knowledge discovery and data mining). He also served as Editor-in-Chief of ACM Transactions on Database Systems, and founding Editor-in-Chief of ACM Transactions on Internet Technology.
Panel
The Future of LLMs
Large language models (LLMs) are a type of generative artificial intelligence that are trained on massive datasets of text and code. They can generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way. LLMs are still under development, but they have already had a significant impact on a wide range of industries and applications.
This panel will discuss the future of LLMs, including the following topics:
- New and emerging applications of LLMs. How will LLMs be used in the future? What new possibilities will they open up?
- The challenges of developing and deploying LLMs. LLMs are complex and computationally expensive to train and deploy. What are the challenges that need to be overcome in order to make LLMs more accessible and affordable?
- The ethical and social implications of LLMs. LLMs have the potential to be used for good or for bad. What are the ethical and social implications of this technology? How can we ensure that LLMs are used responsibly?
The panel will feature experts from a variety of fields, including academia, industry, and government. They will share their insights and perspectives on the future of LLMs and discuss the challenges and opportunities that lie ahead.
Audience: This panel is open for free to anyone who is interested in the future of artificial intelligence and large language models. It will be of particular interest to researchers, developers, and policymakers who are working in this field.
This panel will provide a valuable opportunity to learn about the future of LLMs and the challenges and opportunities that lie ahead. It will also be a forum for discussion and debate about the ethical and social implications of this technology.
Stéphane BressanNational University of Singapore Singapore Moderator |
Stéphane Bressan is Associate Professor in the Department of Computer Science of the School of Computing (SoC) of the National University of Singapore (NUS). Stéphane is Track Leader for Maritime Information Technologies at NUS Centre for Maritime Studies (CMS), Affiliate Professor at NUS Business Analytics Centre, Faculty Affiliate at NUS Institute of Data Science, and a member of the Image & Pervasive Access Lab (IPAL) (Singapore-France CNRS UMI 29255).
In 1990, Stéphane joined the European Computer-industry Research Centre (ECRC) of Bull, ICL, and Siemens in Munich (Germany). From 1996 to 1998, he was Research Associate at the Sloan School of Management of the Massachusetts Institute of Technology (MIT) (United States of America).
Stéphane's research interest is the integration, management, and analysis of data from heterogeneous, disparate, and distributed sources. Stéphane has developed expertise in data- and physics-driven modelling, simulation, and optimisation with data mining and machine learning algorithms.
Ngurah Agus Sanjaya ERUdayana University Indonesia Panelist |
Ngurah Agus Sanjaya ER is an Associate Professor in Informatics Department at Udayana University. His research mainly focuses on automatic extraction of information from semi or unstructured data. He finished his PhD from Télécom Paristech, Paris - France, in which he defended his Doctorate thesis entitled "Advanced Information Extraction by Example". He addresses in this thesis an alternative method of searching for information, i.e. by giving examples of the information in question. He first tries to improve the accuracy of the search by example systems by expanding the given examples syntactically. Later, he uses truth discovery paradigm to rank the returned query results. Finally, he investigates the possibility of expanding the examples semantically through labelling each group of elements of the examples. He currently continues his work on text mining focusing on preserving Balinese language through machine learning. He has published articles on Stemmer, and Part-of-Speech (POS) tagger for Balinese language. He develops a digital portal and a mobile application for Balinese folklore stories where he implemented all of his research. His research interests also include (but not limited to) truth discovery, natural language processing, data mining and big data.
Gabriele KotsisJohannes Kepler University Linz Austria Panelist |
Gabriele Kotsis is Full Professor in computer science at Johannes Kepler University, Linz, Austria, and Past President of the ACM. Receiving recognition for her work from the very beginning (her master 's thesis, submitted at the University of Vienna in 1991, was honored with the student sponsorship award of the Austrian Computer Society, and her PhD in 1995 was honored with the highly prestigious HeinzZemanek award) was doubtlessly a motivating factor for her and her decision to dedicate her career to research in academia and to the scientific community. In 2002, she was one of the co-founding chairs of the working group for professors in computer science within the Austrian Computer Society (OCG). From 2003 to 2007 she was President of the Austrian Computer Society, being the first female holding this position in Austria. In addition to her two-term presidency at OCG, Gabriele takes an active part in the Editorial Board of the OCG Book Series, in the working group Fem-IT (Association of Female University Professors in IT) and in the OCG award committee.
From 2007 to 2015 she served as Vice-Rector for Research at Johannes Kepler University (JKU). Her responsibilities included the development of R&D strategies and policies within the university, coordination and interaction with national and international governmental organisations and funding bodies, and the establishment of collaborations with other research organisations and business partners. From 2016 to 2020, Gabriele Kotsis has been JKU´s representative in the ASEA-UNINET academic research network, which promotes cooperation among European and South-East Asian public universities. Her active involvement in this network led to her nomination and election as President for the current period, February 2019 to July 2022, and to her election as Austrias National Coordinator from 2020 on.
A Min TjoaVienna University of Technology Austria Panelist |
A Min Tjoa received his Ph.D. in Computer Science from the Johannes Kepler University in Linz, Austria, and has been a Professor of Software Technology at the Vienna University of Technology since 1994, with research expertise in data warehousing, business intelligence, cybersecurity, and environmental informatics. He is the author of more than 200 peer-reviewed articles.
Currently, he is also an Adjunct Professor at ITB (Bandung Institute of Technology).
He is the Executive Chairman of the Austrian National Competence Center for Excellent Technologies in the field of IT Security (SBA). He was the President of the United Nations Commission on Science and Technology for Development (UN-CSTD) in 2019/2020 and its Vice President from 2020 to 2022.
He was Honorary Secretary of IFIP (International Federation of Information Processing) from 2013 to 2019 and Vice President of Infoterm (International Information Center for Terminology).
He is also President of the Austrian-Indonesian Society (since 2017).
He is a member of the Executive Board of the ASEAN-European Academic University Network (ASEA-UNINET). He has received an honorary doctorate from the Czech Technical University in Prague and an honorary professorship from Hue University in Vietnam.
Dirk DraheimTallinn University of Technology Estonia Panelist |
Dirk Draheim received the PhD from Freie Universität Berlin and the habilitation from Universität Mannheim, Germany. Currently, he is full professor of information society technology at Tallinn University of Technology (Taltech), Estonia, head of the Taltech Information Systems Group and holds the Taltech National Professorship in Information Society Technology. The Information Systems Group conducts research in large and ultra-large-scale IT systems. Dirk (co-)authored over 200 publications and is author of the Springer books ``Business Process Technology'', ``Semantics of the Probabilistic Typed Lambda Calculus'' and ``Generalized Jeffrey Conditionalization'', and co-author of the Springer book ``Form-Oriented Analysis''. He is also an initiator and leader of numerous digital transformation initiatives.