- GPT models revolutionize AI by mastering human language fluency and context. They transcend chatbots, widely impacting healthcare, finance, and education.
- Rooted in NLP, GPT bridges human language with structured data, powering tasks from translation to sentiment analysis.
- With a market projected to soar to $2.6 trillion by 2032, AI’s impact is profound, similarly to blockchain’s early days.
In the ever-evolving landscape of artificial intelligence, one innovation has made waves like no other – the Generative Pre-trained Transformer, or GPT model. Imagine a technology that can understand, generate, and even predict human language with unprecedented fluency and contextuality. GPT models have revolutionised the way we interact with AI, from chatbots that sound almost human to content generation that’s eerily convincing. But the story doesn’t end there. These models are not just about chatbots and text generation; they’re unlocking new frontiers in healthcare, finance, education, and more.
Now, let me sub in. After all, what is the best way to start talking about artificial intelligence if not letting one of them, specifically ChatGPT, write you the introduction? I find the fact that they can speak about themselves very fascinating. Of course, they can do way more than that.
If I were to be asked the most abused word of the after-pandemic world, it would be very likely artificial intelligence. Today’s world has already become saturated by self-proclaimed experts, companies hopping on with questionable applications, web courses and so on. It all makes sense, since the global artificial intelligence (AI) market size was valued at USD 454.12 billion in 2022 and is expected to hit around USD 2,575.16 billion by 2032, progressing with a compound annual growth rate (CAGR) of 19% from 2023 to 2032. More accurately, North America alone generated more than 36.84% of the market share in 2022, with deep learning being the overall most important AI application, and the service segment is the solution with a greater market share. But what is deep learning? Which services are we talking about? Which companies? How to invest? These are all great questions, to which I don’t have an answer yet. For these reasons, the objective of this article, and the ones to come, is to dig into the complicated and geeky world surrounding this technology to help me understand the functioning of the GPT models, from the most basic principles to the latest trends. Then, since I have finance and speculations in my blood, I will be looking into the current and future applications of AI to finance, only once I have the appropriate tools to understand them. To some extent, this trend reminds me of the early days of blockchain, around the years 2016-2017. Also then, I remember trying to understand the validation processes and every surrounding, before selling my house to buy Bitcoin, just before the bubble exploded. Marvellous.
To begin, it is necessary to acknowledge that informatics is a very structured field: for this reason, a couple of definitions won’t hurt. Firstly, Artificial Intelligence is, according to Britannica, “the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings”. Because of this, it does not surprise us that its most famous subfields are Machine Learning, computer vision and robotics. Of those, the field of Machine Learning can be defined as the use of self-learning algorithms that, based on data, predict outcomes. Under this generic concept, we can encounter a wide range of proper ML types, like supervised or unsupervised, the field of neural networks (NN) and many others (like anomaly detection or online learning)
Now, the first step to understanding AI and GPT and everything around it is Natural Language Processing. This field, as Stanford Edu defines it, began in the 1940s, after World War II, with the intent of creating a machine that could automatically translate between one language and the other. Still, it was only in the ‘90s that NLP became more focused on information extraction and generation, due to the vast amounts of information scattered across the Internet. Then, the mass adoption of personal computers derived demand for consumer-level applications of NLP. Today, it is the essential starting point for connecting human language and computer language. In fact, as M. Keen from IBM defines it, NLP is a “box of tools” used to translate between unstructured and structured data. Unstructured data is the normal written human language, as in myself writing “Remember to buy ice cream” in this way. On the other hand, structured data is the logical translation of the sentence in terms that a computer can understand and elaborate on, like “remember” being a verb and meaning not to forget, or “ice cream” as a word and meaning a cold dessert. Once the computer understands the meaning of the words, it can retrace the meaning of them in the given sentence and in the given context, elaborating them into, for example, adding a note to the calendar reminding me to go to the supermarket. Inside the NLP concept, the passage from unstructured to structured is called Natural Language Understanding, while the other way around is called Natural Language Generation. The relevance of this is to get the input for a wide range of applications, from machine translation (which is the geeky way to call language translation) to chatbots, virtual assistance and sentiment analysis, just to name a few.
The process of NLP starts with an input, which could be either a written text or a verbal text converted to writing using a speech-to-text algorithm: in any case, a form of unstructured text. The message is then broken down into single pieces, or tokens, for example, a sentence into single words, to be processed one by one. After this phase of tokenization, the singular tokens can be reduced to their root (for example, “going”, “gone” or “goes” to the stem “go”) or, in the case of tokens that really don’t have a relationship with their stem (like “university” cannot be understood by its root “universe”), they can be reduced to their lemma, so that “better” can be retraced to its meaning “good”. In this last case, the process is done using an algorithm that is trained with all the necessary data. Then, since not only the meaning of the words is relevant to its meaning, but every token is also analysed in comparison to the context of the sentence (for example, logical structure), using a tool called “part of speech tagging”. Lastly, at least at a superficial level, it can be used a NER (Named Entity Recognition), which is another algorithm that correlated specific tokens with specific meaning, like California as related to United States, or Lorenzo as the name of a person.
Now, why Natural Language Processing is relevant to GPT models? Well, because it is not only relevant, but also the actual essence of the majority of GPT models’ purpose: arguably, the various NLP tasks, like text generation, language translation or sentiment analysis all start from the necessity to understand and generate human language text. Again, going from unstructured to structured data. Unfortunately, there is a lot of fields in the middle. But don’t you worry, I will lead you down this tumultuous road, eventually.
SOURCES
- Precedence Research (2023). Artificial Intelligence Market Size, Growth, Report 2022-2030. [online] http://www.precedenceresearch.com. Available at: https://www.precedenceresearch.com/artificial-intelligence-market. [Accessed 29 April 2024]
- cs.stanford.edu. (n.d.). NLP – overview. [online] Available at: https://cs.stanford.edu/people/eroberts/courses/soco/projects/2004-05/nlp/overview_history.html. [Accessed 29 April 2024]
- IBM (2023). What is Natural Language Processing? [online] IBM. Available at: https://www.ibm.com/topics/natural-language-processing. [Accessed 29 April 2024]
- http://www.youtube.com. (n.d.). What is NLP (Natural Language Processing)? [online] Available at: https://www.youtube.com/watch?v=fLvJ8VdHLA0. [Accessed 29 April 2024]
- Wu, S., Irsoy, O., Lu, S., Dabravolski, V., Dredze, M., Gehrmann, S., Kambadur, P., Rosenberg, D. and Mann, G. (2023). BloombergGPT: A Large Language Model for Finance. arXiv:2303.17564 [cs, q-fin]. [online] Available at: https://arxiv.org/abs/2303.17564. [Accessed 29 April 2024]
- Image courtesy of Pexels





Leave a comment