What is this key to artificial intelligence?
That would be GPT-3, which is short form for Generative Pre-Trained Transformer 3. It is the third in its series, developed by the Elon Musk co-founded (among others) start-up, OpenAI. It is a Natural Language Processing (NLP) model based on the transformer models introduced in 2017 (quite recently). It uses neural networks and applies deep learning algorithms to write coherent and sensible human text (multiple languages). Well, it goes beyond just text. It is capable of more generalized, second-order applications such as writing programs that create websites, write blog articles, generate images using text etc. It potentially has the output quality to fool human beings (at least most of them) into believing that the text was human-written, provided that the conditions were set right. If you don’t believe this, have a look at this blog article by The Guardian. Spoiler alert: The article was written by none other than GPT-3.
Why is GPT-3 key to artificial intelligence?
For a transformer model to be task agnostic and require minimal learning-iterations (known as few-shot learning) at this scale is unheard of. I’ve been researching the scientific paper published by the development team. GPT-3 has achieved this by not reinventing the base technology, but by scaling and tuning the data-sets that were used to train the base algorithm. Put in simple terms, it uses current state of the art methods, but just went bananas-brute-force with the scale of data used. To compare, the latest and biggest Microsoft NLP model, the Turing-NLG, released in February 2020 uses about 17-billion parameters (these can be imagined as weighted neurons). In comparison, GPT-3 was fed with both open internet data, and specifically chosen data-sets that amounted to 175 billion parameters. The training data sets are estimated at a cost of around 5 million US dollars (USD), and the running costs of the model are estimated at 87000 USD on a yearly basis.
The Good
GPT-3 simply has such a wide array of applications. It is the most holistic statistical model of the human language we have yet. If there is something that has been recorded in the human language on the internet, GPT-3 has a good chance of knowing it. And the more frequently it has occurred, the better the chance of GPT-3 retaining it. We have the possibility to benefit directly from this, i.e., thousands of cumulative years of human text. I would define this as augmented intelligence, and not artificial intelligence. Regardless of how we choose to call this, it means that we may be able to generate program codes without being programmers, or generate images based out of descriptions without being forensic experts, and so on.
This essay is supported by Generatebg
The Bad
From my foraging into the scientific paper, what I understand is that GPT-3 functions like a grandiose auto-complete algorithm. From all of the vastness of its datasets, and more precisely, the neural network it has retained from the datasets, it computes the conditional probability of the text that accomplishes the task requested of it. This simply means that GPT-3’s world is just a whole big bunch of strings that have no inherent meaning to the code. All it does is pattern match, and gets continually better at it. Before we bash this approach, I’d like to share that I learnt a foreign language in recent years (German). I intuitively followed a similar approach. Therefore, as rude as it sounds, GPT-3’s mimic and pattern match approach is remarkably similar to humans. However, beyond pattern matching, GPT-3 has no causal logic or reasoning skills, which is precisely why I refuse to call it artificial intelligence. Again, this is based on my understanding of the scientific paper. I have no access to the source code yet (that’s apparently worth a billion dollars). Anyway, since it doesn’t do any reasoning, it leads to some strange results like the image shown from twitter. Without reasoning logic, by changing the weights of parameters or groups of parameters, it is possible to generate completely authentic-looking, and coherent text that is manipulative, and potentially false. Therefore, in this sense, GPT-3 is outright dangerous.
The Ugly
Since OpenAI know that GPT-3 is dangerous, they chose not to release the source code. Their argument was that since GPT-3 requires a lot of resources to run, the group of people who are most likely to benefit from the code are big organisations, and they have an incentive to manipulate their target audience. OpenAI still plans to make the API available to normal users like you and me, for a small fee. Using the API, one can request GPT-3 to perform a task, and get the output. I have been on the waiting list for the API for months now. If anyone from the dev. team is reading is, I’m right here guys! This sounds like a genuinely ethical move from OpenAI, right?
Except, OpenAI announced last month, that Microsoft (also part of the original supporting group) would get exclusive rights to license the source code for a meagre fee of 1 billion USD. As the name suggests, OpenAI was originally meant to be about open source code. They have published the source code for all of their work until this point. For them to taken a U-turn now, and function as a for-profit organisation seems suspect to say the least. Besides, let’s give exclusive access to such a powerful model to one single organisation. What could possibly go wrong? For instance, imagine what the possibilities are, if a biased, causal, reasoning model is paired with GPT-3. What could possibly go wrong?
I’d love to hear your thoughts on this in the comments section below.
I hope you found this article interesting and useful. If you’d like to get notified when interesting content gets published here, consider subscribing.
Comments