2022 is the end of hypothetical AI - what this means for policy and us
Dan Ingvarson, Senior Technical Policy Consultant
This is the first in series of 4 posts is to give policy leaders a set of key considerations to be informed as we begin this next phase of managing AI rather than imagining AI
“Gradually then suddenly”- Ernest Hemingway.
2022 is the year when the hypothetical discussions around the impact of AI have turned into reality fast. A year when the change in computing power, software engineering, human guidance and a big enough set of parameters crossed the already expected event horizon of an AI tool using a Large Language Model (LLM) “mostly making sense” in response to a diverse array of queries.
The most prominent example of these language models can be found in ChatGPT which, built on the GPT-3 large language model, has an IQ of 147, has passed the SAT, and has rapidly become a major topic of discussion in the field of education, putting the stark realities that the EdSafe AI Alliance has been pointing to into context.
The Journey to an LLM
Enough human and computation effort, access to large content resources, and time came together to create the environment which has enabled the 2022 breakthrough. It was scale (and human guidance) that made the emergence of a set of textual relationships so responsive that it has shown the mass market what insiders always knew had been coming.
Four important developments needed to occur for LLMs to reach their current state:
The AI collected a specific set of open data data and broke it into words and sentences.
It asked a previous version to guess the next word in the sentence and update a “parameter set” when right or wrong, to find out how good its guess work was, or how probable it was that its guesses are correct.
It then answered a set of questions which humans then corrected, “teaching” it the “right” answer repeatedly, creating a human feedback loop.
It scaled the whole process until the model was sufficiently large so that the connections it made were sufficient to replicate appropriate responses. For GPT-3 that represents a size of 175 billion parameters.
Critically for policy, the foundational work on LLMs is already finished.
Having accessed and been trained on a lot of the available datasets in the world, the current language models are completing an establishment phase where, like us, they have read and learned. These systems do not maintain a copy of that original data (only the language relationships which impact their existing models) of which any one piece of content provides an almost undetectably small amount of learning. This has some important policy considerations;
The results of the systems are, in some cases, as good as they are going to get, having already learned from much of the open and quality data on the internet. New sources and tactics for training data currently in development may be more intrusive and require a policy tradeoff and opening of currently closed sources.
Because the training of the large language models has, in many cases, already taken place or is currently undergoing, any proposed regulations to control the data training AIs will only affect subsequent models and potentially not have any effect at all.
It is vital to critically evaluate what policy or regulation work is possible and necessary to ensure that there is safety, efficacy and equity in the use of these models in education. Especially as ‘jailbreak’ hacks are already being found, which circumvent existing safety measures.
The cost and complexity of creation is dropping.
The first AIs took 15 years or to reach maturity and surpass human performance, whilst the recent versions achieve this in a year. It now takes AI systems considerably less time to master certain benchmarks and reach baselines of functionality (see image). There are now even open source toolkits for some of the complex pieces to increase this speed further, and we can expect more to come.
Development momentum like this is nothing new: while the iPhone, for example, took many years and iterations to create and many pieces had to come together, the day that it was launched was the day others could create a copy and other systems like android were born soon after. And now most everyone has their primary device for staying connected in their pocket, no matter the operating system. We are accustomed to seeing design innovations and technological developments quickly build upon previous knowledge.
Policymakers must operate assuming many AI’s will come from diverse providers, all with capabilities to impact society.
Google’s development of Google Bard and Chinese search giant Baidu’s own ChatGPT rival both announced on February 6th 2023 demonstrate that the era of functionally capability AI is not in the future, it is here. These large language models are already sufficiently large, equipped to be repurposed, and capable of linking the different concepts found in the relationships in language. Policymakers must operate assuming many AI’s will come from diverse providers, all with capabilities to impact society.
ChatGPT is just one of many
The combination of know-how and cost mean there are now many more and bigger LLMs, many not released to the public. The graphic below shows just what is announced and publicly available at this time. We believe there will be many more which are developed by governments and other private organizations and thus not publicly available. With the cost barrier still in the many millions, however, there will be a finite set of contenders who could potentially dominate the language model space.
With many different language models or AI tools emerging, any rules that have been put in place to control them in education, for example, will need to be constantly updated to address future developments. Furthermore, it will be imperative to explore the point at which regulations can be applied and what effects this can have considering the current rapidly evolving state of development in the AI space. Traditional policy levers such as controls based regulation now need to consider the complex human level of performance existing in AI. It is therefore important to address the human inputs, the related topic of bias and what we accept as being correct or true.
It will be imperative to explore the point at which regulations can be applied and what effects this can have considering the current rapidly evolving state of development in the AI space.
While we believe policy makers must currently focus on LLMs, It is important to remember that ChatGPT uses an LLM which is just one part of the range of AI’s being developed. The startling increase in competency of LLMs seen in 2022 has had the most immediate impact, however, there are other equally astonishing improvements in other AIs which will also require our focus moving forward.
In hindsight, ChatGPT may very well be the LLM poster child for AI’s entry into mainstream dialogue, but it will most likely pale in comparison to the next generation of tools. This is the first class of human-developed technologies that has the capacity to learn faster than any person. The impetus then is upon us to be engaged so that we are sufficiently informed and can craft the necessary controls so that LLMs operate transparently, and work ethically, safely and optimally in our educational context.
In part two of this discussion, we will delve into ideas of why context is so important to LLMs, how this is important to safety and policy issues, and what safeguarding against abuse of this must entail.