The AI Crunch
Usually my posts here share something I've made or done. This one is a bit different. Here, I want to begin a discussion on a potential alternative take on the near future of generative AI development. I will propose that we might encounter a Great Filter for AI models, an evolutionary obstacle that prevents further progress in the current direction. Rather than a big bang of super artificial intelligence, what if we face a big crunch of ever dumber models? What if they are already as good as they'll ever be?
I'm speaking as a graduate student doing a research project related to large language models with a mindset that tends toward the cynic (or, being charitable to myself, the realist). Think of this post as a thought experiment, not as a fortune telling. I don't even think this is the most probable outcome, but my goal here is to get you to also consider the problem and hopefully engage in discussions about what it would all mean.
Background
First, I can very briefly describe some of the background on this topic. In broad terms, AI models can be generative or predictive. Predictive models take an input and give you some prediction about it. That prediction depends on the task. For example, it might tell you whether a picture is of a cat or a dog, or whether an email is spam or not. In contrast, generative models learn to make more of the thing it was trained on. In statistical terms, a generative model produces samples from the learned training distribution. These can be unconditional, such as Nvidia StyleGAN, which produces new faces that look like real people. Or they can be conditional, such as Stable Diffusion, which draws pictures conditioned on a textual prompt.
Singularity?
The exciting, or scary, part of generative models is that they are written with computer code, and computer code is simply another training distribution, just like images. Thus a programming language generative model could feasibly produce new lines of code. It could take its original programming and improve it beyond the creators' human abilities. The improved version would be able to further improve itself, and on and on, happening much faster than humans would be able to accomplish on their own. That's the crux of the AI Singularity Theory, the subject of abundant amounts of science fiction entertainment.
From where I stand today, witnessing the jawdropping trend of generative AI development in just the last few years, this outcome almost seems inevitable. Maybe we're already past the event horizon and some AI is already capable enough that we as a society are trapped, helpless as it consumes more and more data, escaping our grasp completely. OpenAI, a billion-dollar company well-known for creating ChatGPT, is already positioning itself as the steward for humanity into this new age, so clearly this idea is not merely science fiction anymore.
Only 10 years ago, the state of the art in text generation was Markov modeling, where a model predicts the next word based on the last few words. If you gave the model enough physics textbooks, that led to amusing things like this:
It was Maxwell who first thought of using the railway lines to transmit signals by the name of molecules, were not, in any given solution, all the effects there is a combination of the relation of the phenomena were studied without much trouble being taken to obtain numerical relations between the power and dimension of the greatest effects.
source
But now I don't find computer-powered text generation as amusing anymore. For example, what if I told you I didn't write the entire last paragraph, and that it was all ChatGPT? I didn't, but there's no way to prove it either way. Only 2 years ago, just the notion would have been ludicrous. Sure, by then AI could already generate convincing pictures of faces, but producing natural language meant it needs to understand linguistic nuances, maintain a consistent voice, and link long-range semantic connections that were just completely out of scope of any existing approach. Yet here we are. Given this kind of incredible momentum, what could possibly knock us off course to super-intelligence?
While it's no longer a leading theory of cosmology, for some time people hypothesized that just as the universe began with the Big Bang, it could end with a Big Crunch, the end of the universe where it all comes back together to a single point. Thus, in opposition to the AI Singularity, I'm calling the following theory the AI Crunch.
Generative Chaos
My fun idea is that at some point current methods in generative AI training will deadlock on the quality of the training set. Consider that today's best textual models still have difficulty distinguishing fact from fiction, sometimes leading to hilarious if unsettlingly vicious gaslighting expeditions. In literature, the term to search for to find out more about this phenomenon is "hallucination". For example, let's ask ChatGPT about sock stores in Copenhagen.
Recall that the entire point of the generative model is to create samples from the training input distribution. Now consider that the training input for these models is in fact the whole internet, including whatever possibly-hallucinatory outputs of models that have already surfaced online. Thus, over time, new datasets will be increasingly polluted with non-factual but factual-sounding text. The training objective of these models is not to be factual (how would they know?), but to say what someone who read their entire training set would say.
Of course it's also true that humans produce non-factual text too, but our underlying assumption as a research community when creating these models is that the sheer volume of factual text will mask away most inaccuracies that people publish. I would argue that from now on, we can no longer make that assumption simply by extrapolating the forcing function of a capitalistic economy. In our consumption-based society, the limiting factor of revenue generation is the cost of production of new things. In any digital content industry (writing, photography, coding, etc.), the bottleneck rests at the human creating the content.
For example, let's suppose that we are in the shoes of someone who wants to earn money with online ads through a blog that answers common questions. It does not matter whether we are correct; it only matters whether we will capture people's attention. So we would publish the above text about socks in Copenhagen. The tone is helpful and charming, so why wouldn't people like it and read it? Afterwards they might not appreciate that nothing there is real, but they would not realize until they have already given up their attention on our blog's advertising space.
What if this someone decided to extend the approach? They could do the following with relatively little investment:
- Scrape all the questions of pages on Quora
- Ask them all to ChatGPT and publish the answers online
- Sprinkle some light search engine optimization on top
- Watch the clicks flow in
Naturally, the reader of this page would be too discerning to be susceptible to this ploy, but I have little doubt that it'll happen at some point. (is that self-fulfilling? maybe a reader of mine will make this happen)
So what?
The goal of this post is simply to share a view on the discussion about the future of AI. I see a few broad possibilities from where we stand.
- Singularity: Progress continues unabated along the current direction, models become more and more capable, soon surpassing human ability in terms of quality
- Crunch: Pollution of future training sets by past model outputs cannot be fixed
- AI becomes AGI anyway, simply by some mechanism of emergent behavior in larger and larger models, and it becomes capable of fixing itself where humans cannot
- AI progress is locked at 2023 or a near future point; models can not learn current events with updated datasets for fear of uncontrollable nonfactual bias in their outputs
Conclusion
I'll end this article with some excerpts from a conversation I had with ChatGPT about the idea. It wouldn't be fair to poo-poo on AI without giving it a chance to respond, after all.
Me: I am writing a blog post on the AI crunch, an alternative take on the future of AI. Unlike singularity theory, I instead imagine that current techniques will lead to an AI dead end because all future training inputs will include the generated outputs of past AI. Thus AI will be trained on flawed inputs, making future outputs worse and worse. What do you think of this idea?
The idea of the AI crunch, or AI dead end, is an interesting alternative perspective on the future of AI, and it is worth exploring. The idea is that as AI systems become more advanced, they will generate their own data inputs and training data, leading to a feedback loop that could potentially lead to a decrease in the quality of outputs.
While the AI crunch is an interesting idea, it is still largely hypothetical, and there is no strong evidence to support the idea that AI will inevitably hit a dead end. In fact, there are many researchers and experts working to develop new techniques and approaches to address potential issues like biased data and overfitting.
Overall, the AI crunch is an interesting and thought-provoking idea, but it remains to be seen whether it will ultimately be a limiting factor for the development of AI systems.
Me: The problem now is that generated output such as the text you are writing is virtually indistinguishable from what a human would say. Then how can you ensure the input is free of AI generation?
You raise a valid point that as AI-generated output becomes more sophisticated, it can be difficult to distinguish between inputs that are generated by humans versus inputs that are generated by AI. This can create a potential challenge for ensuring that inputs are free from AI generation, especially as AI becomes more prevalent in our daily lives.
One potential approach to addressing this challenge is to focus on developing techniques and approaches for detecting and mitigating the effects of AI-generated inputs. This could involve developing algorithms and tools for detecting and filtering out AI-generated content, or developing approaches for quantifying the potential impact of AI-generated content on downstream AI systems.
Another approach is to focus on developing best practices and standards for training and testing AI systems, which can help to ensure that AI is being trained on high-quality, diverse inputs. This can involve developing guidelines and standards for collecting and curating training data, as well as developing approaches for testing the robustness and generalization of AI systems.
Ultimately, the development and deployment of AI is a complex and multifaceted process, and it will require ongoing engagement and collaboration from a wide range of stakeholders in order to ensure that it is being used in ways that are beneficial for society as a whole.
Funny how I can't tell who wrote this: an advanced bot trained on the whole internet or an extremely diplomatic college student trying very hard to fill a word count for their essay.
View Comments