Why AI?
Spoiler alert: that's not a rhetorical question, and I don't have an answer.
I work in the research org at my job, which means we need to understand, if not use, cutting edge technology. That's great, because I'm a technologist who loves shiny new things. Unfortunately, many cutting edge technologies only appear so (see: blockchain). Right now, the hot new thing is "AI" (typically meaning large language models, or LLMs).
I'm reading a book, AI Engineering, and surveying some papers on the subject to make sure I have a decent foundation. One of the papers (Villalobos, et al. 2022) suggests that we might run out of training data (though I recall that as of late 2024 that was less of a concern) and that we must do something about it.
What I don't understand is: why?
It's taken as a given in many parts of my industry that LLMs must keep developing. The "large" must get ever larger. It reminds me of The Hitchhiker's Guide to the Galaxy when, upon being questioned why a specific bypass must be built that will require demolishing Arthur Dent's home, a functionary responds, "You've got to build bypasses." (Original 1980s BBC rendition.)
The urge to embiggen must be, I think, something deeply human. If we see something we like, we want it to be bigger. The bigger it is, the bigger it must get. Goodness knows it's how we've gotten to where we are with cars. Someone with more time on their hands than I have should probably write about that.
LLMs are not inherently bad—though it's worth noting they threw the book at Aaron Swartz for doing a lot less than OpenAI for much better reasons (more on that in the future). Clearly, the concept can yield a lot of fruit. However, the output and use case are certainly questionable. Sometimes a chat bot can do well, but other times it can do very poorly indeed (how many r's in strawberry?) and have absolutely no idea. Even though they evolved from some very complicated mathematics, they are usually bad at math—because they're language models.
I'm not providing citations for all of the above, because I don't really feel like arguing about whether LLMs are unpredictable in quality. That should be acceptable as truth, at this point. That brings us to my question posed in the title, and reiterated once or twice: why are we doing all this?
Some technology needs proving-out. You have to try it again and again before you can see the value. I had a discussion with my boss about that this week on a totally different subject, where he reminded me that we don't have to know if "there's a there there" before we try to find out. That's the nature of research sometimes.
The problem is that LLMs/AI/chatbots have advanced enough to be shoehorned into a million places they don't have any business being, and they're generally a) bad at what they're doing, b) unpopular, and c) unnecessary. Throwing more and more trees into the wood chipper will produce more wood chips, but it won't make them edible.
I'm sure there are use cases where it makes sense to use LLMs. For instance, I want to be able to tell my computer to rename all my photo files to match my existing ones, in those words, and just let it go. An LLM might be able to string together the code to do that, given enough input data. But I don't think we need to.
The Harvard Business Review wrote about how gen AI is affecting the labor market. The conclusion: writing and programming are being affected first and most, and much more than manufacturing jobs were affected by automation. I don't think I can argue with their methodology, but I think it makes my point for me: if writing jobs are being affected most, then AI isn't solving a problem worth the environmental and monetary cost.
Sure, it's getting cheaper and cheaper to train and use models, but they're being heavily subsidized so that we get used to them and think we can't do without them, and then the prices will go up. It's how Uber hamstrings the local taxi industry or public transit, becomes the easiest or only game in town, and then jacks up prices while lowering pay.[1] It works out great for most people at first, but in the long run only Uber is profiting. Trains and busses worked great before Uber, but it's almost like we've forgotten.
It's time to wrap up. I don't have a pithy conclusion. I am not a never-LLMer, but my skepticism hasn't been satisfied, and I leave the question up to you: why should we pour more data, energy, and money into training bigger and bigger LLMs? What's out there that we might be able to address in a truly transformative way? Cutting freelancers out of a job doesn't count.
If you've got an opinion, write about it on your own blog and then let me know on Mastodon. I really would like to know what you think.