<quote>
The field of AI and Large Language Models is a fertile ground for those who wish to write about technology. The technology is inherently disconcerting and especially so for those who have spent decades learning skills that now seem to be superceded. The disconcertion comes from the unknowns that seem to swirl around this new technology. As somebody facetiously pointed out on news.ycombinator.com you don't actually need a PhD in mathematics to understand this stuff. This post is about a few certainties that I believe that we can have about large language models.
Firstly, the quality of the answers from a model is directly related to the size and quality of the training data, and the quality of the training or sample data is directly related to the knowledge and skill of the human beings who created that data. This knowledge was not come by easily; it involved thousands of years of thought, study, education, transmission and intellectual “sweat and tears” : scientists who spent years carefully collecting and analysing information, authors who typed out masterpieces on clunky old typewriters, academics who wrote papers for USENIX conferences about new languages, monks who copied manuscripts with care and patience, scholars who meditated on deep problems, self-taught mathematicians who spent decades achieving a calculation of Pi to 12 decimal places and so on and so on.
The LLMs may effortlessly regurgitate this knowledge, format it in clever tables, or reconfigure it into seemingly original computer code, but everything comes back to the laboriously obtained human knowledge which was used to train the models.
Another thing that we know about the LLMs is that they have almost no “meta-cognisance”. They don’t know how they know nor do they know how much they know or why they know. They can't really analyse (yet) how reliable their own answers are. Humans have all sorts of “risk assessments” relating to their own knowledge or to other people's, which are heuristic epistemologies encapsulated in phrases like “he's talking through his hat!” .
My personal hunch is that the ideas above support the assertion that the models will not get vastly better in the near future, because the big AI companies have probably already hoovered up the great majority of available (human-generated) data, and without more of that, the models can't really become much more expert than they already are. It may be possible, in come circumstances, for the models to train themselves, which is to say, use results which they have generated in order to improve themselves, but that seems to be only possible in fields where they can verify their own results.
One of those fields is probably computer programming.