> History students using AI should be much more productive than they were in my day! I wonder whether essays are far longer now than they used to be.
We should not aim for endless productivity. In this world of surplus information, "click-bait" titles, SEO content, etc., we should aim to produce less. This includes learning: learning should be done as a meditative process to understand the human condition, not simply to output the most comprehensive essay.
While the end result might be interesting, the most worrisome part about this is the mentality: the general attitude of becoming too machine-like moves us away from quiet solitude that is so integral to humanity.
Amen. Since Chat GPT took off I've continued to believe it's biggest contribution to society will be that it greatly improves the ability of individuals to generate massive amounts of noise which they can add to the sea of unnavigable garbage the internet has already become. The power to produce more in less time is a net negative that ultimately lends itself to entropy, it is not a positive thing at all unless you completely ignore social and holistic effects in favor of a machiavellian hyper-individualist perspective which claims that any means and any consequence is acceptable to satisfy the isolated, local needs of single individuals at any given point in time.
there will be a layering effect where the "winds" howl and suffocate on one "low consciousness" level, while at another effective teams use it for high-value targets; etc. Someone once told me that "the Internet is so big, that there are certainly groups of tens of thousands of individuals somewhere, organized and doing some activity that you have never heard of".. so yes to the dark futures interpretation and also yes to constructive purposes and yes to predatory militaristic purposes, and more..
ps- people self-obsessed and off balance will dive in and try to communicate with "everyone" .. enabling new personal hell realms
Google is in such a hard place. I unwillingly came across a SEO conference while I was vacationing and every single SEO practitioner is using AI tools to fill the web with articles and low quality rehashes while using Google's inability to punish them while not punishing Forbes and the like (while also publishes low quality articles at times). I honestly don't know how Google is going to solve this one and in another decade how will things look like.
Do things that don't scale. Manually curate the good content somehow and find a way to profit from it. If curating legit content can somehow be more profitable than spamming low-quality content, then everyone will start curating and sharing the good content.
It is a valid thought (which I suspect was done with a wink) that is worth following. If students are more productive (in terms of output volume) how are teachers going to cope with that? Do we need to delegate grading of AI output to AIs? Will this contribute to the heating of the planet or does it lead to a self improving and aware AI? And another concern: Writing is sorting thoughts and putting them in order. Writing is thinking. If students delegate writing will they learn to think?
There is no such thing as endless productivity. We should always aim for higher productivity, because that means getting more for less. What we should not do is to reward solely based on productivity.
In pure math, the challenge was always to express your ideas as concisely as possible. The problem was that sometimes it takes five or ten pages to explain things properly!
Writing is just very ineffective in general. We could have conveyed important ideas using far fewer words than what is considered as standard in academia.
Producing too much content to be absorbed and processed is detrimental to society. In the past, it was reasonably easy to filter out duplicate and redundant information. With AI, this is less and less so. Information overload is a real thing. People will have less and less in common, as their opinions and world views are shaped differently by external stimuli.
What makes me sad, but also 100% unsurprised, is that this is exactly the opposite of the killer use case for LLMs. They should be used to summarize bloated ad-infested clickbait articles down to one paragraph or a series of bullet points. I don't want to read about the author's childhood, where the best place is to get the stuff the article is about, or some random anecdotes about that stuff and the author's unresolved childhood trauma that somehow have something to do with that stuff and something that happened with the author's relative(s). I just want to know what I searched for in the first place: how to do what I want to do with that stuff and nothing more.
Seems like an open and shut killer use case for sub-AGI LLMs to me, but also a nice thing, so that's why we can't have it.
> I didn’t go to lectures. They were way too early in the afternoon for me to get out of bed for.
...
> This appreciation of which texts were important, and what I was being invited to write about happened much faster than would have been the case pre-Internet. I really do wonder how students do this today (get in touch if you can tell me!).
Maybe listening to people who are well versed in the subject and the current state of the academic conversation? You can find such people lecturing, sometimes ;)
My flippancy aside: lectures weren't directly related to the tutorial system at Oxford, so it was unclear the benefit of going to centralised lectures was over just studying the source material. It was pretty universal to do just that, and ignore the lectures.
Can anyone comment on the issue of hallucinations? The author only mentions them briefly and I cannot gather how big of a problem this is. Apart from the literal quote the LLM hallucinated, wouldn’t all the other information have to be double-checked as well?
IMO, hallucinations make it basically unusable for things it should be very good at. For example, I have asked two different AIs what the option is for changing the block size with HashBackup (I'm the author). This is clearly documented in many places on the HashBackup site.
The first time, the AI said to use the -b option to the backup program, with examples etc. But there is no -b option, and never has been.
The 2nd time, the AI said to set the blocksize ("or something similar"; WTF good is it to say that?) parameter in hashbackup.conf. But there has never has been a hashbackup.conf file.
From examples I've seen, AI tends to do a passable job spewing a long-winded response where asking several different humans would give similar long-winded responses that contained a lot of judgement and opinions, some of which could be valid or not.
I'll echo this and say that I've run into very similar issues when evaluating local LLMs as the author of a popular-ish .NET package for Shopify's API. They almost always spit out things that look correct but don't actually work, either because they're using incorrect parameters or they've just made up classes/API calls out of whole cloth.
But if I set aside my own hubris and assume that my documentation just sucks (it does) or that the LLM just gets confused with the documentation for Shopify's official JS package, my favorite method for testing LLMs is to ask them something about F#. They fall flat on their faces with this language and will fabricate the most grandiose code you've ever seen if you give them half a chance.
Even ChatGPT using GPT4 gets things wrong here, such as when I asked it about covariant types in F# a couple days ago. It made up an entire spiel about covariance, complete with example code and a "+" operator that could supposedly enable covariance. It was a flat out hallucination as far as I can tell.
Yes, this. If the form of a plausible answer is known it is likely to be invented. API method names, fields in structures, options to command lines, plausible inventions that have a known form.
Similarly references of any kind really that have a known form, like case law, literature, science, even URLs.
I’ve had very similar problems asking technical things. I wish it would do like humans do, and say “not sure have you tried an option that might be called Foo?”. a good human tech repi doesn’t always have the precise answer, and knows it. unfortunately LLMs have mostly been trained on text which isn’t as likely to have these kinds of clues that the info might not be as accurate as you’d like.
I’ve found for technical things, I’m happier with the results if I’m using it as clues to getting the right answer, and not looking for an exact string to copy and paste.
There are many recently published or preprint research papers around that, not necessarily that hard to read I think.
As a consultant this totally prevents me from making any professional use of LLMs at the moment (edit: aside from actual creative work, but then you may hit the copyright issues).
But even without hallucination, using non scholar sources for training is also a problem, Wikipedia is great for common knowledge but become harmful at a certain point where you need nuanced and precise expertise.
The other problem of Wikipedia is it being a target of hostile politically motivated attacks that attempt to rewrite the history. It it will normally self-correct but time to time there are pieces of information that are maliciously incorrect.
RAGs are great for glossary like structured information. Arbitrarily chunking prose text feels like making garbage out of otherwise good quality text. Ideally prose would be used to generate glossary like documents (LLM aided). Performance plays role here (you'd need to process a lot of text up front) so maybe smaller models could be used?
This is something that has always bothered me about RAG. It seems that it’s fine for first order relevance, like a search engine, but for knowledge there needs to be some kind of rumination stage where it revisits the entire corpus to find information that has a second order relevance to round out its ‘understanding’.
You might be able to approximate by chunking and globbing the chunks and searching for those, as well as having the LLM summarize and extract data and search for those items as well.
Sometimes the terminology is around but not well distributed. I'm not sure this is the paper that coined it, but here's an example from 2020: https://arxiv.org/abs/2005.11401
These days it’s enough to completely mindbreak a model via DPO. Seeing how modern open source LLMs are willing to do anything was quite a surprise given how locked down ChatGPT is.
I am not sure how democracy is relevant to this post but yes. See the UK - democratically making one poor choice after the other due a steady but obvious cultural decline. Garbage in, garbage out.
I’m sorry but this is stupid. The point of writing an essay is not to produce a text, but to learn and expand one’s understanding. There are no shortcuts. Thank god this person didn’t become a journalist.
Can you please make your substantive points within the site guidelines? You broke several of them here:
"When disagreeing, please reply to the argument instead of calling names. 'That is idiotic; 1 + 1 is 2, not 3' can be shortened to '1 + 1 is 2, not 3."
"Edit out swipes."
"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."
You understand that this article wasn't meant as a serious attempt to broaden understanding, but instead as a "game" to see what LLMs can do? The author explicity mentions that in the article and highlights that they would have read the source material if it had not been for the self-imposed constraints of the game...
Yet that was NOT your initial comment. The author was testing a theory to see what the results would be. There isn’t anything “stupid” about that. Exposing the issues of LLM usage & the steps to do so is valuable.
In an ideal world where the incentive of a student is solely to broaden their understanding perhaps that's true (although I'd still be inclined to use an LLM for getting a broad brush overview of a new topic, or ideas of where to start investigating).
However, the education system, as formulated in many countries, isn't solely focused on broadening understanding and instead on students getting a degree with a given grade to aid them in getting a job in their preferred career.
With that goal in mind, their incentives may not be solely focused on learning and more on passing essay questions by creating texts. In that case, it's possible to see how their incentives might line up with the use of LLMs to partially or wholly create their essays.
This is, of course, nothing new. There's a reason why universities have plagerism detection software and systems. Students often do not write essay purely themselves, and LLMs will be just one technique to allow them to do so more easily...
in the 1980s at super-competitive UC Berkeley, where non-native English speakers have high stakes outcomes, many organized groups passed pre-written material, essay answers, multiple-choice answers etc among each other in multi-year organized cheating. Then, with prestigious and difficult degree in hand, proceeded to financially succeed. At the same time, UC Berkeley is a great research university with specialists in hundreds of fields doing post-doc expansion of human knowledge - on a competitive basis.
Both of these true at the same time, integrated via finance. Fanciful ideas of pure education with its motives and methods are almost like fairy tales in this environment.
Just because you didn’t comprehend the content doesn’t mean others didn’t. I also fundamentally disagree with your point. Producing good LLM output is an interactive intellectual process requiring expository writing skills.
I would say that the primary benefit of having journalists is not to produce articles for papers or magazines, but to have people who understand what’s happening and why. Communicating that understanding is also important, but if you don’t have the understanding, you can’t communicate it. I don’t see LLMs as playing any significant role in building understanding. Humans need to process (read and write
notes about) information to understand it. Also, the problem of LLM hallucination make them unsuitable for any serious use.
Most journalists halucinate at least as much information as LLMs. There is a small number of publications that try to do reasonably well, but most have enganged in such a race to the bottom that I'd expect a current generation LLM to produce more accurate articles if given the same sources
In an ideal world, I'm sure what you say is true, but in the real world of publications and incentives for journalists, I doubt that's what happens.
Anyone reading news media recently will have noticed the vast amount of badly written/researched articles, often with click-bait headlines. Recently I've even noticed that sources considered "high quality" like the Times in the UK are heavily using click-bait techniques.
I think there's a small chance that LLMs will actually increase understanding in media (for a while at least). The press release copypaste and verbatim lobbyist statement printing is at least in part just due to hurry and the journalists not understanding the subject matter very well.
With LLMs you could get even some background or context to the stories with perhaps even a hint of critical "thinking".
Although more probably the press releases will be more and more just passed through LLMs to match the paper's "style".
Most publishing is not in the business of increasing understanding. It is difficult to get a business to understand, when its revenue depends on it not understanding.
I'd say that "people who understand what’s happening and why" would not be unique to journalists, but also be true for all intelligence analysts feeding into the government (and that this is why dictatorships don't all instantly fall over due to a lack of a free press).
I think the USP of a free press is to shine a light on the issues missed by the formal government — be that large-scale corruption, or hyper-local potholes, or systemic racism or sexism in some (public or private) institution.
"Quis custodiet ipsos custodes?" Ideally, if not in reality, it's the press.
> I don’t see LLMs as playing any significant role in building understanding.
Wrong. LLM plays a significant role in building understanding in this day and age. Combining Google (Internet) Search and LLM to do one's research and knowledge accumulation is the bread and butter.
> The point of writing an essay is not to produce a text, but to learn and expand one’s understanding.
Indeed. What I thought most interesting about this exercise were the ways in which this author's process pushed him towards learning and understanding - even to the extent that he found himself tempted to read the original sources for himself!
I'm a former humanities lecturer, in both English and History, so it's a thought exercise for me to consider how LLMs would change my pedagogy. It'd be naive to assume that students won't want use this technology, and foolish to think that they can be stopped. The only sensible approach, it seems to me, is to guide students towards productive use - ie, methods which will expand their understanding.
This author's most productive use, it seemed to me, was his literature review, which guided him towards useful sources, and an understanding of the contours of debate around the topic. Those, in traditional pedagogy, are the primary purposes of the instructor. It's notable that to do this he had to provide his own corpus of texts (of dubious provenance!), and self-train the model. Both of those functions require skills which are beyond most students and nearly all humanities professors.
The most obvious use was the structuring and writing of the essay. He pointed out one hallucination he caught, but there were probably more that he did not. In the humanities we have traditionally used a well-structured and elegantly-written essay as the gauge of learning and understanding. Will that have to change? Based on this example, and others I've seen, I'd want to review students' working process, and help them refine their use of the LLM. If the prose it produces can be expected to be of reasonable quality - and it's already far better than most undergraduates (unfortunately) are able to produce - then the truer test of their understanding is the guidance they have given it.
Or maybe essays are now the wrong "proof of work"? If so, then what replaces them?
I have no firm conclusions, besides a depressing expectation that educators in the humanities will remain entirely naive to these technologies, to their and their subjects' detriments. I appreciate the author's project, and welcome further discussion.
You’re getting towards something that actually caused me to drop out of college 2 decades ago. The essay then (and possibly still now) was the primary means by which any of my non-math classes seemed oriented. I’m not an essay writer. I can sometimes give 500-1000 words on a topic because I tend to be terse, and found I was always being given poor grades for “not following the assignment” which would usually be 2000-2500 words. I just couldn’t do it.
In the end I dropped out because the point of college wasn’t “learning” (which I was) but rather “how good are you at playing college” (which I wasn’t).
Yeah, the way I always put it to my students was that the word-count was a rough guide to how deeply I expected them to dive into a particular topic. Any substantive question could generate anything between a paragraph and a life's work, but in this essay, you only need to do this much.
I creeped your profile real quick (I'd up-voted you a time or too, lol), and I'd have loved to have had you as a student! Your writing style is admirably concise, which mainly means you have your thoughts in order. (Avoiding "terse", which is sometimes accurate, is a matter of learning a few rhetorical tricks to better engage a reader. That's, like, the easiest writing "problem" to solve.) Clear thinking is 90% of what we (should) want undergraduates to demonstrate, which would put you orders of magnitude ahead of the typical student who hasn't any ideas of their own, and desperately tries to eke out 2k words of pure waffle. From me, 1k words of closely-written reasoning would have been a 'B' from the jump, and provoked a conversation about where else you could take your argument, should you have the time and interest to pursue the subject further.
I'm sorry you had such bad experiences. They're sadly not rare.
> In the humanities we have traditionally used a well-structured and elegantly-written essay as the gauge of learning and understanding. Will that have to change?
No! If the student produces a poorly written essay they should receive a bad grade.
This "engaging with the LLM" is nothing more than Googling the answers. It's incredibly detrimental to the student's understanding. One way to prevent this kind of cheating would be to require essays to be hand-written, or typed on a mechanical typewriter.
> One way to prevent this kind of cheating would be to require essays to be hand-written, or typed on a mechanical typewriter.
That's useless, I'm afraid. Students will generate an LLM text, and then copy it out. If you lock down their machines, or university networks, then they'll access the LLMs another way. It's cat and mouse games all the way down, and we'll never win.
I mean, I get where you're coming from. I'm a humanities guy, through and through. I love writing essays - or, well, really dig having written a good one; the writing process is invariably a slog - and I'm good at it. The writing process catalyzes my learning and crystallizes my thoughts, etc etc. I believe all that stuff, preached it without irony, and spent countless hours in tutorials coaching students.
The trouble is, it doesn't really work. Writing a lot of essays makes only marginal improvements to students' writing, no matter how many tutorials and "writing labs" they go to. Reading complex texts, and learning and imitating good writing, teaches people how to write. And, you know, that only works when they want to a) read complex texts, and b) learn how to write well.
For the students who intrinsically neither a), nor b) - which is the vast, vast, vast majority - we could force them a bit, with grades. Now, however, there are LLMs, which break both halves of that method. Our whole approach to a system of study will have to change. I would rather find something new and useful than cling to a useless paradigm because I was once comfortable and successful within it.
> Students will generate an LLM text, and then copy it out. If you lock down their machines, or university networks, then they'll access the LLMs another way. It's cat and mouse games all the way down, and we'll never win.
I think this is more a product of low standards than anything else. Let's bring the mean down to 2.0GPA. Not everyone should pass. Not everyone should graduate. Failure delivers valuable lessons, too. But I'm sorry, throwing up our hands and saying "oh well, I guess cheating is the norm now" is fatalist BS.
> Reading complex texts, and learning and imitating good writing, teaches people how to write. And, you know, that only works when they want to a) read complex texts, and b) learn how to write well.
I agree with this. Inspiration is super important, and inculcating a love of learning and the life of the mind is the whole ballgame. If the vast majority of students aren't in this boat, and instead they're just trying to check a box to graduate, the whole thing is super broken. But that doesn't mean it can't be fixed, and it doesn't mean we should lower standards in the face of a threat like LLMs. Instead we should raise them.
I don't disagree with anything you wrote. Especially
> inculcating a love of learning and the life of the mind is the whole ballgame. If the vast majority of students aren't in this boat, and instead they're just trying to check a box to graduate, the whole thing is super broken.
Which... Yup. The whole thing is super broken. It has been for a generation. (It's maybe always been at least a little broken? Complaints about students not caring about the life of the mind and only craving the credential were common in the middle ages, too!)
Here's a counterpoint for discussion, which I'm not sure I fully believe in: LLMs can support life-of-the-mind learning. (Even if you think they aren't there yet, their trajectory is clear.) Even apart from that, they will be used everywhere, outside of the classroom. Don't educators have a responsibility to train students in their responsible use?
Thanks. I'm not teaching anymore (it don't pay nowhere near enough to support a family, alas), so I'm looking on from the sidelines. I'm also not spending any time getting my hands dirty with LLMs (aforesaid family, dontcha know?), so I'm on the sidelines there, too.
I do think we're heading for a pedagogical crisis, and I don't see much beyond hand-wringing coming from people in education. This is mostly because their technology skills are (by and large) very nearly nil - "cliometrics" in History, and "digital humanities" in English, are regarded as niche - so nearly everyone with influence within the profession has been blind-sided. I have one former colleague who retired last year, a few years ahead of her plan, rather than deal with LLMs.
Yours is, frankly, the first "practical" investigation of what's possible that I've seen, anywhere - I've passed it along to several people I know. Thank you for doing it, and please post anything else you may do in this space. There might even be a business opportunity in it? Education consultants can make good money, which is usually regrettable, but this specific topic is essential to address right now.
This was the same take from the competitive debate community when I showed them my body of work around automating the process of summarizing documents. The vitriol and hate I got from the “educators” about me depriving the students of their oh so necessary education time via summarizing hundreds of documents was extremely strong. Needless to say, I hold those who believe in work for “educations” sake in contempt and I’m gleeful to see wide spread proliferation of the tools for young folks to subvert the intentions of authoritarian “teachers”.
The "point" of writing an essay appears to me to have been one of those proof-of-work challenges that are relatively easy to assign and grade, but very time consuming to write; I can't recall any cases when the student actually had anything original to say, which the lecturer was really interested in discussing.
So seeing that essays aren't a good proof of work anymore, lecturers would need to come up with a better one. Or, heaven forbid, allow students to actually do something productive with their time.
Well you can still learn and expand your understanding by writing bullet point lists, and then use an LLM in the final stage where you turn these points into a text.
If what you have to say fits in 10 bullet points, why would you make me read a full essay? Do you think that the LLM will add a lot of information that is not in those bullet points, or do you think that I am not clever enough to understand your bullet points and therefore I need an LLM to paraphrase them?
Or do you think I enjoy losing my time reading stuff that you could not even be arsed to write yourself?
We should not aim for endless productivity. In this world of surplus information, "click-bait" titles, SEO content, etc., we should aim to produce less. This includes learning: learning should be done as a meditative process to understand the human condition, not simply to output the most comprehensive essay.
While the end result might be interesting, the most worrisome part about this is the mentality: the general attitude of becoming too machine-like moves us away from quiet solitude that is so integral to humanity.