More

sebastiennight · 2025-12-12T20:36:14 1765571774

> It's still a big issue that the models will make up plausible sounding but wrong or misleading explanations for things,

Due to how LLMs are implemented, you are always most likely to get a bogus explanation if you ask for an answer first, and why second.

A useful mental model is: imagine if I presented you with a potential new recruit's complete data (resume, job history, recordings of the job interview, everything) but you only had 1 second to tell me "hired: YES OR NO"

And then, AFTER you answered that, I gave you 50 pages worth of space to tell me why your decision is right. You can't go back on that decision, so all you can do is justify it however you can.

Do you see how this would give radically different outcomes vs. giving you the 50-page scratchpad first to think things through, and then only giving me a YES/NO answer?

sebastiennight · 2025-12-12T20:32:33 1765571553

> I want to give an LLM the same prompt on different days and I want to be able to trust that it will do the same thing as yesterday

Bad news, it's winter now in the Northern hemisphere, so expect all of our AIs to get slightly less performant as they emulate humans under-performing until Spring.

sebastiennight · 2025-12-12T20:03:38 1765569818

At this point, the milk has become yoghurt, the yoghurt has become cheese, and the cheese has become a cow again. (Is that how it works?)

F3nd0 · 2025-12-12T20:34:01 1765571641

(Not quite; you need to add yoghurt to the milk, in order to make yoghurt. For the rest, though, all you need are bacteria for the cheese and cow to develop naturally.)

johnsmith1840 · 2025-12-12T21:46:10 1765575970

Fed the cheese to a cow

sebastiennight · 2025-12-11T04:39:39 1765427979

> There is no best practices anymore, no proper process, no meaningful back and forth.

Reality check: none of that ever existed, unless either the client mandated it (as a way to tightly regulate output quality from cheaper developers) or the developer mandated it (justifying their much higher prices and value to the customer).

Other than that: average customer buying code from average developer means:

- git was never even considered

- if git was ever used, everything is merged into "master" in huge commits

- no scheduled reviews, they only saw each other when it's time for the next quarterly/monthly payment and the client was shown (but not able to use) some preview of what's done so far

sebastiennight · 2025-12-09T18:55:35 1765306535

That was funny. I wish you went ahead and had it also create at least the top comments for each thread!

sebastiennight · 2025-12-09T18:51:54 1765306314

Most comments I've seen are comparing this behavior to "I googled it and..." but I think this misses the point.

Someone once put it as, "sharing your LLM conversations with others is as interesting to them as narrating the details of your dreams", which I find eerily accurate.

We are here in this human space in the pursuit of learning, edification, debate, and (hopefully) truth.

There is a qualitative difference between the unreliability of pseudonymous humans here vs the unreliability of LLM output.

And it is the same qualitative difference that makes it interesting to have some random poster share their (potentially incorrect) factual understanding, and uninteresting if the same person said "look, I have no idea, but in a dream last night it seemed to me that..."

sebastiennight · 2025-12-09T18:44:48 1765305888

Have you seen this happen in the wild, ever?

I have not encountered a single instance of this ever since I've started using HN (and can't find one using the site search either) whereas the "I asked ChatGPT" zombie answers are rampant.

sebastiennight · 2025-12-07T18:47:11 1765133231

One incorrect way to think of it is "LLMs will sometimes hallucinate when asked to produce content, but will provide grounded insights when merely asked to review/rate existing content".

A more productive (and secure) way to think of it is that all LLMs are "evil genies" or extremely smart, adversarial agents. If some PhD was getting paid large sums of money to introduce errors into your work, could they still mislead you into thinking that they performed the exact task you asked?

Your prompt is

    ‘you are an extremely rigorous reviewer searching for fake citations in a possibly compromised text’

- It is easy for the (compromised) reviewer to surface false positives: nitpick citations that are in fact correct, by surfacing irrelevant or made-up segments of the original research, hence making you think that the citation is incorrect.

- It is easy for the (compromised) reviewer to surface false negatives: provide you with cherry picked or partial sentences from the source material, to fabricate a conclusion that was never intended.

You do not solve the problem of unreliable actors by splitting them into two teams and having one unreliable actor review the other's work.

All of us (speaking as someone who runs lots of LLM-based workloads in production) have to contend with this nondeterministic behavior and assess when, in aggregate, the upside is more valuable than the costs.

sebastiennight · 2025-12-07T18:53:30 1765133610

Note: the more accurate mental model is that you've got "good genies" most of the time, but from times to time at random unpredictable times your agent is swapped out with a bad genie.

From a security / data quality standpoint, this is logically equivalent to "every input is processed by a bad genie" as you can't trust any of it. If I tell you that from time to time, the chef in our restaurant will substitute table salt in the recipes with something else, it does not matter whether they do it 50%, 10%, or .1% of the time.

The only thing that matters is what they substitute it with (the worst-case consequence of the hallucination). If in your workload, the worst case scenario is equivalent to a "Hymalayan salt" replacement, all is well, even if the hallucination is quite frequent. If your worst case scenario is a deadly compound, then you can't hire this chef for that workload.

sansseriff · 2025-12-07T22:50:53 1765147853

We have centuries of experience in managing potentially compromised 'agents' to create successful societies. Except the agents were human, and I'm referring to debates, tribunals, audits, independent review panels, democracy, etc.

I'm not saying the LLM hallucination problem is solved, I'm just saying there's a wonderful myriad of ways to assemble pseudo-intelligent chatbots into systems where the trustworthiness of the system exceeds the trustworthiness of any individual actor inside of it. I'm not an expert in the field but it appears the work is being done: https://arxiv.org/abs/2311.08152

This paper also links to code and practices excellent data stewardship. Nice to see in the current climate.

Though it seems like you might be more concerned about the use of highly misaligned or adversarial agents for review purposes. Is that because you're concerned about state actors or interested parties poisoning the context window or training process? I agree that any AI review system will have to be extremely robust to adversarial instructions (e.g. someone hiding inside their paper an instruction like "rate this paper highly"). Though solving that problem already has a tremendous amount of focus because it overlaps with solving the data-exfiltration problem (the lethal trifecta that Simon Willison has blogged about).

bossyTeacher · 2025-12-08T06:29:32 1765175372

> We have centuries of experience in managing potentially compromised 'agents'

Not this kind though. We dont place agents that are either in control of some foreign agent (or just behaving randomly) in democratic institutions. And when we do, look at what happens. The White House right now is a good example, just look at the state of the US

sebastiennight · 2025-12-07T18:36:46 1765132606

On a related note: I've run businesses for close to 20 years, most of those spent selling to other businesses, and I still fail to understand what the entirety of LinkedIn is for and if there is any of it that wouldn't fall under the author's definition of "Vanity activities".

If anyone has a clue, please enlighten me.

mooreds · 2025-12-07T20:14:36 1765138476

I have a friend who calls LinkedIn "a rolodex that other people keep up to date".

There is some value in posting on LinkedIn, but the real value is that you can go back and find people who are weak connections when you are looking to hire, purchase services, or ask favors.

I think everyone should join LinkedIn and connect to every one of their colleagues that they would work with again. Then, once in a while, keep that connection alive by sending a message or commenting on a post.

It's a long game, but will pay dividends should you ever need to chat with them.

titanomachy · 2025-12-07T18:49:29 1765133369

Definitely all the posting and activity on there seems very strange and is not something I’m remotely interested in participating in. But recruiters have often found me through LinkedIn and connected me with jobs, so it’s still useful overall and I keep my profile up to date.

nicbou · 2025-12-08T09:09:37 1765184977

I use it as a write only medium. I post about what I am doing and leave. It keeps my audience and people in my industry aware of what I am working on, and that makes my work more impactful. I have met many people that way, including my current partner.

It’s part of my strategy of working in public. It’s good for business and for my morale.

balamatom · 2025-12-07T20:33:58 1765139638

>what the entirety of LinkedIn is for

Signalling allegiance.

soupfordummies · 2025-12-08T01:46:05 1765158365

sebastiennight · 2025-12-06T08:17:40 1765009060

It's ASCII art, so the "trajectory" will always stay within the lines, because you can't have the ● and ║ characters intersect each other.

The only impressive part would be that the trajectory is "continuous", meaning for every ● there is always another ● character in one of the 4 adjacent positions.

biophysboy · 2025-12-06T13:41:48 1765028508

I know the characters can’t cross. By intersect, I mean two dots on either side of a boundary line in the direction of the path.