More

santadays · 2026-01-09T19:08:35 1767985715

One definition of analysis is: The process of separating something into its constituent elements.

I think when someone designs a software system, this is the root process, to break a problem into parts that can be manipulated. Humans do this well, and some humans do this surprisingly well. I suspect there is some sort of neurotransmitter reward when parsimony meets function.

Once we can manipulate those parts we tend to reframe the problem as the definition of those parts, the problem ceases to exist and what is left is only the solution.

With coding agents we end up in weird place, one, we have to just give them the problem, or we have to give them the solution. Giving them the solution means that we have to give them more and more details until they arrive at what we want. Giving an agent the problem we never really get the satisfaction of the problem dissolving into the solution.

At some level we have to understand what we want. If we don't we are completely lost.

When the problem changes we need to understand it, orient ourselves to it, find which parts still apply and which need to change and what needs to be added, if we had no part in the solution we are that much further behind in understanding it.

I think this, at an emotional level is what developers are responding to.

Assumptions baked into the article are:

You can keep adding features and Claude will just figure it out, sure, but for whom, and will they understand it.

Performance won't demand you prioritize feature A over feature B.

Security (that you don't understand) will be implemented over feature C, because Claude knows better.

Claude will keep getting more intelligent.

The only assumption I think is right, is that Claude will keep getting better. All the other assumptions require you know WTF you are doing (which we do, but for how long will we know what we are doing).

santadays · 2026-01-09T18:14:08 1767982448

Maybe one day our knee jerk reactionary outrage will be quelled not by any enlightenment but because we are forced to grow weary of falling prey to phishing attacks.

I'd feel pretty stupid getting worked up about something only to realize that getting worked up about it was used against me.

I'm writing this because for a moment I did get worked up and then had the slow realization it was a phishing attack, slightly before the article got to the point.

Anyways, I think the clickbait is kindof appropriate here because it rather poignantly captures what is going on.

deflator · 2026-01-09T18:21:08 1767982868

I agree. It can demonstrate the knee-jerk affect in real time for the reader. Someone who reacts strongly to the title of this thread would have experienced a similar reaction if they had received the SendGrid phish email. Never seen clickbait wording actually be appropriate before.

panarky · 2026-01-09T18:54:15 1767984855

When I see stories that make me want to click, I read HN comments first, and 8 times in ten that saves me a from a "won't get fooled again" moment.

There's got to be a way to generalize this for anyone who still cares about the difference between real facts and manipulation.

idiotsecant · 2026-01-09T18:39:45 1767983985

The effectiveness of these techniques will die off over time as young people are increasingly inoculated against them in the same way our generations are generally immune to traditional advertising. The memetics filters get better over time as us geezers are replaced by new models.

santadays · 2026-01-08T16:30:09 1767889809

I've seen the following quote.

"The energy consumed per text prompt for Gemini Apps has been reduced by 33x over the past 12 months."

My thinking is that if Google can give away LLM usage (which is obviously subsidized) it can't be astronomically expensive, in the realm of what we are paying for ChatGPT. Google has their own TPUs and company culture oriented towards optimizing the energy usage/hardware costs.

I tend to agree with the grandparent on this, LLMs will get cheaper for what we have now level intelligence, and will get more expensive for SOTA models.

lelanthran · 2026-01-08T16:49:49 1767890989

Google is a special case - ever since LLMs came out I've been pointing out that Google owns the entire vertical.

OpenAI, Anthropic, etc are in a race to the bottom, but because they don't own the vertical they are beholden to Nvidia (for chips), they obviously have less training data, they need constant influsx of cash just to stay in that race to the bottom, etc.

Google owns the entire stack - they don't need nvidia, they already have the data, they own the very important user-info via tracking, they have millions, if not billions, of emails on which to train, etc.

Google needs no one, not even VCs. Their costs must be a fraction of the costs of pure-LLM companies.

viraptor · 2026-01-09T00:18:48 1767917928

> OpenAI, Anthropic, etc are in a race to the bottom

There's a bit of nuance hiding in the "etc". Openai and anthropic are still in a race for the top results. Minimax and GLM are in the race to the bottom while chasing good results - M2.1 is 10x cheaper than Sonnet for example, but practically fairly close in capabilities.

lelanthran · 2026-01-09T10:27:13 1767954433

> There's a bit of nuance hiding in the "etc". Openai and anthropic are still in a race for the top results.

That's not what is usually meant by "race to the bottom", is it?

To clarify, in this context I mean that they are all in a race to be the lowest margin provider.

They re at the bottom of the value chain - they sell tokens.

It's like being an electricity provider: if you buy $100 or electricity and produce 100 widgets, which you sell for $1k each, that margin isn't captured by the provider.

That's what being at the bottom of the value chain means.

viraptor · 2026-01-09T11:40:21 1767958821

I get what it means, but it doesn't look to me like they're trying that yet. They don't even care that people buy multiple highest level plans to rotate them every week, because they don't provide a high enough tier for the existing customers. I don't see any price war happening. We don't know what their real margins are, but I don't see the race there. What signs do you see that Anthropic and Openai are in the race to the bottom?

lelanthran · 2026-01-09T12:41:43 1767962503

> I don't see any price war happening. What signs do you see that Anthropic and Openai are in the race to the bottom?

There doesn't need to be signs of a race (or a price-war),only signs of commodification; all you need is a lack of differentiation between providers for something to turn into a commodity.

When you're buying a commodity, there's no big difference between getting your commodity delivered by $PROVIDER_1 and getting your commodity delivered by $PROVIDER_2.

The models are all converging quality-wise. Right now the number of people who swear by OpenAI models are about the same as the number of people who swear by Anthropic models, which are about the same as the number of people who swear by Google's models, etc.

When you're selling a commodity, the only differentiation is in the customer experience.

Right now, sure, there's no price war, but right now almost everyone who is interested are playing with multiple models anyway. IOW, the target consumers are already treating LLMs as a commodity.

flyinglizard · 2026-01-08T22:12:27 1767910347

Gmail has 1.8b active users, each with thousands of emails in their inbox. The number of emails they can train of is probably in the trillions.

brokencode · 2026-01-08T22:29:54 1767911394

Email seems like not only a pretty terrible training data set, since most of it is marketing spam with dubious value, but also an invasion of privacy, since information could possibly leak about individuals via the model.

palmotea · 2026-01-08T22:59:13 1767913153

> Email seems like not only a pretty terrible training data set, since most of it is marketing spam with dubious value

Google probably even has an advantage there: filter out everything except messages sent from valid gmail account to valid gmail account. If you do that you drop most of the spam and marketing, and have mostly human-to-human interactions. Then they have their spam filters.

Terr_ · 2026-01-09T00:04:33 1767917073

I'd upgrade that "probably" leak to "will absolutely" leak, albeit with some loss of fidelity.

Imagine industrial espionage where someone is asking the model to roleplay a fictional email exchange between named corporate figures in a particular company.

SoftTalker · 2026-01-09T04:48:28 1767934108

> Google has ... company culture oriented towards optimizing the energy usage/hardware costs.

Google has a company culture of luring you in with freebies and then mining your data to sell ads.

AdrianB1 · 2026-01-09T00:16:03 1767917763

> if Google can give away LLM usage (which is obviously subsidized) it can't be astronomically expensive

There is a recent article by Linus Sebastian (LTT) talking about Youtube: it is almost impossible to support the cost to build a competitor because it is astronomically expensive (vs potential revenue)

SecretDreams · 2026-01-08T16:35:09 1767890109

I do not disagree they will get cheaper, but I pointing out that none of this is being reflected in hardware pricing. You state LLMs are becoming more optimized (less expensive). I agree. This should have a knockon effect on hardware prices, but it is not. Where is the disconnect? Are hardware prices a lagging indicator? Is Nvidia still a 5 trillion dollar company if we see another 33x improvement in "energy consumed per text prompt"?

zozbot234 · 2026-01-08T16:38:52 1767890332

Jevon's paradox. As AI gets more efficient its potential scope expands further and the hardware it runs on becomes even more valuable.

BTW, the absolute lowest "energy consumed per logical operation" is achieved with so-called 'neuromorphic' hardware that's dog slow in latency terms but more than compensates with extreme throughput. (A bit like an even more extreme version of current NPU/TPUs.) That's the kind of hardware we should be using for AI training once power use for that workload is measured in gigawatts. Gaming-focused GPUs are better than your average CPU, but they're absolutely not the optimum.

santadays · 2025-12-24T01:31:23 1766539883

GraalVM supports running javascript in a sandbox with a bunch of convenient options for running untrusted code.

https://www.graalvm.org/latest/security-guide/sandboxing/

simonw · 2025-12-24T15:31:53 1766590313

Oh that looks neat! It appears to have the memory limits I want (engine.MaxIsolateMemory) and a robust CPU limit: sandbox.MaxCPUTime

One catch: the sandboxing feature isn't in the "community edition", so only available under the non-open-source (but still sometimes free, I think?) Oracle GraalVM.

santadays · 2025-12-20T16:16:30 1766247390

I get this take, but given the state of the world (the US anyways), I find it hard to trust anyone with any kind of profit motive. I feel like any information can’t be taken as fact, it can just be rolled into your world view and discarded if useful or not. If you need to make a decision that can’t be backed out of that has real world consequences I think/hope most people are learning to do as much due diligence as reasonable. Llms seem at this moment to be trying to give reliable information. When they’ve been fine tuned to avoid certain topics it’s obvious. This could change but I suspect it will be hard to find tune them too far in a direction without losing capability.

That said, it definitely feels as though keeping a coherent picture of what is actually happening is getting harder, which is scary.

twoodfin · 2025-12-20T16:22:52 1766247772

I feel like any information can’t be taken as fact, it can just be rolled into your world view and discarded if useful or not.

The concern, I think, is that for many that “discard function” is not, “Is this information useful?”. Instead: “Does this information reinforce my existing world view?”

That feedback loop and where it leads is potentially catastrophic at societal scale.

RussianCow · 2025-12-20T16:42:53 1766248973

This was happening well before LLMs, though. If anything, I have hope that LLMs might break some people out of their echo chambers if they ask things like "do vaccines cause autism?"

DaiPlusPlus · 2025-12-20T19:31:29 1766259089

> I have hope that LLMs might break some people out of their echo chambers

Are LLMs "democratized" yet, though? If not, then it's just-as-likely that LLMs will be steered by their owners to reinforce an echo-chamber of their own.

For example, what if RFK Jr launched an "HHS LLM" - what then?

tptacek · 2025-12-20T21:48:51 1766267331

... nobody would take it seriously? I don't understand the question.

etra0 · 2025-12-20T16:58:41 1766249921

> I find it hard to trust anyone with any kind of profit motive.

As much as this is true, and i.e. doctors for sure can profit (here in my country they don't get any type of sponsor money AFAIK, other than having very high rates), there is still accountability.

We have built a society based on rules and laws, if someone does something that can harm you, you can follow the path to at least hold someone accountable (or, try).

The same cannot be said about LLMs.

pixl97 · 2025-12-20T18:14:52 1766254492

>there is still accountability

I mean there is some if they go wildly off the rails, but in general if the doctor gives a prognosis based on a tiny amount of the total corpus of evidence they are covered. Works well if you have the common issue, but can quickly go wrong if you have the uncommon one.

izacus · 2025-12-20T21:22:22 1766265742

Comparing anything real professionals do to the endless, unaccountable, unchangeable stream of bullyshit from AI is downright dishonest.

This is not the same scale of problem.

santadays · 2025-10-30T01:51:33 1761789093

I can’t imagine this is not happening. There exists the will, the means and the motivation, with not a small dose of what pg might call naughtiness.

santadays · 2025-10-27T22:32:52 1761604372

Don't know about excel, but for Google Sheets. You can ask chatgpt to write you a appsscript custom function e.g CALL_OPENAI. Then you can pass in variables into. =CALL_OPEN("Classify this survey response as positive, negative, or off-topic: "&A1)

thisguy47 · 2025-10-27T22:52:27 1761605547

Sheets also has an `AI` formula now that you can use to invoke Gemini models directly.

santadays · 2025-10-28T01:20:19 1761614419

When I tried the Gemini/AI formula it didn’t work very well, gpt-5 mini or nano are cheap and generally do what you want if you are asking something straightforward about a piece of content you give them. You can also give a json schema to make the results more deterministic.

santadays · 2025-10-15T16:48:00 1760546880

It seems like there is a bunch of research/working implementations that allow efficient fine tuning of models. Additionally there are ways to tune the model to outcomes vs training examples.

Right now the state of the world with LLMs is that they try to predict a script in which they are a happy assistant as guided by their alignment phase.

I'm not sure what happens when they start getting trained in simulations to be goal oriented, ie their token generation is based off not what they think should come next but what should come next in order to accomplish a goal. Not sure how far away that is but it is worrying.

mediaman · 2025-10-15T16:52:10 1760547130

That's already happening. It started happening when they incorporated reinforcement learning into the training process.

It's been some time since LLMs were purely stochastic average-token predictors; their later RL fine tuning stages make them quite goal-directed, and this is what has given some big leaps in verifiable domains like math and programming. It doesn't work that well with nonverifiable domains, though, since verifiability is what gives us the reward function.

santadays · 2025-10-15T17:30:20 1760549420

That makes sense for why they are so much better at writing code than actually following the steps the same code specifies.

Curious, is anyone training in adversarial simulations? In open world simulations?

I think what humans do is align their own survival instinct with a surrogate activities and then rewrite their internal schema to be successful in said activities.

santadays · 2025-09-21T15:50:11 1758469811

I think this is the wrong take. I don’t agree that people are good or bad, I think actions are, and there are lots of reasons and motivations a person can end up enabling a bad situation, some of those motivations can even at the time be justified.

I do believe Meta is very bad for the world and has way too much power. Anything that can get people to open their eyes to this is important. Dividing those that are trying isn’t helping.

santadays · on Nov 17, 2013

The other common pattern being shill reviews:

   5 stars: ||||||||||||
   4 stars: ||
   3 stars: ||||
   2 stars: |||||
   1 star:  ||||||||