> For code modifications in a large codebase the problem with multi-shot is that it doesn't take too many iterations before I've spent more time on it.
I've found voice input to completely change the balance there.
For stuff that isn't urgent, I can just fire off a hosted codex job by saying what I want done out loud. It's not super often that it completely nails it, but it almost always helps give me some info on where the relevant files might be and a first pass on the change.
Plus it has the nice side effect of being a todo list of quick stuff that I didn't want to get distracted by while working on something else, and often helps me gather my thoughts on a topic.
It's turned out to be a shockingly good workflow for me
Should we not teach kids math because calculators can handle it?
Practically, though, how would someone become good at just the skills LLMs don't do well? Much of this discussion is about how that's difficult to predict, but even if you were a reliable judge of what sort of coding tasks LLMs would fail at, I'm not sure it's possible to only be good at that without being competent at it all.
> Should we not teach kids math because calculators can handle it?
We don't teach kids how to use an abacus or a slide rule. But we teach positional representations and logarithms.
The goal is theoretical concepts so you can learn the required skills if necessary. The same will occur with code.
You don't need to memorize the syntax to write a for loop or for each loop, but you should understand when you might use either and be able to look up how to write one in a given language.
Should you never use a calculator because you want to keep your math skills high?
There are a growing set of problems which feel like using a calculator for basic math to me.
But also school is a whole other thing which I'm much more worried about with LLMs. Because there's no doubt in my mind I would have abused AI every chance I got if it were around when I was a kid, and I wouldn't have learned a damn thing.
I don't use calculators for most math because punching it in is slower than doing it in my head -- especially for fermi calculations. I will reach for a calculator when it makes sense, but because I don't use a calculator for everything, the number of places where I'm faster than a calculator grows over time. It's not particularly intentional, it just shook out that way.
I do not trust myself, so even if I know how to do mental math, I still use my computer or a calculator just to be sure I got it correct. OCD? Lack of self-trust? No clue.
I've found most models don't do good with negatives like that. This is me personifying them, but it feels like they fixate on the thing you told them not to do, and they just end up doing it more.
I've had much better experiences with rephrasing things in the affirmative.
The closest I've got to avoiding the emoji plague is to instruct the model that responses will be viewed on an older terminal that only supports extended ascii characters, so only use those for accessibility.
A lot of these issues must be baked in deep with models like Claude. It's almost impossible to get rid of them with rules/custom prompts alone.
because it is a stupid auto complete, it doesn't understand negation fully, it statistically judge the weight of your words to find the next one, and the next one and the next one.
That's not how YOU work, so it makes no sense, you're like "but when I said NOT, a huge red flag popped in my brain with a red cross on it, why the LLM still does it". Because, it has no concept of anything.
PHP's approach is simple though, and in my experience that simplicity pays off when you do start scaling the systems.
In other systems once you get beyond a single machine you need that external communication mechanism anyway, and now you have multiple classes of comms which introduces bugs and complexity and performance cliffs.
In PHP you just throw another server at it, it'll act the same as if you just added another process. Nice linear scaling and simple to understand architectures.
Our group once badgered our DM at the time into allowing the parties pet goat to deal some minimal amount of damage in combat. Then we backtracked and bought a hundred of them from the local shepherd and had a small goat army for a bit.
Unfortunately there was a flood shortly after and our goat army was lost
My experience with a few fun DMs is that you have to be really careful with the shenanigans. I'm not surprised at all about the flood that took out your goats. I'm impressed with the restraint demonstrated by your DM in fact... one of my old DMs would have almost certainly done something more damaging first; off the top of my head, good chance we would have woken up to discover that the goats had eaten all of our clothing in the middle of the night.
Or, you know, having to deal with all the goat excrement; or either stopping for several hours three times a day to "pasture" (and having to find a place where the locals will allow you to do that), or carrying grain for 100 goats around with you.
> We propose adopting diffusion language models for text embeddings, motivated by their inherent bidirectional architecture and recent success in matching or surpassing LLMs especially on reasoning task.
I didn't realize diffusion language models were at this point yet. But what's the catch? why aren't diffusion models (or some kind of hybrid) taking over?
But that's unlikely to be the case here because they've been up there isolated for over 6 months now
reply