More

lukeplato · on Dec 13, 2023

models trained on gpt output might be more distilled and specialized but it wouldn't be improving generalization

lukeplato · on Dec 13, 2023

https://twitter.com/pfau/status/1674766269113937920

eightysixfour · on Dec 13, 2023

I disagree with this. If you give GPT information that was not part of its dataset and ask it to make question and answer pairs off of that information, you are adding higher quality breadth to the training corpus.

Phi-2 seems like pretty good proof of that.

verdverm · on Dec 13, 2023

that's the point, they get less good at everything, but really good at one or a few things

The real benefit here is

1. It's much cheaper and faster to train a bunch of specialized models once you have a single good LLM

2. You probably can't get the same capabilities from a specialized model by training it directly.

lukeplato · on Nov 18, 2023

My impression of Ilya is that it would be far more likely to be a safety related issue than a business/profits related issue

lukeplato · on Oct 25, 2023

Would be cool if this shared the user found specs to create a database of API specs for the web

lukeplato · on Oct 13, 2023

Not sure if it's clickbait, but there is good reason to believe in a deep connection between 4d gauge field theories (i.e. of the standard model) and number theory through representation theory. This involves other work than mentioned in this article though it's still inscrutable for most at this point (motivic cohomology/Hilbert–Pólya in RH, supersymmetry in YM mass gap, cosmic galois in renormalization).

From Peter Woit's blog:

> It’s worth noting that while there are many connections to the ideas originating with Langlands, this new work shows that the “Langlands program” has expanded into a striking vision relating different areas of mathematics, with a strong connection to deep ideas about quantization and quantum field theory. The way in which these ideas bring together number theory and quantum field theory provide new evidence for the deep unity of fundamental ideas about mathematics and physics.

https://www.math.columbia.edu/~woit/wordpress/?p=13578

danbruc · on Oct 13, 2023

[...] provide new evidence for the deep unity of fundamental ideas about mathematics and physics.

This makes no sense to me at all. Mathematics is a modeling language and when we use it to model fundamental aspects of the real world, then we call that physics. So there is obviously some relation between mathematics and physics but I do not understand why you would expect some kind of unity, mathematics is much richer than physics.

Koshkin · on Oct 13, 2023

This makes no sense to you precisely because you think that mathematics is (only) a language. It's not. A language is invented; mathematics is discovered. In the US it's not called a science, but for all intents and purposes it indeed is.

danbruc · on Oct 13, 2023

A language is invented; mathematics is discovered.

I peg to differ. Mathematics is invented when you define the axioms of some structure, after that you discover the consequences of the axioms you picked. First you define the natural numbers and the operations on them, then you discover the primes. So actually invention and discovery are not mutually exclusive here, but invention is more fundamental as it defines what is there to be discovered.

Koshkin · on Oct 13, 2023

You define, but the choice is in fact not quite arbitrary; rather, it is guided by something objective, i.e. by something that exists outside of your own mind.

danbruc · on Oct 13, 2023

What are the constraints? I can make up any set of axioms I like and see what they imply, there is no need that they are in any way related to the physical world. Sure, there is a lot of mathematics that was specifically invented in order to deal with the real world, but that is not a general requirement.

Koshkin · on Oct 13, 2023

> I can make up any set of axioms

Yeah, but that wouldn't be mathematics.

danbruc · on Oct 13, 2023

Why not? If I decide I want to study the properties of the space of total function from vectors of octonions with prime dimension to the surreal numbers, who is to say that this is not mathematics?

Koshkin · on Oct 13, 2023

What you are describing is not "any set of axioms." (Rather, it is a pre-existing concept.)

danbruc · on Oct 13, 2023

Yes, but that is a technical detail, it was just easier for me to come up with that than making up a set of axioms. How is an obscure combination of existing axioms still mathematics but not any set of axioms I make up? And how would we ever extend mathematics if coming up with new axioms is not mathematics?

But let us just take the integers with the common definitions for addition, subtraction and multiplication, but then redefine them so that every operation first performs the usual operation and then increments the result by one.

  1 + 1 = 3
  1 * 1 = 2
  (1 + 2) * 3 = 13
  (a + b) * c = a * c + b * c + c - 4

Still not quite what I had in mind, nothing completely new, but maybe at least different enough from the normal integers to have some weird properties. Not the numbers themselves, they are still just the integers, but the algebraic expressions involving the redefined operations.

Koshkin · on Oct 13, 2023

The problem is that the word "any" stands for "random," or "arbitrary." It is not inconceivable that a scientist, say, would mix "random" substances together just to see what happens. But almost all such experiments would result in exactly nothing, nothing interesting anyway. Usually, a scientist first notices something, puts forward a concrete hypothesis, and then conduct an experiment. In mathematics, it is the same: for example, one notices something that certain things have in common and then tries to generalize this common into "axioms." Modern mathematics is highly evolved field full of such examples. Nobody starts with scribbling random doodles on a blank sheet of paper in the hope that something interesting would come out of it. While not impossible, it is extremely unlikely. In the end, one might say that what is mathematics and what is not is merely a matter of definition, but that, as well as calling the "language of doodles" mathematics, would make the word devoid of any useful meaning, IMHO.

danbruc · on Oct 13, 2023

Sure, randomly generating axioms would probably not be very productive, most of the time you will continue from known territory, generalize something, add additional constraints, whatever. And if that is what you mean with discovery, identifying things that might be interesting to change in one way or another, fine, that is not an unreasonable description of the process. But you are still inventing a new structure when you decide what to change and write down the new rules. If you have the naturals, you might wonder what would happen if there was an inverse of addition or of multiplication for each natural, if you pick the former you will invent the integers, if you pick the later the non-negative rationals. You can also call that a discovery if you like, but it is not a discovery in the sense of say discovering the electron, you did not discover something that has always been there, you actually brought it into existence by the choices you made.

Koshkin · on Oct 13, 2023

> you will invent the integers

You will discover them, not invent. Albeit subtle, there is a difference between these notions, and you will indeed discover the integers in (almost) exactly the same sense as electron was discovered - what you do is, you put forward an idea, play with it, test it, correct mistakes, and in the end you find what you have been looking for. What you can invent in mathematics, is a proof; but even in this case the path may lead to a series of discoveries. Whether it does, depends on the attitude towards it (Grothendieck's analogy between building a proof and opening a nut).

danbruc · on Oct 13, 2023

I would say it is exactly the opposite - you invent the integers, you discover a proof. Discovery for me means finding something that already exists, inventing means bringing something into existence. You discover the electron, you invent the transistor.

Before you write down the axioms for integers, they do not exist, you invent them. That there are infinitely many primes is a consequence of the axioms, you just have to notice those special numbers and give them a name. Here I find it actually quite tempting to say you invent the primes, or maybe better the idea of primes, but I still think it is more correct to say you discover them, given the axioms they were always there even if you have not yet noticed, named and described them. The same for the proof that there are infinitely many of them, given the axioms and the laws of logic, the truth of that statement and the possible proofs for it are fixed, you just have to find one.

Maybe back to the integers. Yes, you may play around with different ideas, tweak definitions and so on. But what you are actually doing is inventing a whole bunch of similar structures and then you pick the one that you like the most, that works the best or whatever. Some of your attempts might be inconsistent, some might not do what you want, but you all invented them. And among them you might discover one that works just like wanted it to work.

Koshkin · on Oct 14, 2023

But unlike the transistor, and like the electron, the natural integers do exist in nature (i.e. have "natural existence"). To quote my other comment here, [number 3] exists as the thing that is common between three apples and the three horses you might want to feed these three apples to; and the fact of the (almost) physical existence of that common (i.e., the number) can be proven by you being bitten by the third horse if you only happen to have two apples instead of three. This common exists, objectively (no sane person can say that it does not), so does the number 3 (because that's what it is called). Now, you can say that we invented the infinity, but that's like saying "we invented the idea that jumping off a cliff is dangerous" - because in both cases the "invention" is unavoidable, as it is dictated by the objective reality which tends to harshly punish those who fail to "invent" the facts.

danbruc · on Oct 15, 2023

What then about mathematical structures that do not exist in the real world? Do you discover some and invent others? Or do you think all mathematical structures are realized?

You say that no sane person will deny the existence of three, I am actually willing to go even further, I am willing to deny the natual existence of your apples and horses. The idea of cutting the universe into pieces is arguably an invention. It is of course an extremely useful idea for comprehending the universe, but if you think about it, you are somewhat arbitrarily drawing borders around collections of atoms.

There is one forest. And a hundred trees. And millions of cells. Is the water in a tree part of the tree? What about the water inside of the cells? When does it become part of the tree? What about a water molecule just evaporating from a leave, when does it become part of the atmosphere?

But even if there are things that can be counted, or if quarks and gluons behave in a way that can be described using SU(3), does that really imply the existence of the mathematical structure used to describe those things? Where do you get multiplication from? Or tetration? Are they invented while the basic structure of the integers is discovered? What exactly does existence even imply or require in this case?

And another thing just came to my mind. In which way do the apples and horses tell us anything about three? I can see that you can probably demonstrate that you can pair your apples and your horses, but that is quite far from telling us anything about three as you can do the same with any number of objects.

You also say the common thing is three, but that does not really help to pick out what three is, therevare many common things. Also note that you introduced the idea of a set, where does that come from?

Three is not something that applies to one of the apples or horses, it applies to the sets of them. Does the set of three horses exist or is that not something you made up, this, this and that, this are my three horses?

And now that we are talking about sets, because that is where three comes into play, why did we talk about apples and horses to begin with, it totally doesn't matter what you put into your sets, just how many things. So did we just lose the relationship to the real world or at least conclude that it never mattered?

I would at least say that it is certainly not as easy as you want it to be, there is much more nuance to this than three apples, three horses, therefore natural numbers. It actually sound circular. Let me pick three apples and three horse, now look, here are two sets of three things, so three exists.

Maybe it would be a nice challenge to tell me how to find three in nature. You obviously can not say take three horses. It seems tempting to say start without a horse, then add one, then another and yet another. But did you then not just invent that, did you not just come up with the axioms of the natural numbers, just expressed with horses instead of symbols?

Koshkin · on Oct 15, 2023

> arguably an invention

A blind person would definitely disagree, having “invented” things he keeps bumping into, one after another… See, knowledge is never invented, it’s based on discovery; and mathematics is a form of knowledge.

> anything about three

Well, the discussion wasn’t about any particular number or concept, but if you want to define 3, you should look no further than the special property of a three-legged stool, the smallest polygon, or, if you look close enough, the proton. The general notion of the number, then, may come from a comparison (trying to see if there is a one-to-one correspondence) between the legs of a stool that is safe to sit on and one you risk breaking your neck if you try. If that’s full of “nuance,” I don’t know what isn’t.

danbruc · on Oct 16, 2023

A blind person would definitely disagree, having “invented” things he keeps bumping into, one after another… See, knowledge is never invented, it’s based on discovery; and mathematics is a form of knowledge.

I was not talking about inventing the things themselves but logically dividing them up. For the blind person it makes not difference whether he bumps into a tree because you decided to divide the forest into several individual trees or whether he bumps into the entire forest. But if you want to count things, then it of course makes a huge difference whether there is just the forest or whether there is a collection of trees. This subdivision of the universe - or the forest - into several individual object - or trees - is arguably invented, not the universe or the forest itself.

Knowledge is a thing you have, your awareness of some fact. Mathematics is not that, it is some form of fact you can be aware of. You can have knowledge about mathematics but it is not knowledge itself. If I name my dog Beethoven, that establishes the fact that the name of my dog is Beethoven, that is a kind of invention. If I tell you about this, you gain knowledge about the name of my dog. No discovery involved, neither when establishing the fact nor when you learn about it.

[...] look no further than the special property of a three-legged stool, the smallest polygon [...]

Okay, I found a stool that does not wobble on uneven ground and a triangular rock. Nothing about them on their own is related to three. You can of course explain to me what a leg is or the corner of a triangle and then ask me to form the sets of legs and corners and point out to me that the sets have the same number of elements, but there is lot of stuff going on here. You made me find things related to three, and I never doubted that there are such things, the number of dimensions of space would be another good candidate. But you mostly gloss over the hard part of actually extracting the natural numbers in general or three in particular.

It is of course trivial in everyday language as we learn about pairing and counting things relatively early in our life and humanity has made use of those ideas for a long time. Look, the number of legs equals the number of corners. And there are as many of them as I have horses and apples. But notice that those sentences are full of ideas and words related to numbers - number of legs, equals, as many.

But just as with the horses and apples, nothing about the non-wobbly stool or the smallest polygon - actually you mean the polygon with the fewest number of sides and note that there the idea of numbers already sneaked in again - is intrinsically related to three. It is the set of legs and the set of sides that are related to three, it is the carnality of the sets that behaves like the naturals. The process of forming those sets does a lot of heavy lifting to get you towards finding numbers in nature. And I do not think you can just brush that under the rug, you will have to justify that forming sets and looking at their cardinality is not something that humans invented.

Koshkin · on Oct 16, 2023

> it makes not difference whether he bumps into a tree because you decided

Exactly. It makes no difference for him that you want to see a proof of individual trees' objective existence - because he already knows this for a fact! That's what that pesky objective reality does, sometimes forcing knowledge about itself upon us, whether we like it or not, or whether we would prefer some other "proof." The proof is in the pudding, as they say.

> Mathematics is not that.

Sure it is. One who knows about numbers knows more about the objective reality than those who don't. One who knows about Lie groups knows even more.

> you gain knowledge about the name of my dog. No discovery involved

That's not quite true: I discover that you have a dog (as long as you did not "invent" it).

> Nothing about them on their own is related to three.

It does, if you look at it from the right angle. There's a different thing at play here. While the number of legs could be easily matched with the corresponding number of apples, by itself this correspondence does not necessarily make any particular number stand out (although in some cases, like, say, in the case of a non-wobbly stool, a triangle, or the number of eyes and hands, it would - simply because there are many pairs of eyes, etc.); what's also at play here is different ways to look at numbers, which includes seeing them not only as "cardinals" (which is what you are still limiting yourself to) but also as "ordinals": the number 1 "stands out" as the smallest ordinal (greater than "nothing"), the number 2 is what follows it, etc. It is all these aspects combined that form the true content of the notion of the number.

danbruc · on Oct 17, 2023

It makes no difference for him that you want to see a proof of individual trees' objective existence - because he already knows this for a fact!

You are missing my point here. The blind guys knows he bumped into something, so something exists. He could just say he bumped into a part of the universe, he is not forced to say he bumped into a forest or a tree. He could even consider himself part of the universe and say one part of the universe bumped into another part of the universe, just as one part of you bumps into another part of you when you clap your hands. The consequence of that is that there are no distinct objects to count, it is just one really complex object, the universe, interacting with itself.

Sure it is. One who knows about numbers knows more about the objective reality than those who don't. One who knows about Lie groups knows even more.

No, that is a kind of map territory thing. You can have knowledge of mathematics but mathematics is not knowledge. You can have knowledge of my dos's name but my dog's name is not knowledge.

It does, if you look at it from the right angle.

Switching from the cardinals to the ordinals will not really make a difference, you are glossing over a lot of heavy lifting. I have to repeat myself, the non-wobbly stool is not related to three, it is the set of its legs that is related to three. On to get there, you have to single out parts of the stool and combine the parts into a set and the take about the cardinality of it. There are a lot of steps and concepts on that way which makes it at least very non-obvious how the number three was always there and is not just the result of that process.

I am open to an example how I would find the ordinals in nature, I am not sure it will be any easier than with the cardinals. Non of the trees in the forest is the first one, you will have to impose an order an them. Maybe something with time, sunrises or days, they are at least already ordered.

Koshkin · on Oct 17, 2023

> You are missing my point here.

Not at all! Talk about losing the trees for the forest... Even the blind guy knows, viscerally, that what he is hugging is not the "entire universe," or that stepping off a cliff would result in a dramatic experience; that a sighted person (and a philosopher) can see "a bigger picture" does not change the facts; indeed, a focus on "a bigger picture" can be deceiving (think of quantum vs. classical mechanics).

> my dog's name is not knowledge

We are going in circles. Mathematics is not just "names."

> the non-wobbly stool is not related to three

It is. You have a class of non-wobbly stools with a matching number of legs. You have a class of girls with a matching number of eyes that boys like more than others. Examples abound.

> how I would find the ordinals in nature

The early bird may get the worm, but it's the second mouse who gets the cheese.

See also https://byjus.com/maths/ordinal-numbers/

lukeplato · on Sept 25, 2023

it's not rolled out yet

lukeplato · on Sept 19, 2023

I recommend watching the Hinton talk on dark knowledge on YouTube

killjoywashere · on Sept 19, 2023

tl;dw?

woadwarrior01 · on Sept 19, 2023

Model distillation.

lukeplato · on Aug 8, 2023

I imagine this is for training a multimodal transformer to retrieve previously viewed content and personalize generative conversational bots

kderbyma · on Aug 8, 2023

I believe it is for heinous purposes and intentions far beyond beneficial uses......it's meant to be cutthroat (which means death is involved)

lukeplato · on June 8, 2023

> optimal programs are very inhuman

sounds revolutionary

lukeplato · on June 8, 2023

check out metarationality: https://metarationality.com/stem-fluidity-bridge

lukeplato · on May 23, 2023

Is there any potential improvements over transformers for interpretablity or alignment?

pico_creator · on May 23, 2023

For anything past 8k context size

We are talking about over 10x reduction in GPU time for inferencing tokens and for training too

Aka it’s cheaper and faster

Alignment is frankly IMO purely a dataset design and training issue. And has nothing to do with the model