I'm inbetween. 10K line files are usually extremely messy, but they can be writt...

sbergot · on Jan 3, 2023

I disagree with this stance. Creating a file and naming it gives it a purpose. It creates a unit of change that tools like git can report on.

nonethewiser · on Jan 3, 2023

A line is a unit of change that git can report on.

If it's a separate file that is scoped to some specific concern, sure. But its tgat grouping by concern that is key. Not separation into another file. Extracting ra dom bits of code into separate files would be _worse_.

swsieber · on Jan 3, 2023

> A line is a unit of change that git can report on.

Yes and no.

Git doesn't store lines, it stores files. Git diff knows how to spit out line changes by comparing files.

So to run git blame on a 10k line file you're reading multiple versions of that 10k file and comparing. It's slow. Worse still is that trying to split said file up while trying to preserve history won't make the git blame any faster.

karamanolev · on Jan 3, 2023

Yes and yes. While agree with the general points, note that they didn't say "unit that git stores", but "unit git can report on". Git can totally report on lines as a unit of change.

beagle3 · on Jan 3, 2023

git diff understands function boundaries, and for many languages will “report” equally well on a single file.

It’s a good idea to break things down to files along logical boundaries. But got reporting isn’t a reason.

edit: "got diff" -> "git diff". DYAC and responding from mobile!

joshuamorton · on Jan 3, 2023

Git diff absolutely does not understand function boundaries, it's diff algorithms routinely confuse things like adding a single new function, thinking that the diff should begin with a "}", instead of a function definition.

marcosdumay · on Jan 3, 2023

It varies a bit with language and tooling, but 10k lines is around the place where the size of your file by itself becomes a major impediment on finding anything and understanding what is important.

A 10k lines file is not something that will completely destroy your productivity, but it will have an impact and you'd better look out for it growing further, because completely destroying your productivity is not too far away. It is almost always good to organize your code when it reaches a size like this, and the exceptions are on contexts where you can't, never on contexts where it's worthless.

karamanolev · on Jan 3, 2023

I generally agree. My argument is that 10K lines written one way can certainly be more readable than 10 files x 1K lines written in a different way, so the real differentiator is the encapsulation and code style, not KLOC/file per se.

regularfry · on Jan 3, 2023

What muddies the waters here is languages like Java, where "10k lines" means "you've got a 10kLOC class there", and ecosystems like PHP's where while there's nothing in the language to require it, people and teams will insist on one class per file because hard rules are easier to understand and enforce than "let's let this area evolve as we increase our understanding".

As long as what's there is comprehensible, being able to evolve it over time is a very useful lever.