I'm inbetween. 10K line files are usually extremely messy, but they can be written not to - a large number of well-organized, well-capsulated <100 LOC classes can be very readable if smashed together in one file. It just so happens that people who tend to write readable self-contained classes just don't put them in 10 KLOC files, but rather split them. And vice-versa, creating an association "10 KLOC files are unreadable", where it's not the length of the file, but rather the organization itself.
Same for business logic - very clear separation can be cumbersome sometimes, but otherwise it becomes messy if you're not careful. And careful people just tend to separate it.
A line is a unit of change that git can report on.
If it's a separate file that is scoped to some specific concern, sure. But its tgat grouping by concern that is key. Not separation into another file. Extracting ra dom bits of code into separate files would be _worse_.
> A line is a unit of change that git can report on.
Yes and no.
Git doesn't store lines, it stores files. Git diff knows how to spit out line changes by comparing files.
So to run git blame on a 10k line file you're reading multiple versions of that 10k file and comparing. It's slow. Worse still is that trying to split said file up while trying to preserve history won't make the git blame any faster.
Yes and yes. While agree with the general points, note that they didn't say "unit that git stores", but "unit git can report on". Git can totally report on lines as a unit of change.
Git diff absolutely does not understand function boundaries, it's diff algorithms routinely confuse things like adding a single new function, thinking that the diff should begin with a "}", instead of a function definition.
It varies a bit with language and tooling, but 10k lines is around the place where the size of your file by itself becomes a major impediment on finding anything and understanding what is important.
A 10k lines file is not something that will completely destroy your productivity, but it will have an impact and you'd better look out for it growing further, because completely destroying your productivity is not too far away. It is almost always good to organize your code when it reaches a size like this, and the exceptions are on contexts where you can't, never on contexts where it's worthless.
I generally agree. My argument is that 10K lines written one way can certainly be more readable than 10 files x 1K lines written in a different way, so the real differentiator is the encapsulation and code style, not KLOC/file per se.
What muddies the waters here is languages like Java, where "10k lines" means "you've got a 10kLOC class there", and ecosystems like PHP's where while there's nothing in the language to require it, people and teams will insist on one class per file because hard rules are easier to understand and enforce than "let's let this area evolve as we increase our understanding".
As long as what's there is comprehensible, being able to evolve it over time is a very useful lever.
Same for business logic - very clear separation can be cumbersome sometimes, but otherwise it becomes messy if you're not careful. And careful people just tend to separate it.