Its not something that they teach at asymptotic analysis lectures, right? which reminds me of professor who famously said - that constant that everyone's just overlooking can in engineering terms very much eat your head.
Unrelated but I see this sentiment a lot. Algorithms (and related courses) are supposed to be largely mathematical/theory based so it sort of has to be hardware agnostic.
Imo an algorithms course doesn't really need to teach you about lowering the constant factor (unless the optimizations for lowering said factor are reasonably abstracted away from physical HW), thats the purpose of Computer Architecture or Operating Systems course.
And realistically in the real world you'd be benchmarking these things anyway if a supposedly "optimal" algorithm is really the bottleneck, and you'd apply context specific optimizations to speed it up.
When I did tutoring for an algorithms class, we’d generally make gestures at the fact that the big-O cost doesn’t tell you everything. But I can’t think of any way to handle that on a systemic level. Just plot n vs t and observe they the big-O prediction comes true for large n, but it gets complicated for small.
After all, nobody would have invented big-O if it wasn’t much easier than the alternative!
ACM papers on sorting algorithms tend to plot run time versus n to show that their novel optimization to sorting pulled ahead of more general algorithms at certain ranges of n. In the early 00's I saw a handful of papers from different teams who came up with algorithms that had much flatter time curves than mergesort for n > 10000.