Frankly, any web app I develop has configurable in-memory caching built in to it, so I would rather increase its size than add an extrinsic cache. By keeping my cache internal to my application, it's also easier for me to invalidate keys accurately.
> If you have 100 instances you really want them to share the cache
I think that assumes decoupled compute and storage. If instead I couple compute and storage, I can shard the input, and then I won't share the cache across the instances. I don't think there is one approach that wins every time.
As for egress fees, that is an orthogonal concern.