It’s really straightforward: storage efficiency wasn’t a priority when we were designing the offline system. We wanted something which was simple, reliable, and fast.
Why didn’t we design for this particular storage efficiency? Because:
- The majority of scenarios don’t have much, if any duplication.
- Where there is overlap, the storage wasted is almost always too small to be an issue.
There you have it.