That's true. Parquet went through the weirdest changes between its various revis...

That's true. Parquet went through the weirdest changes between its various revisions and because it was used for Hadoop data lakes, there's a whole bunch of data that is being stored in legacy formats. Off the top of my head:

- different physical types to store timestamps: INT96 vs INT64

- different ways to interpret timestamps before tzdb (current vs earliest tzdb record)

- different ways to handle proleptic Gregorian dates and timestamps

- different ways to handle time zones (since Parquet only has the equivalents of LocalDateTime and Instant, but no OffsetDateTime or ZonedDateTime and earlier versions of Hive 3 were terribly confused which is which)

- decimal data type was written differently, as a byte array in older versions and as int/byte array/binary in the newer ones

- Hadoop ecosystem doesn't support decimals longer than 38 digits, but the file format supports them