You seem to be fixated on a point whose relevance/importance is not clear to anyone else.
The A15 and earlier had the RAM mounted as PoP. The RAM is mounted on a small organic substrate with wire bonding of the DRAM to the substrate. This scheme has the advantage that it is cheap (because it has been done for years, the machinery is all old and fully depreciated. But it's not optimal in terms of energy because those wire-bonding wires have extra capacitance.
The A12X, M1 and later mount the DRAM directly on the package. No wire bonding, slightly better for energy, also slightly more expensive. The vertical vs horizontal is just a question of convenience and whether you care more about z-height or area, it is not fundamental.
With the A16 we see the A12X/M1 style brought to the iPhone. Details remain unclear (the sites that used to tear apart Apple chips and give photos have mostly stopped doing this for free...). The one public article I know is
https://eetimes.itmedia.co.jp/ee/articles/2210/25/news048.html#utm_term=share_sp (Japanese)
which claims (and shows some terrible quality photos which seem to confirm) that the A16 is no longer using traditional PoP DRAM. Rather it is using something like a vertical version of the A12X/M1 packaging. There's an epoxy glass substrate with DRAM mounted on one side, A16 SoC on the other side, and presumably via's going through the epoxy glass that connect the two. No more PoP wires, so slightly reduced energy per DRAM read/write transaction.
Point is, the choice of vertical vs horizontal is not "essential"; it's based on what's more convenient given the target product. Even the M1 Max and Pro are somewhat different from the M1. The metal stiffener is substantially more robust, and there area lot more capacitors mounted on the package (for better control of rapid changes in current). The M1 Ultra is different again, with the DRAM rather further from the SoC than on Max. (That slightly increases energy, but allows an inner stiffener ring to surround the pair of SoCs, with an outer stiffener ring then around the entire package). Conceivably in future Apple may move all the DRAM for M class, and some of the DRAM for Pro/Max/Ultra class under the package, like A16, to reduce area. Ultimately the choice is likely to be primarily about cost; is the side-by-side mounting slightly cheaper than vertical mounting or slightly more expensive?