I believe that you are confusing PAE (Physical Address Extension) with PSE (Page Size Extensions) and AWE (Addressing Window Extensions).
You are right in that remapping addresses to give the process more than 4 GiB of memory is expensive.
We're not talking about that - we're not talking about giving the process more than 4 GiB.
We're letting the system manage more than 4 GiB of total RAM, but only giving a max of 4 GiB to processes.
This is not inefficient, since the operating system handles pages via page numbers, not addresses. Since the page is 4KiB in size, a 32-bit page number can describe far more than 4 GiB.
There is no appreciable performance loss to running large memory Windows or Linux systems using PAE. The hit comes if you use PSE and/or AWE to force remapping in order to "cheat" and use more than 4 GiB in a single application.
The real issue here doesn't have much to do with the extension technology. There are two main programming models for 64-bit systems - ILP64 and LP64.
In the first, the "c" datatypes "int", "long" and "pointer" are 64-bits. In the second, "ints" remain 32-bits while "longs" and "pointers" become 64-bits.
ILP64 has the advantage that casting an "int" to a "pointer" (or vice versa) doesn't cause problems - but it has the disadvantage of using 64-bits for a default integer.
As long as the chip supports both 32-bit and 64-bit integers, it's an OS design choice whether to use ILP64 or LP64. Most choose LP64.
"Tiger supports the industry standard LP64 programming model supported by other 64-bit Unix systems. This means developers can easily port 64-bit code to Tiger. LP64 support in Tiger provides for 64-bit pointer, long, and long long but preserves 32-bit integer data types."
http://www.apple.com/macosx/tiger/64bit.html
As far as the "crazy math" comment, note that all of your machines have 64-bit or larger filesystems (if you can have files bigger than 4 GiB, you need 64-bit filesystem data).
The system has to use 64-bit integer arithmetic to work with files - but fortunately this does not fall into the "performance critical" arena. There's so much else going on with file access that a couple of extra "add" instructions aren't important.