GreatDrok:
While I understand the point you're trying to advance, you're simply mistaken in your concept of what factors specifically contribute to improving code. First off, different processor architectures work differently. Secondly,
Depends on just what you mean by programming. If you are simply writing a C program then there is little reason you need to concern yourself with the architecture beyond making sure that you don't allocate a huge array and leap around all over it rendering the cache useless. GCC handles all the platform optimisation in the case of Xcode anyway so writing C that will compile and run cleanly on multiple architectures isn't an issue as such. However, as different architectures can behave differently to mistakes it is a good idea to compile and run on a number of platforms as a final sanity check.
On the other hand, if you are hand optimising in assembler for some weird ass platform then you are investing a great deal of effort for a particular platform. Really, you should only be doing this if you are looking to get performance that wouldn't be achievable using standard programming languages. Back in the early 90's I was writing code on a 16384 processor supercomputer and I would sit with a sheet of the timings and latencies for the CPU instructions so I could reorder the calls to memory since the CPU didn't support on the fly instruction reordering like modern CPUs do. The peformance benefits of doing this rather than writing simple C was on the order of an 8 fold increase in performance simply because there was no way in C to express much of what I was doing which was far more than simply instruction reordering by the way.
different processor architectures (or even branches of different architectures) have different optimization technologies. Thirdly, if you go and look at each branch of any architecture, you will note that the chips themselves are all different iterations, each with their own unique set of anomalies, quirks, and chip-designer-made assumptions.
For standard C code, this is all the domain of the compiler writers. Developers should be aware of this stuff but they should be writing portable code and where necessary include platform specific optimisations only if there is a real benefit. For example, I worked on some code a few years back which used a chunk of SSE assembly to speed up string comparisons. This was developed on Intel P3 and for some reason the performance on AMD was much worse than expected. Turned out that due to a peculiarity of the Alpha EV6 bus that Athlons used a section of the code was horribly inefficient on AMD so I replaced it with code more suitable and the AMD implementation then went substantially quicker than the P3. This is one of the reasons I tend not to believe all these benchmarks that show AMD being so much slower than Intel for SSE stuff. As you say, they are very different architectures under the skin.
However, Apple uses their vector libraries to call SSE type instructions so it is up to them to implement each most efficiently for each architecture but once it is done then that is that. Developers shouldn't have to worry.
Fortunately, they support fall-back architectures, which is why you see most software in the Intel world coded for some weird-ass mix of i386 + SSE2/3, or whatever. But to make the leap that learning how to code properly for a completely different architecture teaches you how to code better for your own, well...
Well what? You generally shouldn't need to know the architecture, and if you do then focus on the bottlenecks and fix them. Depends on what you consider being a better programmer I guess. Personally, I think being able to work on many platforms makes for better code. Others think that being able to churn out VB quickly makes them better programmers. Depends on the domain. In my field, cross platform code is a must.
I mean, if you studied German (to use a completely different kind of example), how does that teach you to speak better English? It doesn't.
Actually, multilingual people tend to have much better language skills (i.e. they are better able to express themselves) than monolinguals.
Supporting multiple architectures (which in any kind of responsible sense means giving each one of them somewhat equal priority and status) means having to cater to lower and lower common denominators across all of them, until eventually the disparity between the top and bottom of the scale is so vast that you can only cater to the "lowest common denominator". Besides that, when you want to develop new software to take advantage of new technologies, why would you deliberately want to muff it by forcing it to run on old, by definition not-up-to-date hardware? That just doesn't make any sense.
Do you remember project Marklar? Apple compiled all versions of OS X on Intel because it was a fallback position. OS X is very portable and it wouldn't make sense to drop that portability like MS did with NT. At some point in the future, they may decide to move from Intel CPUs to some other new killer platform. Never say never.