It'll be interesting to see if the extra feature-set is actually meaningful to users rather than being a marketing check-point. In the tablet/mobile space DirectX is only available on Windows Phone and Windows RT neither of which have great penetration. OpenGL is not supported by either Android or iOS, only OpenGL ES. It makes for great demos, but unless there is a big change, those features won't be accessible to most users or developers.
GLES is there mainly as a historical curiosity. If game developers can port PS3 game engines to iOS or Android, OpenGL support will come with them. DirectX is going to stay Windows-only, of course, although full API compatibility could make things like the Surface more appealing.
The API compatibility is just a side-effect of using the same core. There are other tangible improvements, such as to physics and lighting simulation.
They also compete in spaces that Apple has no interest in. For example, their line of dedicated ray tracing hardware has only professional applications at this point. They also just acquired MIPS, which is found in set top boxes and other closed box type hardware that Apple isn't in.
Absolutely, they aren't just an Apple subsidiary. I don't expect them to go bankrupt, but I don't think they can compete with Nvidia given the latter's resources and experience developing GPUs for gaming (including industry relationships).
The Mali design has appeared in a few Samsung and Chinese designs, but no consistent history of big design wins. Since using it, Samsung has also used ImgTec or just outright used a Qualcomm SoC (and their Adreno graphics) in some major territories like the US.
ARM purchased Mali, then totally redesigned the hardware with the T6XX series chips. I know that they're very committed to taking it forward, and they have some IP they are developing that Apple have expressed tentative interest in (NOT the T6XX chips themselves; some tangential stuff).
There are a lot of reasons (especially relating to licensing) that would mean it would make sense for Apple to use an ARM-developed GPU core in the future. However, Apple care more about making great products than saving licensing costs; they would only do it if ARM could develop competitive GPU IP.
Samsung going with the Adreno 330 in the Note3 is likely due to the fact that it's a brand-new architecture. It will take a few generations for it to really show what it can do. The Exynos chip also includes Samsung's 8-core A15 IIRC. I would guess that they wanted to try it out on the lower-volume model, and there are likely yield/production reasons for doing that, too. Don't think ARM's IP can't compete with Qualcomm's.
Since Nvidia doesn't license their GPUs out (they do offer licenses, but have no design wins), they compete at the SoC level. As an overall SoC package, they have only a mixed history of success. They're particularly weak in their radio offerings, which is why Qualcomm gets so many wins.
The actual GPU used in SoCs has pretty low visibility with most smartphone buyers because they simply don't care. The common denominator in graphics moves forward, their phone can play the latest game they want, and that's all most care about.
Furthermore, K1 will show up the latter half of this year with a 32 bit ARMv7s design. By then, Samsung and Qualcomm have moved onto 64 bit. K1 won't be there until second half of this year if on schedule. Their overall SoC offerings still aren't that compelling when treated as a complete package.
32-bit/64-bit doesn't matter for gaming performance. Firstly, high-end games are going to be written in C so advantages to the Obj-C runtime/Android Java VM won't matter. Secondly, even those extra registers aren't going to make a difference against a Kepler which will handle your physics and lighting on the GPU.
Nvidia's historical SoC success isn't that important IMO - they're clearly doubling-down on mobile so I expect it to improve significantly. As someone else mentioned, they recently (6-months ago) announced they will license their GPU IP, then at CES they announced that rather than having a parallel Tegra GPU core they will use the same GPU core from their desktop products.
As for application performance, I don't expect 64-bit to have the same impact on Android as it does on iOS. iOS uses the extra bits in every pointer to store information that was previously put in to a side-table. The performance improvements in the Obj-C runtime come from not having to look up this data in a table every time because it's encoded right in to the pointer itself. I'm not sure Android's garbage-collected Java VM will be able to make such effective use of it or that it will solve their performance issues; certainly not right away.