Sources:
- Vijay Anand's HardwareZone article
Intel's CPU Roadmap: To Nehalem and Beyond from March 2008 covers a lot of interesting areas.
- Cnet's CPU Roadmap: 2008 & beyond article
here (which also covers AMD)
How Intel's plans for platforms & processors in the near term are guided by the multi-year plans extending past 2009, with Sandy Bridge & Larrabee.
In the 2nd post
here "future versions" are just touched on -
- Aliceton
- Dunnington (the last of the Penryn generation a single die 6 core CPU),
- Gainestown - Based on Nehalem microarchitecture
- Beckton - 8 or more core Nehalem based CPU
and mentioned that Mac Pro's are more likely to take on Gainestown CPUs in a dual socket format, than a Beckton 4 socket CPU version which would need FB-DIMM2 memory and not overclock well in comparison.
Dunnington
- last of the Penryn generation
- 6-core processor 45nm
- Works on the Caneland multi-processor (MP) platform (Intel 7300 chipset)
- Thus an big upgrade from the Tigerton X5300 processors which also work on the Caneland MP platform (i.e. pin compatible - drop em in)
- 4 can be used in a quad socket board for server level computing.
- Ready 2H 2008
Swapping out Tigerton processors with Dunnington would thus give a move potentially from 16 processing cores to 24.
As said, Dunnington it seems is a stop gap till Nehalem based MP product comes out, for those wanting to buy new, rather than upgrade Caneland platform based servers. Not really most MacRumor users then!
With full scale availability early 2009, Nehalem deserves a decent look.
Nehalem
QPI - QuickPath Interconnect
- Intel's version of AMD's HyperTransport
- Gives high speed inter-component, inter-processor communication.
- If you've got multiple CPUs, QPI will connect between them.
- Currently it's 1x QPI link per CPU socket, but this may well change.
- They can deliver a total bandwidth of ~25Gb/s per link.
- Can have hot-plug capability, e.g. a processor card. Might not appear at the start.
Integrated DDR3 Memory Controller:
- Improves memory bus bandwidth (via the tri-channel controller)
- Improves the memory bandwidth handling capacity
- Supports registed & unregistered memory DIMMs
- Supports current DDR3 800/1066/1333 Standards. No doubt it'll have to scale, as DDR3 is getting overclocked to 1800 already.
Tri-channel means 3 memory channels per processor, with each channel supporting up to 3 DIMMs, so
1 CPU = 3-9 memory slots.
And here's the rub: Depending on the board used, it'll have 3,6 or 9 slots. A dual socket server board could have up to 18 DIMMs
A quad socket server board could have up to 36 DIMMs.
4GB a DIMM. $2,399.00 per 8GB. Want to max it? >$172,000 for 128GB if you don't have a discount, and use Apple RAM (which isn't competitive)...
Even if you went down the dual socket mainstream DDR3 route, you could get 72GB, which isn't anything to sniff at... I'd sure love to see a memory specialist pimp a machine out with that much.
Integration of a graphics core into Nehalem could occur - as hinted at on page 4 of the hardwarezone article
here. More would be heard from at IDF by Intel you'd imagine. Nehalem multi-core chips prior to Snow Leopard, and then ones with integrated graphics core(s?) at some point after? We'll see.
The architecture has lots more
- Increased Paralellism
- Better Algorithms
- Enhanced Branch prediction
- Simultaneous Multithreading (SMT):- (SMT doubles the potential number of overall threads that can be run simultaneous on each core). Intel reporting SMT can deliver 20-30% more performance depending on the app at just a slight increase in power consumption. So the more threaded the workload, or application, the better the gains. I
- Intel SSE 4.2
- Improved Virtualization Performance
Beyond Nehalem? Westmere is the die-shrink to 32nm, and Sandy Bridge is the 32nm change of microarchitecture, bringing new extensions to the instruction set: Advanced Vector eXtension (AVX) - 256 bit vectors that will increase peak FLOating Point performance (FLOPs) (Up to 2x).
AMD
Put down $4.2 billion and some stock to acquire ATI Technologies. Like Intel, AMD's got the visual computing buzzword too
AMD's CE: "Visual computing is playing a larger role in what we are doing, going forward."
With Socket AM3 desktop chipset, PUMA mobile platform, 45nm Opteron server/workstation, Shrike mobile platform (a unified CPU, chipset & GPU, to creates one APU - Accelerated Processing Unit) in the pipeline.
However, Intel has in that same timeframe 45nm Nehalem with QPI then 32nm Westmere coming out, & Intel's mobile plans inc. Nehalem C2D - Auburnsdale, Nehalem C2Q - Clarksfield, both of which incorporate an on-die GPU. Maybe Calpella, the successor to the newly introduced Centrino 2 (aka Monteivina which isn't a world mover) will actually be a bit more of a crowd pleaser, and QPI user.
AMD has 45nm plans, desktop chips to go dual, quad, octo core, with Bulldozer apparently up to hexadeca 16 core for 2010 potentially. But As Cnet points out, by 2010, a year on from Snow Leopard, dual/quad core chips will be ubiquitous, and thus Mac Pros will likely be 8 core and above, with 16 or more threads on Intel, and Intel will have bragging rights to the first native octocore. And an octocore Mac is like a unicorn burger. Mighty tasty, but a long time in coming.
2005 - Dual core Pentium
Extreme 840 90nm 3.2GHz? Yours for over $1000
2008 -
C2D E8500, the fastest of the Wolfdales, 3.16GHz, yours for ~£400
2005? "Unfortunately, not all applications are multithreaded, and many won't be for months or even years into the future."
Larrabee: Visual Computing
Visual computing means computation of visual information - rendering, HD audio/video processing, physics model processing.
They plan to do this by:
utilizing a programmable and readily available architecture such as several simpler Intel Architecture (IA) cores. Intel plans to add a vector computational unit to each of the cores as well as introduce a vector handling instruction set. They believe their leadership in the total computing architecture of the various platforms and a vast software engineering department will help them achieve their goal of creating Larrabee.
Intel can then scale it up as required for different market areas. Expected in the 2010 timeframe (just in time to match AMD's Fusion), so within about a year from Snow Leopard. Is it a discrete GPU? Is it more along a graphics card? Could it use QPI and drop into a socket on new Nehalem boards?
Who's to say Larrabee couldn't complement a rival graphics card, and be a slave object?
It'll support Direct3D, Open GL, and i'd be suprised if it wasn't happy using OpenCL either.
IDF - 2007 gave some information from Intel about a system on a chip SoC design e.g.
here. Since it's been quiet since, i'd imagine there's something in the skunkworks. the fetchingly titled EP80579 Integrated processor family just sounds soooo.... intriguingly boring.
It's a CPU core (Pentium M - w00t! the predecessor to the predecessor of Core 2, but in all fairness, what my Dell runs on
), memory controller, integrated GPU, input controller (ethernet, USB etc) and other various gubbins depending on the chip flavour.
Kinda cool you can fit all that into a chip. 11Q TDP up to 21W, so made for MIDs primarily.
Update for Larrabee
It looks like a GPU and acts like a GPU but actually what it's doing is introducing a large number of x86 cores into your PC
As kinda thought of
SIGGRAPH will have some interesting information, and Intel has opened up some more info, ahead of presenting a paper called "A Many-Core x86 Architecture for Visual Computing."
The frontpage macrumors article
here comes from extremetech.com article
here as they got a preview.
It's a stand-alone chip, based on the universal Intel x86 architecture. It's aimed at the PC market - i.e. Gaming, and being a competitor to Nvidia and AMD-ATI's discrete (stand-alone) GPU products.
Extremetech points:
- Intel is saying/hinting that "the first Larrabee-based products will be graphics cards targeted at the performance or high-end graphics segment. Those cards will also be able to perform other stream computing tasks."
- Larabee won't be showing up integrated onto motherboards or aimed at the high performance computing mainframe market right out of the gate.
- No actual figures of performance
- No feature sets for products
- No word on core numbers for products (Core count shown on slides from the briefing went from 8 to 48)
- Extremetech has the release date as late 2009/2010
Larrabee is aiming at the many core CPU, that's also a programmable GPU chip. With those many cores essentially based on the Pentium architecture., with multi-threading & 64 bit instructions thrown in.
The key takeaway here is that almost the entire graphics pipeline is being rendered in software, albeit software running on specialized, high performance x86 CPUs with specialized vector units, not on the host processor in a PC.
What else does it do? Support for "full context switching and preemptive multitasking, virtual memory and page swapping, and full cache coherency. These are features developers have come to expect in modern x86 CPUs, but don't yet exist in modern GPUs."
The arrangement of the processing cores on the chip means
performing almost all parts of the graphics pipeline in the same bank of general purpose processors allows for perfectly efficient load balancing: You're never "wasting silicon" in Gears of War if you have a bunch of render back-ends that sit largely idle while they would improve performance greatly in F.E.A.R.
Intel says all this means getting near to a linear scaling of power. More cores that can multi-task means more power to do that range of functions, you don't get that much of a tail off. Intel isn't giving actual figures yet, just relative ones, but check out the speed increase, as more cores are used.
Memory bandwith is an issue, so Intel uses "binned rendering" (aka tile rendering - splitting down a frame into chunks - tiles - and then efficiently sorting these out, then rendering them). Their techniques could help reduce the memory bandwidth per frame by >2x. The tile size is altered so that one processor in Larrabee can process one tile. The more the cores, the smaller the tile &/or the faster the frame rate i'd imagine.
We also don't know what Nvidia and ATI/AMD will have in that time frame. That's well into the next major architectural overhaul, and it's certainly possible that those companies are working on many of the same "neat on paper" ideas Intel has been with Larrabee. Certainly, a number of them make a lot of sense when you consider the increasing generalization and programmability of GPUs. So the story on Larrabee is hardly beginning, and where it fits into the competitive landscape is still entirely unclear.
Did Apple get a heads up about Intel's plans, prior to starting work on Snow Leopard?
And how small will these go? Seeing as the iPhone is only 640x320, it's a fraction of a desktop screen. Could they just make a smaller one, or a low TDP version? It'd be interesting to get the stats vs the current successor chip to the iPhone.
SIGGRAPH
Some other interesting things:
"Advances in Real-Time Rendering in 3D Graphics and Games"
"EDT-IPT 2008 Emerging Display Technologies and Immersive Projection Technologies"
Zcam being
back
Mocap for the masses with iPi Soft - Desktop Motion Capture (aka Shoot 3D) using a digital camera/web cam to do home mocap
Intel's paper of "Why 3D Application Development is Driving Graphics-Industry Convergence" is on August 12th.
image-metrics.com's photo-real animation
RapidMind, Inc
AMD has a few bits and bobs, including "A Unified Programming Model for Multi-Core CPUs and Many-Core Accelerators by AMD",
GPU-Accelerated Video Encoding: State of the Art
Thursday, 14 August, 1 - 2:30 pm
Hall G, Room 1
NVIDIA has several presentations, including "CUDA: The Democratization of Parallel Computing"
Interesting haptic and tactile progress
also
Butterfly Maglex Haptic's are showing up too, a personal fave.
Airborne Ultrasound Tactile Display - A kind of
theramin (3D force fields
)
Stop motion goggles, a flat sheet communication device.