but that is why these newer video cards have a molex connector on them cause the PCI , AGP or PCIe bus cant supply the proper amount of power to the card. if you look at a Geforce2 card it uses the power from the bus, you look at a Geforce 6800 you are going to see 1 or 2 molex connectors on the card so it will function, without them it would not run, cause the bus cant supply that much power to the card.
if you overtax this bus (PCI, AGP, PCIe,) you will run into allot of issues, there is a limit to how much power the said bus can supply to the card in the slot (or combined slots). that is why they put a molex connector on the card so it can get the power directly from the PSU to avoid the bus power overtaxing issue. there is only so much power you can pull from a PCI, AGP, or PCIe
That still has nothing to do with the bus itself. Power is brought through a connection to the 12v, 5v, or 3.3v rail in the power supply depending on the standard (i.e. PCIe, MXM, Mini PCIe, and Express Card use the same bus but their power delivery capabilities are different). This just has to do with how the standards are written. PCIe 2.0 for example doubles the maximum watts provided but that change did nothing to change the bus (architectural changes were all for the purposes of increasing speed like 33MHz PCI -> 66MHz PCI). All that entails is increasing the width of the traces going to the power pins. They did the same with with AGP Pro. The bus itself is agnostic to power input.
Ok - the interfaces, names, etc have changed dramatically, but cards in slots have been around for a long time. Some things are a natural for them, but high power circuits that are included on every (high end) computer are not. Slots should be used for things that most computers DONT need, like fancy RAID controllers, four port ethernet connections, etc.
You can tell when a technology is on its last legs by looking at how complicated it is. These new cards get power from two places, have massive cooling problems, are expensive, etc.
PCI isn't 20 years old. It came out in the mid 90s. The type of slot it's on is similar to VLB and MCA yes, but not identical, and that has nothing to do with PCI or how much power it can transfer. Also it really isn't that complicated to plug a molex connector into a card. At that logic, socketed motherboards are on their last legs since they need not only an extended ATX connector, but an additional 12v line to power just the CPU.
The massive cooling problems and need for all of that power would still exist if you socketed the GPUs. Most of the power is consumed by the cores.
Also if you want to talk about complicated, the latest PMU designs for Intel CPUs are 4-8 phase completely digital designs. Graphics cards still use simple MOSFETs and power transistors. Power regulation itself is not very difficult. The hard part is simply keeping tolerances in check and thats universal as IC speeds increase. The latest ATX spec will tighten the restrictions on voltage ripple for that reason. There is simply no way to avoid increasingly complex power circuitry no matter how you package it.
The bus speed of a card is a statement about how fast the graphics card can copy memory from the system memory to the graphics card. If there is no transfer then the speed is infinite. 'Infinite x' IS really the only way to go forward with design. Here is the present path for graphics: Disk -> system memory -> PCI bus -> card memory -> PCI bus -> back to system memory when card is done with the texture -> back to card memory. With a true GPU on the main bus, you have disk->system memory. That's it.
Memory sharing is a GOOD thing. Just buy memory once, and put it all in one place so that anyone can use it. A 2 Gig graphics card is really expensive right now, but with a motherboard GPU, the graphics card can use 2 Gigs of RAM if it needs it, but usually it will not. Heck, on a 16 GB machine, a 3D walkthrough could easily have a 10 GB set of textures. With a 500MB graphics card on a bus, the textures have to unload and load from the card many hundreds of times, using memory bandwidth, etc. It just does not make sense. GPUs have access to the main memory on the computer, but it still chokes up the memory controller, plus you have 2 copies of textures sitting around..
It would be nice if it could work that way. The problem with your "infinite X" idea is that without having memory for the card, every execution the GPU makes will require going to the memory. The entire point of having on board memory is to avoid this. It's the same reason why they put caches on CPUs. You're not saving anything. You also don't need two copies of the texture because it's cached in the memory of the card.
The card's memory is massively faster than the connection between FSB and main memory. On average, a P35 (the current fastest chipset as the X38 lags behind by a few percent) can gross ~7GB of bandwidth with quality memory at the 1333MHz FSB at maximum. The 8800 Ultra is rated at
103GB/s bandwidth due to a faster wider bus. The maximum throughput of a 1600MHz FSB is 12.8GB/s. Mind you, the FSB is considered a bottleneck which is why Intel is moving to an integrated memory controller soon.
In addition, when you cut the width of the memory bus (halving the bandwidth generally) performance is simply destroyed (not as critical with the G92 since they're still on a fast 256 bit bus but there is a noticeable difference in some games). This has been noted since the early Radeon days when companies would save money by marketing 64 and 128 bit versions of 128 and 256 bit (respectively) graphics cards.
Also why would you further bottleneck the CPU by using the memory of the system? This is the half the reason why Intel Integrated graphics have always been lamented.
Also you're assuming that you get a huge increase in performance from massive amounts of ram. While its true that textures are getting larger, you're not seeing a 1:1 relationship between memory size and performance. Unfortunately the improvement is never that good.
http://www.pcstats.com/articleview.cfm?articleid=2159&page=1
Petty much sums it up.
In fact, until you get to an utterly expansive game like Oblivion and turn every graphics option on, HDR, and full AA & AF at 1600x1200 and above, 256MB video cards are still a viable option.
http://www.firingsquad.com/hardware/xfx_geforce_8800_gt_256mb_xxx_review/page9.asp
Also the memory controller still comes into play no matter how you send data to the memory, PCIe bus or otherwise. There is no second interface to the memory that a socketed GPU could go through, not to mention that you simply would not want to do this.
[quote[As an example of this, look at OS X. Each window is put onto the screen by the GPU, which allows for all sorts of nice effects like the shadows, etc to happen with 'no' CPU load. But - each of the many windows on your machine has to pass through each system on the computer for each change that happens to it. For each keystroke you make, the window has to be drawn by the CPU, and then reloaded across the PCI bus to the GPU as a new texture. So even though the CPU load is low, there is lots of action on the memory - with at least 2 copies going on, etc. With the right driver and system software, all of this evaporates - the GPU can just show the window.
[/quote]
You're confusing the two display modes OS X will implement. While under Quartz Extreme, the GPU will handle the processing of the UI. This is exclusive from standard Quartz where the CPU tells the GPU what to draw. Thats why Quartz Extreme systems feel zippier than their disabled counterparts. Also the windows are always processed, not just when things are changing. Once again there will not be two copies under Quartz Extreme. Date is sent directly to the video card. And yes, you're right, with the right software, this does all disappear, it's called Quartz Extreme.
If the bus on the computer is too slow, then widen it and speed it up. That way the whole machine benefits from having blazing fast memory. The money spent on making that fast bus on the graphics card can be used on the system bus.
Easier said than done and not quite. If Intel could just snap their fingers and give us a 512bit bus, I'm sure they would, but the cost and complexity of the components would be ridiculous. Current video card designs treat the memory chip(s) as several memory busses and patch them together a la Dual Channel memory. This is very expensive to do in terms of transistor count. Also memory bandwidth is of most importance to GPUs since it's always dealing with memory since it receives a constant stream of data from it's own fast memory. CPUs on the other hand spend a considerable amount of time running logic and doing other tasks that don't necessarily require large amounts of bandwidth. How much bandwidth can MS word possibly require? Applications that require massive amounts of bandwidth are going the exact opposite way of what you're suggesting, and are being moved onto graphics cards. GPGPUs are being thought of as the next wave in technology because they offer large amounts of floating point power. Sharing a memory bus with a CPU would simply cripple a GPU.
When you widen the bus to an extreme degree and make it very fast, you can share memory, but the poor performance of the PS3 (which uses such an architecture) in some cases has been
documented. Granted thats not the entire issue (the complexity of programing for 8 SPEs is another), the "technically inferior" Xbox 360 (sharing the same basic instruction set as the Cell's PPE), as well as PCs of comparable spec, which use a traditional architecture, have no such problems.
Yes they will have to redesign chips to make this work. The 'redesign' is only on the mundane i/o parts of the design. The benefits are numerous: Power problems go away, the computer maker knows where the heat is coming from, there are no cards to rattle around.[/url]
I assume the last part is just a dig. Your system may have be missing some screws if you're being literal here. The mundane IO parts you're talking about happens to be the FSB, not something thats easy to design. Also the power problems won't go away just by moving a very hot chip from a card into a socket. Thats like moving a 7L V8 engine from the front of the car to the back. You still need to cool it just the same. Also you still need power circuitry. Nothing goes away except the dedicated memory which doesn't get all that hot to begin with. Actually if you decide to increase your system memory so more is allocated to the GPU, you may end up with more memory chips and more heat.
Want more memory? Add more and the GPU, CPU all get it. GPUs have impressive performance on some non graphics problems, like simulations (eg weather), but now, with the GPU tucked away on some exceeding slow bus, you kinda have to either do your whole job on the card or on the CPU. With everything handy, you can program using both. The list goes on. Plus its cheaper!
You're right, one advantage would be more memory for the GPU, but the bottlenecks, as explained above, simply makes it not worth it. Also (as mentioned before) in this situation, the memory controller and FSB will have to deal with increased overhead from having another device to feed directly.
A radical redesign would not JUST be for Apple, but rather eventually for everyone. It is just that Apple can ship computers with a new technology easier than the windows world. They have done that before more than a few times.
Something big in terms of performance would definitely never be an Apple exclusive. Apple is a customer of Intel, NVidia, and ATI, just like everyone else, and mind you they're a small one at that. Also big gaming performance is definitely something better shown off through companies like Voodoo PC and Alienware. Apple's game selection is a bit slim to be considered a serious gaming platform.
NVidia's Motheroard GPU is really a step towards this goal. The reason that intel integrated graphics are so bad has everything to do with Intel's graphics team. All the good people there work on CPUs. 'Everyone knows' that integrated graphics are bad. So NVidia needed a new marketing name.
A new name won't save it from the problems inherent with integrated graphics. Honestly, they already know how to use the system memory for graphics. If it was the best way forward they would have been there already. Integrated graphics is not anything new. Nvidia has been doing it since the NForce2 IGP. Intel's solution isn't bad because they put "bad workers" on it. It's designed for business and home uses that exclude high performance 3D. You don't enter an economy car in the Daytona 500, and likewise, an 800HP stock car might make for a fun ride to the store but 4MPG car that costs hundreds of thousands of dollars probably isn't the best choice.
--------
The real future for GPUs and CPUs sharing the same turf is integration onto the die. AMD and Intel are both planning this, but it's not going to be for the highest end GPUs, just the lowest end ones. Both Intel and AMD are planning to make CPUs that feature integrated GPUs but right now the markets are fuzzy and no one knows if they'll be for joe 6 pack and his dell (to, as you said, decrease costs) or if they'll make their way into things like PDA, Cell Phones, and multimedia devices like the iPod Touch/iPhone. It would definitely improve the BOM to replace the three chips they're using now with one that featured the GPU, CPU, IMC, and the local interconnect (Intel and AMD have both shown slides showing roadmaps with CPUs integrating PCIe, GPU, CPU, and IMC).
Really this isn't new stuff besides the integrated GPU. Freescale had this planned since the Motorola days (the e800) and I believe their integrated IMC, CPU, PCI, IO chips have already been in production for a while now. A lot of your enthusiasm for these concepts is good, just placed in the wrong direction and wrong markets

.