Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
what can u give that does not affect your privacy, anything acceptable

This paper (http://ieeexplore.ieee.org/Xplore/l...0641683.pdf?arnumber=641683&authDecision=-203), as summarized by google:

A 533-MHz BiCMOS Superscalar RISC Microprocessor - Solid-State ...
by CA Maier - 1997 - Cited by 17 - Related articles
Cliff A. Maier, Member, IEEE, James A. Markevitch, Member, IEEE, Cheryl Senter Brashears, ...... He is currently at AMD, Sunnyvale, CA, where ...
ieeexplore.ieee.org/iel3/4/13972/00641683.pdf?arnumber=641683

Does that work? (you could also buy the paper and read the bio, of course :) If you google "cliff maier AMD" you'll see various references, though some of them are merely me saying I work at AMD in interviews and stuff.
 
I used to build quite a few computer systems some years ago with AMD processors. This was back in the days of the Athlon/Athlon XP and Intel's Pentium 4.
AMD was in many ways on top in those days, however IMO they haven't had anything as good as Intel in a long time.
And for the premium prices you pay for an Apple computer, you better be getting the best CPU available.
 
Would Apple really risk their relationship with Intel?
Could they get the same price deals through AMD?

There's little risk. Multiple vendors (big ones like HP and Dell) put AMD processors in their lineup years ago. The problem with the current relationship is that Intel can be led to believe in a false sense of security when they are the sole supplier. When you have AMD, even at the low end, it gives you negotiating power that you wouldn't have.
 
I used to build quite a few computer systems some years ago with AMD processors. This was back in the days of the Athlon/Athlon XP and Intel's Pentium 4.
AMD was in many ways on top in those days, however IMO they haven't had anything as good as Intel in a long time.
And for the premium prices you pay for an Apple computer, you better be getting the best CPU available.
Except for the server market, Intel's fastest CPU in a market segment clearly outperforms AMD's fastest (they also happen to have equivalent core counts, and Intel core > AMD core). The thing with Apple is that their Macs already aren't using the fastest CPUs in a market segment. The CPUs on the high-end Macs (MBP, iMac) seem to be TDP-limited, while those on the low-end Macs are probably price-limited (except for the 13" MBP and the GPU issue, which Llano should solve). So next year there will be situations where a 4-core AMD CPU has similar TDP and probably price as 2-core Intel CPU. In those cases Intel's CPU advantage is reduced and might even be reversed.
 
The team that did all the great design work for Athlon 64/Opteron is gone. Forced out or quit in disgust. They now work at Apple, Oracle, misc. startups, and, in my case, changed careers.

It seems you can apply this job :

https://www.amd.apply2jobs.com/index.cfm?fuseaction=mExternal.showJob&RID=9426&CurrentPage=1

It's suitable if you want to get back to work for AMD. I believe You will be the best candidate to replace Tom McCoy. ;)

Requisition Number: 9426
Posting Date: 26/Jan/10
Area of Interest: Legal
Job Title: Senior Corporate Counsel, Patents
Intern/Coop/Student Term:
City/Town: Sunnyvale
State/ Province/ Region: California
Country: United States
Job Description: Senior Corporate Counsel, Patents will be responsible for the management and direction of a significant portion of AMD's world wide patent portfolio. Reporting to the Director, Patents & Licensing the candidate will have:
* an EE degree with specific semiconductor manufacturing and CPU experience
* 5+ years experience in private practice (and preferably some in-house/corporate experience) with expertise in drafting and prosecution including the preparation and prosecution of semiconductor related patent applications before the USPTO and preparation and prosecution management experience in major jurisdictions
* USPTO qualified
* Called/Qualified to a US state bar
* Previous management of a significant size patent portfolio
* IP transactional experience
* Litigation/Contentious issue experience

At AMD, we are committed to equal employment opportunity.

AMD does not accept resumes from headhunters, placement agencies or other suppliers that have not signed a formal agreement with us. Our supplier base is restricted to specified hiring needs. Therefore, any resume received from an unapproved supplier will be considered unsolicited, and AMD will not be obligated to pay a referral or placement fee.

But, it seems these rumors that linked with your posts have been floating around as a topic in some Internet forums.
 
Anandtech has the Thuban review up.

Intel dominates in IPC but Thuban does well in threaded applications with the extra cores and it's cheap.

AMD is probably losing their shorts on selling a 346mm^2 CPU for under $300 but I may have to build a system next year to tinker around with depending on what that amount buys me next Spring.
 
Hi cmaier,

as you seem to have a lot of insight into the development of Bulldozer, you might find interesting the following quote with many bits of information, which I received from someone (who’s not a native English speaker as it seems), which I don’t want to disclose. Maybe you can confirm it. He also sent me a slide, which has to be kept secret, so I can’t post it here. I also heard of an unchanged instruction cache, 16k 4-way L1 data caches per core and up to 2M L2 per module.

“So as you can see, there are 4 full integer pipelines per core, capable of doing up to 4 instructions per cycle or in
the case of less utilization can run two branches eliminating branch misprediction.
It can fetch from several threads (program pointers) alternatingly including possible branch targets. For that to
work the branch prediction unit (BPU) tries to identify branches and their targets and controls the working of the
IFU. If the instruction queues of the units to be fed are already filled at high levels, the IFU/BPU pair tries to
prefetch code to avoid idle cycles. Having prefetched the right code bytes in 50% of all fetches is still better than
having no code ready at all. In reality this number is even better.
After a block of 32 code bytes is fetched and queued in an instruction fetch queue, the decode unit receives such
a packet each cycle for decoding. To decode it quickly, it has four dedicated decode subunits, where each of
them can decode most x86 and SIMD instructions on it’s own and quickly (1 per cycle and subunit). More
seldomly used or complex instructions are decoded using microcode storages (ROM and SRAM). This can
happen in parallel with the decoding of the „simple“ instructions. There are no inefficiencies like in K10. XOP and
AVX instructions are either decoded one per subunit (if operand width is <= 128 bit) or one per two subunits (256
bit, similar to the double decode of SSE and SSE2 instructions in K8). The result are „double mops“ (pairs of
micro ops, similar to the former MacroOps). After finishing decoding, the double mops (which can have one
unused slot) are sent to the dispatch unit, which prepares packets of up to four double mops (dispatch packet)
and dispatches them to the cores or the FPU depending on their scheduler fill status. Already decoded mops are
also written to the corresponding trace cache to be used later if the code has to be executed again (e.g. in loops).
Thanks to these caches, the actual decoding units are free and can be used to decode code bytes further down
the program path. If a needed dispatch packet is already in the cache, the dispatcher can dispatch that packet to
the core needing it and in parallel dispatch another packet (from the decoders or the other trace cache) to the
second core. So there won't be any bottleneck here.
The schedulers in the cores or FPU select the mops ready for execution by the four pairs of ALUs and AGUs per
core, depending on available execution resources and operand dependencies. While doing that, there is more
flexibility than was in the K10 with it’s separate lanes and the inability of OOps to switch lanes to find a free
execution ressource. To save power, the execution units are only activated, if mops needing them become ready
for execution. This is called wakeup.
The integer execution units - arithmetic logic units (ALUs) and address generation units (AGUs) - are organized in
four pairs - one per instruction pipeline. They can execute both x86 integer code, memory ops (also for FP/SIMD
ops) and, which is the biggest change, can be combined to execute SSE or AVX integer code. This increases
throughput significantly and frees the FP units somewhat. The general purpose register file (GPRF) has been
widened to 128 bit to allow for such a feature. The registers will be copied between GPRF and the floating point
register file (FPRF) if an architectural SIMD register (the registers specified by the ISA) is used for integer first
and floating point later or vice versa. Since this doesn't happen often, it has practically no impact on performance.
Instead the option to use the integer units for integer SIMD code (SSE, XOP and AVX) the overall throughput of
SIMD code increases dramatically.
The FPU contains the already known two 128 bit wide FMAC units. These are able to execute either one of the
new fused multiply add (FMA) instructions or alternatively an floating point add and mul operation (or other types
of operations covered by the internal fpadd and fpmul units). This ability provides both a lower energy
consumption and higher throughput for the simpler operations. As AMD already stated, the two 128 bit units will
be either used in parallel by the two threads running on the integer cores but could in cycles, where one core
doesn't need the FPU, both be used by only one thread, increasing it's FP throughput. This happens on a per
cycle basis and resembles some form of SMT. The FPU scheduler communicates with the cores, so that they can
track the state of each instruction belonging to the threads running on them.
Both the integer and the floating point units need data to work with. This is provided by the two 16k L1 data
caches. Each core has it's own data cache and load store unit (LSU). The load store unit handles all memory
requests (loads and stores) of the thread running on the same core and the shared FPU. It is able to serve two
loads and one store per cycle, each of them up to 128 bit wide. This results in a load bandwidth of 32B/cycle and
a store bandwidth of 16B/cycle - per core. A big change compared to the LSU of the K10 is the ability to do data
and address speculation. So even without knowing the exact address of a memory operation (which isn't known
earlier than after executing the mop in an AGU), the unit uses access patterns and other hints to speculate, if
some data is the same as some other data, where the address is already known. And finally the LSU is also able
to do execute all memory operations out of order, not only loads. To make all this possible with not too big effort
the engineers at AMD added the ability to create checkpoints at any point in time and go back to this point and
replay the instruction stream in case of a misspeculation.
To reduce the number of mispredicted branches and the latency of the resulting fetch operations, the branch
predictors have been improved. They are able to predict multiple branches per cycle and can issue prefetches of
code bytes, which might be needed soon. Together with the trace caches, it is often possible, that even after a
branch misprediction (which is only known after executing the branch instruction), the correct dispatch packets
are already in the trace cache and can be dispatched from there with low latency.
One big feature, which improves performance a lot, is the ability to clock units at different clock frequencies
(provided by flexible and efficient clock generators), to power off any idle subunit and to adapt sizes of caches,
TLBs and some buffers and queues according to the needs of the executed code. A power controller keeps track
of load and power consumption of each of the many subunits and adapts clocks and units as needed. Further it
increases throughput and power consumption of heavily loaded units as long as the processor doesn't exceed it's
power consumption and temperature limits. For example if the queues and buffers of core 0 are filled and the
FPU is idle, then the power controller will switch off the FPU (until it will be waked up for executing FP code) and
increase the clock frequency of core 0. If core 0 has not that many memory operations (less pressure on cache),
the cache might be downsized to 8kB, 2-way by switching off 2 of the 4 ways it has. This way the power, the
processor is allowed to use, will be directed to where it is needed and not to drive idle units. This is called
Application Power Management as you might heard in some rumors on the net.“

At least the details don’t sound like the architecture would be a miss. The guy also told me, that first samples (don’t know, if already 32nm) run very well with really good performance and power characteristics, outperforming their fastest desktop chips already.
 
Anandtech has the Thuban review up.

Intel dominates in IPC but Thuban does well in threaded applications with the extra cores and it's cheap.

AMD is probably losing their shorts on selling a 346mm^2 CPU for under $300 but I may have to build a system next year to tinker around with depending on what that amount buys me next Spring.

Some people did the calculations and came up with $60 per die. Might be a bit more, but not much. Also AMD has to pay for unused capacity which it reserved at GF.
http://www.semiaccurate.com/forums/showthread.php?p=45135&$5000#post45135
 
Nope, you missed your other two. ;)

And like cmaier, "If you'd like to limit your discussion to the topic at hand i would be delighted to participate." So don't bother replying, you're just wasting your time. Subsequent readers can draw their own conclusions. ;)

You will not dictate to me to what I limit the discussion to? Get it?
 
I think they might switch since Mac gaming is taking off.

I'm just putting it out there.

When it comes to gaming the CPU is not as important as the graphic card and well Apple has to started putting better GPU in their computers first.

Also as it has been pointed out before AMD mobile chips have always sucked compared to Intel's. This includes during the Atholon hey days. AMD mobile chips still sucked back then.
 
When it comes to gaming the CPU is not as important as the graphic card and well Apple has to started putting better GPU in their computers first.

Also as it has been pointed out before AMD mobile chips have always sucked compared to Intel's. This includes during the Atholon hey days. AMD mobile chips still sucked back then.

Exactly. With a big fab to pay for and not enough people in R&D it was not possible to create dedicated mobile designs. Before K7 it was already difficult to pay those guys who were there, not to think about adding more teams. Those one size fits all designs (with some tweaking of the transistors) were not the most sophisticated. But this seems to have changed and the next designs look good. If you design for mobile you simply get better results than if you design for desktop or server.
 
Hi cmaier,

as you seem to have a lot of insight into the development of Bulldozer, you might find interesting the following quote with many bits of information, which I received from someone (who’s not a native English speaker as it seems), which I don’t want to disclose. Maybe you can confirm it. He also sent me a slide, which has to be kept secret, so I can’t post it here. I also heard of an unchanged instruction cache, 16k 4-way L1 data caches per core and up to 2M L2 per module.

“So as you can see, there are 4 full integer pipelines per core, capable of doing up to 4 instructions per cycle or in
the case of less utilization can run two branches eliminating branch misprediction....
Nice post, very informative, thx. Bulldozer sounds exciting and spot on. I like the offloading of Integer SIMD onto the integer units as well as simultaneous branch execution. Throw in a OpenCL on the integrated chipset and this could be perfect for Apple considering Intels efforts to cripple GPGPU.

I hear that bulldozer will hit 5GHz+, anyone else hear anything?
 
Looks like AMD is scoring some major wins in the mobile space.

I read threw that and it looks like those chips are yet again all low end CPU in the mobile space. AMD is not offering anything in the higher end of the market so Intel will still be making higher quality stuff.

I would be the first to be cheering to see AMD really force intel to move again. AMD blew its chance when it had Intel being a sleeping gaint, and had a kick ass product that beat the crap out of anything intel.

Instead of moving on and taking that lead they formed and running with it they sat on it.
 
I read threw that and it looks like those chips are yet again all low end CPU in the mobile space. AMD is not offering anything in the higher end of the market so Intel will still be making higher quality stuff.

I would be the first to be cheering to see AMD really force intel to move again. AMD blew its chance when it had Intel being a sleeping gaint, and had a kick ass product that beat the crap out of anything intel.

Instead of moving on and taking that lead they formed and running with it they sat on it.
It looks like AMD is starting to win the volume areas and the price/performance space. Considering being one generation behind in process size, they seemed to have closed the gap considerably. If Fusion, their APU lives up to the promise, Apple's notebook market could be a big win for AMD, especially with Intel slowing down Apple's GPGPU progress on the Macbook AIR and effectively stopping its progress. It feels like there is a lot of love lost between Intel and Apple lately.

PS did you mean higher performance/w as opposed to "higher quality"?
 
It looks like AMD is starting to win the volume areas and the price/performance space. Considering being one generation behind in process size, they seemed to have closed the gap considerably. If Fusion, their APU lives up to the promise, Apple's notebook market could be a big win for AMD, especially with Intel slowing down Apple's GPGPU progress on the Macbook AIR and effectively stopping its progress. It feels like there is a lot of love lost between Intel and Apple lately.

PS did you mean higher performance/w as opposed to "higher quality"?

intel's customers knew about the i CPU integrated graphics for years now. it wasn't some big shock
 
I would be the first to be cheering to see AMD really force intel to move again.
Really? Because it sounds like you're of the exact opposite mind. Everything they do sucks and is too little too late according to you. You seem to blame everything on AMD without acknowledging that they're up against a behemoth that's several orders of magnitude larger than they are. Even just staying a few steps behind Intel is quite an accomplishment. Who else has managed this feat in the x86 market? Many have tried, few have succeeded.
 
Really? Because it sounds like you're of the exact opposite mind. Everything they do sucks and is too little too late according to you. You seem to blame everything on AMD without acknowledging that they're up against a behemoth that's several orders of magnitude larger than they are. Even just staying a few steps behind Intel is quite an accomplishment. Who else has managed this feat in the x86 market? Many have tried, few have succeeded.

Do you know what I have running in my 6 year old desk top?
It is an Athlon 64 3000+ CPU. 6 years ago AMD was kicking intels rear in desktops chips. Their CPU were faster, and cheaper. AMD figured out x86-64. My Athlon I paid 150 for. It was beating out $300+ intel chips. Shortly afterwards AMD dropped the price to $75 for the same chip and it still was beating the $300+ intel CPU.

As Cmair pointed out AMD blew its lead by not expanding on it. They sat on that edge while intel was a sleeping gaint. Problem was the Athlon chips woke Intel up and they took back their performance edge. AMD has never caught back up.

I like AMD I really do. Just even back in their Athlon days their mobile CPU sucked compared to Intel. Intel made the Pentium M back then and for its time the Pentium M was an amazing mobile CPU.
 
I always though it was strange how AMD went back on their originally plans. Originally they stated they did not plan to put a integrated memory controller for DDR2. They were just going to skip it and jump to DDR3.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.