Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
GoMac wasn't even close to right on this one.

so in a sense it is a marketing trick than anything else?
Not at all! Some software can really benefit from SMT. The nice thing is that Intels latest spin on SMT is seldom a performance negative like it could have been in the first iterations seen years ago.

In any event SMT allows for better utilization of the resouces in the CPU chip. Under optimal conditions the speed up can approach that of two CPUs though in practice it is often less. If you are concerned about the viability of SMT then look up bench marks that highlight where it is a success. Some apps can really give an i7 a workout.
Its good to know. I thought the whole point was that you would get the extra power from the logical core but as it looks its just another term to confuse more people :)))
You are paying attention to the wrong people here. SMT has nothing to do with confusing the people, it is a real useful tech. How effective it is for the workload you apply to your computer is an open question. This however is not unlike the usage of a GPU in a computer. Some people hardly ever fully use their GPUs capability others using the same GPU can put it into thermal overload on a daily basis.

Like wise an i7 and it's SMT capability can hardly get a work out from some users, while other users can swamp every thread to full capacity.

As a side note the OS is what actually manages all those threads. Many of those processes and threads managed by the OS don't really need the full performance of the CPU anyways. So even if the main thread on a CPU only allows for 20% of the baseline performance on the alternate thread you still win with SMT. This is especially true on modern day UNIX OSes where many things happen in background with the user unaware of this.

One last thing: Arrandale! This is suppose to be a two core SMT processor that has the potential to end up in a Mini like computer. If it does that SMT facility could be very significant in making the Mini a very robust machine for Snow Leopard. Honestly I can't reccomend to anybody a computer that only supports two hardware threads anymore. An Arrandale Mini could change drop it off my do not buy list. The rational is clear here, Snow Leopard is an OS designed to harness threads the more that can be routed to hardware the better. Even if those threads don't always give you 100% of the core you still win on average.

In any event this brief message can't cover all the technical details of SMT. What I can say is the current implementations are a vast improvement over the original concept. Many bits of software can leverage those threads but even when a specific app doesn't the OS can. It is not all about marketing, it is actually good tech.


Dave
 
Hopefully it will support more than the pitiful 32 GB of memory like the current model. Life begins at 128 GB, so I sure hope the new model supports that. With 12 cores / 24 threads, the machine will be a complete joke if the memory ceiling is still the same 32 GB.
 
As a side note the OS is what actually manages all those threads.

The OS scheduler needs to be smart and adaptive when running on an SMT system - otherwise you can easily get less performance than without SMT.

For example, consider the case where you have two computable threads. If these are scheduled on separate cores, you get 200% the performance. If, however, the OS schedules those two threads on different logical CPUs on the same physical core, you get (maybe) 120%. It's pretty easy to handle this scheduling with a simulation or long encoding run.

In real life, threads wake up and sleep constantly, so what's "perfect" now might be "worst case" in a millisecond. And, since it costs CPU to move a thread between logical CPUs, the scheduler shouldn't rebalance millisecond by millisecond.


Many of those processes and threads managed by the OS don't really need the full performance of the CPU anyways.

If they don't, they're spending a lot of time in the idle state so the number of CPUs/cores/threads isn't that important. ;)

My rule of thumb with hyperthreading has been to turn it off unless you often have more computable threads than physical CPUs (cores). You help the OS scheduler by eliminating the possibility of having two busy threads on the same physical core when there are idle physical cores.

(Typing on a Core i7-940 with HT disabled.)
 
Hopefully it will support more than the pitiful 32 GB of memory like the current model. Life begins at 128 GB, so I sure hope the new model supports that. With 12 cores / 24 threads, the machine will be a complete joke if the memory ceiling is still the same 32 GB.
Well there was a rumor some time ago about the new Mac Pro to have Gulftown and up to 128 GB RAM.
 
What made sense back then doesn't today.

Looking at my tricked out, one year old 8 core Xeon Mac Pro I can only lament the passing of the days we could buy a CPU upgrade card for such a Mac from DayStar and their like. One of the beauties of a Mac with slots was exactly that. I'd gladly pay the $1,200 or so to upgrade the CPUs on a machine that set me back over $6K. Can the more technically inclined explain what happened that ended such upgrades?
  1. Not to be rude but many of those upgrades where pretty stupid and frankly poor deals.
  2. The bus interfaces are much faster today requiring great care in routing. Generally your board will be more reliable if those high speed signals avoid sockets.
  3. Related to the above is the time it takes for the electrical signals to move across the board. Fast clocks require that parts be close together to assure data arrives in time. If that can't happen then things have to be slowed down in some manner.
  4. A riser card for the processors is a poor packaging solution as it changes a bumpy two dimensional item into a massive thre dimensional one. Not only does this impact packaging but it is a negative for servicabity too.
  5. Technology has changed alot over the years and modern chipsets do not maintain backwards compatibility with the old. Thus by the time you might like a CPU upgrade there is nothing to plug into the main boards chipset. Of course you could put the chipset on the processor daughter card but that has it's own problems.

How are those off the top of my head? The reality is that what was questionable practice in the past has been made highly unlikely by modern technology. As we move into the future and much higher integration the prospects for such hardware upgrades go to zero.

They gotcha there is the advent of Systen on Chip (SoC) technology. As more "stuff" gets put into a SoC the prospects of upgrades from third parties slips. Especially if some of the IP on the SoC is vendor specific. On the otherhand the whole motherboard could be swapped if the third partty vendor can overcome all the obsticals.

The coming tablet could be very interesting in this regard. Especially if they push forward to chip on board technology.

Dave
 
it is getting harder and harder to achieve smaller and smaller line width. TSMC struggle with 40nm, intel working now on 32nm. It will be a problem for the chip makers due to physical limitation. 32 nm going to be tough, i dun see too many 32nm out in Jan.
as for i5 and i7, yes it is very confusing. intel want you to think that way.

You really might want to stick to a subject you understand something about. The big difference between i5 and i7 is that the i7 offers 2 logical cores per core. The i7 in the iMac does that, so it's a real i7.

As for the price, you'd think people would learn to stop speculating about price on systems that have not even been announced, much less released. Historically, the Mac Pro has fallen within a specific price range - even when it was one of the first ones using a given processor. The first nephalem Macs were within the same price range even though the chips were fairly scarce at the time, so there's no reason to think the first Gulftowns will be any different.
 
With this topic about new core processor, I take it the 13.3'' MBP would not take any change to this new hardware since its recently been introduced not to long ago?
I can see changes in upgraded processors for the 15'' and 17'' or iMac, but the 13.3'' can't handle the heat or are not ready to see an update until later in 2010.
I could see many users being raddled if the MBP 13.3'' had an extreme update to hardware and many ppl are buying the model for Christmas time.
 
Apple better update the graphics cards available, I don't care if it's a separate purchase I just want something better than a 4870 or GTX 285. Also they need to update the power supply so it can support the newer cards that eat watts.
 
Interesting comments! I don't totally agree.

The OS scheduler needs to be smart and adaptive when running on an SMT system - otherwise you can easily get less performance than without SMT.
It depends - the greatest words ever spoken abou computing.

There are many factors here but you will almost always get better through put with SMT on modern processors.
For example, consider the case where you have two computable threads. If these are scheduled on separate cores, you get 200% the performance. If, however, the OS schedules those two threads on different logical CPUs on the same physical core, you get (maybe) 120%. It's pretty easy to handle this scheduling with a simulation or long encoding run.
That can sometimes happen. On the other hand some apps can make use of both threads very well indeed. However it is almost never a negative on the modern SMT implementations.

What you need to look at is what happens when you have a plethora of threads say spawned by GCD. For example let's say you have an app that just request that GCD process 32threads that are compute intensive. But you PC only has four cores and eight threads. If GCD issues out eight threads to the OS (contrived number) then all your hardware threads become active. If you get 150% from each core you get 600% as opposed to 400% on a non threaded core. In the end you get done much faster than if you only used four cores.
In real life, threads wake up and sleep constantly, so what's "perfect" now might be "worst case" in a millisecond. And, since it costs CPU to move a thread between logical CPUs, the scheduler shouldn't rebalance millisecond by millisecond.
This is very true on a single core computer, not so much on a multicore model where it is not impossible for a thread to run to completion. I'm not prepared to get into the specifics about how the threads get dispatched as frankly I don't know the details, but I would suspect that Apple is generating more threads for the OS to manage than there are hardware threads to process.
If they don't, they're spending a lot of time in the idle state so the number of CPUs/cores/threads isn't that important. ;)
That is one way to look at it. The other way is that a hardware thread running at ten or twenty percent capacity can still easily handle these minor loads on the system. Sometimes that little extra capacity goes a long way.
My rule of thumb with hyperthreading has been to turn it off unless you often have more computable threads than physical CPUs (cores).
Where did that idea come from? Frankly I think you are screwing youself here. In part because you are thinking you are smarter than the scheduler about what hardware is available to use. More so you have no way of knowing when or if your software will need multiple threads.

The only time you might gain is on hard single threaded apps that can benefit from clock speed. That could be a jusification if this is a proven reality. Even so when you do that for one or two apps you hog tie your machine for other apps and OS flexibility in general.
You help the OS scheduler by eliminating the possibility of having two busy threads on the same physical core when there are idle physical cores.
Where did this idea come from? A proper CPU scheduler should not leave you with free cores. Plus this can be a huge negative if the threads are from the same process and can benefit from the local caches.
(Typing on a Core i7-940 with HT disabled.)

Honestly I'd re-enable that HT support and look at the apps you run a little closer. Unless you have a very specific usage pattern that jusifies it you will be better off with HT on. I don't want to completely dismiss what you think you are seeing rather I'm not sure if your perceptions is based on faulty interpetation of what you are seeing.

The key here is throughput and your apps interactions with SMT in intels processors. If rendering a video or encoding something the question is which gets you done first a quad with SMT on or off? Of course you might not be doing those highly parallel tasks but that just cause me to wonder if then you are seeing a clock rate boost.

In any event the stuff I've seen tested seldom shows SMT, on modern Intel processors, actually slowing things down. What interests me here is what actually has you thinking you see a real advantage turning SMT off? Do you actually see a positive difference in work done at the end of the day with SMT off? If so what is the app you are most focused on?

I could actually see something like Photoshop liking a setup where maximum clock rate can be achieved. That due to it's single threaded nature to begin with. In any event let's see some numbers, curiosity is peaked here.


Dave
 
n any event SMT allows for better utilization of the resouces in the CPU chip. Under optimal conditions the speed up can approach that of two CPUs though in practice it is often less. If you are concerned about the viability of SMT then look up bench marks that highlight where it is a success. Some apps can really give an i7 a workout.

Ummmmm no. Even two CPU's can't double performance... The most Intel will ever claim for SMT under ideal conditions is a %30 boost.

There is no way SMT will get anywhere close to the performance of two CPU's.
 
But the problem/complaint is that there are no apps (ok, maybe we can all put our heads together and list 3) that truly take advantage of multi-core technology. Multi-core technology has been around for almost 5 years (I've owned a few quad-core PC systems since early 2007) and even today practically nothing takes advantage of it
That is false information. EVERYTHING takes advantage of multiple cores, its the nature of how OSX handles applications.

Of course it all has to start with the OS, too, to allow the apps to be written in that manner.
Which has been the case since 2000.
 
The Portables will likely get upgraded on the same schedule as always.

Which means the January to Feburary time frame.
With this topic about new core processor, I take it the 13.3'' MBP would not take any change to this new hardware since its recently been introduced not to long ago?
Don't bet on it!

Let's face it Apple is hitting on all cylinders with it's portables they will keep the line up fresh and the momentum moving forward. I expect the new laptops soon after suitable chips from Intel.

I can see changes in upgraded processors for the 15'' and 17'' or iMac, but the 13.3'' can't handle the heat or are not ready to see an update until later in 2010.
The rumored 32nm processors are supposedly very low power. The new chips could go into a 13" MBP and give it a significant boost in performance and battery lifetime.
I could see many users being raddled if the MBP 13.3'' had an extreme update to hardware and many ppl are buying the model for Christmas time.

Oh come on now; everybody on this planet knows how Apple releases product. No one should be shocked that new stuff comes out next year, like it has for years now.

I can't pinpoint a date but let's face it everything is coming together for a very significant update to the portables. It is simply a matter of looking at what Intel and the GPU manufactures have coming. The 32nm processors will be very fast at very low powers and some of the new mobile GPUs are also low power.

I mention the GPUs in the context of the 13" machine because I think Apple will avoid stepping backwards to GMA tech. I could be wrong as it will depend upon intels implementation.


Dave
 
so in a sense it is a marketing trick than anything else?

No, not at all. Having two virtual cores instead of one physical core gives you maybe 20 percent performance gain for 5 percent more complexity in the chip. That's an excellent trade-off. Twenty percent gain for 5 percent cost. So this is a good benefit for Intel's customers.

And since Intel doesn't market 4 Core + Hyperthreading or 6 Core + Hyperthreading as 8 core or 12 core, it's not a marketing trick anyway.
 
It depends upon the app.

Ummmmm no. Even two CPU's can't double performance...
I disagree as some highly parallel apps do come very close to doing that and scale across even more cores well. Of course these are best case apps but it doesn't dismiss the reality.
The most Intel will ever claim for SMT under ideal conditions is a %30 boost.
Yes and some apps do a lot worst too.
[qoute]

There is no way SMT will get anywhere close to the performance of two CPU's.[/QUOTE]

That depends upon what you call close. In any event look at some of the benchmarks out there with highly parallel code. SMT might not be perfect but it is not that bad.

Dave
 
It depends - the greatest words ever spoken abou computing.

Certainly true words.... ;)


There are many factors here but you will almost always get better through put with SMT on modern processors.

If you never have more active (schedulable) threads than physical cores, you cannot possibly have better throughput with SMT. You might possibly have worse throughput, if active threads get scheduled on the same core when you have idle cores.


That can sometimes happen. On the other hand some apps can make use of both threads very well indeed. However it is almost never a negative on the modern SMT implementations.

Do you have any statistics that say that OSX is good about thread scheduling on SMT systems? If you search for "hyperthreading performance worse" you see things like http://www.csl.cornell.edu/~vince/writeups/case_for_ht.html You might also want to look at the Linux developer discussions about trying to improve the Linux scheduler for SMT. It's not a simple "one size fits all" problem.


What you need to look at is what happens when you have a plethora of threads say spawned by GCD....If you get 150% from each core you get 600% as opposed to 400% on a non threaded core. In the end you get done much faster than if you only used four cores.

This is exactly why I said "My rule of thumb with hyperthreading has been to turn it off unless you often have more computable threads than physical CPUs" - on a server with a typical load with many computable threads, SMT is good - you'll typically win.

On a workstation/desktop with random loads, sometimes you'll have enough computable threads that SMT is a win. Sometimes you'll have fewer threads than physical CPUs, and SMT can hurt.


Where did that idea come from? Frankly I think you are screwing youself here. In part because you are thinking you are smarter than the scheduler about what hardware is available to use. More so you have no way of knowing when or if your software will need multiple threads.

It comes from experience and testing. Right now my quad core system has 649 running threads, is averaging 25% activity (out of 100% - there's a dual CPU VM churning on some things, usually it's much less), and is averaging 2.4GHz (out of 2.93GHz).

So, by limiting the system to 4 real threads, I never have the situation where scheduling artifacts lead to some physical cores with 2 threads and some physical cores with 0 threads.

Also note that "knowing when or if your software will need multiple threads" is not the issue, it's "knowing when or if your software will have multiple computable threads". My system has 657 threads now, so I *know* that it will always need multiple threads. I also know that my system seldom has more computable threads than physical cores.


The only time you might gain is on hard single threaded apps that can benefit from clock speed. That could be a jusification if this is a proven reality. Even so when you do that for one or two apps you hog tie your machine for other apps and OS flexibility in general.

Where did this idea come from? A proper CPU scheduler should not leave you with free cores. Plus this can be a huge negative if the threads are from the same process and can benefit from the local caches.

I know from the way that I use my machine that I don't have long-running multi-threaded CPU-bound apps.

Also, it's very nebulous that "if the threads are from the same process and can benefit from the local caches" is meaningful. Some threads from the same process might be sharing common data, and L1/L2 cache sharing would be a plus. In other cases, threads from the same process might be independent, and running on separate physical cores would be a huge advantage.

How does the scheduler know which is the case? It doesn't.


Honestly I'd re-enable that HT support and look at the apps you run a little closer. Unless you have a very specific usage pattern that jusifies it you will be better off with HT on.

My usage pattern averages less than 1 CPU busy, and I have 4 CPUs. Adding 4 more logical CPUs to my mix will almost never speed things up, and risks slowing things down.

If I were going to spend some time stealing videos with Handbrake, I'd simply turn HT back on while I was ripping them.

Note that HT will sacrifice response time for throughput in the best of cases - think about that.
 
Doesn't hyperthreading really act as a "cueing" system to make sure a core is busy if a process is available rather than reporting its availability for sending ONLY after finishing execution of the prior process?

No, that is not how it works. Each core has a certain number of processing units that do the actual work. On a current Intel processor that would be two integer units, one floating-point adder, one floating-point multiplier, one load/store unit or something like that. It is very hard for one thread to use all these units. If two threads share these resources, then these two threads will together be able to use more of the available resources than a single thread.

The biggest gain happens when a thread has to wait for something and can't continue. When that happens, the other thread running on the same core can continue running at full speed. Without the second virtual core, the real core would stand still, instead it does useful work.
 
Solitaire and Minesweeper? :p hmm..

Please someone enlighten me. Just wondering about the article. What did it mean by dual processor setup would bring 12 physical cores and 24(~!) logical cores? Dual 6 cores i9 is 12 cores. And how can it doubled to 24 logical cores? :confused:

So 1 core contain 2 "logical" cores?
If you get a MP with one i9, it will have 6 physical and 12 logical cores. If you get it with two i9s, it will have 12 physical cores and 24 logical cores.
Breaking news.

Lots of people already have nice monitors and don't want to look at a mirror when they are using a computer.
Breaking news.

If that's their case then all in ones aren't for them.
 
Some interesting info.

If you look here http://www.phoronix.com/scan.php?page=article&item=intel_lynnfield&num=1 you see a long thread about modern CPU performance with lots of bench marks with intel i5, i7 & AMD processors. It is very interesting to see the sometimes massive differences between the i7 & i5 processors. Othertimes i5 wins. What can't be denied is that SMT can lead to some very real advantage for the i7.

Now there are all sorts of issues here such a clock rate, caches and the like but when given the opportunity the i7 can really clobber the i5 on highly parallel code. A 20 to 50% difference is nothing to sneeze at. It of course depends upon the code and problem. The regressions are few relative to the i5 which is why I see the i7 upgrade to the iMac as being very advisable. It is $200 well spent.

Does SMT solves everybodies performance requirements - nope doesn't even come close. It is however a sound economical choice over going dual processor like is seen in a Mac Pro.

It should also be noted that the bench marks are on a Mac OS system. That is OK because the Mac is still maturing with respect multi threaded software. It does give you an idea about how optimal software could benefit from an i7.


Dave
 
The rumored 32nm processors are supposedly very low power. The new chips could go into a 13" MBP and give it a significant boost in performance and battery lifetime.
They are low power but it appears that the 2.53/2.67 GHz variants won't arrive until Q3 2010.
 
More considerations.

Certainly true words.... ;

If you never have more active (schedulable) threads than physical cores, you cannot possibly have better throughput with SMT. You might possibly have worse throughput, if active threads get scheduled on the same core when you have idle cores.
Very true.

My point is how can you as a user possibly know how many threads and or processes are active all the time? The OS can effectively generate new processes & threads any time. Those can be unrelated to your app or generated in service to your current app.

Do you have any statistics that say that OSX is good about thread scheduling on SMT systems? If you search for "hyperthreading performance worse" you see things like http://www.csl.cornell.edu/~vince/writeups/case_for_ht.html You might also want to look at the Linux developer discussions about trying to improve the Linux scheduler for SMT. It's not a simple "one size fits all" problem.
Actually I've been following Linux for a long time and have seen schedulers come and go so I understand the complexity. What I'm say is does it make sense for the average user to second guess the scheduler? In most cases I'd say it is not worthwhile.
This is exactly why I said "My rule of thumb with hyperthreading has been to turn it off unless you often have more computable threads than physical CPUs" - on a server with a typical load with many computable threads, SMT is good - you'll typically win.
That is all well and good but can you off the top of your head say how many threads and processes are running when Safari is using part of a screen to run flash movies and in another thread you are running a word processor app or IDE? If it is 64 bit Snow Leopard that is two processes for Safari plus an unknown number of threads along with your word processor and at least a couple of threads there. So quickly you have three user processes and a few threads to deal with. In anyevent why would you want to burden yourself with turning SMT on and off to get what you think is better performance? Especially when your mix of apps can change at any moment.
On a workstation/desktop with random loads, sometimes you'll have enough computable threads that SMT is a win. Sometimes you'll have fewer threads than physical CPUs, and SMT can hurt.
Exactly so why saddle yourself with being a CPU scheduler when the OS can do it for you? Is it perfect, certainly not but overtime it ought to do better.
It comes from experience and testing. Right now my quad core system has 649 running threads, is averaging 25% activity (out of 100% - there's a dual CPU VM churning on some things, usually it's much less), and is averaging 2.4GHz (out of 2.93GHz).
Would you agree that any of those 600 some odd threads can become active anytime? If so let's say 8 of them became active all at once, wouldn't you want to have SMT available then.
So, by limiting the system to 4 real threads, I never have the situation where scheduling artifacts lead to some physical cores with 2 threads and some physical cores with 0 threads.
If you say so. It maybe very be the case that Mac OS has a crappy scheduler. I do wonder though if this testing of yours is with the latest Snow Leopard release as that section of the OS was totally reworked with the intent to support lots of cores.
Also note that "knowing when or if your software will need multiple threads" is not the issue, it's "knowing when or if your software will have multiple computable threads".
It really doesn't matter as you can answer that question either.
My system has 657 threads now, so I *know* that it will always need multiple threads. I also know that my system seldom has more computable threads than physical cores.
I have to call BS on this one because you'd have to be super human to know what is running on your PC from millisecond to millisecond 24/7.
I know from the way that I use my machine that I don't have long-running multi-threaded CPU-bound apps.
Ok I can by that in general. But have you ever downloaded a large file while watching a movie on screen? It doesn't take much to end up using a lot of resources. This might be a bad example if you have a machine doing decode on the GPU.
Also, it's very nebulous that "if the threads are from the same process and can benefit from the local caches" is meaningful. Some threads from the same process might be sharing common data, and L1/L2 cache sharing would be a plus. In other cases, threads from the same process might be independent, and running on separate physical cores would be a huge advantage.
That can certainly happen. Still if the OS wants it could still schedule those two threads to run on a single core. You are not gauranteed that that the OS will put the thread on another CPU.
How does the scheduler know which is the case? It doesn't.


My usage pattern averages less than 1 CPU busy, and I have 4 CPUs. Adding 4 more logical CPUs to my mix will almost never speed things up, and risks slowing things down.
Risks possibly but consistently I'm not to sure about. The problem that I see is that you say you only have one CPU busy but that doesn't mean the OS isn't feeding work to the others. The impact could be so minor you might never notice.
If I were going to spend some time stealing videos with Handbrake, I'd simply turn HT back on while I was ripping them.
I don't reccomend stealing. In any event you still haven't convinced me that being a cyborg schedulers is worthwhile.
Note that HT will sacrifice response time for throughput in the best of cases - think about that.
Is this another off the cuff remark? Because it is not valid the way I'm parsing it.

In any event we can go on arguing about this but realize I'm an old guy hear and have experience going way back to DECs PDP computers. I use to work on a system where I had to alter priorities on processes to get the right work done at the right time. Not much fun to be had with that.

In any event your issues have me curious because what you describe as usage should never cause you a problem on the type of hardware you are running. I was under the impression that SL scheduler was vastly improved. Combined with all the improvements in the i7 you should be golden.


Dave
 
This is exactly why i postponed my purchase of the 8 core mac pro. The only thing im worried is a price rise ! I hope not !!:eek: Please God !!! :(

Indeed! You can be rest assured that Apple will manufacture and sell these at Cost:profit ration of 1:2 (or 2xProfit for 1x the cost).

Either way I know I'll never be able to afford this currently or within 12mths so I'll be looking to acquire a USED 8-Core Mac Pro (2008) from a used ebay listing mid year.
 
In any event the stuff I've seen tested seldom shows SMT, on modern Intel processors, actually slowing things down.

What interests me here is what actually has you thinking you see a real advantage turning SMT off?

Just found this on Arstechnica...

Cinebench scores scale predictably, save for a certain oddity surrounding quad-threaded performance. The Core i7 965 without HT blows the doors off the Core i7 with HT.

nehalem-6c.png


http://arstechnica.com/hardware/reviews/2008/11/nehalem-launch-review.ars/5

I'm sure there are many benchmarks that show that HT is useful for some apps, and I do turn it on for my production servers and render-farm type systems - but I'll keep HT disabled on my desktop machines unless I'm going to spend the weekend stealing DVD content with Handbrake.
 
My point is how can you as a user possibly know how many threads and or processes are active all the time?

Very easy - ask the performance monitor to tell me how many threads are queued and waiting for CPU.

The number of threads and processes on the system is irrelevant (right now 50 processes and 669 threads). It's the number of threads that are computable that is important. If you have 50 threads and 10 computable is a completely different situation than having 1000 threads and 0.137 computable (on average). The latter is closer to my case.

If it is 64 bit Snow Leopard that is two processes for Safari plus an unknown number of threads along with your word processor and at least a couple of threads there. So quickly you have three user processes and a few threads to deal with.

And they are all idle most of the time....


Would you agree that any of those 600 some odd threads can become active anytime? If so let's say 8 of them became active all at once, wouldn't you want to have SMT available then.

The chances of 8 active at once is less than the chance of 4 being active and being sub-optimally scheduled. (Again, I'm talking about a system with an interactive load, not a server or render system with a long-running multi-threaded task.)


But have you ever downloaded a large file while watching a movie on screen? It doesn't take much to end up using a lot of resources. This might be a bad example if you have a machine doing decode on the GPU.

It's also a bad example because the download is mostly idle waiting for network transfers.


The problem that I see is that you say you only have one CPU busy but that doesn't mean the OS isn't feeding work to the others. The impact could be so minor you might never notice.

Having thousands of idle threads consumes memory resources, but not CPU resources.

I don't reccomend stealing. In any event you still haven't convinced me that being a cyborg schedulers is worthwhile.

Originally posted by AidenShaw:

Note that HT will sacrifice response time for throughput in the best of cases - think about that.

Is this another off the cuff remark? Because it is not valid the way I'm parsing it.

Not off the cuff at all.

Consider an overly simplified case of a single core HT CPU, and a series of tasks (threads) that take 60 CPU seconds per task (no IO or other variables).

Without HT, a task takes 60 seconds, and you do 1 task per minute.

With HT, and assuming 120% "efficiency" for HT, a task takes 100 seconds, and you do 1.2 tasks per minute.

That's better throughput, but worse response time.


In any event we can go on arguing about this...

Let's not. I've spent quite a bit of time benchmarking systems with and without hyperthreading. I'm comfortable with my "rule of thumb" that if the number of computable threads on your system is almost always less than the number of physical cores, then I'll turn HT off. On busy servers or special purpose systems like render systems, I'll turn it on.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.