Small speedup from hyper-threading?

Discussion in 'iMac' started by 5iMacs, Oct 25, 2014.

  1. 5iMacs macrumors regular

    Joined:
    Oct 25, 2014
    #1
    I've been using, upgrading, and benchmarking Macs for long enough to prove that the Geekbench benchmarks are really quite accurate and useful. Especially for operations that really max out the machine like rendering and video encoding, and the text processing and machine learning computations I do for research. The relative differences in processing time line right up with the Geekbench numbers.

    You could look at it as a really strong base config or a really moderate upgrade, but I think this is the smallest delta in the Multi-Core score that I've seen in the top and bottom CPU configs in an iMac, just 21%.

    http://browser.primatelabs.com/geekbench3/1090987
    http://browser.primatelabs.com/geekbench3/1094679


    Given that 10% of that is just the clock speed difference, I feel like only a 10% boost from employing hyper-threading is small compared to what I've seen in the past, although I don't have exact numbers at my finger tips. Is this the way you guys remember it?
     
  2. yjchua95 macrumors 604

    Joined:
    Apr 23, 2011
    Location:
    GVA, KUL, MEL (current), ZQN
    #2
    These scores aren't really accurate as they're 32-bit variants.

    Here's a more accurate comparison: http://browser.primatelabs.com/geekbench3/compare/1088621?baseline=1107697

    It's a 35% increase in performance for a 14.2% increase in clock speed.
     
  3. 5iMacs thread starter macrumors regular

    Joined:
    Oct 25, 2014
    #3
    Why are you using the lowest i5 score and the highest i7 score if you are trying to be accurate?
     
  4. yjchua95 macrumors 604

    Joined:
    Apr 23, 2011
    Location:
    GVA, KUL, MEL (current), ZQN
    #4
    Hmm...if you want to be nitpicky, so be it.

    Here's the highest i5-4690 score and highest i7-4790K score compared:

    http://browser.primatelabs.com/geekbench3/compare/1115892?baseline=1107697

    Still a 33.5% increase in performance.

    So your original argument is still invalid.
     
  5. 5iMacs thread starter macrumors regular

    Joined:
    Oct 25, 2014
    #5
    Thanks yjchua95, that is indeed the discrepancy. A 17% increase due to HT after clock speed adjustments is about what we've seen in the past.
     
  6. joema2 macrumors 65816

    joema2

    Joined:
    Sep 3, 2013
    #6
    Note there is no fixed speedup for hyperthreading -- it varies with the application. You can get 40% in an optimal case or a performance decrease in a pessimistic case.

    The only way to know for sure is run your intended application on an i7 iMac with hyperthreading on and off, and note the difference.

    There is a terminal command which supposedly does this, but on my 2013 iMac 27 with 3.5Ghz i7, running 10.9.5, it doesn't seem to work. There is no performance difference, also iStat Menus shows all 8 logical cores active. So without an effective way to turn HT on/off, it's impossible to evaluate the perf. difference.

    Former command to disable HT: sudo nvram SMT=0, and reboot.

    Former command to re-enable HT: sudo nvram -d SMT, and reboot.
     
  7. yjchua95 macrumors 604

    Joined:
    Apr 23, 2011
    Location:
    GVA, KUL, MEL (current), ZQN
    #7
    On a side note...

    http://browser.primatelabs.com/geekbench3/compare/611544?baseline=1115892

    This is the highest i5-4690 score vs the 3.1GHz i7-4770S in my 21.5" late-2013 iMac.

    Although my i7 is clocked 11.4% lower than that i5, in single threaded tasks, it's identical to the 3.5GHz i5, and outperforms the i5 by 17.1%.
     
  8. Roman2K~ macrumors 6502a

    Joined:
    Mar 11, 2011
    #8
    That's impressive. Though I'm not sure Hyperthreading is the only factor in favor of the i7. It's also got 8 MB of L3 cache vs 6 MB for the i5. Do you know how/if that could make a difference?
     
  9. Chippy99, Oct 28, 2014
    Last edited: Oct 28, 2014

    Chippy99 macrumors 6502a

    Joined:
    Apr 28, 2012
    #9
    There's not much in the Intel architecture that should make hyper-threading significantly faster.

    If you compare an i5 to an i7, both with 4 cores (at the same clock speeds), they both have essentially similar raw compute power. It is true that two parallel tasks can be handled more effeciently by two i7 threads than 1 i5 core, due to faster context switching. But this is a marginal, not staggering benefit. A possibiy larger benefit - depending on the work load - is reduced processor stalls, since 1 thread can continue whilst another is stalled, where alternatively an entire i5 core might potentially be stalled.

    But these are minor benefits really and unless there's a specific use-case where the above are disproportionately important, then I don't expect big performance gains. Increased clock-speed and cache is usually more significant.

    And @yjchua95, your stats demonstrate a 35% increase in performance for a 14% increase in clock speed, whilst performing the workload chosen, which is specific to the benchmark. A different (especially non-synthetic) use case might show much different results. It is not a fair conclusion to extrapolate from a single benchmark how much more powerful an i7 is compared to an i5. It really depends on the workload.
     
  10. tillsbury macrumors 65816

    Joined:
    Dec 24, 2007
    #10
    Agreed. It does make a difference when running VMs, though. If you give each VM a couple of cores they are a lot more responsive, and you still have some left for OSX.
     
  11. Chippy99 macrumors 6502a

    Joined:
    Apr 28, 2012
    #11
    Interesting. So if you give a VM 2 i7 cores (threads) on an i7 setup it performs significantly better than giving the same VM 1 core of an i5? I had not realised that and if true, it's presumably down to to the superior context-switching.

    I have to say I am surprised. If you imagine an OS might have say 100 processes running, then with the i5 you'd have 100 context switches being handled in software. But with the i7 you'd have 50 of the switches happening on chip, but the other 50 would still be happening in software.

    Therefore I am surprised that speeding up only half of the context switches would have much, if indeed noticeable, effect.
     
  12. tillsbury macrumors 65816

    Joined:
    Dec 24, 2007
    #12
    Ahhh, I don't know about comparing with an i5, I've never owned one. All I know is that the difference between one and two is significant, and you can give two VMs two each on an i7 without running out.
     
  13. 5iMacs thread starter macrumors regular

    Joined:
    Oct 25, 2014
    #13
    I have some experience running VMs and I've seen that effect that tillsbury is referring to.

    What happens is that if you give the VM 2 cores, you're really giving it 2 cores and for the moments that it has a lot of work to do it can spread it out and be done with it twice as fast.

    An i7 has 8 cores+threads, which might only be 33.5% more total CPU capacity but it has more logical things it can allocate to VMs. So you could give one VM 3 "cores", another one 2 "cores", and still not starve the host OS of cores. It would not be as smooth to do this on an i5, which is not intuitive, I think your scenario of the 100 vs 50/50 makes more sense, but for some reason VM software seems to work like this.
     
  14. Chippy99 macrumors 6502a

    Joined:
    Apr 28, 2012
    #14
    Yes after I posted, it dawned on my if you give a VM 2 processors in an i7 setup, perhaps both threads are not coming from the same physical core.

    That would mean that the VM could potentially consume half of the i7's physical resources, i.e. you could dynamically reach 50% CPU utilisation if monitored from the host OS. Whereas with a single CPU VM on an i5 setup is hard-limited to 25%.

    I'll test this now and report back!
     
  15. Chippy99 macrumors 6502a

    Joined:
    Apr 28, 2012
    #15
    OK, tested on an i7 setup:

    It does not work as I thought. A 2 CPU VM (VirtualBox) running 100% workload in the client, consumes maximum 200% CPU (one quarter of the available 800%) in the host OS.

    However, this load is spread across 4 processors in the host OS. 4 remain used completely and 4 are roughly 50% used. I am not sure what this means. Perhaps it means the whole physical processor is being used (all four physical cores) but each is only loaded by 25%?

    But anwyay, being limited to a quarter of the resources available, I am not sure why this would be much faster than a single CPU VM on an i5. Slightly faster only, I would have thought.
     
  16. 5iMacs thread starter macrumors regular

    Joined:
    Oct 25, 2014
    #16
    That's a good test, I think the representation of CPU utilization is just confusing.

    Think of it as 4 "plain" cores on the i5 that go simply to 100%, and 4 "enhanced" cores on the i7 that can spike to 105-130% when stuff is multi-threaded enough.

    So yes, giving two cores on the i7 is much like giving 2 cores to the i5, in terms of utilization of the CPU overall.

    It's possible that modern virtualization software overcomes this notion of "dedicating" cores to VMs, my anecdotal experience (and general "recommended practices") go way back and hopefully will be irrelevant soon if not already.
     
  17. joema2 macrumors 65816

    joema2

    Joined:
    Sep 3, 2013
    #17
    Hyper-threading can result in significant performance gains for common tasks. E.g, I just tested a FCP X H.264 render which was 30% faster with HT on vs off. There's a utility called CPUSetter than enables/disables HT (use at your own risk): http://www.whatroute.net/cpusetter.html This was on a 2013 iMac 27, i7@3.5Ghz, 32GB, GTX-780m.

    The HT performance increase comes from utilizing previously-unused CPU resources, not from context switching per se. Each CPU core has many functional elements (e.g: instruction fetch, register fetch, decode, address generation, execute, writeback). As instructions move through the pipeline they do not keep all these execution resources busy. These are pipeline stalls or "bubbles". HT (aka SMT) allows another thread to use those resources during the stall.

    Modern microprocessors overview: http://www.lighterra.com/papers/modernmicroprocessors/

    Instruction pipeline: https://en.wikipedia.org/wiki/Instruction_pipeline
     
  18. 5iMacs thread starter macrumors regular

    Joined:
    Oct 25, 2014
    #18
    And as far as general utility, Chippy99, I completely agree that the benefits of HT are unpredictable. You only have to look at the Geekbench tests themselves to see that some tests get a 50% benefit and some get 0%.

    Actually if you really want to measure overall impression of speed for a computer system, it's worse than that. First you have to set aside most of the operations which are fast on any 2GHz+ system. Then you have to exclude almost everything else because it is I/O (disk and network) limited. Then finally you get to long-running CPU-intensive operations that can be sped up a few tens of percent :D

    But it has great psychological value, which also matters.
     
  19. joema2 macrumors 65816

    joema2

    Joined:
    Sep 3, 2013
    #19
    A little more info: 64-bit Geekbench 3.1.2 gave the below results on my 2013 i7 iMac with HT on and off

    HT off: Single-core: 3923
    HT on: Single-core: 4005

    HT off: Multi-core: 12,470
    HT on: Multi-core: 14,888

    So the multi-score benchmark improved by 19.3%. However as I previously posted, an actual FCP X video render test was improved by about 30%.
     
  20. Chippy99 macrumors 6502a

    Joined:
    Apr 28, 2012
    #20
    I might be having momentary brain failure, but how do you turn HT off?

    EDIT: Sorry - missed your post above!

    ----------

    That's what I said in the first place!
     
  21. 5iMacs thread starter macrumors regular

    Joined:
    Oct 25, 2014
    #21
    That's a really cool test! This is a Haswell CPU, I'm assuming?
     
  22. joema2 macrumors 65816

    joema2

    Joined:
    Sep 3, 2013
    #22
    Yes the 2013 iMac 27 has a Haswell i7. However I'd expect roughly similar differences in hyperthreading with the Ivy Bridge i7 used in the 2012 iMac and the Sandy Bridge i7 used in the 2011 model. But you can't be sure without actual tests.
     

Share This Page