[C++] MacPro slower than MacBook

Discussion in 'Mac Programming' started by topcomer, Jul 28, 2009.

  1. macrumors newbie


    I develop C++ code for scientific computing, usually on my MacBook, but now my research group bought a MacPro and I tried to build and run my code there. Well, it turned out that the speed is comparable for extremely small memory usage, while for larger data (if 30mb can be considered larger), the MacBook totally outperforms the MacPro. For reasonable data (200mb) the MacPro is almost stuck.

    I then tried to profile using Shark, and found that the following call takes 23% of the runtime of my process:


    Can someone please help me to understand what is going on?
    Thank you!
  2. macrumors G5


    Google is your friend. Why don't you type "ml_set_interrupts_enabled" into Google and see what happens? Maybe someone had the same question in 2004, posted on the Mac developers site, and got a reply from Eric Schlegel?

    (Note: If you don't know who Eric Schlegel is, then I can guarantee he knows 1000 times more about MacOS X than you do. If you know who he is, he very very very likely knows more about it).
  3. macrumors 65816

    gnasher729, I did what you said and found something completely irrelevant to OP's question. Sure the person asked why so much time was being used there, but he wasn't having tremendously slow code on a MacPro and fast code on a MacBook. Also, OP clearly states his code is slow on the MacPro while processing data and the google result was about CPU usage during idle.

    OP, osx is tremendously slow and inefficient. That aside, is your code written with parallel processing in mind? The fact that the MacPro is going to have more cores than your MacBook, and the call you are seeing heavy usage in, probably indicates you will want to see how you are handling data concurrently. Are you locking a lot of semaphores?

    Without more information as to what your code is doing I'm grasping at straws here.
  4. macrumors 6502a


    Without a lot more detail its impossible to know where the culprit is. Two different computers may have two totally different installs of the dev tools.

    First make sure both environments are identical e.g. Xcode is the same version on both and configured the same (Xcode installs several versions of gcc, g++ and now LLVM compilers and sets the default version to use also). Which version of Mac OS X are on both?

    Is this a command line tool? Is it built with a make file?

    More details needed...
  5. macrumors G5


    Eric Schlegel's response gives a very clear hint what the OP is doing, and what he needs to change. If you don't see it, tough.
  6. macrumors newbie


    Ask this question (with the version numbers mentioned as necessary above) on stackoverflow.com - they have many more solid programmers there who can help without being snarky.
  7. macrumors 603


    I hardly ever see anyone being snarky on this forum. If you are referring to a comment I made which apparently has now been deleted, I was attempting to get the poster to provide an argument to substantiate their point which I feel was necessary given the rather ridiculous nature of the claim.
  8. macrumors 65816


    If you are finding that most of the (wall) time is spent in ml_set_interrupts_enabled - it means your program's threads are blocked on something - IO/Condition Var/something else.

    Could it be possible that your program has a scalability glitch somewhere - the Mac Pro has more CPUs than the MacBook Pro and if you program creates more threads on the Mac Pro and get stuck most of the time due to synchronization or something else - that may explain the problem you are seeing.

    That's just one possibility though - without actually seeing what your program does it is hard to tell what the problem is.
  9. macrumors G5


    A good way to slow down an application is to call select () with a zero timeout as often as you can. Or check the event loop as often as possible, etc. Both are also good ways to call ml_set_interrupts_enabled a lot.
  10. macrumors newbie

    Thanks all for the replies. I try to summarize the answers.

    I did it before posting and found that specific discussion but wasn't helpful for my case.

    No. However, my code is built on PETSc, which is a library designed for parallel computing. But nowhere in my code there is any use of parallel computations (at least not intentionally).

    They are both on OS X 10.5.7, the MacPro has XCode 3.1, the MacBook XCode 3.0. I use Eclipse as IDE with Makefile building.

    Why am I claiming something ridiculous? I'm sorry but we aren't born professors, so I must learn by mistakes.

    Is there any specific information I can provide?

    So is it Shark that causes the slow down while profiling? Anyway, the code is slow also when I do not profile..
  11. macrumors 603


    I was referring to Darkwing, not you. Sorry I should have made that clearer.
  12. macrumors 65816


    Can you use activity monitor and tell us the overall CPU usage on both machines for your program? Also can you tell us what the memory usage of the program looks like? Is swap usage increasing on the MacPro while your program runs?

  13. macrumors G5


    No, Shark isn't slowing down anything. Shark tries to sample about 1000 times per second which instruction is executing. When interrupts are disabled, it can't sample: It will wait until interrupts are enabled again and sample the next instruction after that. And that will be in 'ml_set_interrupts_enabled'. So your code doesn't spend 23% of its time in 'ml_set_interrupts_enabled'. It spends 23% of its time with interrupts disabled, and when they get enabled it attributes the time to 'ml_set_interrupts_enabled'.

    Your code is calling something like crazy that eventually enables and disables interrupts, which is likely an enormous waste of time. Try using Shark to find out what calls 'ml_set_interrupts_enabled' so often and find out how to avoid that.
  14. macrumors newbie

    Thanks for the explanation about interrupts. It would be difficult to locate who calls them though since I never heard about "interrupts" before in my life. I'll try my best.


    CPU ~97% (on a single core I suppose)
    2 Threads
    No swap increase


    CPU ~160%
    2 Threads
    No swap increase
  15. macrumors newbie

    I attached the Sampler (in Instruments) to my process and got that 58% is spent in the following callstack:

  16. macrumors G5


    I guess the idea is that you want to spend 99.9% in glvmDoWork, but not in pthread_cond_wait.

    Check in your code how often glvmDoWork calls pthread_cond_wait. It probably does it much, much, much too often. Set a breakpoint on the call to pthread_cond_wait, then step over it, then have a look at how much useful work your code does before it calls pthread_cond_wait the next time.

    Basically it looks like your code is spending all its time having one or more threads talking to each other, instead of doing anything useful.
  17. macrumors 68040


    I haven't dealt with threading too much, so I don't have a too much to add to this... but what has struck me as strange here is that the MacBook has multiple cores, so it should be able to have both threads running just as well as the Mac Pro. Maybe it's that the Mac Pro is going to have plenty of cores available for running your code, while the MacBook might be less likely to have both cores free. Anyhow, just an observation, since there were no single-core MacBooks produced.

  18. macrumors G5


    Lee, consider a situation where one thread per core is created, but only one task. On the dual core MacBook two threads are talking to each other, one saying "I am busy", the other saying "I've got nothing to do". On the eight core MacPro eight threads are talking to each other, one saying "I am busy", and seven saying "I've got nothing to do". The seven idle threads will, if things are programmed badly enough, keep the one thread from doing any actual work, so the MacPro will end up slower.

    It is hard to say what is actually happening, but your CPU time spent in functions like pthread_cond_wait should be very close to zero, not > 50%. So that is a sign that there is much too much communication between threads, and nothing else.

Share This Page