MYSTERY: OS X shell processes hanging, CPU load 100%+!

Discussion in 'macOS' started by byronic, Mar 29, 2017.

  1. byronic macrumors newbie

    byronic

    Joined:
    Mar 29, 2017
    Location:
    Kansas City, MO
    #1
    Hello everybody! This is my first post. Sorry it has to be about a difficult, no-good issue with my Macbook Pro running OS X 10.10.5 (Yosemite), although I believe this applies to other versions as a friend of mine has virtually the same issue.

    For a few months now I've had a problem where, after anywhere from 6 hours to 36 hours after a system reboot, my mid-2012 13" Macbook Pro (OS X 10.10.5) will start to heat up, the fans become loud, and the system becomes unresponsive. The "Activity Monitor" application and "htop" display loads well above 200% (that is, consuming more than both cores of its Core i5 processor) and the battery will start draining very quickly until I kill the applications that are hanging and eating all the CPU. The applications are exclusively shell applications - that is, applications you would normally run from the Terminal or that would be launched in the background by crontab - and are often the same few suspects: bash, cat, sed, find, groff (when trying to open a man-page with 'man'), etc. These applications will hang every time they're run UNTIL a system reboot at which point things will work fine again. This is a very serious problem because many critical system features including installations and daily maintenance tasks launched from crontab depend on these programs.

    I did some digging and DTrace'ing when I found that not even a shell would launch under Terminal unless I hit CTRL+C during its load (at which time it didn't finish loading it's ".profile" or "rc" files) and discovered that /usr/libexec/path_helper was hanging at the launch of each shell process. I replaced path_helper with a program that simply exits and, now while the shell will launch fine, the same programs such as 'man', program installations, VMWare Fusion virtual machine loads (because of what's going on in the background), etc. will hang. I can kill the processes causing the problem and get VMware to loads its VMs from time to time, but other things just break.

    I am at an absolute loss for WHY this is happening. It doesn't seem to have any rhyme or reason. At first, it seemed to only happen when Dropbox would run with the presence of a symlink to an encrypted volume mounted in /Volumes/. I fully uninstalled Dropbox, removed symlinks and rebooted and things seemed fine for a few days. There, I thought - it was Dropbox being buggy and corrupting something. But alas my fortune was not to last long for the same processes started hanging yet again. I have no idea what causes this. Again, a reboot will alleviate things for several hours but invariably after no longer than 36 hours or so things return to their home in Malfunctioningville.

    The odd thing is that my friend experiences similar behavior. His laptop is a much newer 2015 Macbook Pro model and he runs either El Capitan (10.11) or macOS (10.12). He too thought it may had been Dropbox and after an uninstall of the suspected culprit, just like my experience the problems persisted.

    Anyone have any idea what could be causing this? Could it be some library that becomes corrupt over a period of.. wait, that doesn't make any sense. Seriously I'm at a loss. I've DTraced things down to the system call level to see what libraries were being loaded or what was being execve()'d and whatever else was happening just prior to the hang and cannot really find much consistent except path_helper (which again has been replaced with a binary that just calls exit(0)) would hang and a select set of shell tools/programs hang.

    As for other solutions I've tried running software like "Applejack" to check and fix permissions, corrupted caches/plists, defunct/non-existent links and system service / plist / launchd / library entries, etc. and, while that took care of some other slight nagging issues I had with my Mac, it didn't fix this. (I ran Applejack both with the system running as normal AND as single-user mode as instructed).

    Has anyone had a similar experience? Anyone know what in the hell could be going on? I'm at a complete loss here.

    Thank you in advance for any ideas / support!
     
  2. BrianBaughn, Mar 29, 2017
    Last edited: Mar 30, 2017

    BrianBaughn macrumors 603

    BrianBaughn

    Joined:
    Feb 13, 2011
    Location:
    Baltimore, Maryland
    #2
    My understanding is that Applejack isn't supported past Mac OS 10.6.8 and the developer hasn't updated it since then. You should probably check this sort of thing before you start running third-party stuff on your computer.

    What else have you run on that MPB that could possibly be unsupported in Yosemite? One thing that you can run is Etrecheck and post the report here for members to peruse. Might be a clue in there.

    Also, you could try creating and logging into a new User Account to see if the problem is limited to your regular one.
     
  3. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    Location:
    Sailing beyond the sunset
    #3
    If you've replaced /usr/libexec/path_helper with an executable that does nothing, then when processes try to set their PATH and MANPATH environment variables, nothing will happen. That could leave those env vars with unexpected values.

    Here's example output from running path_helper directly:
    Code:
    PATH="/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin"; export PATH;
    MANPATH="/usr/share/man:/usr/local/share/man:/usr/X11/share/man"; export MANPATH;

    The man page for path_helper says this:
    The path_helper utility reads the contents of the files in the directo-
    ries /etc/paths.d and /etc/manpaths.d and appends their contents to the
    PATH and MANPATH environment variables respectively. (The MANPATH envi-
    ronment variable will not be modified unless it is already set in the
    environment.)

    Files in these directories should contain one path element per line.

    Prior to reading these directories, default PATH and MANPATH values are
    obtained from the files /etc/paths and /etc/manpaths respectively.​

    It seems to me that if you somehow have damaged or corrupted files in either of the /etc/*.d files, then path_helper could well go crazy.


    If path_helper is completely taken out, and PATH is a safe value, then my next guess is that some library has been damaged or corrupted. Yes, this can happen if you have some file-system damage that went unrepaired, or if you just happen to have the bad luck of a bad block in a system library. It's even possible the bad block is intermittently bad

    I don't have an easy to to determine which library it might be, short of comparing to a known-good version on another file-system.

    You should definitely run a hardware test:
    https://support.apple.com/en-us/HT202731

    I had a problem that I thought was a bad disk, that turned out to be a bad RAM stick. The hardware test found it.

    The fact that your problems are intermittent suggests running the RAM test for an extended period of time. RAM defects can be thermally triggered.


    If you have a bootable external drive of some kind, now would be the time to break it out and plug it in. Confirm that it boots correctly and doesn't exhibit the problem. If it does, then something else might be wrong, or it could be the external drive just has a defective OS on it.

    If the external drive works reliably, then you could use the 'cmp' command to compare every library on the known-good external drive with the same lib on the defective drive. The 'find' cmd will come in handy here.

    Or if you're not that interested in finding the exact cause, you could just restore the OS from the last backup or from the known-good version on the external drive.


    Regarding DropBox, or any other addition, it's entirely possible that installing it modifies a system file (e.g. one of the /etc/*.d files used by path_helper), and that modification is not removed when DropBox (or whatever) is uninstalled.

    IIRC, files in /etc are not under the protection of SIP, since there's a lot of reconfigurable stuff that lives there, and SIP has to accept even malconfigured files, as it has no semantic context on which to judge the correctness of reconfigs.
     
  4. byronic thread starter macrumors newbie

    byronic

    Joined:
    Mar 29, 2017
    Location:
    Kansas City, MO
    #4
    Applejack seemed to run fine, and the web site says the latest version supports versions past 10.4. Several "mysterious" problems including random application crashes at load were fixed. So, no problems there.

    But I'm not sure what some third-party application running unsupported on Yosemite has to do with the problem I described - can you be more clear? I will check out Etrecheck.

    Thanks
    - byronic
    --- Post Merged, Mar 29, 2017 ---
    I understand what path_helper does - I "disabled" it because it hangs at start up during the period of time that my system is malfunctioning. path_helper refuses to ever exit and hangs forever, and tracking down every "dot-file" for all shells that executes it was proving to be a pain (although I did it for bash & zsh) so I just disabled it.

    The path to the man pages is fine on my system even with the path_helper files not being loaded as I placed everything in /private/etc/paths.d/ in to my zshrc and bashrc / profiles. This is not why the processes are hanging.

    I will run more exhaustive dtrace "sensors" and see what pops up when /usr/libexec/path_helper is run.

    I think there's a misunderstanding though - it's NOT path_helper in particular that's the problem. It's just one of many applications that spike CPU load to 50%+ each (consuming well over 200% when multiple are running, such as cat, find, sh/bash, groff, etc.) and it executing at shell initialization was causing all user shells (not shell scripts as far as I could tell) to hang indefinitely.

    It COULD be a RAM problem indeed. A few months back I had disassembled my Macbook Pro entirely in order to replace the "upper" in order to replace a malfunctioning keyboard and trackpad, and during disassembly it's possible that my RAM was either knocked out of seating, the contact pins got even dirtier (the inside of the computer was pretty filthy), or was damaged in some way. These models have the replaceable SODIMMs and these RAM modules are like 1/6th the size of the entire logic board. They're easy to accidentally screw with when disassembling.

    I will try memtest86 or something similar when I get a chance.

    But, alas, my friend has a very similar problem to mine, and his laptop is much newer, has not been disassembled, has the latest macOS install. I do want to figure it out what in the world the problem is because I've never seen an issue like this - the RAM being the culprit will be quite the expected outcome, but I looked at other causes first. I will report back after I test the RAM. In the meantime if anyone has any ideas, please let me know.

    Thank you!
     
  5. r.munz, Apr 24, 2017
    Last edited: Apr 24, 2017

    r.munz macrumors newbie

    Joined:
    Mar 19, 2014
    #5
    I am exactly at your point. I also had the feeling that dropbox has something to do with it, but as I am not using it anymore, the problem still persists. The symptomps are exactly the same: path_helper hanging at terminal launch and few others running wild from time to time: cat, sh, man, awk, troff, grotty and so on.
    How did you perform a full uninstall of dropbox?
    Did you memtest your system? What was the outcome of the test?
    I am running on a MBP 13" Mid 2012 with 10.10.5. So far I could not figure out what is even the cause of my issue and this is the first relevant post I have found on the internet. Any help is much appreciated
     
  6. r.munz, Apr 25, 2017
    Last edited: Apr 29, 2017

    r.munz macrumors newbie

    Joined:
    Mar 19, 2014
    #6
    UPDATE: After applying the following fix I am pretty sure that I have solved the issue! My MBP is now running for more than two days and the tests are all good.

    I have simply rebooted the machine from the recovery partition and executed a disk permission repair on the boot disk. Diskutil found some corrupted preferences that I suspect were the ones messing up with the shell applications. Here is a brief list of what was fixed:

    private/var/root
    private/var/root/Library
    private/var/db/GPURestartReporter (kernel panic due to GPU were also not reported)
    private/var/db/displaypolicyd
    private/var/db/lockdown

    After the repair the system looks much more stable and generally faster. No application hangs so far and no CPU extreme usage. I think I might have solved it. Try it out and let me know.

    UPDATE: The system fell in the same state as before. One process that is suspected to be triggering the halt of the shell processes might be Spotify, which was blocked by Little Snitch. I will be doing some more research on it and I will post my results here.

    UPDATE: after a SMC reset and a couple of days of testing all seems working fine!! I think I have finally solved the issue - WRONG: the problem still persists
     
  7. r.munz, May 18, 2017
    Last edited: May 18, 2017

    r.munz macrumors newbie

    Joined:
    Mar 19, 2014
    #7
    After endless tries and tests I can finally say that I have put an end to this horrible nightmare made of suffer and sorrows. My experiments last for almost two weeks and they were really painful to carry out because, as clear from the main post, the problem used to raise after an undefined amount of time on boot. My tests included all the following:
    - Apple Service Diagnostic for the specific laptop in use: MacbookPro 13" mid 2012. All the hardware tests were passed succesfully
    - Memtest. All passed without errors
    - Disks check and repairs. Tools used: Disk Utility, DriveGenius 3, Disk Warrior 5
    - SMC firmware update

    After having realized that the machine was healthy hardware wise I have begun my ride into the software side. Although being reluctant to go for a fresh install, I have decided that this was the best (and cleanest) way to move on. Please note that from the first clean install on, the problem was not detected by opening a terminal window and observing the path_helper and bash processes hanging anymore. Despite the Terminal was still oprational, other shell processes (called by other applications) were still misbehaving badly. Instead of calling the Terminal, I have used the following free app probes to detect the problem: TunnelBlick (an opnenVPN client for OS X) and OnyX (utility software which requires admin privileges to run). At their start up the following processes hung with 50% CPU usage each: sh, sed, cat, lsof, groff. They are often processes owned by root. Reboot or logout/login solved the issue, temporarly. The tests included (but were not limited to) all the following:

    - Yosemite clean install on a test external hard disk. Boot from external hd. The problem still persisted.
    - Removed the internal drives from the laptop (a system boot SSD and a User data HD) and booted from the external test partition. The problem still persisted
    - Installed a new SSD on the macbookpro and performed a clean install of Yosemite on the new drive. Booted from newly installed SSD with no external HD mounted. The problem still persisted.
    - Borrowed a second MaacbookPro from a friend (15" 2011 model) and booted it from the first external hard disk (Yosemite) which raised the problem on my laptop (13" mid 2012). I have kept running both my laptop and the borrowed one for the same amount of time since boot (basically I turned them on at the same time). While after some time my machine raised the usual problem, the borrowed machine did not!!

    Here I started to believe that this problem was hardware specific and could be attribuited to the combination between specific os version (10.10.5 updated withthe latest security update) and the laptop model (again: mid 2012 13"). In other words: I think that I have found a deep, hidden bug which Apple is not even aware of.

    All these tests were videologged because I couldn't believe of what was happenening and I was thinking that my mind was playing tricks (including being afraid of viruses and rootkits).

    So I have decided to move on and make a clean install of El Capitan to my laptop. Well, guess what?

    The problem has disappeared!

    Wizardry? Black sourcery? No!! It is a f***** BUG!

    I used to think that Apple products were very reliable because the software was specifically designed for proprietary hardware and that Apple things used to WORK no matter what. But this horrible experience thaught me the contrary.

    I hope that by sharing this I could make others people life much easier.

    A side note, I have observed that El Capitan seems to be much faster and reactive. This, at least, up to the next update which will brick stuff for specific models....

    Peace to you all.
     
  8. r.munz macrumors newbie

    Joined:
    Mar 19, 2014
    #8
    Here the full specs of my MBP:
    i5 2.5mhz
    16gb DDR3 1600MHz
    128 GB Samsung 850 SSD
    1TB WD Red
     

Share This Page