PDA

View Full Version : Help: How to debug complete UI lockup




micsaund
Feb 11, 2007, 12:54 PM
ARRGGH!! Well, it just happened to me again (MB 2.0GHz, 2GB).

I was watching a video in iTunes and surfing the web (Firefox) and the entire OS X GUI locked-up as I clicked the Activity Monitor . I had Mail, Adium, and Skype open in the background.

Open program summary: iTunes, FireFox 2.0.0.1, Activity Monitor, Adium, iCal, Mail, Skype.

The mouse still moved, but everything else was frozen. I have the Activity Monitor in the Dock, and the "Activity Monitor" text that appears when you hover over the icon was frozen in place. The iTunes video stopped updating, however it continued to play (audio).

When this happens, it appears that the OS is still chugging, as the audio was playing and the podcast got marked as played when it was complete, but I cannot do *anything* with the GUI and eventually, I have to hold the power button to force a reset.

The machine was rebooted yesterday (about 24 hours ago) and has been "undocked" from my external monitor/keyboard/mouse and then re-docked this morning before the lockup happened.

Is there some way I can debug this when the GUI is frozen? Maybe I could telnet in from another machine, but then what?

Has anyone else seen this on the MB? Think maybe it's Skype (these problems are fairly recent and my Skype install is recent, but it seems an unlikely candidate to cause this problem)? Maybe a Flash issue from FireFox as I had just been viewing a bunch of Sunday newspaper ads which are in Flash? Any other ideas about the source of the problem or how I can go about digging into this?

Edit: At this time, I cannot force the issue to happen and I don't have an Apple Store nearby (70+ miles away) that I can swing-by and chat with (even if there wasn't a line of 20 people at any given time ;) )

Thanks,
Mike



mkrishnan
Feb 11, 2007, 01:03 PM
Generally, it's very rare for a normal user application that runs in a window on top of OS X to bring the user interface down like that.

Are you using any plugins that modify basic system behavior, such as alternate mouse / trackpad drivers, things that change the visual appearance of OS X, etc?

If not, I would say actually a better candidate is that the Dock is acting up. You may have a corrupt dock plist file, although I'm not sure which one(s) to delete. When the dock goes down, it does actually cause the symptoms you describe. The next time it happens, try using cmd-opt-esc to bring up the force-quit window, also, and see if there's anything listed as non-responsive in there.

micsaund
Feb 11, 2007, 01:06 PM
Yes, in the System Prefs->Other I have:

Default Apps
Fan Control
OpenBase (?)
SteerMouse

I think I've tried the Force Quit key sequence, although I probably had it wrong since I'm on a Winbloze style keyboard when docked. I will re-try the Force Quit next time for sure (just tried it and it's Win-Alt-Esc for anyone else who needs it).

mkrishnan
Feb 11, 2007, 02:13 PM
Mmmm, I wouldn't really expect OpenBase to be doing this. Steermouse is a possibility -- in the distant past, other input drivers like Sidetrack occasionally caused problems like this (the Sidetrack issue has long since been resolved). I'd suggest getting rid of it temporarily to debug, except that the problem doesn't happen consistently, which will make debugging it hard....

micsaund
Feb 11, 2007, 04:52 PM
This may be unrelated, but I've had some app crashes recently also, and I just found the Log Browser thingy and see this in the crash reports:

Firefox:
Exception: EXC_BAD_ACCESS (0x0001)
Codes: KERN_INVALID_ADDRESS (0x0001) at 0x05aaf817


Mail:
Exception: EXC_BAD_ACCESS (0x0001)
Codes: KERN_INVALID_ADDRESS (0x0001) at 0x0296a000

These (happened at different times) were after I had been using LOTS of RAM, and I blew them off as something went wonky with the swap/paging (address calculation problem between the swap-out and back in or something) but maybe it's all part of the same problem.

Also, around the crash, windowserver.log says:
Feb 11 10:19:14 [63] kCGErrorIllegalArgument: CGXOrderWindowList: Empty window list
Feb 11 10:19:14 [63] kCGErrorIllegalArgument: CGXGetWindowShape: Invalid window -1
Feb 11 10:19:14 [63] kCGErrorIllegalArgument: CGXGetWindowShape: Invalid window -1
Feb 11 10:19:14 [63] kCGErrorIllegalArgument: CGXGetWindowShape: Invalid window -1
Feb 11 10:19:14 [63] kCGErrorIllegalArgument: CGXGetWindowShape: Invalid window -1
Feb 11 10:19:14 [63] kCGErrorIllegalArgument: CGXGetWindowShape: Invalid window -1
Feb 11 10:19:14 [63] kCGErrorIllegalArgument: CGXGetWindowShape: Invalid window -1
Feb 11 10:19:14 [63] kCGErrorIllegalArgument: CGXGetWindowShape: Invalid window -1
Feb 11 10:19:14 [63] kCGErrorIllegalArgument: CGXGetWindowShape: Invalid window -1
Feb 11 10:19:14 [63] kCGErrorIllegalArgument: CGXGetWindowShape: Invalid window -1
Feb 11 10:19:14 [63] kCGErrorIllegalArgument: CGXGetWindowDepth: Invalid window -1
Feb 11 10:29:37 [63] kCGErrorIllegalArgument: CGXOrderWindowList: Empty window list
Feb 11 11:29:36 [63] kCGErrorIllegalArgument: CGXOrderWindowList: Empty window list
Feb 11 11:41:08 [60] Server is starting up

11:41 is when I did the forced reboot.

system.log:
Feb 11 10:19:55 sobek KernelEventAgent[34]: tid 00000000 received VQ_NOTRESP event (1)
Feb 11 11:41:06 localhost kernel[0]: hi mem tramps at 0xffe00000
Feb 11 11:41:06 localhost kernel[0]: PAE enabled
Feb 11 11:41:06 localhost kernel[0]: 64 bit mode enabled
...more rebooting hardware info from the restart.


Thanks for the help so far, mkrishnan! I will uninstall steermouse if I keep seeing this and cannot find anything else to point a finger at. If only Apple would include multi-button 3rd party mouse support in the OS mouse driver, I wouldn't need such things :mad:

Mike

mkrishnan
Feb 11, 2007, 06:35 PM
That is definitely strange...

I found this:

http://forums.macosxhints.com/showthread.php?t=60932

I'm not sure how well it applies, though. Are you saying that an application crashes in these cases without bringing the system down, or that you get a kernel panic ultimately resulting in the text you provided? I'm assuming it's just the app... if multiple apps are triggering a manageable kernel-level problem...hmmm, I'm actually not sure what that might mean. :(

micsaund
Feb 11, 2007, 07:40 PM
I may have just stumbled onto something.

I one of the console or system.log files, I noticed a thing about an AFP share that I have in the basement not responding right before the lockup.

Well, I just got the lockup again a few minutes ago, and this time, I yanked the ethernet cord (I had wireless disabled already) and after a few seconds, the UI came back! I plugged the cord back in, my rsync back started, and then things locked-up again, until I pulled the plug (which yielded this windowserver.log entry):
Feb 11 18:27:03 [60] kCGErrorFailure: CGXDisableUpdate: UI updates were forcibly disabled by application "Dock" for over 1 second. Server has re-enabled them.
Feb 11 18:27:51 [60] kCGErrorFailure: CGXDisableUpdate: UI updates were forcibly disabled by application "Dock" for over 1 second. Server has re-enabled them.
Feb 11 18:28:24 [60] kCGErrorFailure: CGXDisableUpdate: UI updates were forcibly disabled by application "Dock" for over 1 second. Server has re-enabled them.

Here is some info from system.log about the AFP stuff:
Feb 11 18:28:03 sobek kernel[0]: AFP_VFS afpfs_Reconnect: Restoring session /Volumes/NASRAID1
Feb 11 18:28:04 sobek kernel[0]: AFP_VFS afpfs_Reconnect: primary reconnect failed 5, trying secondary /Volumes/NASRAID1


So, it might be something with the AFP share. I'd think that OS X would handle a wonky AFP share better than locking everything up, but maybe not. Ever seen this kind of behavior from a bad/crashed share?

To test, I will operate without that AFP share mounted and see if it happens again. I think it's looking better, though, after seeing that I could bring the UI back to life by killing the network connection.

EDIT a bit later: I just un-docked my external monitor and keyboard/mouse and the machine woke back up during the unplugging and stayed on (lid shut), which locked it up also, with some of the same messages minus the AFP stuff. ARGH! I suppose I can live with shutting-down to dock/un-dock, if that's (hopefully) the only problem, but I still shouldn't have to. I hope I don't have to format and reinstall as that would be rather Windows-ish after only 1.5 months of having the machine.

Just as an aside to show you how bad things lock-up when this happens, here are some system.log entries from the lockup period. It shows that the kernel is still alive and logging things, but the Dock or something is hosing the IPC or something very major up:
Feb 11 18:27:37 sobek diskarbitrationd[41]: firefox-bin [1179]:11815 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: mdimportserver [1095]:33371 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: mdimportserver [1095]:24879 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: coreaudiod [40]:28675 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: iCal [206]:25603 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: SteerMouse Manager [202]:24579 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: Activity Monitor [203]:19519 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: Adium [200]:23555 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: Google Notifier [198]:22019 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: Dock [193]:21507 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: Finder [196]:15955 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: SystemUIServer [194]:19971 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: SystemUIServer [194]:17923 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: loginwindow [68]:13315 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: ATSServer [67]:7739 not responding.
Feb 11 18:27:37 sobek diskarbitrationd[41]: coreservicesd [64]:9731 not responding.