View Full Version : 10.6 Server: My ACLs seem to be broken

Dec 18, 2012, 08:13 PM
I've got a small office file server (a Mini server running 10.6.8 server, configuration imported from a 10.5 XServe) that I'm having some really, really weird permissions issues with.

One of the directories on a share on the server is supposed to be read/write-able by a bookkeeping user group, but not readable by the broader general staff user group (this is a custom staff group, not the system default one). I did this by creating an ACL for the folder (via the browser in Server Admin) with Full Control permission for the desired group, and then below it the staff group with deny Full Control, then set inherit to everything below.

That worked fine for literally years.

Then, suddenly, a few days ago, people could no longer modify or delete folders that they created within that folder. When I checked the permissions on created folders, they were somehow getting created without "delete" allowed, which made no sense, but I assumed that something had gone wonky and tried doing every combination of reboots, re-setting permissions, re-propagating them, etc I could think of.

Finally I re-created a fresh user group for the Bookkeepers (new GID, short and long name), deleted the old one entirely, used the command line and sudo to purge the ACL from the top-level folder entirely, and re-added the desired permissions.

Still no luck--now I can create new folders, but cannot rename or move a folder I have just created, although I can delete it. The "Effective Permissions" browser in Server Admin shows my user as having full permissions for the folder in question to do everything, I've logged out and back on to make sure it's not a cache issue, and I've run out of ideas short of an OS reinstall.

The command line says I have the following permissions, which as far as I can tell are identical to directories I can edit the name of and move:
inherited allow list,add_file,search,add_subdirectory,delete_child,readattr,writeattr,readextattr,writeextattr,reads ecurity,writesecurity,chown,file_inherit,directory_inherit

versus this for a folder I CAN edit:
inherited allow list,add_file,search,delete,add_subdirectory,delete_child,readattr,writeattr,readextattr,writeextatt r,readsecurity,writesecurity,chown,file_inherit,directory_inherit

...the notable difference in there being lack of "delete" permissions on the problem directories. Which is bizarre, because that group is set to "full control", and I CAN delete it--just not move or rename. (Perhaps that's the "delete_child" of the parent directory allowing me to do that?)

Is there something I'm missing here? What the heck is going on?

Les Kern
Dec 20, 2012, 08:53 AM
In my 15 years never seen this... looks like reverting to POSIX... did you recently add new users?.... try this...
Delete the original group and leave that new group you made, make a new group with a different name or even better use the ORIGINAL name, then nest the new group in there. Add one of the users to POSIX as well. See if the number of users affects it. Should just take a few minutes to test.

Dec 20, 2012, 04:18 PM
Thanks for the suggestions, Les.

I'll try the exact procedure you suggest, but I can already partially confirm that it's not simply reverting to POSIX permissions.

The POSIX permissions on the folder in question (and its children, having been propagated and confirmed via Server Admin and the command line) are "rwxrwx--- admin admin Other". And my personal user is in the system admin group (GID 80), yet I'm having exactly the same "can't move/rename folders" issues as everybody else. If it were pure POSIX, I should have full rwx permissions (and most of the other users shouldn't even be able to do a directory listing, since they're not server admins).

I also forgot to mention that the Finder is getting the same permissions weirdness; if I connect directly to the server and use its Finder to move a folder within that folder, I'm prompted for an Admin password in order to be able to do so, as you'd expect for trying to do something your user doesn't have permission to do (and if I give one, it does move). So the sudo override works.

As for getting rid of the old groups:

I had already deleted the original user groups--I did that immediately after creating the new ones. However, if I try to re-create a group with the same name or same shortname, I get an error that a group with that name already exists. So while the groups have been deleted (they do not appear in Workgroup Manager, and ACLs that included the groups showed a garbage name until deleted), they are apparently still lingering in the internal database, at least to the point that they cannot be recreated by Workgroup Manager.

Les Kern
Dec 20, 2012, 07:54 PM
Maybe clear it all out by selecting "show all records tab and inspector" from prefs and get the target icon.

Perhaps it's almost time to save the docs and re-format? I usually spend a few hours on an issue like this, and if it's a heavy production server go back to a previous good clone or just start from scratch.

Dec 21, 2012, 05:31 PM
So that's interesting.

Even within the "all records" tab, the phantom groups are not visible--only the ones I expect to be there are.

If I use dscl to show the full list of groups on . (as opposed to the local LDAP), I do see one of the two phantom groups there (it doesn't appear in the local LDAP according to dscl). So apparently it got added as a local group somehow. The other phantom group, however, I can't find in either -list shown by dscl, which is even stranger.

I also noticed a phantom user (apparently imported from a much older server install but long since deleted) in the local system admin (80) group, which is a little odd.

Clearly something got hosed in the directory itself, and given how far back some of the oddities go, I'm skeptical that even reverting to a recent backup would do any good--something tells me that the issues would just reoccur. If I can get a clean export out of the LDAP directory and get a clean install of the OS, then reimport just that, it seems like I'd be in good shape, but that will be annoying in terms of reconfiguring other services (particularly since the server has a few hard-to-install odds and ends, like a license server with USB dongle, running on it). It wouldn't be as bad if I hadn't just gotten done reinstalling the OS a couple of months ago due to some completely different oddities.

At least it doesn't look like the problem is with the data volume; having to reformat and re-clone that would be even more annoying than a server OS reinstall.

My plan is to try and migrate to 10.8 Server over the vacation and see what that does for me. I don't think we need any of the services that it doesn't have, and it's cheap enough to at least be worth trying as an experiment.

Les Kern
Dec 23, 2012, 11:24 AM
Good luck! We know what you're going through!

Jan 7, 2013, 12:33 PM
Just a bit of a follow-up in case anybody runs across this trying to troubleshoot something similar.

I ended up installing a completely fresh copy of 10.8 (+Server) on another partition and NOT importing any settings from the old 10.6 Server install. I then exported the list of users from Workgroup Manager, imported them into the 10.8 Workgroup Manager, re-set passwords, and recreated the groups I wanted.

At that point I could set the ACLs and everything worked properly. Interestingly, on the "orphaned" group ACLs on the volume (which show as a hash instead of group name under 10.8), the permissions indeed lacked "delete" (and only delete) which is what was causing the biggest issues for me; it was apparently just that 10.6 absolutely refused to set this correctly (or show it in the inspector). It was almost like the default umask had gotten screwed up somehow (although other things had obviously also gone wrong).