Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Nutter

macrumors 6502
Original poster
Mar 31, 2005
432
0
London, England
Would the following objective-C class be thread safe?

Code:
@interface Foo : NSObject
{
	long _bar;
}
- (long)bar;
- (void)setBar:(long)newBar;
@end

@implementation Foo
- (long)bar;
{
	return _bar;
}
- (void)setBar:(long)newBar;
{
	_bar = newBar;
}
@end

Now, in practice, I think that the line _bar = newBar; would compile to a single instruction. (Could somebody confirm that?) In that case, there's no danger of the thread blocking during the reassignment of _bar. In other words, the class is thread safe despite not using locks, because the critical code, when compiled, is within the level of granularity of the thread switching.

However, in theory, isn't it bad practice to make assumptions about the compiler? The code would be broken if a compiler came along that compiled _bar = newBar; into two instructions, however unlikely that may be.

Does anyone have any thoughts on this? Also, does the same apply to all basic C data types?

Thanks,

Milo
 

garethlewis2

macrumors 6502
Dec 6, 2006
277
1
Your code is thread safe as you only have one instruction in each method. A store and a load.

If for instance your code was written as such

- (long)bar;
{
printf("Returning _bar");
return _bar;
}
- (void)setBar:(long)newBar;
{
printf("Setting bar to %ld\n", newBar);
_bar = newBar;
printf("bar has been set\n");
}

This is no longer thread-safe. Between the start of each of the methods and each instruction, the method could be suspended and another thread executed. If the setBar and bar methods were being run by different NSThread or pThread invocations, the instance variable bar would be in an unknown state. To get around this, you would need to add a global mutex object, and then put a lock/unlock around critical points in the code. This stops the scheduler from attempting to suspend a thread when in a critical block.

And yes, this applies to all basic data types, actually C only has two fundamental datatypes. Integer and Floating-Point. Everything else is just an extension or reduction of one them.
 

gnasher729

Suspended
Nov 25, 2005
17,980
5,565
Would the following objective-C class be thread safe?

Code:
@interface Foo : NSObject
{
	long _bar;
}
- (long)bar;
- (void)setBar:(long)newBar;
@end

@implementation Foo
- (long)bar;
{
	return _bar;
}
- (void)setBar:(long)newBar;
{
	_bar = newBar;
}
@end

Now, in practice, I think that the line _bar = newBar; would compile to a single instruction. (Could somebody confirm that?) In that case, there's no danger of the thread blocking during the reassignment of _bar. In other words, the class is thread safe despite not using locks, because the critical code, when compiled, is within the level of granularity of the thread switching.

However, in theory, isn't it bad practice to make assumptions about the compiler? The code would be broken if a compiler came along that compiled _bar = newBar; into two instructions, however unlikely that may be.

Does anyone have any thoughts on this? Also, does the same apply to all basic C data types?

Thanks,

Milo

It isn't threadsafe at all. You didn't make _bar volatile. That means the compiler is allowed to assume (incorrectly) that nobody else reads or writes _bar.

Lets say that initially _bar = 1, one thread sets it to 2, another thread sets it to 3.

First problem: Another thread reads _bar in a loop to see if it changes. Because _bar is not volatile, the compiler can assume that it doesn't change unless the running thread changes it. And as the running thread doesn't change it, the compiler can assume that it never changes and even remove the test. Optimising compilers do that kind of thing regularly.

Second problem: Assume a thread reads _bar, stores it in a local variable x, and uses x later. Meanwhile another thread changes _bar. Because _bar is not volatile, the compiler can assume that it doesn't change. So when the thread accesses x again, the compiler might not even have bothered copying _bar to x and instead might read _bar again since it is the same value. Optimising compilers do that.

And it goes on...

Once you make it volatile, there's the question: What exactly do you mean by treadsafe? If thread 1 sets _bar to 2, and thread 2 sets _bar to 3 almost simultaneously, what will the value of _bar be?
 

garethlewis2

macrumors 6502
Dec 6, 2006
277
1
No C compiler today takes any notice of the keyword volatile. The compiler makes the decision about whether it would put this variable in a register or in memory.
 

gnasher729

Suspended
Nov 25, 2005
17,980
5,565
And yes, this applies to all basic data types, actually C only has two fundamental datatypes. Integer and Floating-Point. Everything else is just an extension or reduction of one them.

There are for example 64 bit integers (no atomic operations on 32 bit PowerPC or Intel chips), there are bitfields (have fun modifying one bit field while another thread changes a nearby bitfield), there are structs and unions (any idea what happens if two threads copy to the same struct simultaneously ?) And would you want to make any bet what happens if two threads try to modify a 64 bit floating-point number that is stored straddling two VM pages, with each thread causing a page fault for a different page?

"volatile" is your friend. Mutexes are your friend. Atomic operations are your friend. Go to developer.apple.com and do a search for "atomic operations" which will show you things like OSAtomicAdd32 etc.
 

garethlewis2

macrumors 6502
Dec 6, 2006
277
1
Oh well, it's good to see that his program is attempting to write to a device, like a card reader isn't. ;)
 

Nutter

macrumors 6502
Original poster
Mar 31, 2005
432
0
London, England
Once you make it volatile, there's the question: What exactly do you mean by treadsafe? If thread 1 sets _bar to 2, and thread 2 sets _bar to 3 almost simultaneously, what will the value of _bar be?

By thread safe, I just mean that the method -bar can't return a corrupt value due to be called during the assignment _bar = newBar;. The kind of problem you're describing is down to good design, and no amount of locking will help.

No C compiler today takes any notice of the keyword volatile. The compiler makes the decision about whether it would put this variable in a register or in memory.

Ok ... so does that mean that compilers now always assume that variables are volatile, and never presumes to cache them?

Could someone suggest a good link or book where I can read about safe multithreaded design? I understand the basics of thread safety in relation to objects but I'm not confident that I know all the ins and outs of thread safety in general. I'm a self-taught programmer with no proper CS training, so I'm really quite ignorant about these things.

In fact, in case someone feels like critiquing my code directly (I'd be very grateful), I'll attach it to this post. Basically, I've developed a light-weight AppleScript bridge. I've designed my two main classes, MBAppleScriptObject and MBAppleScriptApplication, to be thread safe, but I'm not sure I've thought of everything. Any comments about the code would be appreciated, this is the first serious coding I've done.
 

Attachments

  • MBAppleScriptBridge.zip
    33 KB · Views: 53

garethlewis2

macrumors 6502
Dec 6, 2006
277
1
The way I described will protect you from corrupt data if stored in memory. The lock guarantees that only one thread is reading or writing that data. If this could be subverted, then no OS running today would stay up for a few minutes before crashing horribly.

For volatile, I made a mistake. Yes compilers will take in account volatile as long as you don't tell the compiler to use optimistation, say -O2, this flag attempts to stuff everything into registers and only return data from an external device, etc without trying to read the device. Since Apple use -Os volatile should be used. You won't need to use volatile unless you are reading from a device that is being updated externally, a sensor for example. Volatile forces the program to read the devices actual result into memory every time the variable is accessed. This hoses the performance of an app, if you are using volatile everywhere.

Coer Mac Programming is a good book, it does assume a lot of knowledge from the reader though. There are some good pthread tutorials online as well, and NSThread uses pthreads internally.
 

Nutter

macrumors 6502
Original poster
Mar 31, 2005
432
0
London, England
The way I described will protect you from corrupt data if stored in memory. The lock guarantees that only one thread is reading or writing that data.

I'm afraid I don't see why I'd need a lock in the example you posted. Surely, with or without the printf comments, the only critical section of code is _bar = newBar;, which doesn't need a lock because of the atomic nature of the assignment? Surely the number of instructions in the methods themselves doesn't matter, only the number of instructions in the critical code blocks?

Also, where did all this talk of volatile variables come from if they're only necessary for accessing variables that can also be modified externally? (I realise you didn't start it Gareth!) I'm getting confused...
 

HiRez

macrumors 603
Jan 6, 2004
6,250
2,576
Western US
If a long is 64 bits when this is run on a 32-bit system, isn't that a potential problem? I always understood that reading and writing longer than 32-bit primitives (like a double) was not thread-safe because the thread could be pre-empted between modifying the high and low portions, but maybe that's antiquated?
 

csubear

macrumors 6502a
Aug 22, 2003
613
0
the class is thread safe despite not using locks, because the critical code, when compiled, is within the level of granularity of the thread switching.

All other things put aside, you're still not safe if this code is ever run on a machine with more than one processor.

Thread Safe == Locks for Shared data.
 

gnasher729

Suspended
Nov 25, 2005
17,980
5,565
I think that long is always 32-bits. Long long is always 64-bits. I may be wrong.

"long" is always at least 32 bits. When compiling in XCode for PPC64 or Intel64, "long" is 64 bits. That will break a lot of code, but not this one here, because a 64 bit long will be a single unit, not two units as a 64 bit number on a 32 bit processor would be.
 

Nutter

macrumors 6502
Original poster
Mar 31, 2005
432
0
London, England
Ok, so the consensus is that my original code is not thread safe, neither in practice nor in theory.

"volatile" is your friend. Mutexes are your friend. Atomic operations are your friend. Go to developer.apple.com and do a search for "atomic operations" which will show you things like OSAtomicAdd32 etc.

Which of these would you use to correct my code? I know how to use a mutex lock to protect the variable, but is there a less expensive way, using "volatile" and/or atomic operations? Which atomic operation would I even use for this - I've found the definitions in OSAtomic.h, but I can't find any decent documentation.

Can anyone recommend a source that will explain all these things?? Apple's Multithreading Guide is pretty brief.
 

garethlewis2

macrumors 6502
Dec 6, 2006
277
1
gnasher I already amended my post about volatile. And no I didn't confuse any keywords in 'C'.

Your code will be thread-safe on a multiprocessor machine if you only have one execution path through it. As in a single threaded client process. But if the client or server is multi-threaded, then OS X cannot guarantee the order in which the threads actually execute and without a pmutex lock, the critical code may be pre-emtped. In one server bar is executed to retrieve that variable, and before the return statement, that thread is suspended. The next thread is executed and it runs setBar, which gives bar a new value, the first thread is then put into a run state and returns bar. Depending on your program type, say a counter function to show much process has been performed on a task, you might just want to ignore that problem. On the other hand, if it was a mission critical task, where you should always assume the variables are in a valid state then you need to use mutex locks, and maybe the volatile keyword, if you still have problems with inconsistent values.

You will get this problem even with a single core machine, but it less likely to happen, since two threads can't execute at the same time on current PPC and Intel processors for the Mac I believe.
 

Nutter

macrumors 6502
Original poster
Mar 31, 2005
432
0
London, England
Thanks for the explanation. However, I don't see how a lock would help - the value of bar could still be changed after I release the lock and before the return value from -bar; is used in my code. If I needed to be sure that the return value was up to date I'd have to put a big lock around the code that calls -bar; and uses the result, which is another matter.

Your point seems to apply equally to two threads running on one processor as it does to two threads running on two processors (if I understand it correctly!).

By the way, according to apple's docs you don't need to use the volatile keyword if you're using a lock.

All other things put aside, you're still not safe if this code is ever run on a machine with more than one processor.

I think I follow this now - two threads on two processors could be accessing the variable simultaneously, which is not good. So I do need a lock (or atomic functions?) after all. Huh. It sounds obvious now. :eek:

I've seen the use of atomic functions discouraged, so I think I'll be using NSLock.
 

gnasher729

Suspended
Nov 25, 2005
17,980
5,565
Ok, so the consensus is that my original code is not thread safe, neither in practice nor in theory.



Which of these would you use to correct my code? I know how to use a mutex lock to protect the variable, but is there a less expensive way, using "volatile" and/or atomic operations? Which atomic operation would I even use for this - I've found the definitions in OSAtomic.h, but I can't find any decent documentation.

Can anyone recommend a source that will explain all these things?? Apple's Multithreading Guide is pretty brief.

First, your example was a bit unusual; it just cannot work right in a multithreaded environment. Lets say you start two threads, one will try to set a variable x to 1 at exactly midnight tonight, and the other will try to set the same variable x to 2 at exactly the same time. In some cases (for example if x is 64 bit on a 32 bit system) you might end up with a mixture of both numbers, but even in the best case one thread will change x before the other, so you don't know whether x will be 1 or 2 one second after midnight.

Things like "atomic add" are quite useful. Lets say you want to increase x by one. Your processor reads x, adds 1 to the number, writes the result back to x. Say x = 100, and two threads try to do this at exactly the same time. If you are lucky, one thread finishes before the other starts, so x is correctly increased twice. If you are unlucky, the first thread reads x = 100. The second thread reads x = 100. The first thread calculates the new value 101, so does the second thread, then both write 101 to x. x is increased only once. An atomic add (I think it is OSAtomicAdd32) will increase x, and will not let anyone else interfere with it. So if both threads call OSAtomicAdd32 to increase x, it is guaranteed that x is increased twice.

One operation that is often used is "change x from a to b". You could for example read x, which might be 100, then call a function "change x from 100 to 101". There are two possibilities: If x was still 100 when you called that function, it will get changed to 101. If someone else changed it between reading 100 and calling the change function, it will not be changed again, and the function reports an error. In that case you just try again: Read x again (which might be 99 or 101 or something else at this point), add 1, try to change it again, until you succeed.

On a single processor system, things can go wrong, but they usually go wrong less often (which means any bugs just so much harder to find). Lets say you still have two threads running, but they cannot run simultaneously. First thread reads x = 100 and wants to change it to 101. JUST BEFORE it writes x back, the operating system could switch to another thread. And that thread could be running for many milliseconds, and could change x thousands of times! Then when the OS switches back to the first thread, that thread writes the number 101 - which is likely completely wrong!

About VOLATILE: The volatile keyword in C tells the compiler that a value can change unexpectedly (for example because it is modified by another thread), or that something could read the value (for example another thread reading it). If you change x, and x is volatile, then the compiler will use an instruction to change x _exactly_ where it should according to your code. And when you read x, the compiler will read x from memory _exactly_ where it should according to your code.

As an example: If you have int x; int y; y = x; then obviously x and y have the same value, and your compiler will assume that they are the same. If you then write z = y; the compiler might actually store x into z, because it knows x and y are the same. If another thread changes x or y in between, your code will not notice. If you make x and y volatile, then the compiler _must_ store the value of x into y when you write y = x; and not any later, and it _must_ read y again when you write z = y; . So volatile is essential to make multithreaded code work (and it makes code slower that doesn't need it).

(Hope this doesn't come up twice, MacRumors forgot my login while I was typing).
 

lazydog

macrumors 6502a
Sep 3, 2005
709
6
Cramlington, UK
With regard to your original code, I would say yes it is technically thread safe so long as the read/writes of _bar are atomic on your particular machine (32bit v 64bit etc). Without seeing the context that the class is used from I think that's the most that can be said about it probably.

For example if you have one thread which can read and write to _bar and another thread which only ever reads from _bar then you don't need any locks so long as the the read/writes are atomic for your machine.

If on the other hand for you're using _bar as a counter in a client/server situation, you might have multiple threads accessing the counter by reading bar(), incrementing the value, then writing it back with setBar(). In this situation you will need a lock around the bar()/setBar() pair.

Hope this helps

b e n
 

MarkCollette

macrumors 68000
Mar 6, 2003
1,559
36
Toronto, Canada
No C compiler today takes any notice of the keyword volatile. The compiler makes the decision about whether it would put this variable in a register or in memory.

Question: Which two C keywords did you confuse just now?

"register" and "volatile"


The very premise of this thread breaks my programming rules of thumb. Let's say that the variable can be accessed atomically, and that the algorithm for accessing it is correct, and that the interactions between readers and writers to this field can be guarranteed to be correct, and there's no bugs in GCC, and no wierd side-effect from whatever combination of multiple cores and hyperthreading or whatever the next hardware optimisation there is. Then the code is still brittle. Any future changes to any of the readers or writers could break this implicit contract of behaviour. Really, if you have state that needs to be shared between threads, just simply do the locking.
 

Nutter

macrumors 6502
Original poster
Mar 31, 2005
432
0
London, England
Thanks for the detailed help gnasher, and for your input lazydog. I suppose my original example was stated too minimally to make any clear judgements about thread safety. Certainly, if I were to use _foo as a counter, it would be unsafe by design.

Really, if you have state that needs to be shared between threads, just simply do the locking.

This seems like good sense. The only thing I'm still confused about is the volatile keyword ... do I need it if I'm using a lock?

For instance:

Code:
@interface ThreadSafeFoo : NSObject
{
	BOOL _bar;
	NSLock *_barLock;
}
// ... declarations ...
@end

@implementation ThreadSafeFoo

- (id)init
{
	if ([super init] == nil) return nil;
	_barLock = [[NSLock alloc] init];
	return self;
}

- (BOOL)bar
{
	BOOL localBar;
	[_barLock lock];
	localBar = _bar;
	[_barLock unlock];
	return localBar;	// Or do something else with it, potentially
}

- (void)setBar:(BOOL)newBar
{
	[_barLock lock];
	_bar = newBar;
	[_barLock unlock];
}

// override dealloc like a good boy...
@end

Take a look at - (BOOL)bar;. My instance variable _bar could be changed by another thread as soon as I unlock _barLock. Is it conceivable that the compiler could optimise such that if _bar changes, localBar effectively changes too? Should I therefore declare _bar as a volatile variable to stop this happening?

Gnasher, you seem to be saying that I should. I'm just baffled as to why this isn't mentioned in apple's documentation, or on cocoadev, etcetera. I can find plenty of other sources advocating the use of volatile, but nothing specifically mac, cocoa, or objective-c related.

PS. For those who looked at my MBAppleScriptBridge source code, I realise now that I was using @synchronized in completely the wrong way. Oops. Fixed.
 

gnasher729

Suspended
Nov 25, 2005
17,980
5,565
This seems like good sense. The only thing I'm still confused about is the volatile keyword ... do I need it if I'm using a lock?

...
Take a look at - (BOOL)bar;. My instance variable _bar could be changed by another thread as soon as I unlock _barLock. Is it conceivable that the compiler could optimise such that if _bar changes, localBar effectively changes too? Should I therefore declare _bar as a volatile variable to stop this happening?

It's a bit complicated (as usual). The C language itself doesn't know anything about threads. It assumes that only one thread is running, and it assumes that usually nothing can change a variable or notice that a variable is changed, and that is what "volatile" is for: It tells the compiler that a variable could be written or read by something other than the currently running thread (that could be another thread, it could also be some hardware).

C and Objective C don't really know what "Lock" and "Unlock" mean. "Lock" and "Unlock" are just function calls to the compiler, and the compiler has no idea what the function call does. _bar could be changed by another thread as soon as you call Unlock, and the compiler doesn't care. On the other hand, the compiler doesn't have the source code for "Unlock" and as far as it knows, "Unlock" itself _might_ change _bar. You know it doesn't, and I know it doesn't, but the compiler doesn't know that. Therefore, the compiler can _not_ assume that _bar is unchanged, and _must_ return localBar.

One word of caution with locking in general: Even if you use a mutex, another thread can still change _bar - if it doesn't use the same mutex! That is of course a bug in that code. You should _always_ minimise the number of places in your code that can change a variable like _bar (it will just make program maintenance easier in the long run), but with multiple threads it is essential. If you only every change _bar in one function, and that function uses a mutex, you are safe.
 

lazydog

macrumors 6502a
Sep 3, 2005
709
6
Cramlington, UK
Take a look at - (BOOL)bar;. My instance variable _bar could be changed by another thread as soon as I unlock _barLock.

Yes this could happen. The locks as you have them ensure that the value read or written to _bar is not corrupted by another thread also writing to _bar. That's all. Until you know how your class is being used the placements of these locks may or may not be sufficient to ensure your application is thread safe. You need to look at the wider picture to answer that question and also to come up with a solution.

Here is an example of what I mean:-

value = [foo bar]; // Get the current value of_bar
… // do some calculations using value, eg value += value ;
[foo setBar: value] ; // Write the value of _bar

Here we have the possibility that another thread could write a value to _bar inbetween the [foo bar] and [foo setBar:value] calls. If this happens it would invalidate the result of the intermediate calculation. So the locks within the methods 'bar' and 'setBar' are not sufficient - you need locks around the above code. If the above code is the only point of access to your class then you could do away with your original locks.

For the moment I would forget about using 'volatile', I think it's just confusing the real issues you have with multithreading.

By the way there is C type called sig_atomic_t which is guaranteed to read or write in a singal processor instruction.

Hope this has helped


b e n
 

Nutter

macrumors 6502
Original poster
Mar 31, 2005
432
0
London, England
Sorry for dragging up this thread from its watery grave, but I'm still a little unsure about this. Perhaps now that I know a little more, I can better define my question.

Let's say I have the following code:

Code:
@interface Foo : NSObject
{
	id _object;
}

- (id)object;
- (void)setObject:(id)object;
@end

@implementation Foo

- (id)object;
{
	return _object;
}

- (void)setObject:(id)object;
{
	@synchronized(self)		// Any kind of mutex will do.
	{
		if (_object == nil)
			_object = [object retain];
	}
}

@end

My question: Is the -object method safe, to the extent that it will always return a valid and well-defined value, either nil or the value of _object once it has been set?

In other words, is the assignment of _object atomic?

And, is the answer any different if we are in 64-bit mode?
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.