Many (most?) Mac OS X apps will not work properly when multiple instances are run at the same time. There are various technical reasons for this, which I won't go into. Some apps work; others don't; there's no way to predict which way it will go.
The last time I did any serious EQ, I used a sweep, which ToneGen will do. I did not spend the extra effort to do separate A/B tone testing.
ToneGen has the capability to write audio files. So generate a series of single tones, and save each one to a separate file. Then open the files in QuickTime Player for playback. You can then play them back individually or together.
Or you could use the program "Audacity", which is a full-featured audio editor. It's definitely overkill for this task, and is a cross-platform program, so it has a "not like a Real Mac App" look, but it's not hard to figure out how to do the basics. For example, you could use its builtin signal generator to make A/B alternating tones.
I also suggest filing a feature-request with the makers of ToneGen. Generating A/B tones, or at least being able to easily mute/unmute individual tones seems like it would be a good feature.
Good luck.