Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

mbabauer

macrumors regular
Original poster
Feb 14, 2006
105
0
I am a Java developer starting Cocoa. I have worked through a couple of books, and would like to start working on some applications, starting with some audio applications. I have an idea for an application that mixes verious audio feeds, mic, MP3 audio files, line in, AC3 files, etc, into a single MP3 feed. The problem is: Where to start?

I tried reading some on Core Audio, but it doesn't seem to cover what I was hoping it would. In essence, there are a few sample programs, and only one that handles MP3, and it shells out to play the actual file.

Can anyone tell me where I can get started? I am used to the JavaDoc format, and I am finding it rather tough to find info in the XCode documentation.
 
mbabauer said:
I am a Java developer starting Cocoa. I have worked through a couple of books, and would like to start working on some applications, starting with some audio applications. I have an idea for an application that mixes verious audio feeds, mic, MP3 audio files, line in, AC3 files, etc, into a single MP3 feed. The problem is: Where to start?

I tried reading some on Core Audio, but it doesn't seem to cover what I was hoping it would. In essence, there are a few sample programs, and only one that handles MP3, and it shells out to play the actual file.

Can anyone tell me where I can get started? I am used to the JavaDoc format, and I am finding it rather tough to find info in the XCode documentation.
CoreAudio is what you want, but the documentation for it is extremely poor and scattered. One of the main files, AudioUnitProperties.h, is not even indexed by the Xcode help system, so most things in it don't show up in Xcode. This includes AUAudioFilePlayer, which is what you probably want for playing MP3 files. I have to use Spotlight to search for bits in this and other files (Xcode will still open those files). If you change the Xcode Help search method to "Full-Text Search" instead of "API Search", you can find some more information, but still not all of it.

The best source for CoreAudio information is probably the CoreAudio Mailing List, which is searchable here ("coreaudio-api" is the name you want in the "Listname matches" field). Even here though, it can be tough to find answers, and you have to weed through a lot of stuff to find what you're looking for. You can also sign up for the list and receive emails.

Finally, be prepared to put in a lot of time to get anything done. The CoreAudio API is in C and C++ only, there is no Cocoa (or, AFAIK Java) interface to it. I would recommend you use the AUGraph API to make it a bit easier. The basic method is this:

1. Specify the audio units you will be using with component codes and creat an AUNode for each.
2. Connect the nodes together how you want (for example, effects node to mixer node to output node).
3. Grab references to each node that you need to manipulate. This will give you an AudioUnit instance for each that you can operate on directly.
4. Set up the stream formats of your units. When playing a file one typical thing to do is set the file player unit's output format to match the sample rate of your output device (you can only have one output device per graph).
5. Initialize the graph and start it running.

A sample of some of my old code:
Code:
/* ---- These are defined in the .h file -- I want access to them other places int he class. */
    AUGraph                 myGraph;
    AudioUnit               filePlayerUnit;
    AudioUnit               *currentFilePlayerUnit;
    AudioUnit               mixUnit;
    AudioUnit               outUnit;
    AudioFileID             anAudioFile;

////////////////////////////////////////////////////////////////////////

- (void)initAUGraph {
    OSErr                   err = noErr;
    AUNode                  filePlayerNode;
    AUNode                  mixNode;
    AUNode                  outNode;
    ComponentDescription    cd;
    UInt32                  propSize;
    
    err = NewAUGraph(&myGraph);
    NSAssert(err == noErr, @"NewAUGraph failed.");
    
    /* ---- Specify the audio unit types. */
    cd.componentType = kAudioUnitType_Output;
    cd.componentSubType = kAudioUnitSubType_HALOutput;
    /* ---- All AUHALs use Apple as componentManufacturer, regardless of who makes them. */
    cd.componentManufacturer = kAudioUnitManufacturer_Apple;
    cd.componentFlags = 0;
    cd.componentFlagsMask = 0;  
    err = AUGraphNewNode(myGraph, &cd, 0, NULL, &outNode);
    NSAssert(err == noErr, @"AUGraphNewNode for output failed.");
    
    cd.componentType = kAudioUnitType_Generator;
    cd.componentSubType = kAudioUnitSubType_AudioFilePlayer;
    err = AUGraphNewNode(myGraph, &cd, 0, NULL, &filePlayerNode);
    NSAssert(err == noErr, @"AUGraphNewNode for file player failed.");
    
    cd.componentType = kAudioUnitType_Mixer;
    cd.componentSubType = kAudioUnitSubType_StereoMixer;
    err = AUGraphNewNode(myGraph, &cd, 0, NULL, &mixNode);
    NSAssert(err == noErr, @"AUGraphNewNode for mixer unit failed.");
    
    /* ---- Connect nodes to each other in the graph. */
    err = AUGraphConnectNodeInput(myGraph, filePlayerNode, 0, mixNode, 0);
    NSAssert(err == noErr, @"AUGraphConnectNode fp -> mix failed.");
    err = AUGraphConnectNodeInput(myGraph, mixNode, 0, outNode, 0);
    NSAssert(err == noErr, @"AUGraphConnectNode mix -> out failed.");
    err = AUGraphOpen(myGraph);
    NSAssert(err == noErr, @"AUGraphOpen failed.");
    
    /* ---- Get references to the audio units each node relates to. */
    err = AUGraphGetNodeInfo(myGraph, outNode, NULL, NULL, NULL, &outUnit);
    NSAssert(err == noErr, @"AUGraphGetNodeInfo for outUnit failed.");
    err = AUGraphGetNodeInfo(myGraph, mixNode, NULL, NULL, NULL, &mixUnit);
    NSAssert(err == noErr, @"AUGraphGetNodeInfo for mixNode failed.");
    err = AUGraphGetNodeInfo(myGraph, filePlayerNode, NULL, NULL, NULL, &filePlayerUnit);
    NSAssert(err == noErr, @"AUGraphGetNodeInfo for filePlayerNode failed.");
    
    /* ---- Set initial pointer to the fp unit. */
    currentFilePlayerUnit = &filePlayerUnit;
    
    /* ---- Get the format of the input bus of the output device. */
    AudioStreamBasicDescription outputDeviceInputFormat;
    propSize = sizeof(AudioStreamBasicDescription);
    memset(&outputDeviceInputFormat, 0, propSize);
    err = AudioUnitGetProperty(outUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 0, &outputDeviceInputFormat, &propSize);
    NSAssert(err == noErr, @"Reading output device input bus format failed.");
    
    /* ---- Create a common stream format from the canonical format, with the output device's sample rate and channel count. */
    AudioStreamBasicDescription commonFormat;
    memset(&commonFormat, 0, sizeof(AudioStreamBasicDescription));
    [self setCanonicalStreamFormat:&commonFormat];
    commonFormat.mSampleRate = outputDeviceInputFormat.mSampleRate;
    commonFormat.mChannelsPerFrame = outputDeviceInputFormat.mChannelsPerFrame;
    
    /* ---- Set the output bus of the file player unit to match the output device's input bus format. */
    propSize = sizeof(AudioStreamBasicDescription);
    err = AudioUnitSetProperty(filePlayerUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 0, &commonFormat, propSize);
    NSAssert(err == noErr, @"Setting file player unit A stream format property failed.");
    
    err = AUGraphInitialize(myGraph);
    NSAssert(err == noErr, @"AUGraphInitialize() failed.");
    
    err = AUGraphStart(myGraph);
    NSAssert(err == noErr, @"AUGraphStart() failed.");
    
    [self setVolume:preFadeTriggerVolume];
}

////////////////////////////////////////////////////////////////////////

- (void)setCanonicalStreamFormat:(AudioStreamBasicDescription*)asbd {
    /* ---- 32-bit floating-point linear PCM, deinterleaved. NOTE: leaves sample rate and channel count untouched. */
    asbd->mFormatID = kAudioFormatLinearPCM;
    asbd->mFormatFlags = kAudioFormatFlagsNativeFloatPacked | kAudioFormatFlagIsNonInterleaved;
    asbd->mBitsPerChannel = 32;
    asbd->mFramesPerPacket = 1;
    asbd->mBytesPerPacket = asbd->mBytesPerFrame = sizeof(Float32);
}
 
(continued, looks like MacRumors has a character posting limit cutoff)

Now you can get and set properties on the units. For example, to play a file:
1. Open the file using AudioFileOpen(), which gives you an AudioFileID reference to it.
2. Set up an ScheduledAudioFileRegion (this is a C structure) to specify which part of the file to play. Set this as the kAudioUnitProperty_ScheduledFileRegion property on your file player unit.
3. Prime frames on the unit using the kAudioUnitProperty_ScheduledFilePrime property. This preloads frames from disk so playback is able to start immediately when you tell it to.
4. Set a start time on the unit by assigning an AudioTimeStamp structure to the kAudioUnitProperty_ScheduleStartTimeStamp property. This will start the unit producing audio (converted from MP3 into the 32-bit float linear PCM format that CoreAudio prefers to deal with internally).

Code:
////////////////////////////////////////////////////////////////////////

- (void)schedulePlayRegionForUnit:(AudioUnit*)aUnit file:(AudioFileID)aFileRef {
    ScheduledAudioFileRegion playRegion;
    UInt32 propSize = sizeof(playRegion);
    memset(&playRegion.mTimeStamp, 0, sizeof(playRegion.mTimeStamp));
    playRegion.mTimeStamp.mFlags = kAudioTimeStampSampleTimeValid;
    playRegion.mTimeStamp.mSampleTime = (Float64)0;
    playRegion.mCompletionProc = NULL;
    playRegion.mCompletionProcUserData = NULL;
    playRegion.mAudioFile = (AudioFileID)aFileRef;
    playRegion.mLoopCount = (UInt32)0;
    playRegion.mStartFrame = (SInt64)0;
    playRegion.mFramesToPlay = (UInt32)-1; // play all frames
    
    OSErr err = AudioUnitSetProperty(*aUnit, kAudioUnitProperty_ScheduledFileRegion, kAudioUnitScope_Global, 0, &playRegion, propSize);
    NSAssert(err == noErr, @"Setting scheduled play region property failed.");
}

////////////////////////////////////////////////////////////////////////

- (void)primeFilePlayerUnit:(AudioUnit*)aUnit {
    OSErr err = noErr;
    UInt32 primeFrames = (UInt32)0; // 0 = default number of priming frames
    UInt32 propSize = sizeof(primeFrames);
    err = AudioUnitSetProperty(*aUnit, kAudioUnitProperty_ScheduledFilePrime, kAudioUnitScope_Global, 0, &primeFrames, propSize);
    NSAssert(err == noErr, @"Priming file player AU failed.");
}

////////////////////////////////////////////////////////////////////////

- (void)schedulePlaybackForUnit:(AudioUnit*)aUnit {
    OSErr err = noErr;
    AudioTimeStamp stamp;
    memset (&stamp, 0, sizeof(stamp));
    stamp.mFlags = kAudioTimeStampSampleTimeValid;
    stamp.mSampleTime = (Float64)-1; // -1 means start set available render cycle (immediately)

    err = AudioUnitSetProperty(*aUnit, kAudioUnitProperty_ScheduleStartTimeStamp, kAudioUnitScope_Global, 0, &stamp, sizeof(stamp));
    NSAssert(err == noErr, @"Scheduling audio unit rendering start time failed.");
}

When you're done, make sure you clean up by closing the file and taking down the graph:
Code:
////////////////////////////////////////////////////////////////////////

- (void)dealloc {
    /* ---- No error-checking here since there's nothing we can do about it anyway; app is closing. */
    AUGraphStop(myGraph);
    AUGraphUninitialize(myGraph);
    AUGraphClose(myGraph);  
    if (anAudioFile) {
        OSErr err = AudioFileClose(anAudioFile);
    }
    
    // other stuff here...
    
    [super dealloc]
}

And all that's just for a very simple file playback, things can get much more complex. I'm not trying to discourage you, but just know it's going to be a tough slog to even get something running, but eventually it will start to make some sense. There are no books I know of for CoreAudio, that would probably be a big help. I'd try it myself but I don't know nearly enough about it.
 
Wow! Thanks!

Wow! This is actually more than I expected! I was expecting anything from 'RTFM' to 'Look *here*', but wasn't expecting code fragments...wow!

I am a little rusty on my C/C++. I haven't done ANY of it since my collage days, some 8 years ago, but I figured this was not going to be easy. Even in standard Java, audio is one of those things best left out if possible. But, as they say, nothing ventured, nothing gained.

That sucks about a single output. I was hoping to multiplex the stream to both the default speaker out and a file. I read something in the scattered XCode docs about some sort of multiplexing AudioUnit. I don't know how this works in a "pull" architecture, but I do remember reading it somewhere.

How do you handle the actual stream along side your GUI code? I have only brisked over your code, but do you start a separate thread so that processing of the audio stream is as uninterrupted as possible from GUI actions? Most of the Cocoa books I have read seem to advise against a multi-threaded application.
 
mbabauer said:
That sucks about a single output. I was hoping to multiplex the stream to both the default speaker out and a file. I read something in the scattered XCode docs about some sort of multiplexing AudioUnit. I don't know how this works in a "pull" architecture, but I do remember reading it somewhere.
Yeah, it does kind of suck. There is a SplitterAU that will turn one stream into two but since you're still limited to one output per graph it doesn't do you any good for what you want. You would use a splitter if you wanted to send a signal through separate effects chains (perhaps one goes through a reverb and then they both come back into a MixerAU where you could mix the dry with the wet).

As far as getting an output and also recording it to a file, I think the general method would be to place an input render callback on your output unit. In the callback you would copy the buffers it's passing to somewhere else (because you always want to minimize processing inside a render callback). Then, in a separate thread probably, you would write those buffers into your audio file using ExtAudioFileWrite() (using the new ExtAudioFile API you can have it do conversions as well on the way to the file).

How do you handle the actual stream along side your GUI code? I have only brisked over your code, but do you start a separate thread so that processing of the audio stream is as uninterrupted as possible from GUI actions? Most of the Cocoa books I have read seem to advise against a multi-threaded application.
Yes, in general, you always want to avoid threading if there's another way to do it, as it can make debugging much, much more difficult. If you're using the ScheduledAudioFilePlayerAU, the reads it does off of disk are done in it's own separate thread, so you don't really have to worry about it. I think the disk bottleneck is more of a problem than GUI interaction. In my own experience, the problem here is that the file player's built-in buffers are rather small (it has some 8k buffers and some 32k buffers for larger chunks). This works fine for playing a single file, however, I've had problems with dropouts when trying to play multiple files simultaneously (and where the CPU is not saturated). I've also had a number of random crashes that appear to be happening in the CoreAudio disk read thread that the file player unit uses.

I think larger buffers would help, and indeed the file player unit has a property to set the buffer size. Unfortunately, although it's been in the documentation for years, Apple has failed to implement it, therefore you're stuck. So right now I'm trying to implement my own buffering off of disk using the ExtAudioFile API and a ring buffer and letting a render callback pull data from my ring buffer. I don't have it working yet, but if I can get it to work, it should be more robust and flexible than the file player AU (albeit more complicated to implement and tune for performance). Also, it uses notifications and callbacks to avoid multithreading.
 
Reply to Audio and Cocoa

Hi,

I have been learning about coreaudio for a few months now, trying to figure out various general concepts in audio programming. This thread, especially the last post by HiRez, is interesting to me, and I think partly related to how I want to use coreaudio. I would like to buffer my own audio files and be able to play multiple ones simultaneously in a highly controlled way so as to mix them, and splice and arrange them, similar to how systems like pro tools and cubase can manipulate audio. I'm not trying to create such advanced general tools as those at this point, but I would like to know more about how they manage their audio streaming off of disk.

I'm pretty sure the implementation of single/multiple ring buffers and sending audio through my own registered callback is the way to go. Ardour, among others, does this, although diving into the source code has been painful in the past, because of the fact that I haven't grasped the main concepts yet.

HiRez, could you expound a bit on the ring buffer concept? Have you made any progress yet? On what resources/knowledge are you basing your implementation off of?

Any info would be greatly appreciated,
-gtcan
 
As an Amazon Associate, MacRumors earns a commission from qualifying purchases made through links in this post.
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.