How to decode PCM from microphone?

Discussion in 'iOS Programming' started by HARDWARRIOR, Dec 11, 2008.

  1. HARDWARRIOR macrumors member

    Nov 17, 2008
    I need to make clap recognition on iPhone.
    I know an algorithm how to make it analyzing amplitude-from-time audio signal. But it's not clear for me now how to get such a representation of audio signal from microphone.

    Now I only know how to write audio files from microphone:

    #import <Cocoa/Cocoa.h>
    #import <AudioToolbox/AudioQueue.h>
    #import <AudioToolbox/AudioFile.h>
    #define NUM_BUFFERS 10
    typedef struct 	{
    	AudioFileID                 audioFile;
    	AudioStreamBasicDescription dataFormat;
    	AudioQueueRef               queue;
    	AudioQueueBufferRef         buffers[NUM_BUFFERS];
    	UInt32                      bufferByteSize; 
    	SInt64                      currentPacket;
    	BOOL                        recording;
    } RecordState;
    @interface recorder : NSObject {
    	BOOL recording;
    	RecordState recordState;	
    @property(nonatomic, assign) BOOL recording;
    - (void)start;
    - (void)stop;
    #import "recorder.h"
    @implementation recorder
    -(void)setRecording:(BOOL)val {
    		[self start];
    		[self stop];
    -(BOOL)recording {
    	return self->recording;
    - (id)init {
    	if (self = [super init]) {
    		recording = NO;
    		recordState.dataFormat.mSampleRate = 8000.0;
    		recordState.dataFormat.mFormatID = kAudioFormatLinearPCM;
    		recordState.dataFormat.mFramesPerPacket = 1;
    		recordState.dataFormat.mChannelsPerFrame = 1;
    		recordState.dataFormat.mBytesPerFrame = 2;
    		recordState.dataFormat.mBytesPerPacket = 2;
    		recordState.dataFormat.mBitsPerChannel = 16;
    		recordState.dataFormat.mReserved = 0;
    		recordState.dataFormat.mFormatFlags = kLinearPCMFormatFlagIsBigEndian | kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
    	return self;
    - (void)dealloc {
    	[super dealloc];
    void AudioInputCallback(void *inUserData, AudioQueueRef inAQ, AudioQueueBufferRef inBuffer, const AudioTimeStamp *inStartTime, UInt32 inNumberPacketDescriptions, const AudioStreamPacketDescription *inPacketDescs) {
        RecordState *recordState = (RecordState *)inUserData;
        OSStatus status = AudioFileWritePackets(recordState->audioFile, false, inBuffer->mAudioDataByteSize, inPacketDescs, recordState->currentPacket, &inNumberPacketDescriptions, inBuffer->mAudioData);
        if(status == 0)
            recordState->currentPacket += inNumberPacketDescriptions;
        AudioQueueEnqueueBuffer(recordState->queue, inBuffer, 0, NULL);
    - (void)start {
    	if(!recording) {
    		self->recording = YES;
    		recordState.currentPacket = 0;
    		OSStatus status = AudioQueueNewInput(&recordState.dataFormat, AudioInputCallback, &recordState, CFRunLoopGetCurrent(), kCFRunLoopCommonModes, 0, &recordState.queue);
    		if(status == 0) {
    			for(int i = 0; i < NUM_BUFFERS; i++) {
    				AudioQueueAllocateBuffer(recordState.queue, 16000, &recordState.buffers[i]);
    				AudioQueueEnqueueBuffer(recordState.queue, recordState.buffers[i], 0, NULL);
    			NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
    			NSString *documentsDirectory = [paths objectAtIndex:0];
    			NSString *writablePath = [documentsDirectory stringByAppendingPathComponent:@"audio.aiff"];
    			CFURLRef fileURL = CFURLCreateFromFileSystemRepresentation(NULL, (const UInt8 *) [writablePath UTF8String], [writablePath length], NO);
    			status = AudioFileCreateWithURL(fileURL, kAudioFileAIFFType, &recordState.dataFormat, kAudioFileFlags_EraseFile, &recordState.audioFile);
    			if(status == 0)
    				status = AudioQueueStart(recordState.queue, NULL);
    		if(status != 0)
    			NSLog(@"Recording Failed!");
    - (void)stop {
    	if(recording) {
    		self->recording = NO;
    		NSLog(@"Stopping queue...");
    		AudioQueueStop(recordState.queue, true);
    		NSLog(@"Queue stopped");
    		for(int i = 0; i < NUM_BUFFERS; i++)
    			AudioQueueFreeBuffer(recordState.queue, recordState.buffers[i]);
    		AudioQueueDispose(recordState.queue, true);
    		NSLog(@"Audio file closed");		
    I have an idea to replace AudioFileWritePackets in callback AudioInputCallback. And that inBuffer->mAudioDataByteSize bites from pointer inBuffer->mAudioData will be amplitude-time signal representation... But I do not know how to decode that memory area: looks like I need to work with a pairs of bites (mBytesPerPacket = 2). May I represent them as one of "standart" types (like int)? Is it signed or not?
  2. firewood macrumors 604

    Jul 29, 2003
    Silicon Valley
    Arrays of signed big-endian short integers using the format you described.

    I suggest books on digital signal processing, and maybe research papers on speech/sound recognition. You might also find some good suggestions in books on computer music.

  3. cpatch macrumors member

    Sep 17, 2007
    San Diego, CA
    Look at the sample SpeakHere app from Apple. Among other things it shows you how to monitor the amplitude of the incoming signal. To recognize a clap (or short burst of sound) you just need to take this information and look for a rapid increase and decrease in amplitude.

  4. adamk77 macrumors 6502

    Jan 6, 2008
    Have you had any progress on this?

    I am also trying to decode the data to pass into a fast fourier transformation function.

    Any info would be greatly appreciated.
  5. firewood macrumors 604

    Jul 29, 2003
    Silicon Valley
    There are already too many apps in the store where developers don't know enough about the data going into or coming out of an FFT.

    I suggest an intense study, not only of DSP theory, but of data types and how to avoid problems with computer arithmetic.

  6. adamk77 macrumors 6502

    Jan 6, 2008
    Thanks for the tip.

    I've figured it out.
  7. HARDWARRIOR thread starter macrumors member

    Nov 17, 2008
    No luck. Actually Apple's aurioTouch example does what I needed: it visualizes amplitude and can make fft. But it's not documented in a point where raw audio data is processed ( You are given a void pointer and number of bytes allocated by data. aurioTouch the performs some magic in a loop parsing this data as SInt8s and stepping by 4 * (sizeof(SInt8)) bits ahead. I just dont understand why 4?
    Also this example uses audioUnits. I dont get point in using them in iPhone and even why Apple made an AudioUnit framework for iPhone.

    For a clap recognition I checked audioLevels like in SpeakHere example. It is actually an average power which is integral from wave over a period. Kind of "smoothed" amplitude. So if you need only to wach amp peaks and decreasing over time - it's ok. But for fft you cant use average power.

    So if you will understand every line of code of aurioTouch - then yol'll solve your problem) Please notify of any progress on fft in this topic ;)
  8. adamk77 macrumors 6502

    Jan 6, 2008
    I had no idea that such a sample code existed from Apple. I've actually succeeded in what I was trying to accomplish (i did what firewood said -- signed shorts, big endian), but the use of AudioUnit in auriotouch is interesting.
  9. HARDWARRIOR thread starter macrumors member

    Nov 17, 2008
    Can you post a code where parsing to signed shorts is made? How do you set dataFormat? Especially interested in dataFormat.mFormatFlags.
  10. adamk77 macrumors 6502

    Jan 6, 2008
    I'm have no clue about the dataFormat -- I just use the default one from the sample code. My goal was to simply pass in the wav data to get some semblance of frequency. Something simple like if I whistle, it would register a higher number and if I talk low it would register a lower number. I didn't really care whether it actually was the right frequency. So I'm not sure how much this will help, but what I did was:

    for (i = 0; i < len; i+=2)
    char *buf = (char *) inBuffer->mAudioData;
    SInt16 s = (buf << 16) | (buf[i+1] << 8);
    in = (double) s;
    in[i+1] = 0;
  11. luka4e macrumors newbie

    Feb 2, 2009

    Hi, I can't understand what your code does...

    How can I get the instant audio frequency to do a successive FFT?
    What is the "in" array?


  12. rahulvyas macrumors newbie

    May 6, 2009

    adamk77 can you tell me a way to detect heartbeat by changing some code in auriotouch sample please.i am stuck at this from last two weeks.

Share This Page