iOS How to decode PCM from microphone?

HARDWARRIOR · Dec 11, 2008

I need to make clap recognition on iPhone.
I know an algorithm how to make it analyzing amplitude-from-time audio signal. But it's not clear for me now how to get such a representation of audio signal from microphone.

Now I only know how to write audio files from microphone:

recorder.h

Code:

#import <Cocoa/Cocoa.h>
#import <AudioToolbox/AudioQueue.h>
#import <AudioToolbox/AudioFile.h>

#define NUM_BUFFERS 10

typedef struct 	{
	AudioFileID                 audioFile;
	AudioStreamBasicDescription dataFormat;
	AudioQueueRef               queue;
	AudioQueueBufferRef         buffers[NUM_BUFFERS];
	UInt32                      bufferByteSize; 
	SInt64                      currentPacket;
	BOOL                        recording;
} RecordState;

@interface recorder : NSObject {
@private
	BOOL recording;
	RecordState recordState;	
}

@property(nonatomic, assign) BOOL recording;

- (void)start;
- (void)stop;

@end

recorder.m

Code:

#import "recorder.h"


@implementation recorder

-(void)setRecording:(BOOL)val {
	if(val)
		[self start];
	else
		[self stop];
}

-(BOOL)recording {
	return self->recording;
}

- (id)init {
	if (self = [super init]) {
		recording = NO;
		
		recordState.dataFormat.mSampleRate = 8000.0;
		recordState.dataFormat.mFormatID = kAudioFormatLinearPCM;
		recordState.dataFormat.mFramesPerPacket = 1;
		recordState.dataFormat.mChannelsPerFrame = 1;
		recordState.dataFormat.mBytesPerFrame = 2;
		recordState.dataFormat.mBytesPerPacket = 2;
		recordState.dataFormat.mBitsPerChannel = 16;
		recordState.dataFormat.mReserved = 0;
		recordState.dataFormat.mFormatFlags = kLinearPCMFormatFlagIsBigEndian | kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
	}
	return self;
}

- (void)dealloc {
	
	
	[super dealloc];
}

void AudioInputCallback(void *inUserData, AudioQueueRef inAQ, AudioQueueBufferRef inBuffer, const AudioTimeStamp *inStartTime, UInt32 inNumberPacketDescriptions, const AudioStreamPacketDescription *inPacketDescs) {
    RecordState *recordState = (RecordState *)inUserData;
	
    OSStatus status = AudioFileWritePackets(recordState->audioFile, false, inBuffer->mAudioDataByteSize, inPacketDescs, recordState->currentPacket, &inNumberPacketDescriptions, inBuffer->mAudioData);
    if(status == 0)
        recordState->currentPacket += inNumberPacketDescriptions;
	
    AudioQueueEnqueueBuffer(recordState->queue, inBuffer, 0, NULL);
}

- (void)start {
	if(!recording) {
		self->recording = YES;
		
		recordState.currentPacket = 0;
		
		OSStatus status = AudioQueueNewInput(&recordState.dataFormat, AudioInputCallback, &recordState, CFRunLoopGetCurrent(), kCFRunLoopCommonModes, 0, &recordState.queue);
		
		if(status == 0) {
			for(int i = 0; i < NUM_BUFFERS; i++) {
				AudioQueueAllocateBuffer(recordState.queue, 16000, &recordState.buffers[i]);
				AudioQueueEnqueueBuffer(recordState.queue, recordState.buffers[i], 0, NULL);
			}
			
			NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
			NSString *documentsDirectory = [paths objectAtIndex:0];
			NSString *writablePath = [documentsDirectory stringByAppendingPathComponent:@"audio.aiff"];
			
			NSLog(writablePath);
			
			CFURLRef fileURL = CFURLCreateFromFileSystemRepresentation(NULL, (const UInt8 *) [writablePath UTF8String], [writablePath length], NO);
			
			status = AudioFileCreateWithURL(fileURL, kAudioFileAIFFType, &recordState.dataFormat, kAudioFileFlags_EraseFile, &recordState.audioFile);
			if(status == 0)
				status = AudioQueueStart(recordState.queue, NULL);
		}
		
		if(status != 0)
			NSLog(@"Recording Failed!");
	}	
}

- (void)stop {
	if(recording) {
		self->recording = NO;
		
		NSLog(@"Stopping queue...");
		AudioQueueStop(recordState.queue, true);
		NSLog(@"Queue stopped");
		
		for(int i = 0; i < NUM_BUFFERS; i++)
			AudioQueueFreeBuffer(recordState.queue, recordState.buffers[i]);
		
		AudioQueueDispose(recordState.queue, true);
		AudioFileClose(recordState.audioFile);
		NSLog(@"Audio file closed");		
	}
}

@end

I have an idea to replace AudioFileWritePackets in callback AudioInputCallback. And that inBuffer->mAudioDataByteSize bites from pointer inBuffer->mAudioData will be amplitude-time signal representation... But I do not know how to decode that memory area: looks like I need to work with a pairs of bites (mBytesPerPacket = 2). May I represent them as one of "standart" types (like int)? Is it signed or not?

firewood · Dec 11, 2008

Arrays of signed big-endian short integers using the format you described.

I suggest books on digital signal processing, and maybe research papers on speech/sound recognition. You might also find some good suggestions in books on computer music.

.

cpatch · Dec 11, 2008

Look at the sample SpeakHere app from Apple. Among other things it shows you how to monitor the amplitude of the incoming signal. To recognize a clap (or short burst of sound) you just need to take this information and look for a rapid increase and decrease in amplitude.

Craig

adamk77 · Dec 25, 2008

Have you had any progress on this?

I am also trying to decode the data to pass into a fast fourier transformation function.

Any info would be greatly appreciated.

firewood · Dec 25, 2008

There are already too many apps in the store where developers don't know enough about the data going into or coming out of an FFT.

I suggest an intense study, not only of DSP theory, but of data types and how to avoid problems with computer arithmetic.

.

adamk77 · Dec 25, 2008

firewood said:
There are already too many apps in the store where developers don't know enough about the data going into or coming out of an FFT.

I suggest an intense study, not only of DSP theory, but of data types and how to avoid problems with computer arithmetic.

.

Thanks for the tip.

I've figured it out.

HARDWARRIOR · Dec 26, 2008

adamk77 said:
Have you had any progress on this?

I am also trying to decode the data to pass into a fast fourier transformation function.

Any info would be greatly appreciated.

No luck. Actually Apple's aurioTouch example does what I needed: it visualizes amplitude and can make fft. But it's not documented in a point where raw audio data is processed (aurioTouchAppDelegate.mm😛erformThru). You are given a void pointer and number of bytes allocated by data. aurioTouch the performs some magic in a loop parsing this data as SInt8s and stepping by 4 * (sizeof(SInt8)) bits ahead. I just dont understand why 4?
Also this example uses audioUnits. I dont get point in using them in iPhone and even why Apple made an AudioUnit framework for iPhone.

For a clap recognition I checked audioLevels like in SpeakHere example. It is actually an average power which is integral from wave over a period. Kind of "smoothed" amplitude. So if you need only to wach amp peaks and decreasing over time - it's ok. But for fft you cant use average power.

So if you will understand every line of code of aurioTouch - then yol'll solve your problem) Please notify of any progress on fft in this topic 😉

adamk77 · Dec 26, 2008

HARDWARRIOR said:
No luck. Actually Apple's aurioTouch example does what I needed: it visualizes amplitude and can make fft. But it's not documented in a point where raw audio data is processed (aurioTouchAppDelegate.mm😛erformThru). You are given a void pointer and number of bytes allocated by data. aurioTouch the performs some magic in a loop parsing this data as SInt8s and stepping by 4 * (sizeof(SInt8)) bits ahead. I just dont understand why 4?
Also this example uses audioUnits. I dont get point in using them in iPhone and even why Apple made an AudioUnit framework for iPhone.

For a clap recognition I checked audioLevels like in SpeakHere example. It is actually an average power which is integral from wave over a period. Kind of "smoothed" amplitude. So if you need only to wach amp peaks and decreasing over time - it's ok. But for fft you can use average power.

So if you will understand every line of code of aurioTouch - then yol'll solve your problem) Please notify of any progress on fft in this topic 😉

I had no idea that such a sample code existed from Apple. I've actually succeeded in what I was trying to accomplish (i did what firewood said -- signed shorts, big endian), but the use of AudioUnit in auriotouch is interesting.

HARDWARRIOR · Dec 26, 2008

adamk77 said:
I had no idea that such a sample code existed from Apple. I've actually succeeded in what I was trying to accomplish (i did what firewood said -- signed shorts, big endian), but the use of AudioUnit in auriotouch is interesting.

Can you post a code where parsing to signed shorts is made? How do you set dataFormat? Especially interested in dataFormat.mFormatFlags.

adamk77 · Dec 26, 2008

HARDWARRIOR said:
Can you post a code where parsing to signed shorts is made? How do you set dataFormat? Especially interested in dataFormat.mFormatFlags.

I'm have no clue about the dataFormat -- I just use the default one from the sample code. My goal was to simply pass in the wav data to get some semblance of frequency. Something simple like if I whistle, it would register a higher number and if I talk low it would register a lower number. I didn't really care whether it actually was the right frequency. So I'm not sure how much this will help, but what I did was:

for (i = 0; i < len; i+=2)
{
char *buf = (char *) inBuffer->mAudioData;
SInt16 s = (buf << 16) | (buf[i+1] << 8);
in = (double) s;
in[i+1] = 0;
}

luka4e · Feb 2, 2009

for (i = 0; i < len; i+=2)
{
char *buf = (char *) inBuffer->mAudioData;
SInt16 s = (buf << 16) | (buf[i+1] << 8);
in = (double) s;
in[i+1] = 0;
}

Hi, I can't understand what your code does...

How can I get the instant audio frequency to do a successive FFT?
What is the "in" array?

thanks,

regards

rahulvyas · Dec 19, 2009

adamk77 said:
I'm have no clue about the dataFormat -- I just use the default one from the sample code. My goal was to simply pass in the wav data to get some semblance of frequency. Something simple like if I whistle, it would register a higher number and if I talk low it would register a lower number. I didn't really care whether it actually was the right frequency. So I'm not sure how much this will help, but what I did was:

for (i = 0; i < len; i+=2)
{
char *buf = (char *) inBuffer->mAudioData;
SInt16 s = (buf << 16) | (buf[i+1] << 8);
in = (double) s;
in[i+1] = 0;
}

adamk77 can you tell me a way to detect heartbeat by changing some code in auriotouch sample please.i am stuck at this from last two weeks.

Search

Search

iOS How to decode PCM from microphone?

HARDWARRIOR

macrumors member

firewood

macrumors G3

cpatch

macrumors member

adamk77

Suspended

firewood

macrumors G3

adamk77

Suspended

HARDWARRIOR

macrumors member

adamk77

Suspended

HARDWARRIOR

macrumors member

adamk77

Suspended

luka4e

macrumors newbie

rahulvyas

macrumors newbie

Our Staff