PDA

View Full Version : How to decode PCM from microphone?




HARDWARRIOR
Dec 11, 2008, 08:35 AM
I need to make clap recognition on iPhone.
I know an algorithm how to make it analyzing amplitude-from-time audio signal. But it's not clear for me now how to get such a representation of audio signal from microphone.

Now I only know how to write audio files from microphone:

recorder.h

#import <Cocoa/Cocoa.h>
#import <AudioToolbox/AudioQueue.h>
#import <AudioToolbox/AudioFile.h>

#define NUM_BUFFERS 10

typedef struct {
AudioFileID audioFile;
AudioStreamBasicDescription dataFormat;
AudioQueueRef queue;
AudioQueueBufferRef buffers[NUM_BUFFERS];
UInt32 bufferByteSize;
SInt64 currentPacket;
BOOL recording;
} RecordState;

@interface recorder : NSObject {
@private
BOOL recording;
RecordState recordState;
}

@property(nonatomic, assign) BOOL recording;

- (void)start;
- (void)stop;

@end


recorder.m

#import "recorder.h"


@implementation recorder

-(void)setRecording:(BOOL)val {
if(val)
[self start];
else
[self stop];
}

-(BOOL)recording {
return self->recording;
}

- (id)init {
if (self = [super init]) {
recording = NO;

recordState.dataFormat.mSampleRate = 8000.0;
recordState.dataFormat.mFormatID = kAudioFormatLinearPCM;
recordState.dataFormat.mFramesPerPacket = 1;
recordState.dataFormat.mChannelsPerFrame = 1;
recordState.dataFormat.mBytesPerFrame = 2;
recordState.dataFormat.mBytesPerPacket = 2;
recordState.dataFormat.mBitsPerChannel = 16;
recordState.dataFormat.mReserved = 0;
recordState.dataFormat.mFormatFlags = kLinearPCMFormatFlagIsBigEndian | kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
}
return self;
}

- (void)dealloc {


[super dealloc];
}

void AudioInputCallback(void *inUserData, AudioQueueRef inAQ, AudioQueueBufferRef inBuffer, const AudioTimeStamp *inStartTime, UInt32 inNumberPacketDescriptions, const AudioStreamPacketDescription *inPacketDescs) {
RecordState *recordState = (RecordState *)inUserData;

OSStatus status = AudioFileWritePackets(recordState->audioFile, false, inBuffer->mAudioDataByteSize, inPacketDescs, recordState->currentPacket, &inNumberPacketDescriptions, inBuffer->mAudioData);
if(status == 0)
recordState->currentPacket += inNumberPacketDescriptions;

AudioQueueEnqueueBuffer(recordState->queue, inBuffer, 0, NULL);
}

- (void)start {
if(!recording) {
self->recording = YES;

recordState.currentPacket = 0;

OSStatus status = AudioQueueNewInput(&recordState.dataFormat, AudioInputCallback, &recordState, CFRunLoopGetCurrent(), kCFRunLoopCommonModes, 0, &recordState.queue);

if(status == 0) {
for(int i = 0; i < NUM_BUFFERS; i++) {
AudioQueueAllocateBuffer(recordState.queue, 16000, &recordState.buffers[i]);
AudioQueueEnqueueBuffer(recordState.queue, recordState.buffers[i], 0, NULL);
}

NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDirectory = [paths objectAtIndex:0];
NSString *writablePath = [documentsDirectory stringByAppendingPathComponent:@"audio.aiff"];

NSLog(writablePath);

CFURLRef fileURL = CFURLCreateFromFileSystemRepresentation(NULL, (const UInt8 *) [writablePath UTF8String], [writablePath length], NO);

status = AudioFileCreateWithURL(fileURL, kAudioFileAIFFType, &recordState.dataFormat, kAudioFileFlags_EraseFile, &recordState.audioFile);
if(status == 0)
status = AudioQueueStart(recordState.queue, NULL);
}

if(status != 0)
NSLog(@"Recording Failed!");
}
}

- (void)stop {
if(recording) {
self->recording = NO;

NSLog(@"Stopping queue...");
AudioQueueStop(recordState.queue, true);
NSLog(@"Queue stopped");

for(int i = 0; i < NUM_BUFFERS; i++)
AudioQueueFreeBuffer(recordState.queue, recordState.buffers[i]);

AudioQueueDispose(recordState.queue, true);
AudioFileClose(recordState.audioFile);
NSLog(@"Audio file closed");
}
}

@end


I have an idea to replace AudioFileWritePackets in callback AudioInputCallback. And that inBuffer->mAudioDataByteSize bites from pointer inBuffer->mAudioData will be amplitude-time signal representation... But I do not know how to decode that memory area: looks like I need to work with a pairs of bites (mBytesPerPacket = 2). May I represent them as one of "standart" types (like int)? Is it signed or not?



firewood
Dec 11, 2008, 10:01 AM
Arrays of signed big-endian short integers using the format you described.

I suggest books on digital signal processing, and maybe research papers on speech/sound recognition. You might also find some good suggestions in books on computer music.

.

cpatch
Dec 11, 2008, 01:14 PM
Look at the sample SpeakHere app from Apple. Among other things it shows you how to monitor the amplitude of the incoming signal. To recognize a clap (or short burst of sound) you just need to take this information and look for a rapid increase and decrease in amplitude.

Craig

adamk77
Dec 25, 2008, 10:58 AM
Have you had any progress on this?

I am also trying to decode the data to pass into a fast fourier transformation function.

Any info would be greatly appreciated.

firewood
Dec 25, 2008, 02:37 PM
There are already too many apps in the store where developers don't know enough about the data going into or coming out of an FFT.

I suggest an intense study, not only of DSP theory, but of data types and how to avoid problems with computer arithmetic.

.

adamk77
Dec 26, 2008, 12:41 AM
There are already too many apps in the store where developers don't know enough about the data going into or coming out of an FFT.

I suggest an intense study, not only of DSP theory, but of data types and how to avoid problems with computer arithmetic.

.

Thanks for the tip.

I've figured it out.

HARDWARRIOR
Dec 26, 2008, 10:09 AM
Have you had any progress on this?

I am also trying to decode the data to pass into a fast fourier transformation function.

Any info would be greatly appreciated.

No luck. Actually Apple's aurioTouch example does what I needed: it visualizes amplitude and can make fft. But it's not documented in a point where raw audio data is processed (aurioTouchAppDelegate.mm:PerformThru). You are given a void pointer and number of bytes allocated by data. aurioTouch the performs some magic in a loop parsing this data as SInt8s and stepping by 4 * (sizeof(SInt8)) bits ahead. I just dont understand why 4?
Also this example uses audioUnits. I dont get point in using them in iPhone and even why Apple made an AudioUnit framework for iPhone.

For a clap recognition I checked audioLevels like in SpeakHere example. It is actually an average power which is integral from wave over a period. Kind of "smoothed" amplitude. So if you need only to wach amp peaks and decreasing over time - it's ok. But for fft you cant use average power.

So if you will understand every line of code of aurioTouch - then yol'll solve your problem) Please notify of any progress on fft in this topic ;)

adamk77
Dec 27, 2008, 12:45 AM
No luck. Actually Apple's aurioTouch example does what I needed: it visualizes amplitude and can make fft. But it's not documented in a point where raw audio data is processed (aurioTouchAppDelegate.mm:PerformThru). You are given a void pointer and number of bytes allocated by data. aurioTouch the performs some magic in a loop parsing this data as SInt8s and stepping by 4 * (sizeof(SInt8)) bits ahead. I just dont understand why 4?
Also this example uses audioUnits. I dont get point in using them in iPhone and even why Apple made an AudioUnit framework for iPhone.

For a clap recognition I checked audioLevels like in SpeakHere example. It is actually an average power which is integral from wave over a period. Kind of "smoothed" amplitude. So if you need only to wach amp peaks and decreasing over time - it's ok. But for fft you can use average power.

So if you will understand every line of code of aurioTouch - then yol'll solve your problem) Please notify of any progress on fft in this topic ;)

I had no idea that such a sample code existed from Apple. I've actually succeeded in what I was trying to accomplish (i did what firewood said -- signed shorts, big endian), but the use of AudioUnit in auriotouch is interesting.

HARDWARRIOR
Dec 27, 2008, 01:09 AM
I had no idea that such a sample code existed from Apple. I've actually succeeded in what I was trying to accomplish (i did what firewood said -- signed shorts, big endian), but the use of AudioUnit in auriotouch is interesting.

Can you post a code where parsing to signed shorts is made? How do you set dataFormat? Especially interested in dataFormat.mFormatFlags.

adamk77
Dec 27, 2008, 01:19 AM
Can you post a code where parsing to signed shorts is made? How do you set dataFormat? Especially interested in dataFormat.mFormatFlags.

I'm have no clue about the dataFormat -- I just use the default one from the sample code. My goal was to simply pass in the wav data to get some semblance of frequency. Something simple like if I whistle, it would register a higher number and if I talk low it would register a lower number. I didn't really care whether it actually was the right frequency. So I'm not sure how much this will help, but what I did was:


for (i = 0; i < len; i+=2)
{
char *buf = (char *) inBuffer->mAudioData;
SInt16 s = (buf[i] << 16) | (buf[i+1] << 8);
in[i] = (double) s;
in[i+1] = 0;
}

luka4e
Feb 2, 2009, 08:50 AM
for (i = 0; i < len; i+=2)
{
char *buf = (char *) inBuffer->mAudioData;
SInt16 s = (buf[i] << 16) | (buf[i+1] << 8);
in[i] = (double) s;
in[i+1] = 0;
}

Hi, I can't understand what your code does...

How can I get the instant audio frequency to do a successive FFT?
What is the "in" array?

thanks,

regards

rahulvyas
Dec 19, 2009, 02:45 AM
I'm have no clue about the dataFormat -- I just use the default one from the sample code. My goal was to simply pass in the wav data to get some semblance of frequency. Something simple like if I whistle, it would register a higher number and if I talk low it would register a lower number. I didn't really care whether it actually was the right frequency. So I'm not sure how much this will help, but what I did was:


for (i = 0; i < len; i+=2)
{
char *buf = (char *) inBuffer->mAudioData;
SInt16 s = (buf[i] << 16) | (buf[i+1] << 8);
in[i] = (double) s;
in[i+1] = 0;
}

adamk77 can you tell me a way to detect heartbeat by changing some code in auriotouch sample please.i am stuck at this from last two weeks.