PDA

View Full Version : Creating a file format for dictations, wanting critique




MorphingDragon
Oct 3, 2010, 07:05 AM
Basically, I've ben commissioned to make a replacement for Olympus' rather bad dictation/transcription software. I need this to run on Mac OSX/linux and Windows.

Because .dss is a proprietary (and expensive) file format, I need to create one with similar features that doesn't cost extravagant amounts.

I thought about just rebranding a bzip2 file, and have it contain a Vorbis file for the sound, an XML file for the metadata (Vorbis' inbuilt metadata doesn't meet the requirements and a plain XML file will be easier to work with), a .SHA checksum file and a verification certificate. Then encrypting the bzip2 file with a 256-bit key.

Anything wrong with that plan?

The reason for bzip2 and Vorbis is that the libraries are crossplatform and under the BSD license.



robbieduncan
Oct 3, 2010, 07:24 AM
What are the usage profiles? Do you need to be able to jump quickly to any point in the dictation file? If so does Vorbis support this well? In particular do you need to unzip and load the entire file into memory to jump to 45 minutes into the dictation? This might not be the best on limited memory devices (say you ever port this to the phones).

MorphingDragon
Oct 3, 2010, 07:49 AM
What are the usage profiles? Do you need to be able to jump quickly to any point in the dictation file? If so does Vorbis support this well? In particular do you need to unzip and load the entire file into memory to jump to 45 minutes into the dictation? This might not be the best on limited memory devices (say you ever port this to the phones).

Its just a basic "small law" lawfirm, the lawyers just use dictation to give instructions and relay large amounts of information to the support staff. Olympus' DS5000 software crashes continuously but they would rather pay me to make some in-house software than replace all of the Dictators and foot pedals.

Vorbis' libogg seek ability is about the same as LAME and VLC's Vorbis implementation (If that means anything to you).

The median for the dictations are about 13.43 minutes with a ∂ of 53 seconds. I took the largest recorded dictation I could find, (20 minutes) and converted it to an .ogg of 64KBPS VBR, the resulting bzip2 file with dummy files was about 5MB. Bzip2's library supports adding and extracting single files programmatically, so if done right I don't think memory will be an issue on smaller devices.