That is wrong. You only need an assembler, and if you are an operating system manufacturer, you can do it in binary/machine-code if you didn't have time to hack your assembler.
That is the whole point of the Accelerate framework.
However, if none of the existing Accelerate APIs can make any practical use of the SSE4 instructions, it won't be useful until the new APIs are published.
The accelerate framework sounds pretty neat. So the answer to the original question rests on several further questions:
- Do the accelerate APIs do anything which could take advantage of SSE4 instructions?
- Did Apple update code within the accelerate framework to use SSE4 instructions.
- Does the software you're interested in spend significant time in accelerate framework library code which is likely to be faster with SSE4 support.
And:
- When you install Mac OS 10.5.2, does it tailor the binaries to the specific CPU you're using? libvDSP looks like this:
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A
office:A kelly$ file libvDSP.dylib
libvDSP.dylib: Mach-O universal binary with 4 architectures
libvDSP.dylib (for architecture i386): Mach-O dynamically linked shared library i386
libvDSP.dylib (for architecture ppc7400): Mach-O dynamically linked shared library ppc
libvDSP.dylib (for architecture ppc64): Mach-O 64-bit dynamically linked shared library ppc64
libvDSP.dylib (for architecture x86_64): Mach-O 64-bit dynamically linked shared library x86_64
It seems like we need another couple of entries: (for architecture i386 supporting SSE4) and (for architecture x86_64 supporting SSE4). Without this, how does a binary with SSE4 support work on an intel chip which doesn't support the instruction?