AME - transcoding 1080i DNxHD 120 footage to 1080p H.264 - how long?

Discussion in 'Digital Video' started by OneAnswer, Dec 24, 2014.

  1. OneAnswer macrumors member

    Sep 20, 2014

    I have to transcode around 150 hours of DNxHD 120 footage (25 fps, 1080i) via Adobe Media Encoder to H.264 MP4 files.
    I use the "HD 1080i 25" preset (Level 4.1, VBR, 1 pass, target bitrate 32 Mbps) and it takes two i7 CPUs (3770k and 4790k in two machines) roughly 1 minute to render 40 to 50 seconds of footage.
    AME is only using 50 to 60 percent of the CPU.
    HandBrake would produces similar transcoding durations with lower quality settings.

    The source clips are QT Reference files (MOV and WAV, MOVs linking to MXF files in Avid MediaFiles folder).

    Is DNxHD the reason for the "delay"?

    PS: I use AME CC 2014 on OS X 10.8.5 (3770k) and OS X 10.10.1 (4790k).
  2. handsome pete macrumors 68000

    Aug 15, 2008
    Have you encoded any other types of files to get a comparison on times? Encoding 120 hours of footage is going to take an enormous amount of time regardless. However, if there is an issue, I'm thinking it might have to do with encoding via the reference files. See if you can encode a dnxhd file directly and see what kind of result that gets you.
  3. OneAnswer thread starter macrumors member

    Sep 20, 2014
    Thanks for the reply.

    Initial testing showed me, that using Same as Source QT .mov files did take the same amount of time, even a bit longer, since those .mov files had to be written, and 25 GB takes longer to write than 300 MB .wav + 300 KB .mov file, times 400 or so.
    But I am almost finished, it took my two computers almost a week.
  4. kohlson macrumors 68000

    Apr 23, 2010
    Seems about right. 7 days is 168 hours, and you were doing about 1:1 in your initial test run. The contention could be in several places that you might affect: Read/input, memory, CPU, output/write. But you could have spent days trying to figure it out, and it sounds like most of this was pretty automated. If it produced the desired result then it seems like it worked out. So, roughly, what was the size of input, and the resulting output?
  5. OneAnswer thread starter macrumors member

    Sep 20, 2014
    I have two i7 quad core desktop CPUs (3770K and 4790K), both CPUs were used between 50 to 60 (out of 100) percent, the 32 GB of RAM were hardly used (4 GB for AME), read and write speeds were normal (all HDDs are capable of 120 MB/s and much more, they hardly went to 60 MB/s, but only after the transcoding process).

    The size of the input was around 10 TB, the size of the output was around 2 TB and less.
  6. joema2 macrumors 65816


    Sep 3, 2013
    Since you're doing single-pass H.264 and have a 4790K available, in theory transcoding software that uses the chip's on-board Quick Sync transcoder would be the fastest. FCP X supports that, and Handbrake does (Windows version only). I don't think AME does but am not sure.

    I did a quick test comparing FCP X 10.1.4 to Handbrake 0.10.0 and FCP was modestly faster, but less improvement than I expected. This was on a 2013 top-spec iMac 27. Producing a 1.15GB 1080p/30 file took 01:42 on FCP X vs 02:48 on Handbrake using H.264 (x264). That's 102 seconds vs 168 seconds.

    Usually Quick Sync will make a 4x or 5x difference, so if OS X Handbrake isn't using that, they are doing something else to pick up the speed. That said, FCP X export was significantly faster than Handbrake, just less faster than I expected.

    I think you're done now, but if you have additional huge transcode jobs it might be worth using the Windows version of Handbrake which supports Quick Sync, or just getting a copy of FCP X or Compressor and using that. Compressor can be installed stand-alone and it also supports Quick Sync.
  7. ColdCase, Jan 4, 2015
    Last edited: Jan 4, 2015

    ColdCase macrumors 68030

    Feb 10, 2008
    Dunno about you, but it looks like to me that HB uses all 8 cores full tilt where FCPX uses most of 4 cores and a bit of the 4 others. At least that what activity monitor tells me. If FCPX is 4 times faster but only using 1/3 -1/2 of the CPU it could explain your observations.

    Personally, FCP doesn't give me the control I want and, to my eye on a 60 inch screen, HB produces better "looking" video for the appleTV and with less file sizes. ... but thats apples and oranges :)
  8. kohlson macrumors 68000

    Apr 23, 2010
    Thanks for the response. AME can take advantage of GPU capabilities, but requires a strict set of circumstances. Depending on what you have on each system, this may help. Also, are these Apple Macs, or custom configured? Diagnosing throughput is a challenge even with direct access to the systems, so I don't have much to offer beyond this. Glad it worked out for you, though.
  9. OneAnswer thread starter macrumors member

    Sep 20, 2014
    I tried Compressor too, it was also "slow" and not using all cores.

    I will keep the QuickSync in mind though, thanks, but I dislike using Windows (for its interface and colours) quite a bit, that I do not like the idea of that.


    Custom configured, and worked quite well most of the time.
  10. joema2 macrumors 65816


    Sep 3, 2013
    A few minor versions ago I checked Compressor export speed on 1080p/30 H.264, and using single-pass it was as fast as FCP X. Now I just spent an hour trying many different Compressor export options, and for some reason it's no longer that fast. I don't know if FCP got faster or Compressor got slower, or if I'm doing something different.

    This is very important for anyone transcoding a lot of material because the performance differences can be huge depending on the options chosen. For a 2 min. test video, I got the below export times. Note Handbrake to MPEG-4 was fastest of all, despite it supposedly not using Quick Sync.

    Input file: 2 min 1080p/30 .mp4 file produced by FCP X, original material: Canon 5D3 1080p/30 IPB codec.

    Handbrake, codec=MPEG-4, CBR, QP=30: 0:21
    FCP X, master file, 1080p, H.264 "faster encode": 0:33
    Handbrake, codec=H.264 (x264): 0:46
    Compressor H.264 1080p custom preset, single pass H.264: 1:13
    FCP X, master file, 1080p, H.264 "better quality: 5:38
    Compressor H.264 1080p custom preset, multi-pass: 6:20

    Re not using all cores, you mean not all 8 virtual cores on a hyperthreaded 4-core CPU?

    There are cases where running 8 threads on such a CPU will cause substandard performance because two CPU-bound threads are competing for each real core and causing "cache thrashing". Whether this happens depends on the exact characteristics of those threads.

    I think the OS X thread scheduler is hyperthread-aware and may be intentionally scheduling threads on alternate virtual cores in this specific case.

    In *each* of my above tests on FCP/Compressor, iStat Menus showed alternate virtual cores were used. So CPU activity was about the same, but render time varied greatly. Handbrake showed all 8 threads heavily used.

    In general GPU-accelerated rendering should not be faster than Quick Sync. While Quick Sync requires the on-chip HD graphics, it's not really GPU assisted. Rather the hardware resources QS needs are tied into the on-chip GPU. Quick Sync is essentially an on-chip custom ASIC designed specifically for transcoding. It only works for single-pass MPEG-2 and H.264 but for those cases it's generally faster than software or GPU-assisted methods. The problem is software usually doesn't indicate whether QS is being used, and if incompatible parameters are chosen, the task silently can revert to software-based rendering. I can't explain why Handbrake is so fast to MPEG-4, unless maybe it's using QS. Windows Handbrake supposedly does, OS X Handbrake supposedly does not, but who knows?

    Larry Jordan article on using Quick Sync in Compressor:

Share This Page