Regarding your articles, that would be very kind of you.
Warning, they are pretty long ;-) I start with the Subler one - if you also need the SubRip one (which is only a bit shorter), let me know. Note that the info in the original article is updated via several "update" sections in the next post.
The article:
After yesterday's article on Subler, let me present you some additional tips and tricks for the excellent remuxer tool, Subler. Today, I'll speak of a fairly new and really excellent feature of Subler: optical character recognition (OCR for short) to quickly recognize the subtitles in bitmaps that is, the default subtitle formats of DVD's, Blu-ray discs and DVB broadcasts. With this feature, you can very easily convert even the subtitle tracks of your DVD's and Blu-ray discs for playback / rendering on iOS devices something impossible with the original, bitmap (non-textual) subtitles using the stock
Videos player.
Subler's approach is vastly different from that of
SubRip, the traditionally used app to OCR bitmap subtitles. Subler doesn't require any kind of manual character training: it does everything itself, taking its language-specific data from standard, language-specific dictionaries.
To activate the latter, for non-English languages, just copy the file
http://code.google.com/p/tesseract-ocr/source/browse/trunk/tessdata/<language code>.traineddata to
~/Library/Application Support/Subler/tessdata (after creating the directory). For example, for Finnish, you'll need the
fin.traineddata file. You can copy several language files there.
After this, if you open / import MKV files containing bitmap subtitles (unless you manually override the default everything should be OCR'ed), the subtitle tracks will be OCR'ed and exported as textual.
Note that, in this article, I pay special attention to including
both the
textual (OCR'ed) subtrack and the original
graphical bitmap-based subtitle track. While Subler's OCR support is excellent, there still might be cases it recognizes something wrong. Then, it's better to be safe - you can always switch on displaying the embedded graphical subtrack in both desktop players like VLC and some third-party iOS ones like
AVPlayerHD. Then, you'll easily and quickly find out what has been recognized wrong and can avoid misunderstandings.
DVD subtitles (subs for short)
The workflow with DVD's is much simpler than both DVB (on which I'll publish a separate article) and Blu-ray sub imports: you won't need to use any additional software at all.
1, Open (
File > Open) the MP4 / M4V file created by HandBrake (already having bitmap VobSub subs) from the original MKV created by
MakeMKV (here,
lupaus-title04-noburntinsubs.m4v; both this and the original MKV file can be downloaded from
THIS article):
2, click the "+" button (annotated above) and select the original MKV file (also mentioned in Bullet 1). Deselect all the
non-subtitle tracks (unless you also want to include for example additional audio tracks). The subtitle tracks' action will be 3GPP Text meaning they will be OCR'ed - exactly what we need.
Click
Add.
3, you'll see this:
You can safely save your file now. Note that you don't need to enable all the checkboxes of all the subtracks you want to save while Subler only selects the first of them in the list, it'll, nevertheless, save them all.
After saving (during which Subler OCR's the just-added subtracks), the
Format column of the just-imported subtitle tracks will change from
VobSub to
3GPP Text, showing they're now textual (annotated on the right; see below for the Text annotation on the left):
Here, you can also modify the name of the track so that you can easily see which track is bitmap and which is textual. For example, in the screenshot above, I've changed Subtitle Track to Text for all the textual subtracks (also annotated).
Now, VLC displays the new sub list the following way, making it easy to select the right subtrack based on its type (textual vs. bitmap):
Blu-ray subs
Unfortunately, as opposed to DVD subs, Subler doesn't support
S_HDMV/PGS subtracks the native sub format of Blu-ray dics. If you try to passthru them, Subler won't create a usable file; if you set
Action to
3GPP Text during opening the MKV file, no subtracks will be written to the target file.
Basically, you'll need to extract these subtracks, convert them to the Subler-friendly DVD-based IDX / SUB-format and re-add them to the MKV. Then, you'll already be able to add them, both in their original (bitmap) and OCR'ed form, to the target MP4's.
Let's start with subtraction. Unfortunately, my long-time favorite,
iMkvExtract, doesn't support extracting these subtracks it just doesn't export anything if you select one or more
S_HDMV/PGS subtracks.
For this tutorial, I've selected a part of the Blu-ray version of the excellent Iron Sky movie where German is spoken so that I can provide you with a test video you can play with with three subtitle languages as there are no English subtitles for English speech and the Behind the Scenes section of the disc only contains Finnish subs, not English / Swedish ones (the Blu-ray is only sold in Finland; this is why there are not even Swedish subs here). The video chunk is
HERE feel free to download it and play with its subtracks.
1, get and install
MKVtoolnix (fortunately, it's a simple DMG file). Start it.
2, click
Add (annotated below) and load the MKV file:
3, in the
Tracks, chapters and tags list, look for entries starting with
S_HDMV/PGS. Immediately following this type, in the parentheses, there will be some (track) ID's: in the above screenshot (also annotated), these are 5, 6 and 7.
4, for the next part, you'll need to switch to the
Terminal to access the command-line interface of the
mkvextract program directly. Fortunately, it's part of
MKVtoolnix so you don't need to install it separately.
If you've dragged
MKVtoolnix to
Applications/Video, just issue the following command in Terminal (assuming you're in the same directory as your source MKV file; if you aren't, use the absolute / relative path to the MKV file):
/Applications/Video/Mkvtoolnix.app/Contents/MacOS/mkvextract tracks MKVfilename trackID1
utputSUPfilename1 [trackID2
utputSUPfilename2 [trackIDN
utputSUPfilenameN]].
For example, in our case with three subtracks with ID's 5, 6, 7 and with a source MKV file named
IronSkyMAIN-rip.mkv, the command will look as follows:
/Applications/Video/Mkvtoolnix.app/Contents/MacOS/mkvextract tracks IronSkyMAIN-rip.mkv 5:sup1.sup 6:sup2.sup 7:sup3.sup
An example screenshot with the results:
Now, you'll need to convert these BD-specific sup files to traditional IDX / SUB pairs. Unfortunately, most of the traditional tools like
SubtitleCreator 2.3rc1 (which I used in a previous article for DVB TS SUP -> IDX + SUB conversion) doesn't recognize the format; neither does
SubMagic (which doesn't handle DVB TS SUP's either, BTW). The tool I recommend is, fortunately, fully OS X-compliant as it's written in Java:
BDSup2Sub (
dedicated thread). Just download
BDSup2Sub.jar (the current, stable 4.0.1 version will be just fine) and double-click it.
When the GUI is displayed, select
File > Load and load the SUP files, one by one. Just click OK on the first two dialogs to dismiss them; after that, select
File > Save/Export and, there, after setting the export language,
Save:
Now, to add the new, converted subtracks back to the MKV file, go back to
MKVtoolnix and click the same
Add button as above. Add the IDX files (only no need to manually add the .sub files). You can mass-add them if you use the Cmd key while clicking for multiple selection. After adding the three of them, MKVtoolnix will show the following:
Now, just click Start Muxing at the bottom left. The MKV file will be muxed; now, with the DVD-format VobSub track, also compatible with Subler.
Now, what you will need to do is straightforward.
1, Open the MKV file in Subler. Don't touch anything in the open dialog: do NOT try enabling the
S_HDMV/PGS subtracks!
2, Click
Add and, then, you can save your video right away (Cmd + S): it'll have the OCR'ed audio tracks.
If you
also want to save in the same target MP4 file both the DVD-compliant VobSub bitmap subtracks in addition to the just-created OCR'ed version of them, you'll need to do exactly the same as was the case with DVD's. While still having the just-remuxed (target) MP4 file in Subler, click + in the upper left corner, select the MKV file (again) and set every single VobSub track action to
Passthru from the default
3GPP Text; also, don't forget to disable all the non-VobSub-subtitle-tracks (all audio/ video etc. tracks) so that they aren't duplicated in the target file:
To avoid the bitmap subtitles being shown with extra large, blown-up characters, you'll also want to decrease their size after(!!!) saving (Cmd + S). (Changes made before exporting VobSub tracks won't be visible.) To do this, click each of the just-added VobSub subtitle tracks (not the older textual ones!) and enter 1920 in the first field after
Scaled Size (and press Tab) and 540 in the second, instead of the original 640 and 480, respectively (if it shows 0, make sure you save the file first!):
Now, you can just save the file. (Again, here, you can also change the subtrack names to reflect their being bitmaps.)
Why just 540? you may ask. I've found it the most ideal. When keeping the default one (after entering 1920 in the first textfield, it'll be computed to be 1440 as can also be seen in
THIS screenshot), the bitmap subs will be in the center of the screen as can be seen in the following screenshot (click it for the original-sized one):
![]()
After changing the default 1440 to 540, the subtitle will be a bit distorted (vertically scaled) but, at least, displayed at the bottom of the screen:
![]()