Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

amoergosum

macrumors 6502
Original poster
Oct 20, 2008
377
43
I'd like to extract subtitles from a DVD. I guess the subtitles should be extracted to a .srt file.
Does anyone know how to do it? What software do you recommend?
 
I just downloaded subler.
When I click on File/Open and then on the DVD two folders appear >>>

"Audio_TS"

and

"Video_TS"

When I click on "Video_TS" all files inside are greyed out.

Thought that might happen. It's probably drm locked. Did you try MP4 Box, handbrake mght work
 
handbrake mght work

I just tried handbrake. The files inside the "Video_TS" folder are not greyed out so that's a good thing.
But there's no .srt output. All I can find is >>>

Output Settings
Format: MKV File
or
Format: MP4 File
 
I'd like to extract subtitles from a DVD. I guess the subtitles should be extracted to a .srt file.
Does anyone know how to do it? What software do you recommend?

DVD subtitles are not digital, but pictures overlaid on the video. It is not strictly extraction, but this (ridiculous jumping through hoops) may be sufficient for your needs:

Rip your DVD to an unencrypted VIDEO_TS folder using your favorite DVD ripper (RipIt, Fairmount, etc.).
Open the VIDEO_TS folder with Handbrake. Select the subtitles you want but uncheck the "burned in" option. Let Handbrake process the file. You will end up with an M4V file with DVD (VOBSUB) subtitles. Duplicate this file so you have two copies.

Open the first M4V file with Subler. You should see your video, audio, and subtitle tracks. Delete the subtitle track. Click the "+" button to add a track, navigate to the copy and open it. Uncheck all the tracks except for the subtitle, verify the "action" is "Tx3g". Click "add" and then save the file. Subler will translate the VOBSUB subtitles to text using an OCR process. In the end, you'll have an MV4 file with text subtitles. You can extract those with any number of tools, e.g. MP4Tools.

Edit: upon reflection, you can avoid making two copies by using the 'New' menu item in Subler - my workflow has artifacts.


A.
(if anyone has a more straightforward method, I would love to hear it)
 
Last edited:
I was able to extract the subtitles using handbreak (DVD to mkv) and iMkvExtract (extraction of the subtitles).
 
I was able to extract the subtitles using handbreak (DVD to mkv) and iMkvExtract (extraction of the subtitles).

Well, I did this, but I don't know how to make use of the extracted files, which have the suffixes ".sub" and ".idx" Is there any way to convert these to .srt files, or at least view their contents? (When I double-click them, they open in VLC, but nothing is displayed.)

Thanks in advance for any guidance anyone can offer.
 
I find that a combination of Handbrake, Subler and Aegisub is the best way to extract subtitles. The subtitles on a DVD is basically bitmap (known as VobSub) and does not look very good.

Here is an example of the difference:
vlcsnap-VobSub_zpsyxbpt1w6.png

vlcsnap-SRT_zpspjb4zr89.png


First you use Handbrake to convert the DVD into an MP4 file. In this process pay attention to the "Subtitles" tab:
Handbrake-subtitles_zpsv4qq7kde.png

This will include the bitmap subtitles into the video file and if your player can work with these and you can live with this format, you don't need to do anymore. The "Burned in" option, means that the subtitle will be rendered into the video frames and thus it can't be turned off during playback.

If you want the subtitle to be converted into text (SRT) format, Subler works really well for transforming the images through an OCR process.

For OCR recognition of subtitles that are not in english, I found that it works best if you download the latest training data for that language from the tesseract-oct project on github.

First open this folder: "~/Library/Application Support/Subler/"
Then create a new folder inside it, called: "tessdata" (if it doesn't already exist).
Now download you trained language file from Github and put it inside the "~/Library/Application Support/Subler/tessdata/" folder. For danish, the file is called: "dan.traineddata"

To convert your subtitles that are contained (in VobSub format) inside an MP4 file, the process goes like this:

Open Subler
Select File -> New
Drag your MP4 file into the top window (a dialog will come up and ask which tracks you want to include). Select all or select just the subtitle you want to convert. Then click Add.
Now you are back to the main window and here it is important to set the correct language of the subtitle. Here is how I did for a danish subtitle:
Subler_zpsv0elu56s.png

Now select File -> Save as
Chose a different name, so that you don't overwrite your original and let it do the saving. In this process, which takes a few seconds, the OCR will do it's magic and convert the VobSub into text.
Once done, select your subtitle track in the upper window and select File -> Export. Type a name for your subtitle (e.g. "Jurassic Park.srt" and click Save.
You can delete the temporary MP4 that Subler created as it isn't really needed.

Sublet does more that just the OCR of the subtitle. This is why the process seems a little strange. But I find that in almost all cases, you will want to run a spell-checker against your subtitle. So I always export it to SRT.

To check for spelling errors (or to make other adjustments), I found that Aegisub works well.

I hope this guide will help people, as it took me quite a while to discover how these different tools work.
 
I have found it much easier to just download movie subtitles and use Subler to imbed the file, using it's timing function to adjust, if needed, to sync with the video. Before doing it that way, I also did the ocr method. Just too many spelling corrections, was taking hours. But I appreciate your tutorial as I suppose there are some movie subtitles that are not available or at least not available in the needed language.
 
I have found it much easier to just download movie subtitles and use Subler to imbed the file, using it's timing function to adjust, if needed, to sync with the video. Before doing it that way, I also did the ocr method. Just too many spelling corrections, was taking hours. But I appreciate your tutorial as I suppose there are some movie subtitles that are not available or at least not available in the needed language.
I completely agree that if your DVD is a mainstream movie that you can find on opensubtitles.org or similar site, this is much easier than trying to do the same work, yourself. But in some cases it just doesn't exist.
 
I find that a combination of Handbrake, Subler and Aegisub is the best way to extract subtitles. The subtitles on a DVD is basically bitmap (known as VobSub) and does not look very good.

Here is an example of the difference:
vlcsnap-VobSub_zpsyxbpt1w6.png

vlcsnap-SRT_zpspjb4zr89.png


First you use Handbrake to convert the DVD into an MP4 file. In this process pay attention to the "Subtitles" tab:
Handbrake-subtitles_zpsv4qq7kde.png

This will include the bitmap subtitles into the video file and if your player can work with these and you can live with this format, you don't need to do anymore. The "Burned in" option, means that the subtitle will be rendered into the video frames and thus it can't be turned off during playback.

If you want the subtitle to be converted into text (SRT) format, Subler works really well for transforming the images through an OCR process.

For OCR recognition of subtitles that are not in english, I found that it works best if you download the latest training data for that language from the tesseract-oct project on github.

First open this folder: "~/Library/Application Support/Subler/"
Then create a new folder inside it, called: "tessdata" (if it doesn't already exist).
Now download you trained language file from Github and put it inside the "~/Library/Application Support/Subler/tessdata/" folder. For danish, the file is called: "dan.traineddata"

To convert your subtitles that are contained (in VobSub format) inside an MP4 file, the process goes like this:

Open Subler
Select File -> New
Drag your MP4 file into the top window (a dialog will come up and ask which tracks you want to include). Select all or select just the subtitle you want to convert. Then click Add.
Now you are back to the main window and here it is important to set the correct language of the subtitle. Here is how I did for a danish subtitle:
Subler_zpsv0elu56s.png

Now select File -> Save as
Chose a different name, so that you don't overwrite your original and let it do the saving. In this process, which takes a few seconds, the OCR will do it's magic and convert the VobSub into text.
Once done, select your subtitle track in the upper window and select File -> Export. Type a name for your subtitle (e.g. "Jurassic Park.srt" and click Save.
You can delete the temporary MP4 that Subler created as it isn't really needed.

Sublet does more that just the OCR of the subtitle. This is why the process seems a little strange. But I find that in almost all cases, you will want to run a spell-checker against your subtitle. So I always export it to SRT.

To check for spelling errors (or to make other adjustments), I found that Aegisub works well.

I hope this guide will help people, as it took me quite a while to discover how these different tools work.
I wanted to sign in and say thank you for this guide.

I had to redo the process a few times since the SRT files would sometimes come up blank but I eventually got all of them from this obscure Belgian show so I can be able to translate them.

Thank you again :D
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.