Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

DavidCar

macrumors 6502a
Original poster
Jan 19, 2004
525
0
Is there a way to extract closed captions into a textfile from an EyeTV recording?

It would be nice if they could build this into EyeTV, then add the capability of searching a recording based on words in the closed captions.

Thanks,
DC
 
In 1992 It Was A Piece of Cake. Now Not So Much.

Is there a way to extract closed captions into a textfile from an EyeTV recording?

It would be nice if they could build this into EyeTV, then add the capability of searching a recording based on words in the closed captions.

Thanks,
DC
Not that I know of. We used to be able to do that with the old ATI All-In-Wonder card in the early 1990's from analog broadcasts. I used to capture Bill Clinton speeches from his 1992 campaign all the time with it. So if we were 15 years ago it would be a piece of cake. Now, not so much. I recommend you ask El Gato to add that functionality in their feedback mechanism. Under the Help menu is online support where they will take your input.
 
Will Quicktime Pro Extract Text?

Not that I know of. We used to be able to do that with the old ATI All-In-Wonder card in the early 1990's from analog broadcasts. I used to capture Bill Clinton speeches from his 1992 campaign all the time with it. So if we were 15 years ago it would be a piece of cake. Now, not so much. I recommend you ask El Gato to add that functionality in their feedback mechanism. Under the Help menu is online support where they will take your input.
I just read that Quicktime Pro can do a "Text to Text" Export, but I don't know if that would work with a recording exported from EyeTV. Does anyone know?

http://www.apple.com/quicktime/tutorials/texttracks.html

I just posted the request with ElGato.
 
Those seem more like CC creator tools to me. I thought the OP wanted to extract them?

Might this be of some use?

http://sourceforge.net/projects/ccextractor

B

This seems to be what I want, if I could figure out how to compile it, or get an OSX version.

CCExtractor uses MPEG2 files, which I can generate from EyeTV. I don't believe the other refereneces above could use MPEG2. I hope the captions are preserved when EyeTV does the export to MPEG2. I tried exporting from EyeTV to a .mov file, and the captions went missing.
 
This seems to be what I want, if I could figure out how to compile it, or get an OSX version.

Playing with captions has been on my list of projects for a while now. If I get anywhere with it I'll check back in on this thread.

EDIT: It built really easily. PM me with an e-mail address I can send the binary to and I'll send it along. You'll need to run it from Terminal.

B
 
Playing with captions has been on my list of projects for a while now. If I get anywhere with it I'll check back in on this thread.

EDIT: It built really easily. PM me with an e-mail address I can send the binary to and I'll send it along. You'll need to run it from Terminal.

B

I just managed to build it also. So somewhere I have a binary. I don't know where it went, but I've got an MPEG2 file I'm now going to try it with. I'm assuming it went to a place where Terminal can find it.

I had to edit the build file to make it work, because I don't speak enough Unix to tell the Terminal where to find the files.
 
I created an extract file, but the file is unreadable by TextEdit. It took maybe 30 seconds to process a 30 minute HDTV MPEG2
 
Yeah, the output format seems to be non-text. Lots of good info & utilities for Windows here:

http://www.geocities.com/mcpoodle43/SCC_TOOLS/DOCS/SCC_TOOLS.HTML

EDIT: Especially Converting SCC to a Readable Format: CCASDI :p

B

I added the "-srt" option and created a file that can be read by TextEdit, but now I need to find a way to remove the time codes. It took 67 seconds, not the 30 noted above.

I'm trying to also use the "-cf" option for "clean file" hoping that removes the time codes, but I'm apparently not getting my Unix syntax correct for a double option.

Later: the "-cf" option didn't do what I wanted it to do.
 
I emailed Carlos Fernandez, the author of CCExtractor, and he thought it was a good idea for him to add an option for creating text transcripts without the time codes. If I hear that he has done it, I'll post a message in this thread.
 
FWIW, This is what a -srt file looks like when read by TextEdit. This is from a PBS Nova program:

1
00:02:12,565 --> 00:02:15,067
(men yelling, grunting)

2
00:02:19,839 --> 00:02:23,175
NARRATOR:
The samurai were the heroes
of ancient Japan.

3
00:02:25,211 --> 00:02:26,845
(yelling)

4
00:02:26,913 --> 00:02:29,281
Still today,
their legend continues.

5
00:02:32,152 --> 00:02:33,352
The samurai's sword
 
I'm almost there but need some help

I've been following this thread and am now stuck due to my lack of Unix skills and familiarity with macs. If anyone could help me I would really appreciate it.

I was able to build a Unix executable file called ccextractor. When I open this file it shows me the documentation for the program and ends with the following:

Error: (This help screen was shown because there were no input files).

The beginning of the documentation shows the following syntax:

ccextractor [options] inputfile1 [inputfile2...] [-o outputfilename] [-o1 outputfilename1] [-o2 outputfilename2]

PLEASE HELP!!

My question is where do I enter this syntax? I try typing it into terminal and get:

ccextractor: command not found
 
Instead of typing the command, drag and drop the executable file into the Terminal window. You need the absolute path of the script, which Terminal will fill in for you. Then you can finish entering the arguments as the documentation shows.
 
Instead of typing the command, drag and drop the executable file into the Terminal window. You need the absolute path of the script, which Terminal will fill in for you. Then you can finish entering the arguments as the documentation shows.

Got it! Thanks so much!
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.