Delete SRT subtitles markup

Discussion in 'Mac Programming' started by tcwdoggy, Jul 15, 2017.

Tags:
  1. tcwdoggy macrumors newbie

    Joined:
    Jun 13, 2012
    #1
    Hi,

    I want to write a simple apple script that can delete the italics markup "<i>" in SRT files. But the "<" and ">" are causing problem. What is the correct format to enter in this case?

    This is what I have.
    Thanks.

    tell application "TextEdit"
    set every word of front document where it is "<i>" to ""
    end tell
     
  2. cruisin macrumors 6502a

    cruisin

    Joined:
    Apr 1, 2014
    Location:
    Canada
    #2
    < and > have special meanings for computers, so you might have to "escape" it so that it is used as text, usually by using the "\" character.

    But why not use the find and replace option in the text editor of your choice?
     
  3. tcwdoggy thread starter macrumors newbie

    Joined:
    Jun 13, 2012
    #3
    Cause this is just the beginning of something more. I did tried your suggestion previously though, don't think it work at all.

    Thanks anyway.
     
  4. TheTruth101 macrumors regular

    Joined:
    Mar 15, 2017
    #4
    There is this software that runs on PC, is free and it does wonders. We use it for closed captioning. The best. It may work for you: http://www.nikse.dk/subtitleedit/
     
  5. Childs, Sep 28, 2017
    Last edited: Sep 29, 2017

    Childs macrumors member

    Joined:
    May 28, 2010
    #5
    This is kinda late, and I dont really know AppleScript, but using sed:

    Code:
    sed -e 's/<[^>]*>//g' tmp.srt
    So you can probably do something like this in AppleScript:

    Code:
    do shell script "sed -e 's/<[^>]*>//g' tmp.srt > tmp.notags.srt"
    
    Or in Python:

    Code:
    import re, os, shutil
    
    srt_file = os.path.expanduser('~/tmp/sample.srt')
    shutil.copy(srt_file, srt_file+'.orig')
    
    with open(srt_file, 'r+b') as fp:
        data = fp.read()
        fp.seek(0)
        fp.write(re.sub('<[^>]*>', '', data))
        fp.truncate()
    
    print 'Finished.'
    
     
  6. Mark FX, Sep 29, 2017
    Last edited: Sep 29, 2017

    Mark FX macrumors regular

    Mark FX

    Joined:
    Nov 18, 2011
    Location:
    West Sussex, UK
    #6
    You can use AppleScript to seperate out the "<i>" and "</i>" tags found in SRT files, but would involve using AppleScript repeat loops on every paragraph or word in the TextEdit documents text, which would be very slow.

    So the above shell techniques would be a better option.

    But if your determined to use AppleScript, then the better option would be to use the Cocoa Framework and Classes for such a task.

    So use this example text below to practice with, in a new TextEdit document, before letting this example code loose on your desired files, to make sure it's acting as desired.

    original text
    Code:
    <i>Hello</i> <i>World</i> <i>!</i>
    <i>Hello</i> <i>Again</i> <i>World</i> <i>!</i>
    <i>Hello</i> <i>Yet</i> <i>Again</i> <i>World</i> <i>!</i>
    
    AppleScript code
    Code:
    use AppleScript version "2.5"
    use scripting additions
    
    use framework "Foundation"
    
    tell application "TextEdit"
        set documentText to (text of front document) as Unicode text
    end tell
    
    set documentString to current application's NSString's stringWithString:(documentText)
    
    set documentString to documentString's stringByReplacingOccurrencesOfString:("<i>") withString:("")
    
    set documentText to (documentString's stringByReplacingOccurrencesOfString:("</i>") withString:("")) as Unicode text
    
    tell application "TextEdit"
        set text of front document to documentText
    end tell
    
    You have'nt stated which version of Mac OS your using, so the above example assumes your using OSX 10.10 or later.

    And you should end up with this text below in your TextEdit document.
    Code:
    Hello World !
    Hello Again World !
    Hello Yet Again World !
    
    I have NOT included any error checking in the above AppleScript code, which I would encourage you to do in any coding.
    And you should of course check that the TextEdit application is running before trying to instruct it with AppleScript tell blocks, otherwise you will get AppleScript runtime errors.
    So please only see this code as a quick and dirty proof of concept, and does NOT represent a finished or complete AppleScript application.

    Hope this helps

    Regards Mark
     

Share This Page