Text Manipulation

Discussion in 'Mac Programming' started by Waragainstsleep, Oct 8, 2014.

  1. Waragainstsleep macrumors regular

    Joined:
    Oct 15, 2003
    Location:
    UK
    #1
    This isn't really a programming issue necessarily, but the folks here are perhaps more likely to have an answer.

    I have a plist file which consists of entires like this one:

    <dict>
    <key></key>
    <string>http://www,url.com</string>
    <key>title</key>
    <string>9ISyAl7.jpg 720×960 pixels</string>
    <key>lastVisitedDate</key>
    <string>433770848.7</string>
    <key>D</key>
    <array>
    <integer>1</integer>
    </array>
    <key>visitCount</key>
    <integer>1</integer>
    </dict>



    There are over 5000 of these in the plist.

    I'd like to get them into a spreadsheet with the URL in one column, the title in the next column, the last visited date in the next column and the visit count in the last column.

    Does anyone know a quick way to accomplish this please?
     
  2. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #2
    Show the complete structure of the plist file. For example, is the dict you showed an array member or a member of another dict?

    The command-line tool PlistBuddy might be usable, but you need to show the exact plist structure. Best of all would be a complete XML plist file with only a few items, say 3, that you want extracted. In short, provide small but well-formed sample data.

    PlistBuddy man page:
    https://developer.apple.com/library...win/Reference/ManPages/man8/PlistBuddy.8.html
     
  3. Freez, Oct 8, 2014
    Last edited: Oct 8, 2014

    Freez macrumors newbie

    Joined:
    Feb 9, 2011
    #3
    use find/replace

    Use find/replace in TextWrangler.
    1. Copy text you want removed for the find.
    2. Use the tab for the replace between columns and return for the replace after last column.

    note
    tab is \t
    return is \n

    3. Select all and copy.
    4. Paste into excel document.
     
  4. Waragainstsleep thread starter macrumors regular

    Joined:
    Oct 15, 2003
    Location:
    UK
    #4
    Its inside an array, inside a key, inside another dict. Its from a Safari browser history plist. ~/Library/Safari/History.plist.
    Mine has three keys, the middle one has the info I need in it. I'm sure your will look very similar.

    I'll look into plist and text wrangler, thanks.
     
  5. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #5
    The thing is, PlistBuddy needs to know the specific key. If you don't know it, or can't provide it, then no one can tell PlistBuddy what to retrieve. "The middle one" isn't something PlistBuddy will understand.

    Which version of Safari?
     
  6. cqexbesd macrumors regular

    Joined:
    Jun 4, 2009
    #6
    If its well formed XML you might also be able to use the xpath tool from the command line or even an XSLT.
     
  7. Waragainstsleep thread starter macrumors regular

    Joined:
    Oct 15, 2003
    Location:
    UK
    #7
    It should be the current version or pretty close.

    I can delete the other keys easily enough using Xcode if you think that would help?
     
  8. briloronmacrumo, Oct 9, 2014
    Last edited: Oct 9, 2014

    briloronmacrumo macrumors 6502

    briloronmacrumo

    Joined:
    Jan 25, 2008
    Location:
    USA
    #8
    But it could be solved reasonably easily with standard Foundation( Objective-C ) calls to retrieve standard plist data. In Foundation it is possible to retrieve all the data for a specific key and store in an array. From there the data could be formatted as needed ( maybe a CSV file ). Assuming a URL to the plist file is available:

    Code:
    NSArray *array = [[NSArray alloc] initWithContentsOfURL:url];
    reads the entire plist into an array.

    Code:
    NSArray *values = [array valueForKey:@"visitCount"];
    Retrieves all the values for the key "visitCount"

    Possibly not the solution sought but off the shelf tools are sometimes inflexible while writing the code yourself is not.
     
  9. subsonix macrumors 68040

    Joined:
    Feb 2, 2008
    #9
    The point is that the structure of this file may differ between versions of Safari. The entries you are interested in with keys: title, lastVisitedDates and the url should be possible to retrieve with PlistBuddy, however the url has an empty key. Not sure how to deal with that in PlistBuddy, that is if there are special meta characters to deal with that. It's possible to just print the entire dict then use the usual suspects, grep, awk, sed etc to get the url out, but..

    An alternative, is to parse the file with a library. Which you can do in Applescript which does have support for reading and writing property lists afaik, or in worst case use a regular xml parsing library for say Python.


    I noticed that this history plist has a version number which you can get by:

    Code:
    PlistBuddy -c Print:WebHistoryFileVersion ~/Library/Safari/History.plist
     

Share This Page