Question to the pros about deconstructing NSStrings

Discussion in 'Mac Programming' started by neptunet, Dec 29, 2011.

  1. neptunet, Dec 29, 2011
    Last edited: Dec 30, 2011

    neptunet macrumors newbie

    Sep 12, 2005

    I'm new to the Mac Programming forum but I've been reading MacRumors for years.

    My question is about how to take apart an NSString. I'm reading the Apple reference docs and there's some confusing stuff about NSRange, Scanners, blocks, and not to mention that all of the "substring" methods start with "range".

    So what I'd like to do is look at my NSString and count the number of non-numerical characters. Then, I'd like to look at the first character of my NSString, determine if it's a number or not, make note of that, and move on to the next and so on, to the end of the string.

    What NSObjects/methods/scanners/ranges should I be looking at to do that?

    Thank you guys :)
  2. lee1210 macrumors 68040


    Jan 10, 2005
    Dallas, TX
    characterAtIndex: is probably what you want.

  3. JoshDC macrumors regular

    Apr 8, 2009
    characterAtIndex: should be fine for your case, but Apple recommends (see WWDC 2011's "Advanced Text Processing" session) using the string enumeration method enumerateSubstringsInRange:eek:ptions:usingBlock: with NSStringEnumerationByComposedCharacterSequences as one option. The main reason is that using characterAtIndex: requires extra effort to correctly handle composed character sequences, which I guess is what you mean by character.
  4. kainjow Moderator emeritus


    Jun 15, 2000
    If you need backwards compatibility, use substringWithRange: in a loop, with the range's length set to 1.
  5. Sydde macrumors 68020


    Aug 17, 2009
    Seems to me that NSScanner would be the most efficient way to go. Just use the -scanUpToCharactersInSet: to find the first digit, then you could use one of the number-scanning methods (like -scanDecimal) to capture a numeric value, or -scanCharactersInSet: to collect the digits into a string. You could use -scanLocation with accumulating variables or a NSMutableIndexSet if you need to keep track of character counts. I have taken the attitude that the less you muck around directly with NSString contents, the better off you are.
  6. neptunet thread starter macrumors newbie

    Sep 12, 2005
    thanks so much for all the replies!

    Actually, what is a composed character sequence? In my case I want to check for colons in a string of digits. This may be a dumb question, but what exactly does that string enumeration method do?

    Ohh, a range of one. I'm still not sure exactly what that means, but would that be like using CharacterAtIndex:? What did you mean by backwards compatibility?

    Does scanUpToCharactersInSet mean it will read the characters sequentially (into another string?) up until it hits my colon?
  7. jiminaus macrumors 65816


    Dec 16, 2010
    An accented character, for example á, could be encoded in two different ways in Unicode. One way would be encode the precomposed character 00E1 (Latin Small Letter A with acute). Another way would use the composed character sequence of 0061 (Latin Small Letter A) followed by 0301 (Combining Acute Accent). They are visually the same, but they are not equal.
  8. chown33 macrumors 604

    Aug 9, 2009
    Sailing beyond the sunset
    Essentially, it's a single character, such as the letter 'A', along with all subsequent combining accents and other combining forms.

    Here's a bunch of Frequently Pasted Links, though some are doubtless obsolete:
    Minimum Knowledge of Charsets - Joel on Software
    Additional links:

    UTF-16 for Processing:
    Canonical Equivalence in Applications:
    UAX #15: Unicode Normalization:
  9. Sydde macrumors 68020


    Aug 17, 2009
    If you use the default numerics NSCharacterSet and -scanCharactersInSet:, it will fill the string with the characters it finds in the set until it runs into a character that is not in the set. Then you can get to the next numeric digit with -scanUpToCharactersInSet:, back and forth until you run out of string. Read the docs on those methods and on NSCharacterSet.
  10. JoshDC macrumors regular

    Apr 8, 2009
    On re-reading your request and seeing Sydde's suggestion, I think that may be the way to go.

    The method I suggested will go through each composed character sequence and perform the block, for example:

    [@"hello" enumerateSubstringsInRange:NSMakeRange(0, [@"hello" length])
    ^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
    Will print:

    Then it'll be up to you to do something based on the value of substring. It's a little heavy-handed for a number of cases, I think this being one of them.
  11. seepel macrumors 6502


    Dec 22, 2009
    I think in a case like this to give you the best solution it would help to know what the end goal is. As you can see from previous posts there are a few ways to do this that have certain strengths. So what do you want to end up with at the end? And what do you want to do with it? So far I would vote for NSScanner as you're best bet.
  12. neptunet, Mar 1, 2012
    Last edited: Mar 1, 2012

    neptunet thread starter macrumors newbie

    Sep 12, 2005
    In reply to that last post,

    Thanks for asking about that. I'm trying to work with video timecode. As far as I know the only way to input timecode is into a string with digits and colons. Once I have the string, I want to see what the timecode is. So I need to read the digits out of the string and discard the colons. When I need to display the timecode, I'll put the colons back in.

    Reading over the suggestions so far, I realize that I don't know what a range or scanner is. Could someone be so kind as to explain? Especially the range. I've read the class references but I'm still clueless. Sounds like range is the actual memory range? Why would I ever want to know that just to figure out what kind of character it is? *head explodes*

    I noticed in C# (unrelated I know) there are some nice properties like


    Wow, how useful that would be right now! Is there anything that simple in Objective-C?

    Thanks again for the wonderful help. :)
  13. neptunet thread starter macrumors newbie

    Sep 12, 2005
    Oh, am I talking about "tokenizing"? My numbers are delimited by colons. So I just want to get the numbers. It's funny how a little jargon can help.
  14. Sydde macrumors 68020


    Aug 17, 2009
    Well, if you are working with timecodes, it might be easier (and faster in the code) to just get the raw timecode data and convert it mathemagically to usable numbers. According to Wikipedia, timecodes are stored in BCD, meaning each byte is two decimal digits, which you can convert to an int or whatever with a little simple math
    timeValueComponent = ( timeByte >> 4 ) * 10 + ( timeByte & 0x0F );
    though the frame number might require more conversion. QuickTime can provide you with this raw data in a QTTime record, not sure how it works in AV Foundation.
  15. hchung, Mar 2, 2012
    Last edited: Mar 2, 2012

    hchung macrumors 6502a

    Oct 2, 2008
    Try this....

    NSArray* timeElements = [timecodeString componentsSeparatedByString:mad:":"];

    This gets you an array of strings from your timecode.
    Then you'd get [timeElements objectAtIndex:0] for your hours, [timeElements objectAtIndex:1] for your minutes, and so on.

    In the future, if you're not sure of how to ask how to do what you want to do, it helps if you describe the overall problem statement first because we might be able to provide easier ways to accomodate your task. :) "I have video timecode strings and want to break them up into pieces; they're delimited by colons like 11:22:33:44" versus "I want to read a string character by character, determine if number or punctuation, and then record that and move on" will gets you very different results.
  16. PatrickCocoa, Mar 2, 2012
    Last edited: Mar 2, 2012

    PatrickCocoa macrumors 6502a

    Dec 2, 2008
    hchung is right

    In Cocoa, there is almost always a method that does what you want. Your programming technique needs to change from:

    "what sequence of atomic level operations can I string together to create a procedure that does what I want"


    "where has Steve hidden the secret method in Cocoa that does what I want".

Share This Page