How do NSSets work?

Discussion in 'Mac Programming' started by moonman239, Jan 4, 2016.

  1. moonman239 macrumors 68000

    Joined:
    Mar 27, 2009
    #1
    Here's what I know: NSSets are unordered. Generally, you use them when you want to see if an object is a member of a group. You can filter the set using an NSPredicate. You can turn it into an array. If you call anyObject on a set, you'll get an object, but you can't know ahead of time which object you'll get.

    Which leads me to my questions: How are objects in NSSets stored? How are they filtered? How does containsObject work? And how does anyObject work?
     
  2. lee1210 macrumors 68040

    lee1210

    Joined:
    Jan 10, 2005
    Location:
    Dallas, TX
    #2
    This won't be that helpful, but underneath most Set implementations is normally a hash table or hash map, where the keys of the table represent the set, since they are unique.

    Great, so how does a hash table work? The keys need to support a hash method that tries to differentiate instances of a class by turning all identifying properties into a single integer.

    Fantastic, what does this have to do with storing unique elements? The hash table implementation will get the hash of an object, then do a calculation (smear) to decide what index in an array should be used for an object with that hash. Two objects may hash to the same value, so if there is a collision a test is done to see if the objects are equal if there's already an object at the index determined. If equal, then the value (the table maps keys to values) is updated in the same position in the value array. If the key at this index is not equal, this constitutes a collision, and a new index must be determined (sometimes the very next index, sometimes a resmear). The same process is used if that position is occupied.

    What this boils down to is that a Set's elements are stored in an array (that may be sparse, having unused elements), and which index is used for each element is determined by its hash code and the smearing algorithm.

    While things like "any element" seem a bit magical, this could be as simple as the first element in the array to an actual random position. Contains object will do a key lookup like the object is being stored as described above, but when the index is found (through repeated smearing if needed), ultimately YES or NO is returned based on the key being present in the table. Filtering would just be enumerating the Set and testing each element. Enumerating can be as simple as walking through the array finding non-nil elements.

    This isn't enough info to implement a hash-based set yourself, but hopefully it's a good enough sketch that you can imagine the mechanics.

    -Lee
     
  3. gnasher729, Jan 5, 2016
    Last edited: Jan 8, 2016

    gnasher729 macrumors P6

    gnasher729

    Joined:
    Nov 25, 2005
    #3
    It's just like NSDictionary, except that NSDictionary stores pairs of keys + values, whereas NSSet only stores the keys. Other than that, they are practically identical.

    Not at all. An NSDictionary must have unique keys and can have many identical values. An NSSet cannot contain duplicate items; that makes the items in an NSSet the same as the keys in an NSDictionary. If you take an implementation of NSDictionary, and leave out everything that has to do with storing values, what remains is an NSSet. If you didn't have NSSets and wanted one, you could take an NSDictionary, and add the items you want as keys with a value @0 for each key.

    BTW. Fast enumeration over a dictionary iterates over the _keys_, not the values.
     
  4. ChristianJapan macrumors 601

    ChristianJapan

    Joined:
    May 10, 2010
    Location:
    日本
    #4
    I would disagree to say: NSSet store the key; it's more like storing the value; to keep the analogy to NSDictionary. There is no need to use again a key to get an item from the set; just enumerate over it and get each item.
     
  5. lee1210 macrumors 68040

    lee1210

    Joined:
    Jan 10, 2005
    Location:
    Dallas, TX
    #5
    The point is that in an NSDictionary the keys must be hashable and are unique, while the values do not need to be hashable or unique. If you needed a quick Set implementation and already had a Map/Dictionary implementation, it would de trivial to use a Dictionary with dummy values, and use the keys as your Set.

    In a Set the elements must be hashable (well, for most) and unique just like a Dictionary's key set.

    -Lee
     

Share This Page