Coredata entity calculation approach

Discussion in 'iOS Programming' started by Sam77, Apr 25, 2012.

  1. Sam77 macrumors newbie

    Joined:
    Aug 17, 2010
    #1
    Ive gained a lot previously here posting questions about core data. A lot of things cleared. Since my object model has potential uncertainties, i run into a lot of challenges and end up questioning the whole approach.

    Oh well, heres the problem. Im gonna write a little more detail than required to explain this,

    I have 2 entities:

    Code:
    Parent Object <-->> Entry object. Parent has a to-many relationship with Entry Object.
    Parent 1:
     Name: Height
     Tag: Ht
     
    Parent 2:
     Name: Weight
     Tag: Wt
    
    Parent 3:
     Name: Body Mass Index
     Tag: BMI
    
    
    Entry Object:{
      NSNumber: Value
      NSDate: timeStamp
      ForParent: -->Parent Object

    These objects are listed in a tableView. And theres an "Add" button in the navigationItem.

    User selects an parent in the list, and taps "Add", a modelView appears and the user inputs the Value and an EntryObject is created for that Parent.
    Example, i select Height then tap "Add" and enter the height, and the app records the date added. The date could be manually switched.

    Same procedure for "Weight" Parent.

    Now If BMI is selected, the modalView shows two textfields, one for height and one for Weight. After the values are entered, an NSNumber is returned by calculating the BMI = [Wt/(Ht*Ht)]. The difference here is that NO Entry Object is created with this value, and simply the number is returned.

    Up until now, things work just fine and as desired.

    If I fetch Entries (object) of each Parent Object, I'd get:

    Code:
    Parent-Height = {  Entry(178cm, 10thApril,
                              Entry(187cm, 24thApril,
                           }
    
    Parent-Weight = { Entry(167lbs, 24thApril,
                            }
    
    Parent-BMI = { NO Entry Objects } Instead it pulls in the EntryObjects of height and Weight and parses it and presents the array at runtime. The way i do that is actually create a dictionary with keys as timeStamp (date of entry) and their objects as the entries taken at that timeStamp.

    Code:
    The dictionary is { 10thApril { 
                                               Height=1.78
                                            }
                              24thApril {
                                               Height=1.87
                                               Weight=167lbs
                                           }
                            }
    After parsing, the only nsnumber returned is for the 24thApril as both values are present.
    Code:
    So the resultant Array becomes: {  167/(1.87*1.87), 24thApril }
    
    I wanted to develop it so that even if "ONE" parent of BMI is missing, in this case, Weight was missing on 10thApril, i'd pull the Weight entry from another date and calculate BMI.
    The idea is to step back in time and grab the nearest Weight entry and present that.

    Why am i doing so you ask? why not just say hey, no weight entered on 10th hence no BMI then... well, thats cuz
    1- some leniency can be granted in health matters, BMI doesn't change suddenly.
    2- I wouldn't want to show "N/A", that would just be not fair.. to the user to always enter


    This would not apply in every case, but I'm concerned about bmi only.

    Any advise on structuring the model or something that makes this more efficient would be great.
    I'd really like advice to show BMI in the desired way would be great.

    Im Fetching all entries for bmi by creating a NSPredicate and a SortDiscriptor.

    Code:
    Sorting is via the timeStamp.
    Predicate =  "self.ForParent IN parentArray", [NSArray of HeightParent, WeightParent].
    And SO i'd get the entries in a sorted like this:
    Code:
    Entries: { 
                  EntryWeight - 24April,  ***
                  EntryHeight - 24April,   ---
                  EntryHeight - 10April,   ---
                  EntryWeight - 1stApril,  *** 
                  EntryHeight -  1stApril,  ---
                  EntryWeight - 20March, ***
                 }
    equal number of entries: of height and weight, BUT some dates may have single entries and since i sort with dates first into a dict, i comes like this:

    Code:
           Key-24April:      [Array= Wt, Ht]
               key 10APRIL:     [Array= Ht      ]
               Key 1stApril:      [Array = Wt, Ht]
               Key 20March:     [Array= Wt     ]
    
    When I build and parse the BMI,
    For example when I get 10April Object: i have ONly Ht in it, i would like to get the next /nearest Weight Object to return a non NIL value.

    Ive taken a lot of space to write in detail, and granted, most of you got the point repeatedly, but this is just so critical to the app I'm working on...
     
  2. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #2
    Instead of allowing dates with missing height or weight, create each new timestamp by copying both height and weight from the most recent entry. You now have a new entry with the same height and weight as the prior entry.

    Now, if the program doesn't change the height, it's the same as the prior entry. Likewise with the weight.

    I don't know about you, but my height hasn't changed for a few decades. My weight changes more often. So copying the height and leaving it unchanged seems like an obvious thing to do, from a user's viewpoint.


    If you need to keep track of which values were copied and which ones were entered on each date, keep a boolean variable for each; example: a copiedHeight and copiedWeight boolean. Both are true when you create a new entry (because height and weight are both copied from prior entry). Then if the weight is set to a new value, clear copiedWeight. Same for copiedHeight.

    You'd only do this if you needed to know when values are copied vs. when new values are entered. If you don't need to know this, don't do it.
     
  3. Sam77 thread starter macrumors newbie

    Joined:
    Aug 17, 2010
    #3
    Hey thanks for the reply,

    You've got valid points and my height hasn't changed since about a decade i think :), In which case, i have a variable in the Parent that says if its fixed or not.
    The reason being:
    - If the user is a kid, or someone's checkin their kid's bmi, Height is actually a good index in growth chart.

    - When the height is static, i don't even check Entry, i jus grab it from else where, in the Parent object. Theres no point in creating entries for something thats static and going to remain unchanged for good. THe benefit of not creating new entries for statics is simply less results in fetching, less data saving (even if its bytes, should be beneficial in the long term).

    Code:
    Instead of allowing dates with missing height or weight, create each new timestamp by copying both height and weight from the most recent entry. You now have a new entry with the same height and weight as the prior entry.
    I didn't quite get what your saying, all dates that have either height or weight or both are already stored as keys in a dict each (as described in my above post) object is an array of those entries.
    When you say create a new timeStamp, since that is attached to an Entry, I'm assuming your saying create a new EntryObject.
    The whole point was to get the missing height and weight and 'calculate' the BMI, i don't prefer storing an entry for BMI itself, its parsed at runtime, i see you asking why?
    • Potentially i can calculate add an entry object, but incase a user decides to Change the weight for that specific date, (the ability of which i chose to give), i'd have to recalculate the BMI for that Entry of that date.
    • If i simply add Height (assuming I'm a kid) and not add Weight, (assuming its not changing much, [dont hammer me for this, I'm also working out unlikely scenarios in my stupid unrealistic quest for app smoothness glory], would i have to add an entry in retrospect for BMIs to fill in the gap, cuz weights and heights are added on different dates.
    • Why store these values when they could essentially are equations in nature. Im sacrificing the processing power here, probably slowing things down, but this may pile up if you think about it, And i wanted to keep it simple and calculate it at runtime.

    I'll expand graphically for a little bit more understand, BMI is a charting equation, and for each date, in the graph, i plot BMI dots for date. And I present a [dictionary allkeys] as X-axis, and and each key should have a height and weight objects and i calculate and present it.
    this is the problem, i need to fill in the blanks where it ONE (height or weight) might be missing in a good way..

    Any more ideas? or am i totally insane?
    EDIT:

    Heres how i create the dictionary, how should i do fill in those blanks..


    Code:
     + (NSDictionary*)sortFetchedArrayOfEntries:(NSArray *)EntriesArray{
        NSMutableDictionary *dateSortedDict = [NSMutableDictionary new];
        for(EntryObject *EntryObj in EntriesArray){
            NSDate  *entryDate = EntryObj.timeStamp;
            NSMutableArray *array = (NSMutableArray *)[dateSortedDict objectForKey:entryDate];
            if(!array) array = [NSMutableArray new];
            [array addObject:EntryObj];
            [dateSortedDict setObject:array forKey:entryDate];
        }
        return [NSDictionary dictionaryWithDictionary:dateSortedDict];
    }
     
  4. chown33, Apr 25, 2012
    Last edited: Apr 25, 2012

    chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #4
    Yes, I botched my terminology: I wrote timestamp when I meant entry.


    I don't understand that at all. You don't have to add any entries in retrospect. The values added when the entry is created aren't changed. That's the value of replicating the data. If you change one entry later, it only affects that entry, not all subsequent entries that might be implicitly relying on the default weight or height.


    I already understood why you weren't storing BMI: it can always be calculated from two values.

    I don't see the big problem with storing multiple values. I don't think the scale of the data is large enough to worry about it. Even an empty value in an entry has a cost in size, so unless you've analyze this down to the exact number of bytes, I think you're doing Premature Optimization.


    I don't understand the "parent" relationships. Why? What does it do for you?

    You have a series of tuples: date, weight, height. You can select by date, weight, or height. What value does a "parent" have for any of these?. Maybe I'm thinking of it too much as simple SQL, but I don't see any obvious entity-relationship diagrams from this that includes a "parent" relationship.


    Am I missing something about Core Data? Since when does an NSDictionary (mutable or not) have an "order" to its keys or values? I.e. how is it possible to sort a dictionary, since the keys have no defined order?

    Yes, you can put the keys or values into an array and sort the array, but your method is this:
    Code:
     + (NSDictionary*)sortFetchedArrayOfEntries:(NSArray *)EntriesArray
    
    and clearly shows a returned NSDictionary.

    Maybe the problem is you're using the wrong kind of collection. If you use an NSArray, and it was sorted by timestamp, then it would be trivial to take the index of any entry and find the nearest predecessor that has either a weight or a height.

    I think it would also be pretty easy to select a set of tuples (entries) that included at least one non-null weight and height, which would guarantee you'd be able to find a predecessor with weight and height.

    If that's too complicated for general use, then wrap the raw Core Data model inside a method that does that automatically. You don't have to expose the Core Data model directly, especially if one of the things you're doing is filling in missing data that isn't actually stored in the model.


    You should make the simplest data model that works. If that means duplicating data, then do that. Then write the rest of your program using the simple data model. When that all works, you can go back and optimize the data model, possibly wrapping the more complex data model inside methods that still provide the same simple model, but are smarter and removing duplicate data. Or if it turns out the simple data model is acceptable in size and speed, you don't have to change anything.

    Instead, you're spending a lot of time and effort making a complex data model, one that allows missing values, and then you're exposing all that complexity of working with possibly missing values in the rest of the program. Now you can't simplify your data model, because everything else assumes the complicated model. Abstraction and encapsulation are your friends.
     
  5. Sam77 thread starter macrumors newbie

    Joined:
    Aug 17, 2010
    #5
    Code:
    I don't see the big problem with storing multiple values. I don't think the scale of the data is large enough to worry about it. Even an empty value in an entry has a cost in size, so unless you've analyze this down to the exact number of bytes, I think you're doing Premature Optimization.
    I should read through that link, I've been playing around core data a lot since the last couple of months.,
    Anyway, you've mentioned "Empty Value" of an entry , yes it has a cost, i don't even create that for BMI, BMI exists only as a Parent Object, if i were to "fetch its entries" so to speak, I'm actually fetching Entries for Height and Weight Parent Objects using a predicate, i have else where in the app that tells me that BMI needs Height and Weight object, and does not itself has any entries. (though it potentially could), but for the reasons i've outlined, (in the above posts), i chose to calculate it and simply return as an NSNumber.


    Code:
    Entry Entity{
       NSNumber value;
       NSDate timestamp;
       forParent  (relationship with Parent e.g., height/weight)
    }

    Potentially when writing height and weight, What i do is create a new Entry Entity, add the value, timestamp is recorded, and then i add it to the parent in question, height or weight watever,
    entry.forParent = ParentHeightObject;
    1 entity for all kinds of entries.

    For fetching the BMI,

    [NSPredicate predicateWithFormat:mad:"forParent IN %@", [NSArray of HeightObject, WeightObject]];
    Parent <--- to-many--->> Entry. each entry.forParent would point to either height or weight in the above case, at least thats how I'm working this.[/CODE]


    My apologies here, i don't conform to any naming conventions and ending up being confused, i just created that function to sort for charting the data, in this case BMI.

    again,
    Code:
     + (NSDictionary*)sortFetchedArrayOfEntries:(NSArray *)EntriesArray
    
    What Im actually I'm doing is,

    Fetching Entry objects for BMI from the store, and as i mentioned above, it fetches Entry Objects for Height and Weight both, and in the fetchRequest I use timeStamp as the sort descriptor.
    Now I get a sorted array of Entry Objects based on timestamp.

    I use the above method, and create a dict.
    NSDictionary *dict=[self sortFetchedArrayOfEntries:[self fetchedEntries]];
    //fetchedEntries method gets the array from fetch request. i get it in the form of
    Entry 1.4, time=11April //entry.forParent = height
    Entry 69 time=11April //entry.forParent = weight
    Entry 1.2, time=4April //entry.forParent = height
    Entry 1.1, time=9March //entry.forParent = height

    Using that method, I'm attempting to create a dictionary that grabs all the dates from all entries adds them as a key. and adds all the Entry objects in an array as its object.

    There are times when a user feeds Height and Weight both in a single form. WHich creates 2 entry objects for each with the exact same timestamp. I'd know that both entries are available and theres no need to look for a missing one.
    This is what dictionary i make,


    Code:
         Key-24April:      [Array= Wt, Ht]
               key 10APRIL:     [Array= Ht      ]
               Key 1stApril:      [Array = Wt, Ht]
               Key 20March:     [Array= Wt     ]
    [dict allkeys] is the array for my x-axis, in a way this is my definitive way of saying that a use has entered some kinda data on this date, and i need to show it.

    Day1 = [dict allkeys] objectAtIndex:0] ;

    For that day i grab the entries
    [Array: EntryHeight(1.4), EntryWeight(69)] = [dict objectForKey:Day1];

    Since i have both, i can godhead and calculate the BMI,

    there will be days when theres only Height! and no weight, in which case, What should be my best move forward, assuming both are changing (growing), i just want to grab the nearest - PAST Weight. to fill in the blank.

    I think it would also be pretty easy to select a set of tuples (entries) that included at least one non-null weight and height, which would guarantee you'd be able to find a predecessor with weight and height.

    I hope its clearer now..

    I know little of what your saying :eek:. What should that be?

    And a little rephrase of my question, where In this whole process, should i
    ---> INTERRUPT
    ---> Checking FOr the MIssing Field
    ---> Identify its Weight
    ---> Grab the past weight entry
    ---> Fill it in the missing
    ---> CONTINUE with the rest.

    the thought of looking back while a for(); is going on is making me nauseous.


    really appreciate your help man..
     
  6. Sam77 thread starter macrumors newbie

    Joined:
    Aug 17, 2010
    #6
    Oh man, i was posting a reply and missed this part.

    Your probably right..this effort was an attempt to simply future uses of this model. And maybe I'm making this whole thing a lot more complex.

    In which case, lets say

    A user enters ht/wt in the textfields. I store them as 3 entries, one for ht, one for weight and one for BMI (calculated value).

    Action: Now i go back and edit the Weight entry value for that date
    Desired Result: i would like the change to also be reflected for the BMI for that date.

    Action: I delete the Weight entry completely, Should i be leaving the BMI alone? as its already store calculated the value based on the weight before?
    Desired Result: Recalculate the BMI based on an earlier date - value.
    How would i do that? again,

    Contrast this with what I was doing, the "calculation part AND BMI" was not at all based on stored values, it was about fetched Values available at hand and show the output. (Atleast thats what I thought)...


    Another reason i forgot to mention was the potential use of iCloud. I thought I'm reducing at least 0.1% of dependency, by not adding calculated entries, which i could otherwise just do at runtime. Since all core data is synced as transaction logs, and the app would absolutely need to show BMI, i thought why sync a BMI entry log when ht and weight are already being propagated.

    I guess those were some reasons, but I'd take your advise seriously if you still recommended the safe old way.
     
  7. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #7
    Why store BMI? You always calculate it.

    What kind of object would you use to represent one entry? In short, design the class, then design the data storage for it.

    I don't know how you'd do it, but my class would have something like:
    Code:
    @property NSDateTime * timestamp;
    @property float weight;
    @property float height;
    @property (readonly) float bmi;
    
    This is just a rough outline, but it shows the important relationships. BMI is purely a dependent value of weight and height, so it's purely readonly. Also, weight and height should be specified with units, so weightInKilograms or heightInMeters or whatever.

    A readonly dependent property doesn't have to be stored, because it can always be calculated from other properties of the object. It's like the area of a rectangle: knowing the width and height, area can always be calculated. You don't need to store area.

    In the implementation:
    Code:
    -(float) bmi 
    {  float h = self.height;  return self.weight/(h*h);  }
    

    If bmi is readonly and calculated, then changing weight automatically alters what bmi will return (see above). You don't have to do anything for the bmi to change when the weight changes.


    The underlined part makes no sense to me. What does it mean to "delete the weight entry"? Are you saying you weighed zero kilograms (or pounds) on that date? Why? Exactly what is deleting weight supposed to mean?

    And I don't see a "weight entry" in my class outline. There's one entry (object) that has 3 specified values (timestamp, height, weight) and one dependent value (bmi). The whole thing, a single object, makes one entry.

    If you change one of the values in an entry, then the meaning of that entry changes. Change the timestamp and you're saying "My weight and height were thus on the date D". Change the weight and you're saying "I weighed W on that date". Change the height and you're saying "I was really H cm tall on that date". Change weight or height and BMI automatically changes, because it's never stored, only calculated.

    None of the individual values can be deleted, though two of them can be zeroed. But what does it mean to zero a value? Well, plug in a zero to what you're saying: "I weighed zero kg on that date" or "I was zero cm tall on that date". That seems nonsensical to me, but maybe you have some reason for allowing it. If so, please explain that reason.

    Deleting means removing the entire entry, the single object representing a height and weight on a specific date. There's no such thing as deleting a weight or height, AFAICT.

    If you have some other meaning for "deleting a weight", then explain it in terms of what happens to the simple object model when a weight is deleted. It has an effect, or you intend it to, so describe exactly what effect it has.

    You should be able to do this with every operation on the data: be able to explain the effect exactly. Not only the effect, but the non-effects as well. Example: deleting an entire entry has no effect on any earlier or later entries. It simply removes one recorded date,weight,height datum.


    I never suggested storing calculated values (bmi). I suggested copying the weight and height.

    I still suggest making the simplest data model that works. See the class properties I outlined. That seems like the simplest data model to me. One simple feature is no entry (object) depends on any other entry. Each entry is completely independent from changes made to any other entry.

    You can write the entire program using that data model, i.e. that class, if you write an actual implementation for it. It has timestamps, height, weight, and bmi. BMI is modeled as a dependent value of height and weight. A single entry consists of the tuple: timestamp, height, weight, with bmi calculated. You can zero any value, but that doesn't delete the entry (tuple) from the set of stored data.

    If you want to delete an entry, you delete the entire entry as a whole. Doing so has no effect on earlier or later entries, because when each entry is created, it starts with default height & weight values copied from the most recent entry, i.e. the entry with the latest timestamp.

    After you write the program using the simple data model (i.e. the simple class, the simple object model), you can evaluate revising the storage model that underlies the simple object model.

    If height changes less frequently, maybe keep tuples of timestamp,height. Correlate that with tuples of timestamp,weight. Now given any timestamp (even one for a date not stored) you can find the height and weight for that date. The correct height is the one whose timestamp is equal to or less than the date of interest. Likewise, the correct weight is the one whose timestamp is equal to or less than the date of interest. Get those from the stored data (i.e. the storage model). Put the date, height, and weight into an object instance, and boom there's your simple data model. The data model no longer has 1-to-1 correspondence to the storage model, but so what? The simple data model is still what prevails in all the other parts of the code.

    The only thing that changed is what's hidden inside the implementation of the simple data model. And again, only do that after the whole program is working, and after you've collected profiles and measurements that show a clear need for a different storage model. Personally, I don't think it matters.

    My weight and height don't change that often in one day, so I'm not going to be updating the data hundreds or even ten times a day. I doubt that it would transfer more than a couple hundred bytes per day, even with the simple data storage model. Is a couple hundred bytes really a big deal? Icons are bigger than that. Two SMS messages are bigger than that.


    That's a longish reply, but I think you're being grossly misled by a complex data model with no clear purpose. It's not clear to me you've ever thought about it as an object model, i.e. as plain simple properties defined in a class, and how that would be implemented, and the relationships between objects, or between collections of objects.

    Core Data is supposed to help you manage and store your object graph. It's not supposed to lead you to create unnecessary complexity in order to fit with some preconceived notion of optimized transaction logs for iCloud storage.

    Here's the first sentence of "Introduction to Core Data Programming Guide"
    The Core Data framework provides generalized and automated solutions to common tasks associated with object life-cycle and object graph management, including persistence.
    If you don't have a clear outline of what the objects are, then you should define them first.
     
  8. Sam77 thread starter macrumors newbie

    Joined:
    Aug 17, 2010
    #8
    Hey thanks aloooooooot for your detailed answer :)

    Really put things to perspective, i wish there was some reward points here like stack overflow, i find this forum very helpful always!


    cheers :)
     

Share This Page