Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Rhalliwell1

macrumors 6502a
Original poster
May 22, 2008
588
1
I'm doing some research into an iPhone application idea i have had, the idea has two parts - the App it's self and an online back end. The back end will be logging onto some form of a database. The data structure will be straight forward however the iPhone app will have the potential to update and log to the back end multiple times per minute. Now, lets speculate, the app becomes successful and i have 1 000 000 iPhones updating twice per minute. Lets say each time they log they send 512 Bytes thats 1 024 000 000Bytes per minute (512 * 2 * 1 000 000) or 1 Gigabyte/min which is 1 474 560 000 000Bytes/day or 1.5Tb/day. So a lot of data being produced.

So my question is how well does core data cope with large quantities of data?

I have read some of the developer docs on it and from what i can gather the data is stored in memory until you tell it to save to disk:

Saving Changes
All changes to the objects managed by Core Data happen in memory and are transient until they are committed to disk. To commit changes to the data model to disk, simply send a save: message to the managed object context. This behavior preserves the traditional document semantics that users expect in document-based applications.

What is the intentional use of this save function? Every time the application closes? every time you have x amounts of object in memory and you feel it's time free some up?
 
What are you using for the back end? Is it a Mac?

There really isn't an issue in terms of the quantity of data, maybe your hosts internet connection bandwidth, but you shouldn't have an issue otherwise.

Not sure what core data does for you, someone else would have to answer that.

Are you writing objects? Does the iPhone app need real time access to the DB or is the DB used for statistics?

You can cache data on the back end and put it into a queue. Just have a separate thread read the queue and write it to disk/database if you want to.

If you want to write directly to a DB, you need to batch the rows (multiple rows / commit). You would probably have to do some work to get the DB up to speed.

If your not depending on live access, one strategy I've seen is to write to a flat file in CSV format and cut it off at say 50MB. Each time the limit is reached, create a new file with an incrementing number in the filename ie., myfile0001.csv, myfile0002.csv, etc.. Then just create some scripts that check for the next file in sequence and load it using the normal DB utilities.

This saves you a lot of effort in building the DB interaction.
 
At the moment i'm just researching into the best solution, I'd like it to be OS X based and Core data is preferred as it seems hassle free to implement and maintain.

Writing objects would seem the best way. The iPhone app will not need real time access to the data. The only time the app will be pulling data from the DB is when a user authenticates.

At the moment i have written very little code for this project so realistically i am unsure of how much data is going to be produced. I'm just in the research stage at the moment. The bandwidth problem you mentioned is something i'm concerned about but i'll have to get a bit further on with the project until i know more accurately how much data is to be produced.

Thanks a lot for you help and the flat file suggestion. I shall look into that.
 
From a real world perspective, I would never write objects to a file.

The problem is that if you have a problem, you need to build a set of tools to analyze and search the data. Any time I build a new system, I aways build a text file format for the output.

In the line of work I do most (Financial Services), we always write to file in addition to writing to the database. It allows portability and backup so you can bring data into either QA or Development and be able to re-create/re-produce things fairly quickly. Serialized objects sound cool, but I've never found it to be a good solution. It's easier to pop data into an editor or be able to grep the data searching for something specific.

Think about how you will need the data in the future, not how you can take advantage of built-in features.

As far as bandwidth, host providers are pretty flexible, but for what your doing, you may need to think about how to deal with regional server farms, etc. Make the infrastructure flexible, maybe take advantage of the gps capability and figure out which regional hosting to send the data to.

Are you going to open the product up to different countries? If so, are you going to host in those countries? Two messages per second per iPhone might be a bit much to ship half way around the world (protocol timeouts, etc).

Lot's of basic decisions to make first... The rest will follow.
 
Thanks for your advice, it all seems to make sense.

It would be about 2 messages per minute. some more, some less.. depends on the user.

To begin with it would only be released in the UK, with it being a small place one host should be adequate. I'll tackle each region separately dependant on their geographical and economical attributes.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.