Downloading data from many sources concurrently

Discussion in 'Mac Programming' started by Soulstorm, Jun 28, 2010.

  1. Soulstorm macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #1
    Normally, I would post this question under the iPhone programming section, but since the answer may also apply for OS X, I thought I should post it here.

    I have a program that will need to download data from many sources concurrently. The sources will be so many (75), that I want to limit the concurrent download operations to 5 at a time. So, when a download is complete, another download starts, and data from the previews download is displayed on screen.

    All these will be done in the background. My question is: Will NSOperationQueue help? If yes, how? I'm not looking for plain code, I am rather looking for general directions. If you know of any better way, please tell me so.

    Thank you in advance for any information.
     
  2. whooleytoo macrumors 603

    whooleytoo

    Joined:
    Aug 2, 2002
    Location:
    Cork, Ireland.
    #2
    Typically, I'd just create my own queue (could be as simple as an NSArray of NSDictionarys, with the each NSDictionary storing the download URL and status for one download).

    Start 5 downloads (asynchronously), and tag them as 'in progress', and every time you get a download complete, you change the status field for that download to 'complete' and iterate through the array to find the next download. When all are 'complete', you're done.

    If you use a separate thread for each download, you just need to be careful updating the array on complete, since multiple threads might finish and update the array simultaneously. Updating the array only on the main thread should help.
     
  3. kainjow Moderator emeritus

    kainjow

    Joined:
    Jun 15, 2000
    #3
    NSOperationQueue is probably what you want. It allows you to set the max number of operations in the queue, and handles all the queue management you'd have to do yourself if you rolled your own. Just provide a custom NSOperation that handles the downloading. NSURLConnection already works in the background, and if you're on 10.6 GCD is used so the operations are automatically created in the background.

    I have a URLConnectionOperation class that I wrote for a test project a few months ago. If you're curious I can post it.
     
  4. Soulstorm thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #4
    Thanks a lot for your answer. Although I wrote that I'm not looking for plain code, I would be grateful if I saw some code with NSOperations. Apple's examples left me with some questions.

    1) When does it end? I mean, I have some functions that happen in the NSOperation, but how does Cocoa understand when an Operation is complete and another one must start from the Queue?
    2) I have an RSS downloader and parser in the same object. The downloader works with NSURLConnection, since it provides asynchronous downloading, methods for testing connection establishment, and methods to validate data at the moment they arrive (in chunks). But since NSURLConnection is performed in the background, would it have a meaning also putting it in a queue operation?
     
  5. robbieduncan Moderator emeritus

    robbieduncan

    Joined:
    Jul 24, 2002
    Location:
    London
    #5
    NSOperation is not meant to be used directly. You use one of the pre-defined subclasses (NSBlockOperation or NSInvocationOperation) or subclass yourself. Assuming you use the block or invocation operation then the operation completes when that block or invocation completes.
     
  6. Soulstorm thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #6
    I use the subclassing method. I will also check NSInvocationOperation and NSBlockOperation to see if they suit my needs.
     
  7. robbieduncan Moderator emeritus

    robbieduncan

    Joined:
    Jul 24, 2002
    Location:
    London
    #7
    If you've subclassed it then, as per the documentation, you override

    start
    isConcurrent
    isExecuting
    isFinished

    When your operation completes you (again as per the documentation) send KVO notifications for the last two of those methods. In that way the queue/execution system knows your operation is complete.
     
  8. Soulstorm thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #8
    Thanks. However, the methods you suggest I override, according to the documentation apply only for concurrent operations. However, I need to download all this data and parse it in the background, and I want the results to be shown in the main thread.

    Wouldn't it be better if I override only the -main function to make the NSOperation subclass non-concurrent? I'm just asking, not correcting, I haven't cleared how can concurrent and non-concurrent operations help me in different ways... :)
     
  9. robbieduncan Moderator emeritus

    robbieduncan

    Joined:
    Jul 24, 2002
    Location:
    London
    #9
    So if you want your operations running in the background and more than one at once I think you need concurrent operations.
     
  10. Soulstorm thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #10
    OK now I have a problem:

    I didn't change my code in the class that downloads and parses the RSS feed. Instead, I made a new NSOperation Subclass:

    Code:
    @interface DownloadOperation : NSOperation <SFHTTPDownloaderDelegate> {
    	SFRSSDownloaderParser *rssParser;
    	id<SFRSSDownloaderParserDelegate> delegate;
    	BOOL finishedParsing;
    }
    @property (nonatomic, retain) SFRSSDownloaderParser *rssParser;
    @property (nonatomic, assign) id<SFRSSDownloaderParserDelegate> delegate;
    
    - (id) initWithURLLocation:(NSString *)urlString andDelegate:(id)del;
    @end
    
    Implementation:

    Code:
    @implementation DownloadOperation
    @synthesize rssParser;
    @synthesize delegate;
    
    - (id) initWithURLLocation:(NSString *)urlString andDelegate:(id)del
    {
    	self = [super init];
    	if (self != nil) {
    		self.delegate = del;
    		rssParser = [[SFRSSDownloaderParser alloc]initWithURLLocation:urlString delegate:self];
    		finishedParsing = NO;
    	}
    	return self;
    }
    
    - (void) start
    {
    	NSLog(@"starting...");
    	[self.rssParser start];
    }
    
    - (BOOL) isConcurrent
    {
    	return YES;
    }
    
    - (BOOL) isExecuting
    {
    	return YES;
    }
    
    - (BOOL) isFinished
    {
    	return NO;
    }
    
    - (void)sfRssParserDidEndParsing:(SFRSSDownloaderParser *)sfparser
    {
    	NSLog(@"did end parsing!");
    }
    
    - (void) dealloc
    {
    	[rssParser release];
    	[super dealloc];
    }
    
    @end
    The code that inits and starts the NSOperationQueue is this:
    Code:
    	
    operationQueue = [[NSOperationQueue alloc]init];
    [operationQueue setMaxConcurrentOperationCount:3];
    int i;
    for (i=0; i<1; i++) {
    	[operationQueue addOperation:[[DownloadOperation alloc]initWithURLLocation:@"http://images.apple.com/main/rss/hotnews/hotnews.rss" andDelegate:self]];
    }
    ...and it exists inside a -ViewDidLoad function of a view controller in iPhone SDK (similar to -awakeFromNib).

    However, I am not getting any result! It seems that the "[rssParser start]" function IS being called, but no delegate actions of the NSURLConnection that is inside that function is being called.

    The strangest thing is that if I change the "-init" function to this:

    Code:
    - (id) initWithURLLocation:(NSString *)urlString andDelegate:(id)del
    {
    	self = [super init];
    	if (self != nil) {
    		self.delegate = del;
    		rssParser = [[SFRSSDownloaderParser alloc]initWithURLLocation:urlString delegate:self];
    		finishedParsing = NO;
    		[rssParser start];
    	}
    	return self;
    }
    The code will work normally! Although that would defeat the purpose of having to use NSOperationQueues at all.

    Any ideas?
     
  11. robbieduncan Moderator emeritus

    robbieduncan

    Joined:
    Jul 24, 2002
    Location:
    London
    #11
    I've no idea if this is really the issue or not but it's one path to investigate. If your NSOperations are being scheduled on threads that are not the main thread then do those threads have NSRunLoops? If not then I'm not sure the async methods of NSURLConnection will work correctly as I think they depend on being in a run loop...
     
  12. Soulstorm thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #12
    Can you elaborate? How can I test if this is the case?
     
  13. robbieduncan Moderator emeritus

    robbieduncan

    Joined:
    Jul 24, 2002
    Location:
    London
    #13
    Simplest way to check which thread each NSOperation is running it would be to NSLog the name of the currentThread in the code that actually preforms the operation execution. If they are executing in separate threads I would try creating my own NSRunLoop before creating the NSURLConnection and see if that fixes the issue.
     
  14. Soulstorm thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #14
    Perhaps you are right:

    First of all, using the method you suggested, the -main method of my NSOperation, and the -start method of RSSParser exist in different threads.
    Another thing I noticed:

    This is what my code looks now in the -start method of my rss parser:

    Code:
    - (void)start
    {
    	NSLog(@"now this is the name: %@", [[NSThread currentThread]name]);
    	NSLog(@"starting download");
    	NSURL *requestURL = [NSURL URLWithString:self.url];
    	NSURLRequest *request = [NSURLRequest requestWithURL:requestURL cachePolicy:NSURLCacheStorageNotAllowed timeoutInterval:60.0f];
    	NSURLConnection *newConnection = [[NSURLConnection alloc]initWithRequest:request delegate:self];
    	self.httpConnection = newConnection;
    	[newConnection release];
    	
    	if (self.httpConnection == nil) {
    		NSLog(@"connection is nil!");
    	}else {
    		NSLog(@"connection is ok...");
    	}
    }
    
    all NSLogs appear correctly, only NSURLConnection methods do not work. If I change the method to this:
    Code:
    - (void)start
    {
    	self.httpData = [[[NSString stringWithContentsOfURL:[NSURL URLWithString:self.url]]dataUsingEncoding:NSUTF8StringEncoding]mutableCopy];
    	[self connectionLoaded];
    It will work just fine.

    So I think I will try creating run loops to see what happens. Must I create them inside the -main function of the NSOperation class, or the -start function of my RSS Parser?
     
  15. Soulstorm thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #15
    robbieduncan, thank you very much.

    By creating a run loop for NSURLConnection, I was able to make things work. Something is crashing in the app now, but I'm sure that I will find the cause somewhere else, as I was able to download and parse data using NSOperation.

    Thanks a lot, robbie, and everyone who has posted answers in this thread.
     
  16. robbieduncan Moderator emeritus

    robbieduncan

    Joined:
    Jul 24, 2002
    Location:
    London
    #16
    The -start I would have thought. Otherwise they won't be in the correct thread.
     
  17. iSee macrumors 68040

    iSee

    Joined:
    Oct 25, 2004
    #17
    I'm don't know exactly what your current problem is but I do have a suggestion that may help:

    Make your subclass of NSOperation be a non-concurrent operation.
    Therefore, you will only override -main.

    It doesn't make sense to use concurrent operations with an operation queue.
    That's because a concurrent operation manages runnning itself in the background. But an operation queue will also manage running its operations in the background. There's no point to them both doing it and the additional complexity can only lead to problems.

    Furthermore, I'd make the download and parse that SFRSSDownloaderParser does synchronous (non-concurrent). Because you call start and wait for a delegate callback it looks like that is also managing a background task (the download I assume). But again, there's no reason to make it work in the background too. The extra complexity can only cause problems. That way your operation's -main method can be a simple block of code that executes start to finish. Remember, main executes in the background, so you're free to be a synchronous as you like within it.

    At the end of main you'd probably want to notify your view controller so that it can update the UI with a call something like:

    [myViewController performSelectorOnMainThread:mad:selector(didFinishDownloadAndParsing) withObject: resultsOfParsing];
     
  18. Soulstorm thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #18
    I understand. But, if I make the downloading of the URL completely synchronous, how will I be able to cancel the downloading of the data when the user exits the current view?

    Will simply calling -cancel be enough?
     

Share This Page