Slow SQLite Query

Discussion in 'Mac Programming' started by Starfox, Sep 9, 2012.

  1. Starfox macrumors regular

    Joined:
    Apr 7, 2011
    #1
    I have this query:

    Code:
    SELECT UserID, JSON FROM TweetsJSON WHERE TweetID <= 999999999999999 AND Time >= 1315535679 AND Time <= 1315708479 ORDER BY TweetID DESC  LIMIT 100;
    
    that's taking ages to complete - TweetID is INTEGER PRIMARY KEY and there's an index on Time. Any idea what's wrong there?
     
  2. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #2
    How long is "ages"? Hundreds of milliseconds? Hundreds of hours? Be specific.


    Break it down. Try a statement with each WHERE subexpression singly:
    Code:
    SELECT UserID, JSON FROM TweetsJSON WHERE TweetID <= 999999999999999 ORDER BY TweetID DESC  LIMIT 100;
    
    SELECT UserID, JSON FROM TweetsJSON WHERE Time >= 1315535679 ORDER BY TweetID DESC  LIMIT 100;
    
    SELECT UserID, JSON FROM TweetsJSON WHERE Time <= 1315708479 ORDER BY TweetID DESC  LIMIT 100;
    
    Which one is fastest? Which one is slowest?

    Try again with two subexpressions (3 combinations). Which pair is fastest? Slowest?

    Reorder the AND clauses. Which is fastest? Slowest?

    If they're all equally slow, eliminate clauses, starting with LIMIT 100 and working backwards (i.e. eliminate DESC, then eliminate ORDER BY TweetID).


    None of this is magically going to give a complete answer. It's just basic debugging detective work. But you'll need the basic data in order to build a complete answer.
     
  3. xStep macrumors 68000

    Joined:
    Jan 28, 2003
    Location:
    Less lost in L.A.
    #3
    The query suggests you are searching through a quadrillion records. That's an interesting problem. The database optimizer may not handle such a generic query with so many possibilities. Even it is using the best solution, that is a lot of indexes to be searching through for a range rather than exact values.

    The optimizer might get some help if you place parenthesis around the time selection like so: (Time >= 1315535679 AND Time <= 1315708479). Or use the BETWEEN option instead.

    Also, is 1315708479 a valid time? I'm reading it as HHMMSS.ssss. If so, then 70 isn't valid.

    I'd experiment by setting the limit to 1, 10, 20, 30, etc and time the result. I'd do that to understand if the time to get data goes up exponentially. I'd also experiment with using a date range, if available, rather than TweetID.

    As chown33 said, it takes work to find a best solution.
     
  4. Starfox thread starter macrumors regular

    Joined:
    Apr 7, 2011
    #4
    Time values are stored as UNIX timestamps, seconds since January 1st 1970, in 64-bit integers. I'll try what you guys suggested and see what I can find out, thanks.
     

Share This Page