Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Cerebrus' Maw

macrumors 6502
Original poster
Mar 9, 2008
409
1
Brisbane, Australia
The background I have a PHP script that calls a Mysql table that has many address on it. The script basically calls the Google Map API, and checks to see if the address's can geo-locate. If google returns an error, then that address cannot be geo located, and the script asks the user to update that particular address.

The Problem There are a muliptude of tables, each with a varying amount of address. My problem is that the default 30 second max execution time for a script seems to get exceeded, as the simple_load_xml call can take a bit of time (relatively speaking)

I've tried using the set_time_limit() to over ride the php.ini limit of 30 seconds, but I think I'm still being caught out by the httpd.conf, since I timed it at about 5 minutes before the page basically kaputs and blanks out on me.

I dont have access to this file unfortunately, so I'm wondering, does anyone have a tip for getting around this?
 

Cabbit

macrumors 68020
Jan 30, 2006
2,128
1
Scotland
The only thing i can think of is breaking your SQL query down into smaller chunks, if your asking it to find to much information and therefore causing a timeout then your script is ether inefficient or the the SQL database to slow.

As for changing the php_time_limit() perhaps this is changed in the php.ini script or can be over written in your own script i will need to look that up before i can make a educated comment.
 

Cerebrus' Maw

macrumors 6502
Original poster
Mar 9, 2008
409
1
Brisbane, Australia
The number of fields in the tables are pretty small, its just first address line, second address line, an increment ID (for uniqueness), and a customer name, so my sql cant get any faster/smaller

I think I'll just have to limit the number of calls to maybe a hundred ago, and find some way of keeping track of which records I've tried...
 

SrWebDeveloper

macrumors 68000
Dec 7, 2007
1,871
3
Alexandria, VA, USA
"Note: The set_time_limit() function and the configuration directive max_execution_time only affect the execution time of the script itself. Any time spent on activity that happens outside the execution of the script such as system calls using system(), stream operations, database queries, etc. is not included when determining the maximum time that the script has been running." -- from PHP.net docs on this function

If you use Apache you can change maximum execution time by .htaccess with this line:

php_value max_execution_time 200


-jim
 

Cerebrus' Maw

macrumors 6502
Original poster
Mar 9, 2008
409
1
Brisbane, Australia
"Note: The set_time_limit() function and the configuration directive max_execution_time only affect the execution time of the script itself. Any time spent on activity that happens outside the execution of the script such as system calls using system(), stream operations, database queries, etc. is not included when determining the maximum time that the script has been running." -- from PHP.net docs on this function

If you use Apache you can change maximum execution time by .htaccess with this line:

php_value max_execution_time 200


-jim

The script itself is really small, just a single file maybe 40 lines, so where php.net states that it does not include the execution time of database queries outside of the script I dont think is a factor. I just thinks its the time for the XML to send to the Maps and get a response. A single table probably has over 12000 address (not much if I was just using locally)

I'll see if I can get access to the .htaccess file from work, and if I can implement the change you suggested . Sigh, corporate big wigs interfering...
 

savar

macrumors 68000
Jun 6, 2003
1,950
0
District of Columbia
"Note: The set_time_limit() function and the configuration directive max_execution_time only affect the execution time of the script itself. Any time spent on activity that happens outside the execution of the script such as system calls using system(), stream operations, database queries, etc. is not included when determining the maximum time that the script has been running." -- from PHP.net docs on this function

If you use Apache you can change maximum execution time by .htaccess with this line:

php_value max_execution_time 200


-jim

That's a misleading quote. Any call you make in PHP which blocks (such as a database query) will still count toward the time limit.

To the OP. You're asking for a silver bullet. Let's look at the facts. You have a long script that needs to run, and the Apache timeout is not long enough to complete the script, and you can't modify the Apache conf.

My suggestion: re-design.

Move the long-running query to a command line script. (Long running queries in the browser interface are bad for usability anyway. Nobody wants to sit and stare at the screen for 15 minutes while there is no activity on the screen.) The long-running query should set flags in the database that indicate a user's intervention is needed. Then the web interface just queries for that flag.

Edit: More importantly, your users may get bored and leave. Or their browsers might be the source of the timeout. And that's definitely out of your control. The apache timeout configuration item is related to persistent connections, not single transactions.

And the solution Jim posted is just for setting the PHP value, not the Apache value. The PHP value is just the same thing as calling set_time_limit. You can call set_time_limit(0) to remove the PHP time limit. The apache directive you [might] want to look at is here: http://httpd.apache.org/docs/2.0/mod/core.html#timeout
 

SrWebDeveloper

macrumors 68000
Dec 7, 2007
1,871
3
Alexandria, VA, USA
Any call you make in PHP which blocks (such as a database query) will still count toward the time limit.

If I read the OP's most recent comment properly, the DB is not at issue anyway, savar. The quote from PHP.net was just so the user knows the limitations of the two timeout functions and what they really do. And it makes perfect sense to me. However, as to what you wrote. Which blocks what, i.e. database? Time is "blocked"? Confused by your choice of words, I guess! :eek:

Anyway, you hit the nail on the head as to the OP rethinking the approach to handling this data. I'm with you on that point, entirely, and thank you for making the CLI suggestion. Out of the box thinking like that shows you're a true pro.

Another approach is to simply make the viewable portion of the map smaller, and only query points of interest within those coordinates. If you're doing this already, then identify a sensible drill down process, i.e. the user only sees "states" on the first query, then if they select a state via Ajax query "cities" within that state, and so on. I don't know the actual heirchy of your data, but you get the idea.

It should be noted this approach is sensible for any dataset - always think about how you can speed up a script and reduce database calls to the absolute minimum on every project you do, large or small. My .02, not everyone might agree, and as usual - to each their own, this is only a suggestion.

-jim
 

Cerebrus' Maw

macrumors 6502
Original poster
Mar 9, 2008
409
1
Brisbane, Australia
Cheers for the input guys.

Luckily, this is not for user usage, it's more of a maintainence script that I will eventually hand over to crontab, to run once every month. (sorry, should have menioned that at the start.

Each table links to an Area in Australia. So for example, one table would have all the address within Western Australia (pretty big place) but another table might only have address's within South Sydney.


Edit: Got it working. 23436 address geo-ed over 3 domains, in 2 countries.
 

savar

macrumors 68000
Jun 6, 2003
1,950
0
District of Columbia
However, as to what you wrote. Which blocks what, i.e. database? Time is "blocked"? Confused by your choice of words, I guess! :eek:
I meant "blocked" in the intransitive sense, as in a thread "blocks" when it is waiting for something else to happen. When you issue a DB query in PHP, PHP has to wait for the DB to do some calculations and send results back. PHP halts where it is while that external action happens, and it doesn't make any forward progress while it waits for the DB.

Not usually applicable in web programming since we're almost always single-threaded, but technically system calls and DB calls are separate threads. (Actually, separate threads in separate processes, possibly on a separate machine!)

Luckily, this is not for user usage, it's more of a maintainence script that I will eventually hand over to crontab, to run once every month. (sorry, should have menioned that at the start.

Ahh. If you run in command line mode from the crontab, then that solves your problem right there. Call set_time_limit(0) and then you can spend all day processing if you want.

Each table links to an Area in Australia. So for example, one table would have all the address within Western Australia (pretty big place) but another table might only have address's within South Sydney.

Sounds like an odd design. Why wouldn't all addresses be in a single table?

Edit: Got it working. 23436 address geo-ed over 3 domains, in 2 countries.

Congrats! Very cool. Do you have a commercial license for Google maps API? I know they have limits on how many addresses you can geocode. Are you able to do all that with the free account?

Project sounds like fun. I'm guessing you're looking at GPS data and trying to figure out street addresses automatically? Post a link if you can. I'd like to check it out.
 

Cerebrus' Maw

macrumors 6502
Original poster
Mar 9, 2008
409
1
Brisbane, Australia
Cheers mate.

The free Google API key allows 15000 requests every day off your domain. There are some great API's, which include directions, reverse geo locating, lat/lng geocoding, and you can do some really interesting stuff with overlays (continuous zoom anyone?) They also limit the speed of which you request, but I've never hit this limit.

As for the purpose. I work as part of a car/camper/truck rental compare website. We implemented this little page, where when you confirm a vehicle, you can enter an address, and we'll show you how to get from there, to the depot, or vice versa, directions included

But as we have thousands of these depots, we have to validate their address's (which are in human readable format, not geo coordinates), hence running them against Google.

Usually, its something silly, like for an Airport address, it will be : Domestic and International Terminals, outbound, Sydney Airport. Google is really robust, but even it wont get that. So we just change it to Sydney Airport, Sydney, NSW.

I cant show you the page unfortuanately (unless you want a car rental :D ) , and the other is on the test environment.


And yes, our tables are pretty odd. It has to do with the fact that we have expanded over time, growing into more regions. I believe it was originally thought that we wanted to provide a more local experience to customers (we get their location, and then only supply those applicable, but it was before my time here)

Here is the code (unashamedly copied from the example by Google)
Code:
while ($rowloc = mysql_fetch_array($resloc))
{
     $supplier = $rowloc[supplier_id];
     $loc_id = $rowloc[loc_id];
     $geocode_pending = true;
     $address= $rowloc[loc_addr1]." ".$rowloc[loc_addr2]." Australia";         

     while ($geocode_pending) {
     $request_url = $base_url . "&q=" . urlencode($address);
     $xml = simplexml_load_file($request_url) or die("url not loading");

     $status = $xml->Response->Status->code;

    if (strcmp($status, "200") == 0) {
      // Successful geocode     
      $geocode_pending = false;
    } else if (strcmp($status, "620") == 0) {
      // sent geocodes too fast
      $delay += 100000;
    } else {
      // failure to geocode
      $geocode_pending = false;
      echo "$supplier - $loc_id :" . $address . " <b>Failed</b> to geocode. Please update.<br /><br /> ";      
    }
    usleep($delay);
  }
echo "Search Complete";
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.