modest Posted April 18, 2010 Report Posted April 18, 2010 Say i have a coordinate of nyc, 42.101483,-72.589811, how do i find out what the offset in degrees is for 5 miles north/south and east/west of that location is (maybe somewhere around 0.02 degrees or something around that, i am guessing) Ok, hopefully I've stumbled out of my mid-Saturday haze—although I think I may have done so drunkenly at this point To do this I think you need the radius of curvature for lat and long whic hwiki gives here:Latitude - Wikipedia, the free encyclopedia So... if I follow the discussion, you want to draw a square on the earth (expressed in degrees lat and long) and make it larger and larger. It's would be easy for latitude because a degree of latitude is always very near 110 km. But, Longitude get's all messed up toward the poles. Nonetheless, all I think you need are the N(phi) and M(phi) radius of curvature given at that link. I thinkk, written in Perl, this might be what you're looking for:use Math::Trig; $a=6378.137; $b=6356.7523142; print "What is the latitude (positive in degrees)?"; $Lat = <>; print "What is is the search radius in km?"; $ScaleF = <>; $Phi=deg2rad($Lat); $M=($a*$**2/((($a*cos($Phi))**2+($b*sin($Phi))**2)**(3/2)); $N=($a**2)/(sqrt(($a*cos($Phi))**2+($b*sin($Phi))**2)); $dLat=$ScaleF/(3.14159265/180*$M); $dLong=$ScaleF/(3.14159265/180*cos($Phi)*$N); print "add plus and minus $dLat degrees to the latitude n"; print "add plus and minus $dLong degrees to the longitude n"; Sorry, I really am not anywhere near a stone's throw from a coder. The NYC example you give at 42.101483,-72.589811 and 5 miles (8 km) gives this ouput: What is the latitude (positive in degrees)?42.101483 What is is the search radius in km?8 add plus and minus 0.0720232426099533 degrees to the latitude add plus and minus 0.0967129797928483 degrees to the longitude for a hundred kilometers surrounding NYC:What is the latitude (positive in degrees)?42.101483 What is is the search radius in km?100 add plus and minus 0.900290532624416 degrees to the latitude add plus and minus 1.2089122474106 degrees to the longitude For 200 km surrounding Reykjavik Iceland: What is the latitude (positive in degrees)?64.15 What is is the search radius in km?200 add plus and minus 1.79404922995349 degrees to the latitude add plus and minus 4.10936993425326 degrees to the longitude I think that's working. ~modest edit: A thought just woke me. Latitude can be negative. I don't know why I wrote that it should be positive. Quote
jedaisoul Posted April 18, 2010 Report Posted April 18, 2010 To do this I think you need the radius of curvature for lat and long which wiki gives here: So... if I follow the discussion, you want to draw a square on the earth (expressed in degrees lat and long) and make it larger and larger. It's would be easy for latitude because a degree of latitude is always very near 110 km. But, Longitude get's all messed up toward the poles. Nonetheless, all I think you need are the N(phi) and M(phi) radius of curvature given at that link. I thinkk, written in Perl, this might be what you're looking for:The original question was: Basically problem is such, you have a set of coordinates (lat, long), you have a database that references coordinates to locations, say big cities. Now how do you find where the nearest city is?I agree with alexander that selecting likely cities by iteratively incrementing an offset until some are found is more efficient than calculating the distance from every city. However, I maintain that:a) Selecting likely cities in a radius is unnecessarily cumbersome. Compensation for the curvature of the Earth is either: i. Totally unnecessary (for distances up to hundreds of miles) or... ii. Can be approximated by linear adjustment (trapezoid), which should be much quicker than spherical calculations. Basically, what I'm proposing is similar to normalized maps, where the lines of longitude are "straightened up" to make them parallel. So you can divide the map into regular squares of longitude and latitude. This distorts the distances, but I suspect that the effect is not significant up to the hundreds of miles. For distances in the thousands of miles I propose using a regular trapezium as the basic shape. This would be:c) More accurate than a square.d) Less computationally intensive than a true curved surface. The process could be:1. Normalize the longitude offset to compensate for the latitude of the location. This "squares up" the search area, irrespective of the latitude. Locations near the poles might need special consideration, but thats a detail. When I last checked there were no cities near either pole anyway! 2. Chose an arbitrary offset size (in degrees) and find all cities within that latitude and (normalized) longitude of the location. 3. If no cities are found, increment and repeat 2 until some are found. 4. Increment the offset size by > 1.4142135 (square root of 2) and search once more. The purpose of 4 is that, if the nearest city found happens to be NE, SE, NW or SW of the location, there could be a closer city N, S, E or W of the location that is not included in the search. So extend the search area to include those locations w.r.t. the chosen location. This is all done with simple maths. No need to calculate a radius from the location, and to select cities on that basis. 5. Having selected all the likely cities, and having already normalized the longitude, the nearest city can be found by using Pythagoras' theorem on the latitude and (normalized) longitude offset of each city to compute the "distance" of each from the selected location. Note: This "distance" is a nominal figure. It does not need to be expressed in miles. 6. If the distance is small enough that the distortions are not significant:a) Convert the nominal distance for this city into miles. Return the closest city and its distance. This leaves the question of what to do if the distortion is significant. There are a number of approaches to this, and it would need to be empirically tested which is the most efficient. Possible solutions are:i. Repeat the selection of cities and the calculation of their distances using the "trapeziod" adjustment. Or...ii. Include the "trapezoid" adjustment into the basic selection of cities and calculation of their distances. I might see about actually coding this, to see if it works... Quote
modest Posted April 18, 2010 Report Posted April 18, 2010 Basically, what I'm proposing is similar to normalized maps, where the lines of longitude are "straightened up" to make them parallel. So you can divide the map into regular squares of longitude and latitude. This distorts the distances, but I suspect that the effect is not significant up to the hundreds of miles. Not exactly. The distortion is significant at less than a mile near the north pole. As you saw, near Reykjavik Iceland the distortion is more than double. Really, you need some method to account for longitude's increasing disagreement with latitude as latitude approaches 90 degrees. If you want to be exact, you should also account for earth being a spheroid rather than a sphere. My method accounts for both. Except for those objections, I believe what I just wrote does exactly what you ask. You pass a function a latitude and longitude and a search radius and it gives you the appropriate bounds in lat and long. You can increase the search radius as needed. I agree with you that the search radius needs to be scaled up even after a city is found because, as they say, a circle fits in a square. ~modest EDIT: I guess really, it should be scaled up until a few cities are found then the exact distance should be found on all of them with Haversine. Quote
jedaisoul Posted April 18, 2010 Report Posted April 18, 2010 Not exactly. The distortion is significant at less than a mile near the north pole. As you saw, near Reykjavik Iceland the distortion is more than double. Really, you need some method to account for longitude's increasing disagreement with latitude as latitude approaches 90 degrees. If you want to be exact, you should also account for earth being a spheroid rather than a sphere. My method accounts for both. Except for those objections, I believe what I just wrote does exactly what you ask. You pass a function a latitude and longitude and a search radius and it gives you the appropriate bounds in lat and long. You can increase the search radius as needed. I agree with you that the search radius needs to be scaled up even after a city is found because, as they say, a circle fits in a square. ~modest EDIT: I guess really, it should be scaled up until a few cities are found then the exact distance should be found on all of them with Haversine.You seem to be missing the point, i.e. execution time. My linear approximations should be far quicker to compute than spherical ones (depending on the computer language used). However, I agree that the poles are a problem, so I suggest a novel solution:1. Set up a table of longitudes and latitudes for required cities.2. Convert the true longitudes and latitudes to a system placing north at 0o,0o (where the Prime Meridian meets the Equator, and place in "alternate location" fields in the database (or array).3. Then:a) For latitudes -45o to 45o use the original longitude and latitude. For latitudes outside this range use the "alternate location". The table of alternate locations need only be computed once, so does not affect the run time execution for a given location. The whole point is achieving a practical system capable of locating the city nearest to any location, with the minimum computational overhead. I would suggest that the accuracy need be no better that a mile or so, because all cities are a number of miles across anyway, so the location of any point in the city is only approximately given by the location of the "centre" of the city, which is arbitrary anyway! That would largely be true if we were using locations of airports rather than whole cities, e.g. for computing nearest alternate landing sites in an emergency. By the time you've decided which airport to head for, the aircraft would have moved from its original location anyway (a plane travelling at 360 mph moves a mile in just 10 seconds)! P.s. I'd also convert all the longitudes and latitudes from decimals to integers. Again for computation speed. Quote
modest Posted April 18, 2010 Report Posted April 18, 2010 You can save computationally by considering the earth a sphere: use Math::Trig; $a=6378.137; print "What is the latitude (positive in degrees)?"; $Lat = <>; print "What is is the search radius in km?"; $ScaleF = <>; $Phi=deg2rad($Lat); $dLat=$ScaleF/(3.14159265/180*$a); $dLong=$ScaleF/(3.14159265/180*cos($Phi)*$a); print "add plus and minus $dLat degrees to the latitude n"; print "add plus and minus $dLong degrees to the longitude n"; I would think this could be off by as much as a kilometer per thousand. The previous code was exact. $a and $b, I didn't make clear last night, are the equatorial and polar radius. The search radius can be done in any units of measure (even degrees latitude) so long as $a and $b are expressed in those units. Also, the point of having 3.14159265/180 in the last couple lines coverts radians back to degrees. That could have used some annotation. Conceptually though, I think your idea of scaling the points of the database then keeping the search blocks square should, at best, give exactly the same results as searching the scaled latitude and longitude of the normal database. Since the above method doesn't need an additional database I would think it should be easier. I also don't think your method would be faster. But, I do see what you're saying and I do think it would work. ~modest Quote
jedaisoul Posted April 18, 2010 Report Posted April 18, 2010 Conceptually though, I think your idea of scaling the points of the database then keeping the search blocks square should, at best, give exactly the same results as searching the scaled latitude and longitude of the normal database. Since the above method doesn't need an additional database I would think it should be easier. I also don't think your method would be faster. But, I do see what you're saying and I do think it would work.Thanks for these comments. I'm an old programmer and my instincts are to simplify the computation based on experience when computers were very much slower than today. So, yes, it is possible that my proposal would not be significantly faster today than more complex maths. This is also dependent on the computer language used. It needs to be strongly typed and support integer maths. Otherwise much of the speed advantage would be lost anyway. Quote
alexander Posted April 19, 2010 Author Report Posted April 19, 2010 So here are some interesting findings: first of all, i have been testing and toying with this for about a 1/2 a day now and i should really get some other work done, but, here's the code that i am currently using with some success: function latlongOffset($lat, $long, $distance) { $a = 6378137; $b = 6356752.3142; //equatorial and polar radii $Phi=deg2rad($lat); $M=pow($a*$b,2)/pow(pow($a*cos($Phi),2)+pow($b*sin($Phi),2),(3/2)); $N=pow($a,2)/(sqrt(pow(($a*cos($Phi)),2)+pow(($b*sin($Phi)),2))); return array("lat"=>$distance/((pi()/180)*$M), "long"=>$distance/((pi()/180)*cos($Phi)*$N)); } //takes range in meters function getNearest($lat, $long, $amount=1, $range=2500, $maxrange=NULL, $link=NULL) { $offset = ($maxrange==NULL || $maxrange<=$range)? $range : $maxrange; $ang_offset = latlongOffset($lat, $long, $offset); if(!$link) $link = new mysqli('127.0.0.0','user','pass','db'); $sql = "SELECT latitude, longitude, city, state, 6378137*(2*atan2(sqrt(pow(sin(radians(latitude-($lat))/2),2)+cos(radians(latitude))*cos(radians($lat))*pow(sin(radians(longitude-($long))/2),2)), sqrt(1-pow(sin(radians(latitude-($lat))/2),2)+cos(radians(latitude))*cos(radians($lat))*pow(sin(radians(longitude-($long))/2),2)))) as distance FROM cities WHERE latitude BETWEEN ".(($lat<$lat+$ang_offset["lat"]) ? $lat." AND ".($lat+$ang_offset["lat"]) : ($lat+$ang_offset["lat"])." AND ".$lat). " AND longitude BETWEEN ".(($long<$long+$ang_offset["long"]) ? $long." AND ".($long+$ang_offset["long"]) : ($long+$ang_offset["long"])." AND ".$long) . " ORDER BY distance"; echo $sql; $stmt = $link->prepare($sql); $stmt->execute(); $results = new MyResult($stmt); $cities = $results->getArray("MYSQLI_ASSOC"); if(count($cities)<=$amount && $offset != $maxrange) {return getNearest($lat,$long,$amount,$range*2,$maxrange,$link);} else return array_slice($cities, 0, $amount); } function toMiles($m) { return ($m/1609.344); } function toMeters($mi) { return ($mi*1609.344); } $lat=42.130821; $long=-72.590103; echo "Original location: nLat: $latnLong: $longn"; $cities = getNearest($lat, $long, 5, toMeters(5)); print_r($cities); things that i have found. I was using the Pythagorean theorem but i quickly found that even with a 5 mile offset, the distances would vary enough for the rows to be returned out of sync, so i had to implement haversine in the query to compute the distance. this is actual output:Array( [latitude] => 42.177620 [longitude] => -72.562566 [city] => CHICOPEE [state] => MA [angle] => 0.0542994730177)distance 3.52607024974 milesArray( [latitude] => 42.195565 [longitude] => -72.542478 [city] => CHICOPEE [state] => MA [angle] => 0.0803736658427)distance 5.0940196751 milesArray( [latitude] => 42.151278 [longitude] => -72.510524 [city] => INDIAN ORCHARD [state] => MA [angle] => 0.0821663318519)distance 4.32473107055 miles angle was the name there, because it computed angle differences, and it's ordered by angle (as you can see) I had a gocha with limits in mysql, mysql will honor a limit before it honors the order, so regardless of where the limit is. So even though it might seem like its simple to just throw in a limit by $amount into the query, you won't necessarily get a correct result set. but yeah so i can call this function in a multitude of ways, i can seach for cities in a range, i can search for any amount of cities (regarless or range) and i will allways get them closest to farthest, with distance in meters (you can trust those to maybe 1/10th of a meter or so) There's no post processing, though if anyone wants, i have some other functuions function destinationVincenti($lat1, $lon1, $brng, $dist) { $a = 6378137; $b = 6356752.3142; $f = 1/298.257223563; // WGS-84 ellipsiod $alpha1 = deg2rad($brng); $sinAlpha1 = sin($alpha1); $cosAlpha1 = cos($alpha1); $tanU1 = (1-$f) * tan(deg2rad($lat1)); $cosU1 = 1 / sqrt((1 + pow($tanU1,2))); $sinU1 = $tanU1*$cosU1; $sigma1 = atan2($tanU1, $cosAlpha1); $sinAlpha = $cosU1 * $sinAlpha1; $cosSqAlpha = 1 - pow($sinAlpha,2); $uSq = $cosSqAlpha * (pow($a,2) - pow($b,2)) / pow($b,2); $A = 1 + $uSq/16384*(4096+$uSq*(-768+$uSq*(320-175*$uSq))); $B = $uSq/1024 * (256+$uSq*(-128+$uSq*(74-47*$uSq))); $sigma = $dist / ($b*$A); $sigmaP = 2*pi(); while (abs($sigma-$sigmaP) > 0.000000000001) { $cos2SigmaM = cos(2*$sigma1 + $sigma); $sinSigma = sin($sigma); $cosSigma = cos($sigma); $deltaSigma = $B*$sinSigma*($cos2SigmaM+$B/4*($cosSigma*(-1+2*pow($cos2SigmaM,2))-$B/6*$cos2SigmaM*(-3+4*pow($sinSigma,2))*(-3+4*pow($cos2SigmaM,2)))); $sigmaP = $sigma; $sigma = $dist / ($b*$A) + $deltaSigma; } $lat2 = atan2($sinU1*$cosSigma + $cosU1*$sinSigma*$cosAlpha1, (1-$f)*sqrt(pow($sinAlpha,2) + pow($sinU1*$sinSigma - $cosU1*$cosSigma*$cosAlpha1,2))); $lambda = atan2($sinSigma*$sinAlpha1, $cosU1*$cosSigma - $sinU1*$sinSigma*$cosAlpha1); $C = $f/16*$cosSqAlpha*(4+$f*(4-3*$cosSqAlpha)); $L = $lambda - (1-$C) * $f * $sinAlpha * ($sigma + $C*$sinSigma*($cos2SigmaM+$C*$cosSigma*(-1+2*pow($cos2SigmaM,2)))); return array("lat"=>rad2deg($lat2), "long"=>$lon1+rad2deg($L)); } function distanceVincenty($lat1, $lon1, $lat2, $lon2) { $a = 6378137; $b = 6356752.3142; $f = 1/298.257223563; // WGS-84 ellipsoid params $L = deg2rad($lon2-$lon1); $U1 = atan((1-$f) * tan(deg2rad($lat1))); $U2 = atan((1-$f) * tan(deg2rad($lat2))); $sinU1 = sin($U1); $cosU1 = cos($U1); $sinU2 = sin($U2); $cosU2 = cos($U2); $lambda = $L; $iterLimit = 100; do { $sinLambda = sin($lambda); $cosLambda = cos($lambda); $sinSigma = sqrt(pow(($cosU2*$sinLambda),2) + pow(($cosU1*$sinU2-$sinU1*$cosU2*$cosLambda),2)); if($sinSigma==0) return 0; // co-incident points $cosSigma = $sinU1*$sinU2 + $cosU1*$cosU2*$cosLambda; $sigma = atan2($sinSigma, $cosSigma); $sinAlpha = $cosU1 * $cosU2 * $sinLambda / $sinSigma; $cosSqAlpha = 1 - $sinAlpha*$sinAlpha; $cos2SigmaM = $cosSigma - 2*$sinU1*$sinU2/$cosSqAlpha; if (is_nan($cos2SigmaM)) $cos2SigmaM = 0; // equatorial line: cosSqAlpha=0 (§6) $C = $f/16*$cosSqAlpha*(4+$f*(4-3*$cosSqAlpha)); $lambdaP = $lambda; $lambda = $L + (1-$C) * $f * $sinAlpha * ($sigma + $C*$sinSigma*($cos2SigmaM+$C*$cosSigma*(-1+2*pow($cos2SigmaM,2)))); } while (abs($lambda-$lambdaP) > 0.000000000001 && --$iterLimit>0); if ($iterLimit==0) return false; // formula failed to converge $uSq = $cosSqAlpha * (pow($a,2) - pow($b,2)) / pow($b,2); $A = 1 + $uSq/16384*(4096+$uSq*(-768+$uSq*(320-175*$uSq))); $B = $uSq/1024 * (256+$uSq*(-128+$uSq*(74-47*$uSq))); $deltaSigma = $B*$sinSigma*($cos2SigmaM+$B/4*($cosSigma*(-1+2*pow($cos2SigmaM,2))-$B/6*$cos2SigmaM*(-3+4*pow($sinSigma,2))*(-3+4*pow($cos2SigmaM,2)))); return round($b*$A*($sigma-$deltaSigma), 3); } function distanceHaversine($lat1, $long1, $lat2, $long2) { $lat1 = deg2rad($lat1); $long1 = deg2rad($long1); $lat2 = deg2rad($lat2); $long2 = deg2rad($long2); $dlat = ($lat2-$lat1); $dlong=($long2-$long1); $a = pow(sin($dlat/2),2)+cos($lat1)*cos($lat2)*pow((sin($dlong/2)),2); $c = 2*atan2(sqrt($a), sqrt(1-$a)); return round((6378137*$c),2); } Not the fastest we can go with the database, but it does split the processing off to something that doesn't use much processor anyways (database servers don't tend to, or else the DBA doesn't know what they are doing), and i expect this will, in the end, run faster this way then getting the data set back, calculating distances and sorting the array in php... Quote
alexander Posted April 22, 2010 Author Report Posted April 22, 2010 function getNearest($lat, $long, $amount=1, $range=5, $maxrange=NULL, $link=NULL) { if(!$link) $link = new mysqli('127.0.0.1','user','pass','db'); do { $offset = ($maxrange==NULL || $maxrange<=$range)? $range : $maxrange; $ang_offset=array("lat"=>$offset/69.05, "long"=>$offset/abs(cos(deg2rad($lat))*69.05)); $result = $link->query("SELECT latitude, longitude, city, state, zipcode FROM zipcodes WHERE latitude BETWEEN ".($lat-$ang_offset["lat"])." and ".($lat+$ang_offset["lat"])." and longitude BETWEEN ".($long-$ang_offset["long"])." and ".($long+$ang_offset["long"])); $offset=$offset*2; } while ($result->num_rows < $amount && $offset != $maxrange); $i=0; while($cities[] = $result->fetch_array(MYSQLI_ASSOC)) { $tmp_array[$i]=3963.19*2*ASIN(SQRT(POW(SIN(($lat-$cities[$i]['latitude'])*pi()/180/2),2)+COS(($lat)*pi()/180)*COS(latitude*pi()/180)*POW(SIN(($long-$cities[$i]['longitude'])*pi()/180/2),2))); $cities[$i]['distance']=$tmp_array[$i]; $i++; } asort($tmp_array); foreach ($tmp_array as $pos=>$val) $return_array[$pos] = $cities[$pos]; return array_slice($return_array, 0, $amount); } so this is, so far, the fastest working implementation, it's not super precise anymore, but it works quite quickly... also it now correctly gets springfield as the closest city to the springfield coordinates (there was an issue with the previous code, minor issue, of the "duuuh" kind) Quote
jedaisoul Posted April 23, 2010 Report Posted April 23, 2010 alexander, I'm impressed by the amount of work you have put into this problem. However, it occurs to me that, whilst your approach may work on one continent (North America), it may have difficulties if used globally. The reason is that -179o is next to 180o. I'm not sure your approach works if you have to take this into account. So I've prototyped my "linear" approach in a spreadsheet. This gives me no idea of the speed of an actual database implementation, but does give a good idea of the logical steps necessary. That brings me to the conclusion that you have to check the "distance" in degrees to EVERY city w.r.t. your arbitrary location first to see if it is >180o. If so, take the complement and use that. If I'm right, then this seems to negate the idea of selecting by offset first. My prototype is not super accurate because the only list of cities I could find on the internet were specified in degrees and minutes. So the distances are only accurate to plus or minus 1 Km or so. Also the list contained 120 or so cities in North America, but only a similar number of cities in the rest of the world! So there is worldwide coverage, but it is nowhere near comprehensive. But as it is only intended as a "proof of concept" I found this acceptable. The steps I used were:1. Simulate a database read of the static data on the cities (longitude, latitude and KM per degree longitude at this latitude) into an array by constructing a table of cities. In view of the limited accuracy of the raw data, all degrees are to three places of decimal. The rows are the cities. The columns are:- Longitude (degrees).- Latitude (degrees).- Size of a degree of longitude at this latitude (in Km).- Latitude difference (degrees).- Longitude difference (degrees).- Adjusted longitude difference (degrees) - adjusting for -180o and 180o discontinuity.- Latitude difference converted to Km.- Adjusted longitude difference converted to Km.- The distance between the chosen location and the city (Km). Note:a) The latitude difference (degrees) is converted to kilometers at a fixed rate of 110.9 Km per degree.:bouquet: The longitude difference (degrees) is converted to kilometers at the LESSER of: i. The conversion figure for the city (from the table), or... ii. The conversion figure for the chosen location (calculated once). 2. The actual amount of calculation required is quite limited: i. The Km per degree of longitude for the chosen location. ii. The latitude and longitude differences in degrees between the location and each city. iii The correction of the latitude difference in degrees for the -180o and 180o discontinuity. iv. The latitude and (adjusted) longitude conversion to Km. v. The distance between each city and the location (using Pythagoras' theory). 3. The indexing of the distances to return the closest city is simulated by a lookup of the lowest distance. This returns the city name, distance and direction (degrees N/S and E/W). If anyone is interested I can put the spreadsheet on the internet (in Excel 2003 format), so you can look at it... Quote
alexander Posted April 24, 2010 Author Report Posted April 24, 2010 jedai, example above shows that with a linear approach, even in a 5 mile radius yields a faulty result set.. (i.e. it is out of order) To make it more globally applicable, there are a couple of changes that would need to be made to the code... I'll ponder that over the weekend, but i think i just need to add the checking for which coordinates are more/less then the other set, i think that part of it flips on the south side of the globe, and i have to see what happens near the poles, and latitude effect when trig functions near zero. Quote
jedaisoul Posted April 24, 2010 Report Posted April 24, 2010 jedai, example above shows that with a linear approach, even in a 5 mile radius yields a faulty result set.. (i.e. it is out of order) To make it more globally applicable, there are a couple of changes that would need to be made to the code... I'll ponder that over the weekend, but i think i just need to add the checking for which coordinates are more/less then the other set, i think that part of it flips on the south side of the globe, and i have to see what happens near the poles, and latitude effect when trig functions near zero.I accept what you say, and perhaps a linear approach is inappropriate for your method. That does not make it inappropriate for mine. You see, although my calculations are linear, the raw data of Km per degree latitude and longitude at the equator are GREAT CIRCLE distances (or should be). So, the calculation implicitly includes the curvature of the Earth. Also I have no problem near the poles. At the north pole, the longitude input for the location acts as a compass direction (e.g. at 90oN, 30oW acts as if you were facing in that direction). The E/W direction returned is the amount and direction in which you need to turn to then travel due South to reach the specific location. The only trig functions I use are to determine the length in Km of a degree of longitude at the latitude of each city (which is built in to the table beforehand) and the length of a degree of longitude of the selected location. That is the only trig function calculated dynamically, and at the north and south poles it correctly returns zero. I'm convinced that my method works. If so, yours may be over-complicated (which may have made it slow in the first place). I've put the sheet here. It is zipped up. Hopefully if you have a look at it, you will agree, or at least be able to point out some flaw in it... Quote
alexander Posted April 25, 2010 Author Report Posted April 25, 2010 which may have made it slow in the first place just so you know what slow is, the original method, that returned me the nearest any number of cities sorted by distance from current location to the precision of 1mm, took approximately 4 thousands of a second to run, the later iteration was still slow, but slow as in .001 sec slower then my current method that takes .0012 sec to run (including getting data from a database on a different server)... I could simplify the math though, i ran the numbers using your method and it seems to hold up pretty well, example: starting point:Me 42.130821-72.590103 and 2 cities: A42.195565-72.542478 B42.151278-72.510524 Your algorithm (in a non-convoluted spreadsheet form):first you calculate degree of longitude distance (this would not change for our purpose):degree lat = 110.9degree long = 111.3*sin(radians(90-42.130821)) = 82.541760 then for each city you would calculate the differences:A lat diff = 42.195565 - 42.130821 = 0.064744A long diff = -72.542478 - -72.590103 = 0.047625 B lat diff = 42.151278 - 42.130821 = 0.020457B long diff = -72.510524 - -72.590103 = 0.079579 then (and this is the step that you erroneously think that makes your result set imply a curve, rather imply it correctly, it does imply a curve and is more efficient in shorter distances) you calculate the lat and long distance based on the diffs:A lat km = 0.064744 * 110.9 = 7.1801096A long km = 0.047625 * 82.541760 = 3.93105132 B lat km = 0.020457 * 110.9 = 2.2686813B long km = 0.079579 * 82.541760 = 6.56859072 And finally you calculate the distances based on the theorem: A dist = [math]\sqrt{7.1801096^2 + 3.93105132^2}[/math] = 8.1857888B dist = [math]\sqrt{2.2686813^2 + 6.56859072^2}[/math] = 6.94933802 The margin or error in these short distances seems neglegeable, but i am not positive if it will work for extreme distances, like say finding nearest US city to : -25.069507,-130.109539 for example with haversine formula calculating distances for my function the margin of error will still be within a few meters, but the margin of error in your function increases with distance from the object, and i would think significantly, but i haven't tested... That said to modify my program to use your math is pretty simle, one line changes:$tmp_array[$i] = sqrt(pow(($lat-$cities[$i]['latitude'])*69.05,2)+pow(($long-$cities[$i]['longitude'])*(69.05*abs(cos(deg2rad($lat)))),2)); However my way of obtaining the results to act on, is still much more efficient then letting the database do math on every row, even simple math... Also i will try to see if i can get the full database to query on, we only have the 5 digit US codes in ours atm, which is only about 45K rows, full database, actually full US/Canadian database is 800k+ rows.... Quote
jedaisoul Posted April 26, 2010 Report Posted April 26, 2010 I could simplify the math though, i ran the numbers using your method and it seems to hold up pretty well,Good, we seem to be getting somewhere... ;) then (and this is the step that you erroneously think that makes your result set imply a curve, rather imply it correctly, it does imply a curve and is more efficient in shorter distances) you calculate the lat and long distance based on the diffs:A lat km = 0.064744 * 110.9 = 7.1801096A long km = 0.047625 * 82.541760 = 3.93105132 B lat km = 0.020457 * 110.9 = 2.2686813B long km = 0.079579 * 82.541760 = 6.56859072I still maintain that this is 100% accurate (assuming the world is a regular sphere), provided that the original distance per degree lat and long at the equator are great circle distances. You are just increasing the arc. But... And finally you calculate the distances based on the theorem:A dist = [math]\sqrt{7.1801096^2 + 3.93105132^2}[/math] = 8.1857888B dist = [math]\sqrt{2.2686813^2 + 6.56859072^2}[/math] = 6.94933802This is where a margin of error IS introduced. Pythagoras' theorem is plane geometry. It will be accurate for small distances on the surface of a sphere (which approximates to a plane surface), but, as you say, will be less and less accurate for larger distances. By the way, I would suggest trying:A dist = [math]\sqrt{7.1801096 * 7.1801096 + 3.93105132 * 3.93105132}[/math] Sometimes, multiplying a number by itself is calculated faster than squaring a number. That's an old coder's trick. Whether it is still valid, I'm not sure, but you seem to have the resources to be able to check it out... The margin or error in these short distances seems negligible, but i am not positive if it will work for extreme distances, like say finding nearest US city to : -25.069507,-130.109539I agree in principle, but, I think your degree of accuracy is spurious anyway. See Wikipedia. There are two factors that limit your accuracy:1. Height above sea level.2. The Earth is not a sphere, but an irregular shape approximating a biaxial ellipsoid. The Earth has an equatorial bulge making the radius at the equator about 0.3% larger than the radius measured through the poles. These factors blows apart your claimed "precision of 1mm". So the inaccuracy of using plane geometry may, or may not be the limiting factor anyway. But sadly I lack the mathematical ability to actually quantify the various factors. I'm just a humble code pusher not an academic ;) However my way of obtaining the results to act on, is still much more efficient then letting the database do math on every row, even simple math... Also i will try to see if i can get the full database to query on, we only have the 5 digit US codes in ours atm, which is only about 45K rows, full database, actually full US/Canadian database is 800k+ rows....I agree that using a database query do the maths is not a good idea. By the way you seem to have a lot of "cities" in your database? Quote
alexander Posted April 26, 2010 Author Report Posted April 26, 2010 I agree in principle, but, I think your degree of accuracy is spurious anyway. Undoubtedly it will be off, but the problem with the way your model works is it basically assumes that the earth is a cylinder that is always the same height (which is fine), but is as wide as the earth is at the current latitude (and remember it assumes a cylinder not a sphere), the haversine formula at least compensates for some curvature to give me constant results that are accurate to within a few kilometers, something i am willing to put up with. Its a magnitude more math (see the vinceti functions above) to use the model that is accurate to within a millimeter, that models the ellipsoid earth. But to show you what i mean: start: 42.130821-72.590103 end:-25.069507-130.109539 distanceyour method: degree lat = 110.9degree long = 111.3*sin(radians(90-42.130821)) = 82.541760lat diff = 67.200328long diff = 57.519436lat km = 7452.51638long km = 4747.75548distance = 8836.35572 km Same coordinates using:haversine: 9530.35762 kmvincetys: 9499.761761 km As you can see the degree of inaccuracy with haversine formula is still fairly negligible, 31km ~ 0.3%, the degree of inaccuracy with your method is quite a bit more significant in this case, 663km or almost 7%. Also let me correct something i said above, my whole function takes on average about 0.0108 sec to run (per iteration, it will speed up after the first iteration too), so i was off by an order of magnitude before, and actually i will run the condition i specified above, nearest city to -25.069507, -130.109539 Original location: Lat: -25.069507Long: -130.109539 0.062113 Array( [0] => Array ( [latitude] => 7.367000 [longitude] => -170.855000 [city] => Majuro [state] => MH [zipcode] => 96960 [distance] => 3543.34338202 ) )so this took 0.062113 seconds to run (including the dozen or so queries) also latest, much more troubleshot code function getNearest($lat, $long, $amount=1, $range=5, $maxrange=NULL, $link=NULL) { if(!$link) $link = new mysqli('127.0.0.1','user','pass','db'); do { $offset = ($maxrange==NULL || $maxrange<=$range)? $range : $maxrange; $ang_offset=array("lat"=>$offset/69.05, "long"=>$offset/abs(cos(deg2rad($lat))*69.05)); $result = $link->query("SELECT latitude, longitude, citymixedcase as city, state, zipcode FROM zipcodes WHERE latitude BETWEEN ".($lat-$ang_offset["lat"])." and ".($lat+$ang_offset["lat"])." and longitude BETWEEN ".($long-$ang_offset["long"])." and ".($long+$ang_offset["long"])); $range*=2; } while ($result->num_rows < $amount && $offset != $maxrange); $i=0; while($cities[] = $result->fetch_array(MYSQLI_ASSOC)) { $a=SIN(abs(($lat-$cities[$i]['latitude']))*pi()/180/2); $b=SIN(abs(($long-$cities[$i]['longitude']))*pi()/180/2); $tmp_array[$i]=3963.19*2*ASIN(SQRT(($a*$a)+COS(($lat)*pi()/180)*COS($cities[$i]['latitude']*pi()/180)*($b*$:))); $cities[$i]['distance']=$tmp_array[$i]; $i++; } asort($tmp_array); foreach ($tmp_array as $pos=>$val) $return_array[$pos] = $cities[$pos]; return array_slice($return_array, 0, $amount); } had a little blonde moment there with incrementing the offset instead of the range (lol duuuh) Quote
jedaisoul Posted April 26, 2010 Report Posted April 26, 2010 You are right. My method just does not work at large distances. Ho hum. :phones: Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.