Geocoding solution
My Geocoding experience with Google and Yahoo became increasingly difficult and lengthy until I finally went looking for another solution. I found that solution with the Perl Module Geo::Coder::US. My final solution utilized Geo::Coder::US for most of the geocoding with Google and Yahoo helping on more difficult ones. The processing time for batch processing 20,000 real estate listings went from around 8 hours in the beginning down to about 1 minute in the end.
Batch geocoding 20,000+ real estate listings using the Google and Yahoo geocoders ended up taking about 8 hours to complete. At various points the process was cut off after to many requests, even after I placed a sleep(1) call before each transaction. I became very discouraged at the performance and went looking for another solution.
I investigated the Perl Module Geo::Coder::US and found it was extremely easy to setup and configure. The lookup process is nice because the module does some massaging of the input to standardize stuff like ‘av’ and ‘ave’. It takes the input as a single string like google and yahoo allow. The lookups are extremely fast. Without enabling the cache, Geo::Coder::US handled 80% of the lookups, google 13% and yahoo 7%. This process took about 18 minutes.
When the cache was enabled the process used the database for the majority of the lookups and the process took 1-3 minutes to complete.