Geocoding MLS data

The data provided by REIL does not contain geocoding which means that I will have to do this myself. I am using Perl to download and build the data so I have designed a Perl process to bulk geocode each of the addresses. I designed a caching system so that I only need to do any one lookup one time.

I use the Google and Yahoo geocoders which are free. Yahoo allows something like 5000 per day and Google allows around 15,000. Even before I approached these limits I was blocked because I was sending too many requests per second. To solve this issue I placed a ‘sleep(1)’ line before each of the geocoder calls ensuring that I don’t call either geocoder more than once a second. Once I implemented this (and moved to a different IP) this seems to work fine.

My caching system uses a GOOD and BAD cache table. Values found in one or the other are returned instead of doing a Google/Yahoo geocode call. I have about 20,000 addresses in my ‘GOOD’ cache right now. I will create a cron job to expire these after some period of time, perhaps a couple of weeks.

This entry was posted in Geocode, GIS, MLS, Real Estate and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>