Download REIL Property data from scratch
My first activity was to download the property data. REIL has a bunch of property ‘classes’ such as RES (residential) and CON (Condominium). The classes will need to go into separate tables. So I designed my download process to take in a Resource (Property) and a Class (RES, CON, etc). Designing this way meant I could use the same routines for all classes of data.
The first step is to create the data table. At first I tried this manually, but later realized that I could just download the metadata and use this to create the table. So my first task was actually to download the metadata (for each class) and put it into a  metadata table. A fringe benefit of this is that I now have information about each column of data such as datatype, long description, short description, length, etc. I will use the metadata to build the table now, but will also use it later for search forms and other system stuff.
After downloading the metadata for a particular class, I iterate through each record and build a CREATE TABLE statement. An obvious benefit here is that any column additions/modifications on REIL’s end will be incorporated each time the dataset is rebuilt. At the end of this process I have a fresh new up-to-date table ready to load with data.
There is a limit to the amount of data you can download in one batch. This limit becomes more constrained as you download more columns, and there are a lot of columns (177). You can use pagination, but the documentation mentions that modifications to the data between the time that the download begins and ends can cause issues. What happens if someone deletes or removes a property? This could screw up the pagination and something could be skipped.
I chose to download all of the propertyID’s at once, which the system allows since it is only one field. I load these into an array and proceed to break up the fetches of complete property data into chunks of 1000 or so. This allows the download of all of the data using an initial snapshot.
I download the property data in COMPACT mode and load it into a TEMP table. This allows me to build the complete dataset and verify it before (quickly) moving it into production. Â When the load is complete and the data count is verified, a RENAME TABLE moves the data into production very quickly.
The photos are next. I use a similar process for downloading the photos into a TEMP area. I download photo file names and other metadata into a Resource-Class specific table, and place the actual files in a hashed directory structure for quick lookup. The photo downloads are by far the longest part of the download process. I download the files to a TEMP directory and move the directory to production after validation.
I found that some files error out with a “No Object Found†error message. Investigating further I found that sometimes these files become available later. So, I search for the error message and write a record to a ‘photo error’ table which includes the propertyID and attempt count. Later I run a cron job to attempt to re-download these failed files.Â