Tuesday, December 31, 2013

What's HOT for the GeoHipster in 2014


Skybox Imaging and Planet Labs have launched imaging satellites, expect bunch of cool new image products and imagery derived data in 2014. Also note that Frank Warmerdam is at Planet Labs. 

But wait, there's more! There's another readily available source of imagery data, it's in the photos people are posting to Instagram, Flickr and Facebook. Expect tools to exploit this source of imagery.

Hardware hacking

Arduino and Raspberrypi are moving out of their respective blinky lights infancy. Geohipsters will be connecting them to sensors and talking to via node.js. Expect to see other hardware platforms such as Tessel making inroads on the hardware hacking movement. 

Car hacking is still in it's infancy with blue tooth ODBII modules. But as more cars roll out as mobile platforms replete with a API, car modding will be more than just chip modding for performance.

Thursday, March 21, 2013

A little data for geocoding

What's a geocoder to do without data? Fortunately, there's tons of it and more and more produced every day. I have a project where I need to verify the addresses of non-profits. The IRS provides the Statement of Information (SOI) Tax Statistics for Exempt Organizations Business Master File Extract. The data is provided as both Excel files and as fixed width delimited text files. The fixed width files contain all the records and there is one per state.

Using the same technique I used for importing 2010 Census headers, I imported each line/record as a single field into a temporary table. Using the SQL substring function, I extracted the data into their respective fields. Information about the file structure and fields are available in The Instruction Booklet on the download page. 

Below is the script for importing the data.

When all is said and done, you will have a table containing California tax exempt organizations. The next step may seem a little backward, but I exported the data back to a tab delimited text file.

This may seem a step backward, but until there is a built in Postgres geocoder, handling text files is simpler and faster than write code that extracts data for geocoding using an external service.

Saturday, February 2, 2013

Data Science Tool Kit on Vagrant

Pete Warden has released a version of the Data Science Tool Kit on Vagrant. DSTK is a website for munging all sorts of data and includes a geocoder based on TIGER 2010. The website can be unreliable, requiring an occasional restart, so running a VM is a nice option. The vagrant version upgrades the geocoder to TIGER2012 and is a drop in replacement for Google geocoder requests. To run the DSTK locally

Install vagrant from http://www.vagrantup.com/. Create a directory to hold the vagrantfile, then run the following:

Go to to http://localhost:8080 to start using the DSTK.