Monday, September 10, 2012

Code for America Austin Hackathon: Buses and fun with GTFS

Last Saturday, I participated in the Code for America Austin Hackathon and was impressed by the people who attended. The crowd ranged from very experienced coders to folks who understood and could articulate a real problem as well as policies surrounding the problem. When I said well attended, did I mention that the City of Austin CIO and some of his staff were also there to help out? (Wink, wink, nudge, nudge, say no more to @COSAGOV and rising Democratic political star San Antonio Mayor Julian Castro @juliancastro).

Projects proposed at the hackathon ranged from automating campaign finance forms for the City of Austin, mapping bike routes and amenities, creating a mobile app and API for the  Aunt Bertha service which matches people to services, an online form for clients of the Homeless RV project, and improving bus transportation for riders.

I fell in with the bus group and the problem was presented by Glenn Gabois of Movability Austin. The problem we focused on was notifying riders the location and time of arrival of their bus while they wait at their stop.  Capital Metro, Austin's transportation agency, currently does not provide a feed for their buses while they are in transit. So we divide the problem into three componentes:
  • a crowd sourced bus tracking application
  • the server side infrastructure to provide bus data
  • a mobile application that tells rider a bus' current stop and the number of stops away
Crowd sourcing real time bus locations was an interesting discussion, the group discussed how a mobile app could be used to track buses. Some of the features of the app included:

  • would not require user to turn while on the bus or turn off when off the bus
  • determine which stop where the user boarded the bus
  • disambiguate between buses that used the same stops
The group discussion resulted in a mobile application that ran as a background process on a smart phone that would start sending coordinates when the accelerometer registered a change in speed. Determining which bus a rider boarded was a tougher problem and solutions ranged from QR codes on buses (rejected because it required the user to interact with the application), RFID (problematic because of equipment costs), and NFC (Near Field Communication is relatively new an not a common features on smart phones). In the end, the group centered on algorithmically determining the bus and the route based on data sent by the mobile application. Although this was an interesting discussion, the group decided not to address this problem

In order to use crowd sourced bus tracking data, the group discussed a service architecture to receive the bus tracking coordinates and compare them against the current transit data which Capital Metro provided as GTFS (General Transit File Specification) and as ESRI shape files.  We didn't spend much time discussing this part of the solution and I didn't feel up to the task of setting up a cloud based instance of PostgreSQL and importing the GTFS data into the data base (however, I later proved myself wrong and imported the GTFS data in PostgreSQL and created a sql dump of the database for anyone who wants to use it). For a quick feel good burst of "we're doing something", I uploaded the shape file data to Google Fusion Tables for a quick map. We had five hours to build and demo something at the end of the day.

The group decided to focus on the bus stop application (working title of Bus Time). Naqi Sayed and Kevin Reilly of Hidaya Solutions (@HidayaSolutions) worked on an Android application for determining how many stops away from a rider's current bus stop. With the clock ticking, the group decided that the application would work of just one trip of a single route. My contribution to the group was to rework the GTFS files into a single workable data sets. The GTFS data consisted of the following files

agency.txt
amenities.txt
calendar.txt
calendar_dates.txt
fare_attributes.txt
fare_rules.txt
routes.txt
shapes.txt
stop_times.txt
stops.txt
trips.txt
Using the GTFS Reference, I was able to decode how to get from a route to a list of stops. In our case, we picked a route and a trip on that routed, we then needed to find the stops for that trip. Since stops are sequential we just needed a sorted list of stops of the application.

To get the list of stops, I went through the GTFS Reference to find the keys that would allow me to associated stops with a specific trip. The joins between the data tables looked like this.


  1. Routes are comprised of trips (northbound, southbound, and time). Select a route from routes.txt, then select a trip by the route_id.
  2. A single trip list stop_times. Select stop_times by trip_id.
  3. Stop_times are related to stops by stop_id in the stops.txt file. Select the the stops using the stop_id.
I produced a csv file that was an ordered list of the stops for the application using a combination of sed/grep/awk/bash/ruby scripting. It wasn't pretty but in worked under our time constraint.

Naqi already had the barebones of an Android app with geolocation so he incorporated the data into the app. I did not work on the app but I think it worked this way:
  • the app determined the location of the rider via WiFi location or GPS and choose the closest stop
  • the bus position feed was simulated and was snapped to the closest stop
  • the app compared the rider's stop to the bus' stop, since it was an ordered list it was matter of counting the between the two stops
In Lean Startup speak, the minimum viable product (MVP) was this application.


To me, it was amazing that a group of people who had just met could hammer out an application in six hours. Everyone brought something to the table that contributed to the overall goal. I learned lots of new things from other folks and also some technical lessons along the way, such as importing GTFS data into a database using these scripts

We did what we did because we had to explore a problem, generate solutions, and implement one of then in six hours. Now that I've had a little more time to think about it, I think we could vastly improve the application by importing the GTFS data into Sqlite and using it directly in the application. The application would then have every route, trip, fares, and amenities available. The application could then just subscribe to real-time transit feed without having to deploy application back end. Additionally, we could revisit the QR code idea. Instead of riders with a crowd sourcing app, the QR code would be on the bus and the reader would be at each stop, as buses load and unload passengers the QR code reader could send the bus information to the real-time transit feed.

As you can see the ideas keep flowing, but it's important to remember that creating a mobile application would have been impossible with open data in the form of GTFS.