Python for Transit: Line frequencies in a map from GTFS

Dive deeper into gtfs_functions Python package

Santiago Toso
3 min readDec 7, 2020

Update March 2023!!

This package has been updated in March 2023. This article reflects the usage of the package’s latest version.

Introduction

In this article, we will see how to get line frequencies from a GTFS using the Python package gtfs_functions. You can find the repository and official documentation on GitHub.

Friendly reminder: please help me with a clap (or many!) when you finish reading if you find this article helpful.

If you are looking for an extensive explanation of the package, I recommend you first read this introduction. Here, we are going to directly dive into the specific use case of getting stop frequencies in a map.

Package installation and GTFS parsing

To install the package and parse the GTFS run the code below. For the article, I downloaded the GTFS from SFMTA (San Francisco, CA).

# In your terminal run
pip install gtfs_functions

# Or in a notebook (or similar)
!pip install gtfs_functions

# Import package
from gtfs_functions import Feed, map_gdf

feed = Feed("SFMTA.zip", time_windows=[0, 6, 9, 15, 19, 22, 24])

routes = feed.routes
trips = feed.trips
stops = feed.stops
stop_times = feed.stop_times
shapes = feed.shapes

Calculate line frequencies

The function lines_freq takes 5 arguments:

  • stop_times calculated in step 1
  • trips calculated in step 1
  • shapes calculated in step 1
  • routes calculated in step 1
  • cutoffs: list of numbers that define the time windows we want to aggregate the data by.
lines_freq = feed.lines_freq

The output for one specific line shows:

GeoDataFrame output for the function lines_freq().

Which has the following columns:

  • route_id from the GTFS
  • route_name
  • dir_id: the direction is “Inbound” if it had 0 in the GTFS and “Outbound” if it had a 1.
  • window: service window defined from the “cutoffs” input.
  • frequency: hourly frequency in minutes per trip in the window.
  • ntrips: number of trips in the widow.
  • max_freq: highest hourly frequency in the day in minutes per trip.
  • max_trips: maximum number of hourly trips that take place in that stop.
  • geometry

Show results on a map

You can always export the GeoDataFrames we saw and open them in your favorite GIS software, but I added a function to allow the user to quickly take a look from the notebook before going into that workflow. It is not meant to be presentation-ready or fully customizable, just to take a quick look.

The function map_gdf() is built on top of the folium library and allows you to quickly visualize and style the data on a map.

It takes 6 arguments as shown below. For example, to visualize line frequencies:

Did you find this article helpful? Please let me know leaving a few claps!!

Acknowledgments & References

Even if this is not a corporate package, some members of Via’s Data Science NYC team collaborated on the last update of the package. A special shout out to Mattijs De Paepe who considerably improved the segment-cutting function and Tobias Bartsch who implemented pattern calculation.

In terms of relying heavily on other packages, map_gdf() is just a folium wrapper so much of the merit goes to its creators.

--

--