Python for Transit: Stop frequencies in a map from GTFS

Dive deeper into gtfs_functions Python package

Santiago Toso
3 min readDec 7, 2020

Update March 2023!!

This package has been updated in March 2023. This article reflects the usage of the package’s latest version.

Introduction

In this article, we will see how to get stop frequencies from a GTFS using the Python package gtfs_functions. You can find the repository and official documentation on GitHub.

If you are looking for an extensive explanation of the package, I recommend you first read this introduction. Here, we are going to directly dive into the specific use case of getting stop frequencies in a map.

Friendly reminder: please help me with a clap (or many!) when you finish reading if you find this article helpful.

Package installation and GTFS parsing

To install the package and parse the GTFS run the code below. For the article, I downloaded the GTFS from SFMTA (San Francisco, CA).

# In your terminal run
pip install gtfs_functions

# Or in a notebook (or similar)
!pip install gtfs_functions

# Import package
from gtfs_functions import Feed, map_gdf

feed = Feed("SFMTA.zip", time_windows=[0, 6, 9, 15, 19, 22, 24])

routes = feed.routes
trips = feed.trips
stops = feed.stops
stop_times = feed.stop_times
shapes = feed.shapes

Calculate stop frequencies

With the GTFS parsed, we can start playing with it. The function stops_freq takes 3 arguments:

  • stop_times: GeoDataFrame created in the previous step, needed to calculate the number of trips per stop and direction.
  • stops: GeoDataFrame created in the previous step, needed to have the location of the stops.
  • cutoffs: list of numbers that define the time windows we want to aggregate the data by.

We will define some example cutoffs and see what happens:

stops_freq = feed.stops_freq

The output for one specific stop shows:

GeoDataFrame output for the function stops_freq().

Which has the following columns:

  • stop_id from the GTFS
  • dir_id: the direction is “Inbound” if it had 0 in the GTFS and “Outbound” if it had a 1.
  • window: service window defined from the “cutoffs” input.
  • ntrips: number of trips in the widow.
  • frequency: hourly frequency in minutes per trip in the window.
  • max_trips: maximum number of hourly trips that take place in that stop.
  • stop_name from the GTFS
  • geometry

Maybe you already noticed that this GeoDataFrame is ready to be used in a BI or mapping tool. It even has the fields already labeled with easily readable values like “Inbound” or “6:00–9:00” instead of the numbers we have in the GTFS.

Many times you want hourly values. For that specific case, it’d suffice by creating hourly cutoffs for the function input.

Show results on a map

You can always export the GeoDataFrames we saw and open them in your favorite GIS software, but I added a function to allow the user to quickly take a look from the notebook before going into that workflow. It is not meant to be presentation-ready or fully customizable, just to take a quick look.

The function map_gdf() is built on top of the folium library and allows you to quickly visualize and style the data on a map.

It takes 6 arguments as shown below. For example, to visualize stop frequencies:

Did you find this article helpful? Please let me know leaving a few claps!!

Acknowledgments & References

Even if this is not a corporate package, some members of Via’s Data Science NYC team collaborated on the last update of the package. A special shout out to Mattijs De Paepe who considerably improved the segment-cutting function and Tobias Bartsch who implemented pattern calculation.

In terms of relying heavily on other packages, map_gdf() is just a folium wrapper so much of the merit goes to its creators.

--

--