pygtfs - a database backed python gtfs interface!¶
Get it¶
The source is available on github: https://github.com/jarondl/pygtfs
Basic usage¶
To include pygtfs functionality in your application, use import pygtfs.
The first thing you need to to is to create a new schedule object:
sched = pygtfs.Schedule(":memory:")
This will create an in memory sqlite database. Instead you can supply a filename to be used for sqlite (such as (‘gtfs.sqlite’), or a sqlalchemy database connection.
Then you can load gtfs feeds into the databas, by using append:
pygtfs.append_feed(sched, "sample-gtfs-feed.zip")
Where the gtfs feed can be either a .zip file, or a folder full of .txt files. You can add as many feeds as you want into a single database, without fear of conflicts (but you can two stop names for one place, one from each feed for example). Another option to load feeds is to use the ‘gtfs2db’ script as explained later.
The Schedule object represents a collection of objects that correspond to the
contents of a GTFS feed. You can get the list of agencies, stops, routes, etc.
with fairly straightforwardly named attributes, see pygtfs.schedule
for more details.
>>> sched.agencies
[<Agency BART: Bay Area Rapid Transit>, <Agency AirBART: AirBART>]
>>> sched.routes
[<Route AirBART: >, <Route 01: >, <Route 03: >, <Route 05: >, <Route 07: >, <Route 11: >]
For GTFS entities that are identified by a dataset-unique identifier, there is also a function to get them by id:
>>> sched.agencies_by_id('AirBART')
[<Agency AirBART: AirBART>]
>>> sched.stops_by_id('SFIA')
[<Stop SFIA: San Francisco Int'l Airport>]
The GTFS entity objects have attributes that correspond in name to the field definitions in the [GTFS reference](https://developers.google.com/transit/gtfs/reference).
>>> sched.stops_by_id('SFIA')[0].stop_name
u"San Francisco Int'l Airport"
>>> sched.routes[1].route_long_name
u'Pittsburg/Bay Point - SFIA/Millbrae'
GTFS entities which cross-reference each other can also be obtained straightforwardly with attributes (again, see “Reference” below for full details):
>>> sched.trips_by_id('01SFO10').service # the service associated with trip 01SFO10
<Service WKDY (MTWThFSSu)>
gtfs2db¶
setup.py install will also install a command-line script gtfs2db that takes a GTFS zip file or directory as an argument and will load the data into a database usable with pygtfs. Run gtfs2db –help for more.
Detailed refernce¶
The best place to start is pygtfs.schedule
Contents:¶
The schedule module¶
-
class
pygtfs.schedule.
Schedule
(db_connection)[source]¶ Represents the full database.
The schedule is the most important object in pygtfs. It represents the entire dataset. Most of the properties come straight from the gtfs reference. Two of them were renamed: calendar is called services, and calendar_dates service_exceptions. One addition is the feeds table, which is here to support more than one feed in a database.
Each of the properties is a list created upon access by sqlalchemy. Then, each element of the list as attributes following the gtfs reference. In addition, if they are related to another table, this can also be accessed by attribute.
Parameters: db_conection – Either a sqlalchemy database url or a filename to be used with sqlite. -
agencies
¶ A list of
pygtfs.gtfs_entities.Agency
objects
-
agencies_by_id
(id)¶ A list of
pygtfs.gtfs_entities.Agency
objects with matching id
-
agencies_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.Agency
objects
-
fare_rules
¶ A list of
pygtfs.gtfs_entities.FareRule
objects
-
fare_rules_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.FareRule
objects
-
fares
¶ A list of
pygtfs.gtfs_entities.Fare
objects
-
fares_by_id
(id)¶ A list of
pygtfs.gtfs_entities.Fare
objects with matching id
-
fares_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.Fare
objects
-
feed_infos
¶ A list of
pygtfs.gtfs_entities.FeedInfo
objects
-
feed_infos_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.FeedInfo
objects
-
feeds
¶ A list of
pygtfs.gtfs_entities.Feed
objects
-
feeds_by_id
(id)¶ A list of
pygtfs.gtfs_entities.Feed
objects with matching id
-
feeds_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.Feed
objects
-
frequencies
¶ A list of
pygtfs.gtfs_entities.Frequency
objects
-
frequencies_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.Frequency
objects
-
routes
¶ A list of
pygtfs.gtfs_entities.Route
objects
-
routes_by_id
(id)¶ A list of
pygtfs.gtfs_entities.Route
objects with matching id
-
routes_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.Route
objects
-
service_exceptions
¶ A list of
pygtfs.gtfs_entities.ServiceException
objects
-
service_exceptions_by_id
(id)¶ A list of
pygtfs.gtfs_entities.ServiceException
objects with matching id
-
service_exceptions_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.ServiceException
objects
-
services
¶ A list of
pygtfs.gtfs_entities.Service
objects
-
services_by_id
(id)¶ A list of
pygtfs.gtfs_entities.Service
objects with matching id
-
services_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.Service
objects
-
shapes
¶ A list of
pygtfs.gtfs_entities.ShapePoint
objects
-
shapes_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.ShapePoint
objects
-
stop_times
¶ A list of
pygtfs.gtfs_entities.StopTime
objects
-
stop_times_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.StopTime
objects
-
stops
¶ A list of
pygtfs.gtfs_entities.Stop
objects
-
stops_by_id
(id)¶ A list of
pygtfs.gtfs_entities.Stop
objects with matching id
-
stops_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.Stop
objects
-
transfers
¶ A list of
pygtfs.gtfs_entities.Transfer
objects
-
transfers_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.Transfer
objects
-
translations
¶ A list of
pygtfs.gtfs_entities.Translation
objects
-
translations_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.Translation
objects
-
trips
¶ A list of
pygtfs.gtfs_entities.Trip
objects
-
trips_by_id
(id)¶ A list of
pygtfs.gtfs_entities.Trip
objects with matching id
-
trips_query
¶ A
sqlalchemy.orm.Query
object to fetchpygtfs.gtfs_entities.Trip
objects
-
The loader module¶
GTFS entities¶
GTFS entities.
These are the entities returned by the various pygtfs.schedule
lists.
Most of the attributes come directly from the gtfs reference. Also,
when possible relations are taken into account, e.g. a Route
class
has a trips attribute, with a list of trips for the specific route.
-
class
pygtfs.gtfs_entities.
Agency
(**kwargs)[source]¶ -
agency_email
¶
-
agency_fare_url
¶
-
agency_id
¶
-
agency_lang
¶
-
agency_name
¶
-
agency_phone
¶
-
agency_timezone
¶
-
agency_url
¶
-
feed_id
¶
-
id
¶
-
routes
¶
-
-
class
pygtfs.gtfs_entities.
Fare
(**kwargs)[source]¶ -
agency_id
¶
-
currency_type
¶
-
fare_id
¶
-
feed_id
¶
-
id
¶
-
payment_method
¶
-
price
¶
-
transfer_duration
¶
-
transfers
¶
-
-
class
pygtfs.gtfs_entities.
FareRule
(**kwargs)[source]¶ -
contains_id
¶
-
destination_id
¶
-
fare_id
¶
-
feed_id
¶
-
origin_id
¶
-
route_id
¶
-
-
class
pygtfs.gtfs_entities.
Feed
(**kwargs)[source]¶ -
agencies
¶
-
fare_rules
¶
-
fares
¶
-
feed_append_date
¶
-
feed_id
¶
-
feed_name
¶
-
feedinfo
¶
-
frequencies
¶
-
id
¶
-
routes
¶
-
service_exceptions
¶
-
services
¶
-
shape_points
¶
-
stop_times
¶
-
stops
¶
-
transfers
¶
-
translations
¶
-
trips
¶
-
-
class
pygtfs.gtfs_entities.
FeedInfo
(**kwargs)[source]¶ -
feed_end_date
¶
-
feed_id
¶
-
feed_lang
¶
-
feed_publisher_name
¶
-
feed_publisher_url
¶
-
feed_start_date
¶
-
feed_version
¶
-
-
class
pygtfs.gtfs_entities.
Frequency
(**kwargs)[source]¶ -
end_time
¶
-
exact_times
¶
-
feed_id
¶
-
headway_secs
¶
-
start_time
¶
-
trip_id
¶
-
-
class
pygtfs.gtfs_entities.
Route
(**kwargs)[source]¶ -
agency_id
¶
-
fare_rules
¶
-
feed_id
¶
-
id
¶
-
route_color
¶
-
route_desc
¶
-
route_id
¶
-
route_long_name
¶
-
route_short_name
¶
-
route_text_color
¶
-
route_type
¶
-
route_url
¶
-
trips
¶
-
valid_extended_route_types
= [0, 1, 2, 3, 4, 5, 6, 7, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 300, 400, 401, 402, 403, 404, 405, 500, 600, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 800, 900, 901, 902, 903, 904, 905, 906, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1200, 1300, 1301, 1302, 1303, 1304, 1305, 1306, 1307, 1400, 1401, 1402, 1500, 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1600, 1601, 1602, 1603, 1604, 1700, 1701, 1702]¶
-
-
class
pygtfs.gtfs_entities.
Service
(**kwargs)[source]¶ -
end_date
¶
-
feed_id
¶
-
friday
¶
-
id
¶
-
monday
¶
-
saturday
¶
-
service_id
¶
-
start_date
¶
-
sunday
¶
-
thursday
¶
-
trips
¶
-
tuesday
¶
-
wednesday
¶
-
-
class
pygtfs.gtfs_entities.
ServiceException
(**kwargs)[source]¶ -
date
¶
-
exception_type
¶
-
feed_id
¶
-
id
¶
-
service_id
¶
-
-
class
pygtfs.gtfs_entities.
ShapePoint
(**kwargs)[source]¶ -
feed_id
¶
-
shape_dist_traveled
¶
-
shape_id
¶
-
shape_pt_lat
¶
-
shape_pt_lon
¶
-
shape_pt_sequence
¶
-
trips
¶
-
-
class
pygtfs.gtfs_entities.
Stop
(**kwargs)[source]¶ -
feed_id
¶
-
id
¶
-
location_type
¶
-
parent_station
¶
-
platform_code
¶
-
stop_code
¶
-
stop_desc
¶
-
stop_id
¶
-
stop_lat
¶
-
stop_lon
¶
-
stop_name
¶
-
stop_times
¶
-
stop_timezone
¶
-
stop_url
¶
-
transfers_from
¶
-
transfers_to
¶
-
translations
¶
-
wheelchair_boarding
¶
-
zone_id
¶
-
-
class
pygtfs.gtfs_entities.
StopTime
(**kwargs)[source]¶ -
arrival_time
¶
-
departure_time
¶
-
drop_off_type
¶
-
feed_id
¶
-
pickup_type
¶
-
shape_dist_traveled
¶
-
stop_headsign
¶
-
stop_id
¶
-
stop_sequence
¶
-
timepoint
¶
-
trip_id
¶
-
-
class
pygtfs.gtfs_entities.
Transfer
(**kwargs)[source]¶ -
feed_id
¶
-
from_stop_id
¶
-
min_transfer_time
¶
-
to_stop_id
¶
-
transfer_type
¶
-
The gtfs2db script¶
This is a script to manage the database. Here is its help message:
gtfs2db - convert a gtfs feed to a pygtfs database
Usage:
gtfs2db append <feed_file> <database> [--chunk-size <integer>]
gtfs2db overwrite <feed_file> <database> [-i, --interactive] [--chunk-size <integer>]
gtfs2db delete <feed_file> <database> [-i, --interactive]
gtfs2db list <database>
gtfs2db (-h | --help)
gtfs2db --version
Options:
-h --help Show this help screen.
--version Show version.
-i --interactive Ask before deleting or overwriting existing feeds.
--chunk-size <int> How often to flush database. If memory consumption is high,
lower this number. [default: 10000]
<feed_file> The gtfs file on which to operate. Can be either a folder
containing .txt files, or a .zip file.
<database> The database. Can be either a file, which is interpreted
as an sqlite database stored in this file, or a sqlalchemy
database connection.
Commands:
append appends the gtfs feed to the database
overwrite delete any existing feeds which had the same original
filename as the new file, and then append the new file.
delete delete from the database any feeds with the name supplied.
list list existing feeds in the database.
Description:
This is a tool to manage a database containing several gtfs feeds. The
database is in a pygtfs 0.1.0 format, and can be stored as any database
supported by sqlalchemy (the default being sqlite).
The database file can later be used to create a `pygtfs.Schedule` instance.