Mini Project 01: Fiscal Characteristics of Major US Public Transit System

Introduction

Public transit systems plays an important role in providing the essential transportation service for millions of residents in the United States. In this project, we are looking to analyze the fiscal characteristics of public transit systems to evaluate their financial performance.

Fiscal Characteristics:

We are going to analyze public transit systems based on the following:

  1. Expenses: costs to operate (fuel, maintenance, employee salaries, etc)

  2. Fare revenue: money earned from passenger fares

  3. Farebox recovery ratio: how much of the operating expense is covered by the fares revenue (total farewell revenue/total operating expense)

  4. UPT: Unlinked Passenger Trips

  5. VRM: Vehicle Revenue Miles

Data source:

We used data from the National Transit Database. Data files required:

I created the table using the code provided in the assignment instructions. In this code, I renamed “UZA Name” to “metro_area” (Task 1). I also modified the “Mode” column (Task 2) so we can understand the context of each abbrevations. The interpretation of the codes can be found on the NTD website in the glossary section.

A sample of 1000 observations is shown below:

if(!require("DT")) install.packages("DT")
Loading required package: DT
library(DT)
library(gt)

sample_n(USAGE, 1000) |> 
  select(-`NTD ID`, -`3 Mode`) |>
  rename(Unlinked_Passenge_Ttrips = UPT, Vehicle_Revenue_Miles = VRM) |>
  mutate(month=as.character(month)) |>
  DT::datatable()  #visualize a sample of 1000 observations

Note: In this sample, I unselected columns “NTD ID” and “3 Mode” and renamed “UPT” and “VRM” to make the table look cleaner and easier to interpret.

Task 3: Using dplyr to Analyze Transit Agencies

1. What transit agency had the most total VRM in this sample?

The MTA New York transit has the most vehicle mileage. It has operated over 10.8 billion (almost 11 billion) mileage.

most_vrm_agency <- USAGE |>
  group_by(Agency) |>
  summarize(total_vrm = sum(VRM, na.rm = TRUE)) |>
  ungroup() |>
  arrange(desc(total_vrm)) |>
  slice(1) 

most_vrm_agency |>
  gt() |>
  tab_header(title = "Most Total VRM Agency")
Most Total VRM Agency
Agency total_vrm
MTA New York City Transit 10832855350

2. What transit mode had the most total VRM in this sample?

Bus has the most total VRM of over 49 billion mileage.

most_vrm_mode <- USAGE |>
  group_by(Mode) |>
  summarize(total_vrm = sum(VRM, na.rm = TRUE)) |>
  ungroup() |>
  arrange(desc(total_vrm)) |>
  slice(1)
most_vrm_mode |>
  gt() |>
  tab_header(title = "Most Total VRM Mode")
Most Total VRM Mode
Mode total_vrm
Bus 49444494088

3. How many trips were taken on the NYC Subway (Heavy Rail) in May 2024?

In May 2024, a total of 180,458,819 trips were taken on the NYC Subway (Heavy Rail).

nyc_subway_trips_may_2024 <- USAGE |>
  mutate(month = format(ymd(month), "%Y-%m")) |>
  filter(Mode == "Heavy Rail",
         Agency == "MTA New York City Transit", 
         month == "2024-05") |>
  summarize(total_trips = sum(UPT, na.rm = TRUE)) 
nyc_subway_trips_may_2024 |>
  gt() |>
  tab_header(title = "NYC Subway (Heavy Rail) in May 2024")
NYC Subway (Heavy Rail) in May 2024
total_trips
180458819

5. How much did NYC subway ridership fall between April 2019 and April 2020?

Between April 2019 and April 2020, NYC Subway ridership fall about 91.28%.

nyc_subway_ridership_april_2019_to_april_2020 <- USAGE |>
  mutate(month = format(ymd(month), "%Y-%m")) |>
  filter(Agency == "MTA New York City Transit", Mode == "Heavy Rail", month %in% c("2019-04", "2020-04")) |>
   summarise(
    April_2019 = sum(case_when(month == "2019-04" ~ UPT, TRUE ~ 0), na.rm = TRUE),
    April_2020 = sum(case_when(month == "2020-04" ~ UPT, TRUE ~ 0), na.rm = TRUE), percent_change = (April_2020 - April_2019)/April_2019 *100) |>
  ungroup()

nyc_subway_ridership_april_2019_to_april_2020 |>
  gt() |>
  tab_header(title = "% Change in NYC Subway Ridership")
% Change in NYC Subway Ridership
April_2019 April_2020 percent_change
232223929 20254269 -91.27813

Task 4: Explore and Analyze

Unlinked Passenger Trips per Vehicle Revenue Miles

I also wanted to explore the ratio of unlinked passenger trips per vehicle revenue miles to evaluate the efficiency of transit systems. Using the code below, I categorized and ranked 18 transit modes by the UPT per VRM ratio. The UPT per VRM evaluates how many passenger trips is generated per vehicle mile.

UPT_per_VRM <- USAGE |>
  group_by(Mode) |>
  summarize(total_upt = sum(UPT, na.rm = TRUE), total_vrm = sum(VRM, na.rm = TRUE)) |>
  mutate(UPT_per_VRM = total_upt/total_vrm) |>
  arrange(desc(UPT_per_VRM))

UPT_per_VRM |>
  gt() |>
  tab_header(title = "Unlinked Passenger Trips per Vehicle Revenue Mile")
Unlinked Passenger Trips per Vehicle Revenue Mile
Mode total_upt total_vrm UPT_per_VRM
Aerial Tramway 16258014 292860 55.51462815
Inclined Plane 23995934 705904 33.99319738
Ferry Boat 1265902052 65589783 19.30029334
Cable Car 138759762 7386019 18.78681357
Trolleybus 1913533394 236840288 8.07942521
Streetcar Rail 476858350 63389725 7.52264425
Bus Rapid Transit 663875986 118425283 5.60586362
Monorail and Automated Guideway 194254572 37879729 5.12819329
Heavy Rail 73571166122 14620362107 5.03210287
Light Rail 8645186793 2090094714 4.13626556
Bus 97720629073 49444494088 1.97637029
Hybrid Rail 58617303 37787608 1.55123084
Commuter Rail 9155539922 6970644241 1.31344243
Commuter Bus 764738385 1380948975 0.55377744
Publico 516329565 1021270808 0.50557556
Alaska Railroad 2938715 13833261 0.21243834
Vanpool 449397572 3015783362 0.14901520
Demand Response 1768846847 17955073508 0.09851515

Findings:

  1. High Efficiency Modes:

     - Aerial Tramway
     - Inclined Planes
  2. Moderate Efficiency Modes:

     - Ferry Boats
     - Cable Cars
  3. Lower Efficiency Modes:

     - Demand Response
     - Bus
     - Commuter Bus

Highest VRM Mode for Each Agency

Agencies have different transit modes. I want to analyze which transit mode has the highest UPT for each agency. For example, King County has transit modes of ferry boats, bus, demand response, etc. I want to explore which transit modes in King County generates the highest VRM. Using the code below, I generated the top 3 highest VRM transit mode for each agency.

top_3_highest_vrm_mode_per_agency <- USAGE |>
  group_by(Agency,Mode) |>
  summarize(total_vrm = sum(VRM, na.rm = TRUE), .groups = 'drop') |>
  arrange(desc(total_vrm)) |>
  slice_head(n=3)

top_3_highest_vrm_mode_per_agency |>
  gt() |>
  tab_header(title = "Highest VRM Mode for Each Agency") |>
  cols_label(total_vrm = "Total Vehicle Revenue Miles")
Highest VRM Mode for Each Agency
Agency Mode Total Vehicle Revenue Miles
MTA New York City Transit Heavy Rail 7732916753
New Jersey Transit Corporation Bus 3781858802
Los Angeles County Metropolitan Transportation Authority Bus 3501202902

Findings:

The highest transit mode for MTA New York City Transit agency is heavy rail. It generates 7,732,916,753 miles. Recall in question 1 of Task 3, MTA New York City Transit has a total of 10,832,855,350 miles in total. If we calculate the percentage of total VRM operated by heavy rail, we get 71.4%. Heavy rails operates a big portion of the vehicle miles travelled by MTA.

Task 5: 2022 Annual Summary Table

##creating 2022 summary table
USAGE_2022_ANNUAL <- USAGE |>
  filter(year(month) == 2022) |>
  group_by(`NTD ID`, Agency, metro_area, Mode) |> 
  summarize(
    total_upt = sum(UPT, na.rm = TRUE),
    total_vrm = sum(VRM, na.rm = TRUE),
    .groups = 'drop') |>
  filter(total_upt > 400000) |> #filter major transit system
  ungroup()

Note: Since we are restricting our answers to major transit system, I have filtered total UPT to be greater than 400,000 (shown in code above).

Merging 2022 Summary Table with Financial Data Table

##merge summary with Financials report
USAGE_AND_FINANCIALS <- left_join(USAGE_2022_ANNUAL, 
                                  FINANCIALS, 
                                  join_by(`NTD ID`, Mode)) |>
  drop_na()

Task 6: Farebox Recovery Among Major Systems

Note: For Task 6 questions, the output table will show the top 5 results.

1. Which transit system (agency and mode) had the most UPT in 2022?

In 2022, the MTA New York City Transit heavy rail has the great number of passenger trips. It has over 1.7 billion trips.

most_total_upt <- USAGE_AND_FINANCIALS |>
  filter(total_upt >= 400000) |>
  arrange(desc(total_upt)) |>
  slice_head(n=5) |>
  select(Agency, Mode, total_upt)
most_total_upt |>
  gt() |>
  tab_header(title = "Transit System with the Most UPT in 2022")
Transit System with the Most UPT in 2022
Agency Mode total_upt
MTA New York City Transit Heavy Rail 1793073801
MTA New York City Transit Bus 458602305
Los Angeles County Metropolitan Transportation Authority Bus 193637448
Chicago Transit Authority Bus 140013945
New Jersey Transit Corporation Bus 112739990

2. Which transit system (agency and mode) had the highest farebox recovery, defined as the highest ratio of Total Fares to Expenses?

In 2022, Port Imperial Ferry Corporation ferry boat has the highest farebox recovery ratio of 1.428.

highest_farebox_recovery <- USAGE_AND_FINANCIALS |>
  mutate(farebox_recovery = `Total Fares`/Expenses) |>
  select(`NTD ID`, Agency, Mode, `Total Fares`, Expenses, farebox_recovery) |>
  arrange(desc(farebox_recovery)) |>
  slice_head(n=5) 
highest_farebox_recovery |>
  gt() |>
  tab_header(title = "Transit System with the Highest Farebox Recovery")
Transit System with the Highest Farebox Recovery
NTD ID Agency Mode Total Fares Expenses farebox_recovery
20190 Port Imperial Ferry Corporation Ferry Boat 33443241 23417248 1.428146
11239 Hyannis Harbor Tours, Inc. Ferry Boat 25972659 18383764 1.412804
20169 Trans-Bridge Lines, Inc. Commuter Bus 11325199 8495611 1.333065
40001 Chattanooga Area Regional Transportation Authority Inclined Plane 3005198 2290714 1.311904
90001 Regional Transportation Commission of Washoe County Vanpool 3561776 2876745 1.238127

3. Which transit system (agency and mode) has the lowest expenses per UPT?

In 2022, North Carolina State University Bus has the lowest expense per UPT of $1.18.

lowest_expenses_per_upt <- USAGE_AND_FINANCIALS |>
  mutate(expenses_per_upt = Expenses/total_upt) |>
  arrange(expenses_per_upt) |>
  slice_head(n=5)
lowest_expenses_per_upt |>
  gt() |>
  tab_header(title = "Transit System with the Lowest Expenses per UPT")
Transit System with the Lowest Expenses per UPT
NTD ID Agency metro_area Mode total_upt total_vrm Agency Name Total Fares Expenses expenses_per_upt
40147 North Carolina State University Raleigh, NC Bus 2313091 531555 North Carolina State University 0 2727412 1.179120
90211 Anaheim Transportation Network Los Angeles--Long Beach--Anaheim, CA Bus 7635011 895608 Anaheim Transportation Network 8438881 9751600 1.277221
70019 University of Iowa Iowa City, IA Bus 2437750 579820 University of Iowa 0 3751241 1.538813
40025 Chatham Area Transit Authority Savannah, GA Ferry Boat 582988 13149 Chatham Area Transit Authority 870706 935249 1.604234
60269 Texas State University San Marcos, TX Bus 2348943 702051 Texas State University 0 4825081 2.054150

4. Which transit system (agency and mode) has the highest total fares per UPT?

In 2022, commuter bus under agency Hampton Jitney Inc has the highest fares per UPT of $41.30.

highest_fares_per_upt <- USAGE_AND_FINANCIALS |>
  mutate(fares_per_upt = `Total Fares`/total_upt) |>
  arrange (desc(fares_per_upt)) |>
  slice_head(n=5) 
highest_fares_per_upt |>
  gt() |>
  tab_header(title = "Transit System with the Highest Total Fares per UPT")
Transit System with the Highest Total Fares per UPT
NTD ID Agency metro_area Mode total_upt total_vrm Agency Name Total Fares Expenses fares_per_upt
20217 Hampton Jitney, Inc. New York--Jersey City--Newark, NY--NJ Commuter Bus 521577 2039368 Hampton Jitney, Inc. 21539188 17957368 41.29628
30057 Pennsylvania Department of Transportation Philadelphia, PA--NJ--DE--MD Commuter Rail 452034 1919730 Pennsylvania Department of Transportation 14580664 24920257 32.25568
11239 Hyannis Harbor Tours, Inc. Barnstable Town, MA Ferry Boat 878728 188694 Hyannis Harbor Tours, Inc. 25972659 18383764 29.55711
20169 Trans-Bridge Lines, Inc. New York--Jersey City--Newark, NY--NJ Commuter Bus 403646 1259602 Trans-Bridge Lines, Inc. 11325199 8495611 28.05726
20226 SeaStreak, LLC New York--Jersey City--Newark, NY--NJ Ferry Boat 750392 143935 SeaStreak, LLC 16584600 21572770 22.10125

5. Which transit system (agency and mode) has the lowest expenses per VRM?

In 2022, Metropolitan Transportation Commission vanpool has the lowest expense per VRM of $0.45.

lowest_expenses_per_vrm <- USAGE_AND_FINANCIALS |>
  mutate(expenses_per_vrm = Expenses/total_vrm) |>
  select(-`NTD ID`,
         -metro_area, 
         -`Agency Name`) |>
  arrange(expenses_per_vrm) |>
  slice_head(n=5)
lowest_expenses_per_vrm |>
  gt() |>
  tab_header(title = "Transit System with the Lowest Expenses per VRM")
Transit System with the Lowest Expenses per VRM
Agency Mode total_upt total_vrm Total Fares Expenses expenses_per_vrm
Metropolitan Transportation Commission Vanpool 1024804 12341055 6504406 5491767 0.4449998
San Joaquin Council Vanpool 819111 9297516 5450599 4629125 0.4978884
San Diego Association of Governments Vanpool 989804 9740828 6021472 5264624 0.5404699
Regional Transportation Commission of Washoe County Vanpool 725712 5079308 3561776 2876745 0.5663655
Los Angeles County Metropolitan Transportation Authority Vanpool 1350335 16551651 9148488 9610858 0.5806586

6. Which transit system (agency and mode) has the highest total fares per VRM?

In 2022, Jacksonville Transportation Authority ferryboat has the highest total fares per VRM of $157.70.

highest_fares_per_vrm <- USAGE_AND_FINANCIALS |>
  mutate(fares_per_vrm = `Total Fares`/total_vrm) |>
  select(-`NTD ID`,
         -metro_area,
         -`Agency Name`) |>
  arrange(desc(fares_per_vrm)) |>
  slice_head(n=5)
highest_fares_per_vrm |>
  gt() |>
  tab_header(title = "Transit System with the Highest Total Fares per VRM")
Transit System with the Highest Total Fares per VRM
Agency Mode total_upt total_vrm Total Fares Expenses fares_per_vrm
Jacksonville Transportation Authority Ferry Boat 416129 9084 1432549 3162214 157.70024
Chattanooga Area Regional Transportation Authority Inclined Plane 481957 20128 3005198 2290714 149.30435
Hyannis Harbor Tours, Inc. Ferry Boat 878728 188694 25972659 18383764 137.64433
SeaStreak, LLC Ferry Boat 750392 143935 16584600 21572770 115.22284
Cape May Lewes Ferry Ferry Boat 721923 71640 6663334 24488286 93.01136

Conclusion

The metrics to analyze the most efficient transit system can be measured in different ways.

Looking at the farebox recovery ratio, Port Imperial Ferry Corporation ferry boat would be the top candidate as it can shows a strong ratio of 1.428 (refer back to Task 6 question 2). This shows that Port Imperial Ferry Corporation ferry boat has strong profitability and has a healthy financial performance.

If we measure efficiency from vehicle miles, MTA New York City Transit Heavy Rail would be the most efficient transit system. It has the highest vehicle miles travelled (refer to question 1 of Task 3). It provides extensive amount of service indicating high ridership demand. Its vehicles are well utilized and therefore operating efficiently.