Behind the scenes with MBTA data.

On Monday, December 3, 2018, the walkway that links Independence Avenue in Quincy to the Red Line’s Quincy Adams Station reopened, allowing for the adjacent neighborhood to have an easier and more direct access point to the station.

An image of the newly reopened walkway linking Independence Avenue in Quincy to the Red Line's Quincy Adams Station.

An image of the newly reopened walkway linking Independence Avenue in Quincy to the Red Line's Quincy Adams Station.

This gated entry point near Independence Avenue allows for a better pedestrian connection direct to the station. As we highlighted way back in the early days of the Data Blog, the walkshed around this station was severely limited when this gate was closed, and really the only way to access the station was via car or bus. You can see that the entire neighborhood, which is just steps from the station as the crow flies, was not accessible along the pedestrian network:

A map of the surrounding neighborhood and walkshed affected by the closing and reopening of the walkway linking Independence Avenue and Quincy Adams Station.

With the reopening of the Independence Avenue gate, would there be an increase in the amount of riders boarding at the Quincy Adams station? To learn more about how the reopening of this entrance would impact our riders, we dug into the data. First we queried our Automated Fare Collection (AFC) database to get the tap count on fare gates at Quincy Adams station and on buses that stop on Independence Avenue.

We started by identifying the total number of taps on each of the fare gates at Quincy Adams station. The results are as shown below. The fare gates are numbered according to the ease of access from the station entrance; for example, fare gate 1 is numbered as such because it is closest to the station entrance. If more riders started commuting from Quincy Adams Station after the station entrance at Independence Avenue reopened, we expected the tap count to increase on each fare gate or on the one that is closest to the entrance. However, we observed that there was not much change in the tap count on the fare gates compared to the previous years. The tap count on fare gate 4 increased from December 2018 to January 2019 but this fare gate is the one in the middle and not very close to the station entrance. We concluded that this fare gate was probably used the most because the other fare gates were down (or perhaps this was just random noise) and the increase in tap count was not an impact of reopening the station entrance at Independence Avenue.

A chart tracking faregate 1 use over time. Little change is seen from before and after Quincy Adams Station became more accessible.

A chart tracking faregate 2 use over time. Little change is seen from before and after Quincy Adams Station became more accessible.

A chart tracking faregate 3 use over time. Little change is seen from before and after Quincy Adams Station became more accessible.

A chart tracking faregate 4 use over time. Little change is seen from before and after Quincy Adams Station became more accessible.

A chart tracking faregate 5 use over time. Little change is seen from before and after Quincy Adams Station became more accessible.

You can also see the entries visualized by day in the following chart. While we see a slight increase from normal in the daily entries on a few days soon after the new entrance opened, we also saw a high number of entries on November 29, 2018, the week before the new entrance opened. We also don’t see the trend continuing into December and January.


After looking into the fare gate data, we took a look at the bus ridership for the routes that travel through the newly-accessible neighborhood. We assumed that a few riders took the bus to Quincy Center and then transferred to the Red Line at the Quincy Center station before the Quincy Adams station entrance opened at Independence Avenue, so we looked at the weekday tap counts on bus route 230, which has various stops on Independence Avenue. Unfortunately, we didn't have good APC coverage on buses along this route over the whole time period, and the data we do have at the stop level from ODX is spotty. But here is what we found:

If passengers who previously took the bus to Quincy Center started walking to Quincy Adams once the station opened, there should have been a decrease in the weekday tap count at these stops after December 3rd 2018, but there is no significant decrease compared to the previous years in the data we have.

We then thought that passengers who lived near Quincy Adams, or perhaps in between the two stations, may have been walking, biking, or getting dropped off at Quincy Center before the Quincy Adams gate opened. If this hypothesis was correct, we thought we’d find a number of CharlieCards that typically tapped in at Quincy Center switching to tapping in at Quincy Adams instead. We decided to look at the number of CharlieCards/Tickets that were tapped at Quincy Center between September 1, 2018, and December 3, 2018, to check if these same cards were tapped at Quincy Adams between December 4, 2018, and May 31, 2019 when the station entry point on Independence Avenue was reopened. We found that about 9.22% of CharlieCards/Tickets that were tapped at Quincy Center station between September 1 and December 3, 2018, were also tapped at Quincy Adams station between December 4, 2018, and May 31, 2019. We wanted to see if this percentage of change was unique to that time period, or if a similar change has happened in previous years. Therefore, we looked at the number of CharlieCards/Tickets that were tapped at Quincy Center station the previous year (September 1, 2017 - December 3, 2017) and checked if the same cards were tapped at Quincy Adams station between December 4, 2017, and May 31, 2018. We found that 8.49% of the cards that were tapped at Quincy Center station between September 1 and December 3, 2017 were tapped at Quincy Adams station between December 4, 2017, and May 31, 2018. 

This led us to conclude that even though riders who used CharlieCards/Tickets at Quincy Center station before December 3, 2018, used it at Quincy Adams station after December 3, 2018, it is not necessarily the direct impact of the reopening the Quincy Adams station entrance at Independence Avenue. Riders usually take the Red Line from either Quincy Center station or Quincy Adams station.  


We looked at the data in a number of ways, but could not find a significant indicator of use of the pedestrian gate at Quincy Adams. Of course, we also don’t have any counters or sensors at that gate or on the path, so we don’t believe that no one is using it – just that it was not a big enough change to notice in the data, especially compared to the people who use the other entrance and parking garage to access the station.

Looking at the neighborhood near the new entrance, it is not particularly dense (mostly single- and two-family homes), so we would not expect a huge number of new riders compared to the thousands who were already accessing the station from the other side. It is also likely that since the entrance was closed historically, people whose usual trip was on the Red Line would not choose to live there in great numbers. This may change as time goes on and new people move to the neighborhood, or people living there change their travel patterns. 

This investigation also shows the limitations of our current data, especially on ridership. We don’t always have the level of detail needed to discern relatively small changes like this, especially since there is always normal variation in passenger behavior. We are always investigating new cost-effective ways to learn things about how our passengers travel while continuing to respect their privacy.


As an essential measure of the performance of the MBTA, we report our best estimates of ridership each month both on the MBTA Back on Track Dashboard and to the National Transit Database. As we have discussed on the blog, the source data for ridership comes from different systems and is measured in different ways. There are also many riders and trips that we are unable to measure from our equipment, and whose travel we need to estimate. This post will discuss the methods we use to count riders and trips, and to estimate those we can’t directly count. We will also discuss some of our future plans for improving these estimates and our reporting.

Recap: The Sources

We use different systems to collect the raw data depending on the technology available. The two main sources are Automated Passenger Counters (APCs), which are currently installed on most of the bus fleet, and the Automated Fare Collection system that counts CharlieCard taps and other payment methods on rail and bus services. APCs are also being installed on the Commuter Rail coaches and are being installed on the MBTA’s new Green, Red and Orange Line vehicles which are expected to come into service over the next few years.

For services where we do not have significant APC coverage, we use estimates based on data from the AFC system. The AFC system counts every interaction with a piece of fare equipment (for ridership purposes, these are faregates and fareboxes). We also conduct manual counts at various times and places to check against our automatically collected data, or in cases like Commuter Rail where we have limited automatic data.

Recap: The Measure

We report ridership as Unlinked Passenger Trips (UPT), which counts each boarding of each vehicle as one “unlinked” trip, even if it was part of a longer journey. While this gives additional credit to transfer trips, it is the industry standard and is required by the NTD, so we currently report ridership in this manner. We are investigating other measures of ridership and hope to be able to provide them along with UPT in the future.

How we estimate ridership from raw data

Bus: For our bus network, with a few small exceptions, we have enough APCs installed that we can use them to estimate ridership with minimal scaling and uncertainty. For each day type and route, we compare the boardings counted by APCs on trips with buses equipped with them to the total number of trips scheduled and scale the ridership up. We then scale the ridership back down to account for scheduled service that did not run. 

Rapid Transit: We currently have very limited coverage of APCs on the Rapid Transit system and need to use the AFC data to estimate ridership. We start with the raw validations (taps, ticket insertions, or cash payments) at each AFC location. From here we apply three different factors in order to estimate total ridership from the validations. These factors are explained below:

  • Non-Interaction: Non-Interaction factors account for people who entered the MBTA system without interacting with fare equipment. These are most often children, employees, people actively evading the fare or people who entered when the fare equipment was not functioning. These factors are calculated based on a sample of manual observations of people entering faregates, conducted each year.
  • Station Splits: We usually assume that every validation at a faregate at a station leads to a person boarding the line that serves that station. At stations that serve multiple lines, we do not directly know which line someone who validated there then boarded. For example, someone validating at Government Center could then board either a Green Line or Blue Line vehicle without any further interaction with fare equipment. To estimate these data, we apply a factor called a “station split” to “split” the boardings at such stations between the lines that serve each station. These factors are currently based on past surveys of passengers, but at the conclusion of this fiscal year we will update them using ODX.
  • Behind-the-Gate: As noted above, we report ridership as unlinked passenger trips – every boarding of each vehicle. This means that for trips where passengers transferred lines without passing through a faregate or an APC, we cannot directly measure their second trip and we therefore need to estimate it with a factor. Currently, we do this using the answers from surveys of passengers. We ask them as they are waiting for a train where they are going, and determine how many additional unlinked trips we can estimate for each boarding based on which line they boarded. For example, if our survey showed that there were 121 unlinked trips for 100 passengers surveyed, the “behind the gate” factor for that line would be .21, and we would multiply the count of boardings (after the other factors were applied) by 1.21 to estimate total unlinked trips. We are also updating this factor at the end of the fiscal year using the ODX algorithm.

Putting it all together

The following chart shows an example of how we calculate final ridership from raw faregate interactions, with all three factors applied. These numbers are rounded to the nearest thousand.

A chart depicting average Red Line weekday ridership, with examples of how non-interaction, station splits, and behind-the-gate activity affects our ridership estimates.

First, we sum all the interactions at all faregates at stations with Red Line service. This will over-count the riders at stations that serve multiple lines. Then, we apply the “split factors” to the total interactions at stations that serve multiple lines (there is a different factor for each station-line combination) and apply those interactions to the other lines. This is represented by the -27 in the second column on the chart above. We then have a subtotal of 194,000 interactions that can be attributed to the Red Line.

Third, we apply the non-interaction factor to scale these taps to account for people who entered without interacting with the faregate. This brings our running total to 206,000.

Finally, we apply the additional trips from the other lines that could have behind-the-gate transfers to the Red Line (Green and Orange). These are counted in a similar calculation that is conducted on the interactions recorded at gates on those lines. This adds an additional 36,000 unlinked trips to our total, giving us our final ridership estimate of 242,000 average weekday UPT on the Red Line.

Green Line Surface

The Green Line is the most extensive and complex light rail system in the country, and this complexity presents myriad data challenges, as we have detailed on the blog. For ridership reporting, the surface-running portion of the Green Line presents some unique issues that we must account for. First, there is a high level of non-interaction on the Green Line due to the operational practice of allowing passengers with passes to board at the back door. While we believe the revenue loss from this is relatively low, it does mean we have a large non-interaction factor that we use for Green Line. We continually monitor and improve this factor, and as the new Type 9 cars, equipped with APCs, come into service, we will be able to use these to better estimate non-interaction.

Second, the Green Line fareboxes are not hard-wired to the AFC central database. This means they must be manually “probed” to download their transaction data (cash payments into the fareboxes are collected through a different process). Since the AFC system was installed nearly 15 years ago, this is a much more difficult process than it might seem; data can only be probed in certain places in the train yard, and vehicles do not always come into these places in the yard for any operational reason (by contrast, fareboxes on buses are probed much more regularly since it is part of the nightly re-fueling process). In fact, a large portion of the data from surface AFC interactions are not downloaded to our database until weeks or sometimes months after the transaction occurred.

In order to account for this probing lag, we have developed a process to impute taps for which we do not have data yet, based on the amount of service we see that each vehicle has provided (measured by stations visited from our AVL system) and the number of taps per vehicle-stop visit that we have recorded in each month in the past.

This process consists of four steps: first, we evaluate how much AFC data is missing and likely to come in through a future probing. We conservatively estimate AFC data to be missing if a vehicle is seen to be in service during a particular date but did not record any AFC records. Next, we estimate what the missing data is likely to be based on the same month of the prior year (to account for seasonal ridership trends), in terms of taps per vehicle-stop visit we tend to see in that month. We then look at the number of stop visits that occur on the vehicles with currently missing AFC, and scale them up by this estimate. Finally, every month, as more probed data comes in, we replace the estimates with real data. 

Ridership on the Dashboard

We put all of the above together into our ridership update six weeks after the end of each month. This is the earliest date we feel confident that we have enough Green Line surface data to estimate its ridership. After QA/QC, we combine the above calculated ridership with the ridership reporting we get from Commuter Rail, Ferry and the RIDE to display our average weekday ridership for each month. 

We are working on more detailed and granular ridership tools which will allow users of the Dashboard to explore our ridership data in different ways as data quality and availability improves. Look for these in a future update to the Dashboard.


In the last five years, the MBTA and other large transit agencies across the country have seen drops in their ridership, especially on buses and during off-peak times. This is counter to historical trends; given increased population and economic growth in Boston, we would typically expect ridership to increase. The changes are also not uniform; ridership on the Commuter Rail system, for example, seems to be growing significantly.

Analysts at the Office of Performance Management and Innovation (OPMI) decided to investigate possible causes of this decline in ridership. We have posted some of this work on the blog here and here, and we are excited to post the full report below. The report linked below explores bus ridership, examining what factors are causing the decline in bus ridership specifically, and how these factors differ depending on the neighborhood.

The paper includes two significant analyses: a longitudinal regression looking at bus ridership at the transit-system level across the United States and a geographically weighted regression (GWR) focusing on local differences within the MBTA area. 

Read the full report: “Location, Location, Location: A Neighborhood-Level Analysis of Changes in MBTA Bus Ridership”

Suggested Citation


Thistle, I., & Zimmer, A. (2019). Location, location, location: A neighborhood-level analysis of changes in MBTA ridership (unpublished). MBTA – Office of Performance Management and Innovation, Boston. Retrieved from: https://massdot.box.com/v/busridershipreport


If you have any further questions or concerns related to this report, please reach out to us at This email address is being protected from spambots. You need JavaScript enabled to view it.