Conversations with the Coalition: Marisa Hughes

As part of a new Climate TRACE series, we are interviewing individual coalition members about their work. Up first is Marisa Hughes from the John Hopkins University Applied Physics Laboratory (APL), where she manages the team of data scientists applying remote sensing and AI to estimate emissions from road transportation.
How do you define your sector and/or where are you starting within it?
We are studying the emissions in the transportation sector, specifically focusing on internal combustion engine vehicles in use on roadways. But we’re not tracking individual cars and trucks themselves. Our analysis works on the level of road segments, calculating the average annual daily traffic and expected emissions factors associated with each segment. This provides a high-resolution view of where road emissions are coming from. We are starting with a focus on urban areas, which often include city centers as well as associated suburbs, to provide actionable data where traffic and the resultant pollution and air quality effects are most concentrated.
How is the JHUAPL team approaching road transportation, where the emissions come from lots of “little” things (i.e., cars and trucks) that move around a lot?
We start by looking at annual averages for the way that roads are used. We take the average annual daily traffic and combine with emissions factors that are based on country-level estimates of the vehicle fleet, along with characteristics of the road itself, such as speed limit.
Machine learning models help us determine which are the busiest roads. We are able to generate traffic estimates for the roads within a city using road network layouts from OpenStreetMap, plus population statistics and satellite data. Satellites help us look and understand the layout of the roads and the way a city is designed. Then we combine those population factors to estimate how roads are being used. We rely on widely available datasets that we can apply all over the world.
Policies in different countries impact our results, too. For example, Norway has policies that incentivize electric vehicles (EVs) at the country level. So the estimate for cities in Norway has to factor in higher percentages for EVs than in U.S. cities.
Why is road transportation emissions data so hard to gather and/or not widely available?
The fundamental problem is that there are over a billion cars on the road, distributed everywhere. There is no governing body to control or track all the cars and it’s treated differently in different countries. Plus, when looking from space, cars are not like big power plants or factories — they are small sources, and thus hard to identify. Satellites can’t understand what those billion cars are doing.
We were able to leverage road transport studies in the U.S and there are great direct emissions datasets from road transport. Some studies put emissions sensors along roadways and those ground truth sensors are fantastic for understanding transportation emissions. That’s where we started with understanding the problem and training algorithms. But those ground sensors are expensive and don’t last forever.
Another advantage of starting our training in the U.S was our familiarity with the area. Our team at APL works between Washington, D.C., and Baltimore, and we frequently pulled our results from these regions to sanity check our methods as we experimented. We knew we were getting closer as the beltways around these cities crystallized in bright red. Our team knew from personal experience how heavy traffic in those regions could be!
Why did you start looking into this?
Here at APL, we have a lot of experience in satellite data analysis and remote sensing, having developed algorithms for detecting infrastructure and patterns of life. When climate change became a strategic priority at the lab, we sought ways to apply these strengths to environmental challenges.
We learned that Climate TRACE had a gap in the transportation sector and that’s because of the challenges I mentioned. It’s a hard problem but we had done projects looking at infrastructure in the past, and we thought that we could leverage some of that expertise when looking at roads themselves to have an impact on this problem. We did a successful pilot and now here we are!
What challenges did you face tackling this?
We hit a couple metaphorical road bumps along the way!
One thing early on, we had been using the U.S. for a lot of our initial training of the model with high-detail data. But not every country is similar to the U.S. with car ownership, car culture, and road networks. When we initially tried to scale the algorithm globally, there was a drop in accuracy. We were not hitting the mark.
In year 2, we restructured how we analyze road network data adding in country-level emissions data and that really helped to overcome the challenge of the U.S. behavior not being like other countries. So for example, you could see the same road network and the same population but it makes a huge difference if 20% of cars are electric vs. 80%. We were also not considering methane emissions at first in our study since this is not a pollutant cars are known for, but later observed that countries with high rates of motorcycle ownership had increased methane emissions because of incomplete combustion. This is concerning considering how potent methane is as a greenhouse gas in the short term.
Another challenge — which we are wrestling with right now — is that our algorithms rely on having road network data. We primarily use OpenStreetMap, a free and open mapping resource supported by volunteers around the world. This resource continues to grow in coverage, but is imperfect, especially in low-population areas. We are still trying to understand the way vehicles are used and it’s why we focus on cities where coverage and accuracy are much more accurate. It’s a crowd-sourced thing and you get more OpenStreetMap data where more people live.
And when we extend our algorithm into more rural regions, we have challenges being able to accurately assess where the roads are, how big they are, and assess how they are used. So there is a huge gap in our emissions knowledge in rural areas.
Also, shipping is an enormous component of transport emissions overall, beyond our focus on road transport emissions. But when we consider shipping, there’s the question of what happens to goods as they enter a port to be loaded onto a ship, and how they leave a port once a ship arrives at its destination and its cargo or containers are unloaded onto trucks.
That last mile from the ship pulling into port, dumping the goods, and bringing them to a central location creates the question of how to get the goods from there to the final customer. This is where things get really complicated. You’ve got to worry about the efficiency and logistics of individual orders vs. larger orders because a lot of emissions could be from the last part: getting it to who actually wants your product. Is it an individual, consumer, or a store? But we are already modeling emissions in cities and our strength at APL is at the city level. So we are better suited to tackle that last mile. Because long term, we want to be able to address all of those emissions costs.
How might various organizations use this type of data?
We want to help cities understand what the drivers of their road transportation emissions are, and what the impact would be of taking action to reduce overall vehicle activity, or reduce emissions factors. We hope our data can also inform how the structure of roads relative to population centers might impact both total regional emissions and local air quality. Companies concerned about their carbon footprints should also be looking to the roads — particularly when it comes to last mile delivery services and customer movements.
Interviewed by Ann Marie Gardner.


