NAWG Logo

Newsletter of the
ITS Cooperative Deployment Network

(Please read the Disclaimer)


Insights from the ITS Data Quality Workshop

(Last updated 3/1/04)

[ Mail This Page to a Friend | For More Information]


Participants in the recent ITS Data Workshop in Houston, TX spent two days sharing their perspectives, experiences, and lessons-learned about the vital role that quality ITS data plays in their own agencies or companies. Clearly, having adequate data quality from infrastructure-based detectors, for example, is already an important issue that is likely to become even more critical for next-generation ITS applications. As workshop organizer and moderator James Pol from the U.S. DOT ITS Joint Program Office said at the outset, "all of the new [ITS] initiatives that have been developed or proposed have a very strong thrust of data quality and integrity."

In his opening remarks, Pol said that the workshop had two primary purposes:

  1. To foster the exchange of knowledge by experts in the field, and
  2. To help identify key facets of future ITS data quality research

This article provides snapshots of just a few of the interesting and insightful presentations and discussions at that workshop. ITS America has already posted a news article about that meeting (see ITS Data Quality The Hot Topic At Houston Workshop), and will shortly post a more detailed synthesis report on the workshop as well as the slide presentations.

Data Quality: A Big Issue?

Pol posed this fundamental question at the outset of the workshop: How big of an issue is data quality? Virtually all workshop participants acknowledged that the availability of quality data was absolutely essential to support a wide range of traffic management/operations and Advanced Traveler Information Systems (ATIS) applications. Further, many were concerned that data quality issues would likely become much more critical in the future, as compelling new ITS applications come online. Some of the individual observations included:

Defining What "Data Quality" Means

Shawn Turner from Texas Transportation Institute (TTI) said that measuring data quality requires an understanding of all the intended purposes for that data. He proposed six different measures to assess data quality:

  1. Accuracy
  2. Completeness (availability)
  3. Validity
  4. Timeliness
  5. Coverage
  6. Accessibility (usability)

Turner acknowledged that the importance of any of these measures depends on what specific problem data is designed to help evaluate or solve. He said that TTI is currently experimenting with different "composite measures" that attempt to take all six factors into account.

Key Needs for Quality ITS Data

Turner’s presentation led into a discussion by all workshop participants about "key applications" that require adequate and accurate data. The following key applications were collected:

Traffic data Incident data
Operations policy Roadway network performance
Real-time status (for travel time on VMS) Modeling
Evacuation management Modeling for planners
(Incident/alternates/diversions)
Congestion measures (planners) Speed monitoring (law enforcement)
Commercial vehicle applications Incident management decision support
TMC diagnostics & system performance Travel time

Table 1: Data Needs Identified by Workshop Participants

While many participants mentioned that accurate data is essential to support Advanced Traveler Information Systems (ATIS), Mark Hallenbeck said that WSDOT doesn’t initially deploy cameras or detectors to provide traveler information. Instead, these deployments are initially justified for traffic operations purposes, and their use to support ATIS applications is a spin-off benefit.

Early "Lessons-Learned" about Data Quality from the ADMS Project

Catherine McGhee from the Virginia Transportation Research Council provided an update to the workshop participants about the Virginia ADMS (Archived Data Management System) operational test. ADMS is designed to enable archived ITS data to be used for many different transportation applications, particularly in the areas of planning and mobility measurement, with an emphasis on applications that support transportation management center (TMC) operations. The ADMS database currently incorporates data from 19 miles of coverage in the Hampton Roads area; that coverage area is slated to expand in March.

McGhee said that ADMS' functionality was being released using a "build approach" as follows:

McGhee and Stephany Hanshaw from the Virginia Department of Transportation, whose department is beginning to use ADMS, both agreed that ADMS had opened their eyes about the issue of data quality. McGhee said that when they began the project they were seeing data quality in the 10% good data range. In digging further into the reasons for this substandard quality, they discovered that part of the cause was due to inappropriate screening tests for the type of data. For example, Hampton Roads has a significant number of single loop detectors that can report occupancy but not speed, and the original screening tests flagged this lack of speed data as an erroneous error.

Hanshaw characterized ADMS as a "powerful tool" that will help him formulate better responses to incidents and special events in the future. He said that it's been an "eye-opener" to see that his agency wasn't doing as well as they thought they were doing related to data quality. "We need to do a better job of designing systems with the end-use in mind, including a consideration of maintenance costs. If we're going to put thousands of detection devices out there, we had better be in a position to maintain them," he said. "I'm personally in favor of fewer detectors that product better quality data," he concluded.

Hybrid (public/private) Data Sources

Hanshaw said that in the future a "hybrid" data collection system, in which the agency supplements its own detectors with information from private-sector partners, might make the most sense for many public sector agencies. He described the ongoing AirSage feasibility test that is currently providing segment speeds in the Hampton Roads area (see USDOT and VDOT Support a New Approach to Deriving Traveler Information from Cell Phones).

Final testing and evaluation of that test should be complete in the next few weeks, Hanshaw said. AirSage currently has an agreement with Sprint PCS to access Sprint's data, although the firm has not yet finalized similar agreements with other cellular providers in the region. Hanshaw said that data provided by AirSage includes segment speeds, segment ID (location), and a confidence value that indicates how many of the data points have passed AirSage's own quality tests. "We will know how good the AirSage data is according to their own processes, but we will also calibrate that against ground truth," he said.

While the notion of gaining alternative sources of data seemed popular to many workshop participants, Jim Kranig of Mn/DOT raised a cautionary note. Kranig said that public sector agencies needed to be careful not to get "caught in the lurch" should a private-sector firm on whom they depend for data goes out of business. Hanshaw agreed, but said that the "flip-side of that issue is that the public cannot bear the total cost of building out the detection infrastructure."

Data vs. Information

Hallenbeck from the Univ. of Washington stressed to the workshop participants that there's a big difference between "data" and "information." As an example, he said that agencies may have a lot of loop data from arterial roadways, but may not be able to derive a useful "arterial speed" from that data because that speed greatly depends on whether or not drivers experience progression through traffic signals.

The University is currently looking into the feasibility of a "fastest truck" algorithm to compute a meaningful travel-time measure, Hallenbeck said. The idea is that if the fastest truck can make the trip in a certain amount of time, regular drivers should be able to make it in the same amount of time. He said that the "fastest truck" data they're using is valid, but they're not sure yet whether the information gleaned from that data is valid.

Hallenbeck predicted that data showing how well "managed lanes" like high-occupancy toll (HOT) lanes are working is going to be a big issue five years from now. "Data quality will be an important issue," he predicted. "If pricing is changed on a minute-by-minute basis, will you need detectors every mile?" He concluded that the performance requirements for such facilities will likely drive the need for future detection data.

Upcoming Variable Speed Limit Test in Florida

Schuman of PBS&J said that the iFlorida model deployment will be conducting a variable speed limit (VLS) trial on Interstate 4 through Orlando. The concept of operations for this trial will be developed in the next year. iFlorida will be integrating many types of data -- including weather forecasts and probe data -- into an expanded "data warehouse," and will implement a new "condition system" application that will drive many of the new traveler information services. That condition system will also recommend messages to be put on variable message signs (VMS) as well as variable speed signs used for the VLS trial. Schuman said that, at least initially, the VSL speeds would likely be advisory rather than enforced.

In a separate presentation entitled "Florida Data Quality," Liang Hsia from the Florida Dept. of Transportation described two different "data warehouses," as well as the role that the Florida Statewide architecture, regional TMCs, and numerous other initiatives will play in ensuring adequate data quality. Liang said that FDOT is now using the agency's "Road Rangers" to provide new information about when incidents start and stop.

In a separate presentation, Mike Pietrzyk of Transportation Solutions, Inc. described a new set of ITS Performance Measures that are under development in Florida. Many of these measures will likely depend on accurate underlying data from Florida's freeways and arterial roads. Pietrzyk distributed an "Interim Recommendations Report for ITS Performance Measures," as well as a revised set of measures and related survey questions. More information is available on FDOT's web site.

Focusing on More than just Red, Yellow, and Green

Martin Knopp from the FHWA's National Resource Center in Olympia Fields, IL described an interesting application that he personally coded while with the Utah Department of Transportation that lets users analyze detected data in much greater detail. "The goal was to let UDOT staff dig down deeper into the data, not just look at red, yellow, and green on a speed map," he said. While that application was initially targeted for transportation operations center (TOC) operators and engineers, Knopp said that transportation planners immediately saw the benefits of that tool and wanted to use it.

This application lets users look at such data as volume vs. speed at individual detection stations over time, and also computes the "travel time index," which is ratio of the current travel time to the free-flowing travel time. It also helps operations staff quickly identify malfunctioning detectors.

ATIS Implications of Poor Data Quality

Alan Toppen from Mitretek Systems provided insight into the impact that poor quality data can have on the potential benefits of ATIS. Mitretek Systems had earlier calculated the time-management benefits to travelers from pre-trip notification of traffic congestion. Using archived estimates of travel time from the Caltrans/U.C. Berkeley PeMS database, his group recently analyzed the level of those benefits in light of the accuracy of the travel time estimates. In many cases, those estimates are directly dependent on the quality of underlying detector data.

Figure 1: ATIS Benefit vs. Error (Mitretek Systems)

The results from this investigation were very interesting (see Figure 1). Depending on the time of day, the average benefit from pre-trip ATIS information drops to zero if the error of pre-trip travel-time estimates is in the 14% to 21% range. According to this analysis, for larger errors many travelers would actually see a disbenefit from pre-trip ATIS information. Clearly, this analysis underscores the crucial role that accurate data plays in providing real user benefits from ITS technologies.

Toppen said that his colleague at Mitretek Systems, Soojung Jung, conducted a related investigation into the percentage of transportation network detection/surveillance coverage needed to provide benefits from pre-trip ATIS. The results of that research show that the first 25% of network coverage provides approximately 50% of the total available benefit. Conversely, going from 80% coverage to 100% coverage only provides 5% more user benefit.

Toppen then presented what he called a "notational nomograph" (see Figure 2) illustrating the relative importance of travel time error in any decision to increase network detection coverage. That chart showed that for systems with high error rates in computing travel time estimates, improving detection accuracy is the first priority.

Figure 2. Potential Decisionmaking Regimes (Mitretek Systems)

Late on the workshop's second day, Grant Zammit from the FHWA Resource Center in Atlanta, GA presented an interesting flowchart model that links the ultimate goal of ITS deployments (such as "improve regional traffic operations on arterials") to the issue of data quality. In Zammit's model -- which is useful both for new ITS initiatives or for selling existing ones -- a flow chart is created starting with the ultimate goal on the left-hand side. That goal is then progressively linked to "performance objective," "performance measure" and, ultimately, to the individual initiative. Using this approach, those who are asked to fund data quality initiatives can immediately see the payoff from those initiatives.

In the final workshop session on the "Political Value for Sustaining ITS Funding," Jim Kranig from Mn/DOT recounted the key importance accurate data played in maintaining political support for the agency's highway helper/incident response team. He said that some members of Minnesota's State Legislature were saying things like "it's a wonderful thing that we're changing tires and getting gas, but can we truly afford to do these nice things for people?" Kranig said that data collected by Mn/DOT showed that highway helpers were the initial detection source for approximately 20% of incidents that block limited access roadways. "Luckily, we were able to convince them that we can't turn this program off," he said. Based on this data, Kranig said, Mn/DOT is now looking for innovative ways to expand the highway helper program next year. "These efforts in data quality are really valuable -- especially if we can analyze the data quickly," he added.

At the conclusion of the workshop, Hallenbeck encouraged the participants to attend this year's North American Travel Monitoring Exhibition and Conference (NATMEC), which will be held June 27-30, 2004 in San Diego, CA. "Every two years, the 'data wonks' get together and talk about data-related issues," he said. This year's meeting will be held concurrently with the mid-year meeting of the TRB Freeway Operations Committee.

--Jerry Werner


For More Information


This web page created by the National Associations Working Group for ITS (NAWG), a cooperative effort of organizations whose members are spearheading ITS deployment in the U.S. The NAWG makes every effort to ensure the accuracy of these pages, although errors can and do occur. Report any errors or omissions to the NAWGITS webmaster. Each participating member of the National Associations Working Group for ITS is responsible only for the information it provides.

Mail This Page to a Friend or Associate!

Mail this page to someone you know.
Recipient's Name:
Recipient's Email:
Sender's Name:
Sender's Email: