
Newsletter of the
ITS Cooperative Deployment Network
(Please read the Disclaimer)
A Discussion with Catherine McGhee of VTRC (Last updated 3/15/04) |
|
The goal of the ADMS (Archived Data Management System) Virginia Field Operational Test (FOT) is to demonstrate the value of archived transportation system data to many different types of users, including both planning and operations specialists. How is the project accomplishing that goal, and what lessons have been learned so far? ICDN Editor Jerry Werner recently discussed these and related issues with Catherine McGhee, Senior Research Scientist for the Virginia Transportation Research Council and Co-Project Manager of ADMS Virginia, and Dale Thompson, Congestion Mitigation Coordinator for the FHWA's Office of Operations. |
ICDN: Did the ADMS project get started with the Request for Proposals (RFP) for the Field Operational Test (FOT)?
Thompson: The ADMS Field Operational Test (FOT) got started with a solicitation from FHWA. We had a desire to advance data archiving nationally. For example, a few years ago we folded the Archived Data User Service (ADUS) into the National ITS Architecture. We issued the FOT solicitation two years ago and selected ADMS Virginia. However, I know that the idea of ADMS Virginia pre-dated the FOT, isn't that right Cathy?
McGhee: At the time of the FOT, we had already established a substantial data archive as part of the Smart Travel Lab, a joint effort between the Virginia Dept. of Transportation (VDOT) and the Univ. of Virginia that was created in 1998. The Lab was looking into the whole arena of ITS and traffic management and all the data issues that go along with it. The Smart Travel Lab is connected in real time to the Smart Traffic Centers in Hampton Roads and Northern Virginia, as well as to the signal system in Northern Virginia. We receive data in real-time from those systems, and are designated as the "official archive" for the data from those systems. So we had already established a substantial archive and were dealing with such issues as "how do we best store this data?" and "how do we make it available to people who could use it?" The FOT solicitation that came out seemed a natural fit.
ICDN: Does the Smart Travel Lab archive the data in perpetuity?
McGhee: At this point we do keep it all -- we haven't dumped any data from our database.
ICDN: I've talked to some TMCs who say that their official policy is not to archive such data, because they're concerned about being asked for that data by litigants in traffic accidents.
McGhee: Our STCs also do not archive data beyond a particular period of time -- 30 or 60 days -- which is why we're designated as the long-term archive. I haven't heard anyone in Virginia being particularly concerned about liability issues stemming from the data. There have been lots of those kinds of conversations regarding video, and as a policy we generally don't record or store any video.
ICDN: Another factor limiting concern, I would guess, is that we’re talking about raw data that's useless to anybody who doesn't have a sophisticated engine that’s able to analyze it.
McGhee: Certainly, until the ADMS Virginia system was developed that data was a bunch of bits in a bucket. It was useful for research, but it was cumbersome at best to dig through and find what you needed.
ICDN: So you were archiving this data before the ADMS FOT came along. Was your system called the ADMS?
McGhee: The ADMS Virginia name came about through the field operational test project.
ICDN: Were the user-oriented tools and plans for all the Builds in place before the FOT? (See Table 1: The Three "Builds" of ADMS Virginia.)
McGhee: No, that came about as part of the FOT.
Thompson: The build approach was actually a recommendation from the ADMS Virginia project team subsequent to the award. We [FHWA] agreed that it would be an effective way to manage the project and allow it to evolve.
McGhee: We were primarily concerned with getting good input from our stakeholders. The build approach let them react to the initial build to see how they could use it, and then to give us some ideas for how else they could use it.
|
Table 1. The Three "Builds" of ADMS Virginia
ICDN: It sounds like prior to the FOT you were archiving data, but the FOT has kicked the use of this data into a new gear with new analysis tools. Is that a fair statement?
McGhee: We analyzed a lot of data internally, and received requests from different people for information derived from that data, but those outside of the lab had no way to query the data. They had to ask us to do it for them. While we did keep different kinds of historical averages, we had nothing as formal or structured as we've created through the ADMS FOT.
ICDN: Are the current ADMS stakeholders primarily from VDOT?
McGhee: Because we built this system on Hampton Roads, our stakeholders are mostly from that region, but we do have VDOT people from the central office, including from both the Planning and the Mobility Management divisions. Our stakeholders also include representatives from the Hampton Roads Smart Traffic Center, the Hampton Roads Planning District Commission, from Hampton Roads Transit, and from different cities in the region. We’re trying to accommodate the widest audience that might have a use for the data or be involved in operations in the region.
ICDN: What kind of information are these stakeholders typically looking for? Are they often looking for something that you don't provide yet?
McGhee: Sometimes. We recently visited Northern Virginia to talk to stakeholders there about hosting the next implementation of the ADMS system. They looked at what we've done so far and came up with a whole list of things to be added and ways that they would like to be able to use the system.
ICDN: Could you mention a couple of these new uses?
McGhee: Sure. One thing they mentioned involved HOV lane monitoring. They have reversible lanes, and time after time they get questions like: "Are we reversing the lanes at the right times?" "Do we have them open in the right direction on the weekends?" They would love to be able to perform tailored queries that would answer those questions so that they could monitor how those lanes are performing over time.
ICDN: They can't do any of that analysis right now without a tool like ADMS, is that right?
McGhee: They collect information once a year related to HOV performance, but right now they call us every time they want to run a query on the data, because we have detectors in both the HOV and general purpose lanes. They would also love to see a tool that would help them plan work zones so that they could better anticipate the impact of lane closures.
ICDN: It seems that a big advantage of ADMS to you is that it offloads from you the need to do custom inquiries and data analyses, right?
McGhee: Certainly that's a benefit to the lab, but for every question that we do get we have to wonder how many people don't bother to call, either because they don't realize that the data exists or don't want to take the time to ask somebody else to analyze it. If query tools were available from their desktops, they could run and review reports every week. Then this data becomes much more useful.
ICDN: Dale, I guess that gets into the FHWA's motivation of supporting this project. I presume that you'd like to make tools like ADMS available to operations people so that they can get a better handle on how well their facilities are performing. Is that a fair statement?
Thompson: Yes, you're absolutely right. The primary objective of the Archived Data User Service FOT was to show the operational benefits of an archiving system like this. That's one of the proven benefits of the "build approach." As Cathy was saying, many stakeholders didn’t have a good feel for how an archiving system could help them, especially on the operations side. By presenting Build 1 to stakeholders and demonstrating what functionality the system can provide, they're discovering "wow, it can do that for me -- can it also do this?" Those stakeholders now know that this tool is available to them, but there are other professionals across the country that really don’t know that data is available to them, or what archived data can do for them from an operations perspective.
McGhee: Certainly, the folks in the Smart Traffic Center don't have time to call us to run a query for them if they're dealing with an active incident. Even if they want to do a post event analysis, it's much more useful for them if they can query that tool for information that they need within the confines of their center.
ICDN: Is the tool intended to be useful for what you might call "near-real-time analysis?" Can people actually use it for useful information about ongoing incidents, for example?
McGhee: Some services within the tool are intended to help in near-real-time. Our incident management tool lets you query the database for similar events that have occurred in the past. It then automatically brings up the traffic data that was associated with each of those previous events. For example, it might tell you that the last time you had an event near this location with similar characteristics the duration of the incident was three hours. That sort of information could be useful in determining what messages you post on signs and what detour routes you might want to consider. The fully automated version of that tool will be included in Build 3, so we don’t know yet how that capability will play out in terms of real-time operation. Certainly we hope that it will be useful, but the evaluation will show how useful it turns out to be.
ICDN: Are you toward the end of the FOT timeframe? Obviously, you're approaching Build 3, which is the final release of the system.
McGhee: Yes. Build 3 is due for delivery on April 1, and the project is scheduled to end in mid-June.
ICDN: I presume that both of you anticipate that ADMS will live on past the June timeframe -- is that a fair statement?
McGhee: Yes. I guarantee it will live on in Virginia. We've already started discussions about how we're going to fund ongoing operations and maintenance as well as system expansion. We want to move it to Northern Virginia as a first step, but there are other areas in the state that we'd like to add to the system, as well. We envision that as more people start to use it, they’ll request additional services and new and better ways to use it and the data. The system will definitely continue within VDOT.
Thompson: From the Federal perspective, one of the main criteria for choosing the FOT location was a commitment by the winning team to explore the benefits of data archiving to operations. A number of places across the country met that criteria, and ADMS Virginia rose to the top. Another major element of ADMS Virginia’s selection was the "transferability" of that archiving system to other parts of the state and other parts of the country.
ICDN: Is the software intended to be publicly available to other states or agencies?
Thompson: Yes. The contract agreement specifies that the source code is available to FHWA for FHWA use, to include the iFlorida model deployment for example, as it is an FHWA-funded and sponsored project. Distributing it more widely would be a choice of VDOT, UVa, and Open Roads Consulting.
ICDN: It occurs to me, Cathy, that training must be a big deal with ADMS. Obviously, people will be much more active users if they know more about how the tool works. Are you explicitly training various groups in using the various Builds?
McGhee: Beyond the fact that we walk through the system functionality with the stakeholder groups for each new build, we haven't conducted any formal training. To be honest, I haven’t received any requests or other indication that our stakeholders need it.
ICDN: Is that to say that the tool is probably intuitive enough for them to use it right out-of-the-box?
McGhee: That's been my assumption. It's all menu-driven, so it's fairly intuitive.
Thompson: I agree. The use of GIS (Geographical Information System), in particular, make it more user friendly and display friendly. It provides the operations environment particularly with a more useful tool for better interpretation and use of results.
ICDN: Cathy, at the recent Data Quality Workshop in Houston (see Insights from the ITS Data Quality Workshop), you and Stephany Hanshaw, the Facility Manager of the Hampton Roads Smart Traffic Center, both talked about how the tool has opened your eyes to some of the data quality issues. How did it do that?
McGhee: As part of ADMS Virginia, we’ve created some tools that illustrate data quality at any point in time. We started looking at our data quality -- the percentage of data that was passing our screening tests -- when we were in the midst of Build 1, and were shocked at what we were seeing.
ICDN: You were seeing a very low good data rate on the order of 10% good data, right?
McGhee: It was very low. We found that a big piece of this problem was because we were interpreting the data wrong. A key lesson-learned in this project is that coordination is vital. We can develop anything in the lab, but in order for it to be meaningful and useful for the people who are going to use it, we have to understand what the data is as it comes to us and how it's going to be used out the other side. Understanding how the data was formatted in the field obviously became very important to us, because we had a tremendous number of these single loops that had never been formatted to report speed. Because there was no calculation process in place to report speed, they were all reporting zeros. Our screening test said that the combination of positive volume and zero speed meant bad data. So we were marking a tremendous amount of data as bad that really wasn't bad.
ICDN: Has that issue been resolved at this point?
McGhee: Partially. As a result of digging into this issue, we created new tools that allow users to essentially say, "tell me what the data quality looks like for this detector or corridor or corridor section over time." You can create plots of both data quality and data availability, so you know whether or not a detector was reporting data and how much of that data passed the screening test. That gives you a pretty good indication of how much faith you should put into the data that you're getting out of a query at that particular location.
ICDN: What kind of data quality are you seeing now that you've taken single loops into account?
McGhee: I can't answer that off the top of my head because it fluctuates, but it's definitely much better. We now count data from these single loops as good data, although we don’t have speed information for them. We’re also in the midst of a transition at the moment: last week, Phase II of the Hampton Roads Smart Traffic Center came on-line, so now all of a sudden we have a whole bunch of new detectors. The Smart Traffic Center staff has also been great about including us in discussions to determine which locations are the most critical, and they're actually working to install some new detectors at places where they have lost detection altogether due to construction or communication failures.
ICDN: Are those new loop detectors?
McGhee: They will be RTMS (radar) detectors, which are used for all of the Phase II detection in Hampton Roads. They're trying to get out of the pavement as much as possible.
ICDN: Will the upcoming "Build 3" impact your data quality?
McGhee: A big part of Build 3 involves going back and computing speeds for the single loops. Right now when you query the database some of the speeds that you see are low, because the algorithm is still taking those zeros into the computation of average speed values, because we didn't want to change the software knowing that we were going to compute speeds for those locations.
ICDN: It sounds like you have to pay a lot of attention when you get a new data source to understanding exactly where it's coming from and what it's showing, right?
McGhee: Absolutely -- it's not just a matter of just plunking it down and making it available. We definitely have fields in the database that characterize the source of the data, and of course every complete new source -- like the Norfolk signal data that will be part of Build 3 -- becomes its own data set.
ICDN: In your presentation in Houston, you talked about "imputing data," and I'm embarrassed to say that I don't exactly know what you mean by that.
McGhee: Invariably, in any given data set you're going to have values that are missing or bad. They could be missing because you had a temporary comm. drop or because something happened in your controller cabinet -- there could be all sorts of reasons why. You could have a detector that reports for only two of three 20-second intervals, and you don't want to lose the whole minute's worth of data nor do you want to lose the whole day because you missed a minute. So it becomes useful to fill in that missing data through imputation. "Imputation" is just a fancy word, but all it means is that you filling in the data through some sort of logical process. We currently do it using historic averages. If the average volume at detector 121 Monday at 3:22 p.m. is "x" and if we're missing that particular reading, we plug that [historical information] in there. There are certainly more intelligent ways to do imputation that take into account what's currently happening at upstream or downstream locations. We're working on putting some of the more advanced or complex methods of imputation into place.
ICDN: I presume that's quite an active research area.
McGhee: Yes. However, some people complain about imputation, say that you somehow invalidate your data set if you make up data, so to speak. Many applications can't run on an incomplete data set, so there's real value in somehow completing it. We mark imputed data in our database, which gives you the option of querying just the received data or the received plus imputed data, so you always know if imputed data has been used in your query.
ICDN: Does faulty data usually imply that you're going to substitute imputed data for that location?
McGhee: We can't impute if too much data is bad or missing. It's too big of a risk. Obviously, there's also the risk that you're going to tamp out any variation in your data, which would invalidate analyses that use standard deviation, for example. TTI's buffer index or anything like that that relies on the actual variations on data could be impacted if you impute too freely.
ICDN: Does ADMS currently calculate the buffer index?
McGhee: Right now we're not reporting a buffer index, but certainly we would like to report some sort of performance measure that get at reliability and variability. We spend a lot of time talking about what kind of performance measures we want to provide: what's the most useful to agencies like VDOT or the planning district commission, and what's useful to other folks in general.
ICDN: I understand that Build 3 will provide something called the "normality index." What's that about?
McGhee: That gets at the idea that our database is full of data that looks a whole lot like the rest of the data. A typical Monday looks almost the same as every other Monday -- there's very little difference between what would be considered normal unless some sort of event occurs. So the idea is that if we could somehow figure out what "normal" is -- which is not a simple task, obviously -- then we could just run a tally of how many of those we see. How many Mondays at 3:22:20 look normal? Any exceptions to that would be stored in an exception table. That makes querying a lot faster because you can reduce the size of your database. As opposed to storing every single reading, you really only record a tally of how many normals you saw plus any exceptions to normal.
ICDN: Information about what’s abnormal could also be useful for traveler information, couldn’t it?
McGhee: Absolutely. As opposed to looking for incidents, you're really only looking for anything that varies from normal. However, that's not an easy thing to do, because it's a big task figuring out what "normal" is. Then there are questions like: Does normal change seasonally? How do you know when you've started to drift from normal to a new normal? All of those sorts of questions come into play.
ICDN: Presumably you've answered some of those questions for the "normality index" in the Build 3, right?
McGhee: The normality index will be computed and available in Build 3. It's an indication from 0 to 1 of how normal a reading is based on historical information.
ICDN: Is the data in ADMS exclusively freeway data?
McGhee: No. The largest percentage of data is freeway data from VDOT, but Build 3 adds data from the city of Norfolk’s signal system. As we take the system to Northern Virginia, the arterial data will be a larger part of the system as we work to incorporate data from the large signal system in VDOT’s NOVA district.
ICDN: Is the transit data you're adding in Build 3 derived from real-time vehicle locations using GPS? It would be interesting to correlate bus locations with normal traffic conditions.
McGhee: The transit agency, Hampton Roads Transit, doesn't currently have that capability with their fleet. We basically overlay current traffic conditions onto their static routes, so that they can see where they might have problems adhering to their schedules. They can also quickly see incidents that have occurred on a route, for example. At this point we're providing a visual representation of data for them. ADMS doesn't do any analysis for them, because all we have is static route information.
ICDN: Do any other lessons-learned come to mind that we haven't talked about?
McGhee: The build approach was key for getting the stakeholders involved and thinking about how they might use the data.
ICDN: Not only getting them involved, but it gets their buy in, doesn't it, because you get them involved in an early build, and they say "well, we wish you had something like this." I guess that means that you need to keep the next couple of builds a little loose, so you can add things that users want, right?
Thompson: One of the key things with this approach is the flexibility of how you can help shape it to meet the needs of the stakeholders. You have to have some boundaries, but the build approach allows you to do that. Obviously, if the stakeholders help design it they'll have the buy-in and will use it, which is basically the objective of the system.
McGhee: Absolutely. The coordination has been really important for us, not only in identifying what tools the stakeholders needed, but also in identifying additional data sets. The addition of the data from VDOT’s continuous count stations (TMS data) is due primarily to the active involvement of our stakeholders. Obviously, the data quality and formatting issues also couldn't have been resolved without active coordination.
Catherine McGhee can be reached at Cathy.McGhee@VirginiaDOT.org
Dale Thompson can be reached at Dale.Thompson@fhwa.dot.gov
For More Information