Internationales Verkehrswesen
iv
0020-9511
expert verlag Tübingen
10.24053/IV-2017-0110
51
2017
69Collection
Using GPS technology for demand data collection
51
2017
Jakob Baum
Enrico Howe
Travel demand data is a necessary basis for urban mobility planning, but especially in developing and emerging economies data availability is often weak or non-existing. The Global Positioning System (GPS) technology offers a cheap alternative for data collection to traditional diary or survey methods. This article elaborates on advantages and disadvantages of the approach. Also different aspects of the post-processing of GPS data in order to determine trips, mode choice and trip purposes are discussed. In practice, GIZ collects first experiences with the methodology in four Ukrainian cities.
iv69Collection0030
International Transportation (69) 1 | 2017 30 BEST PRACTICE Data Tracking Using GPS technology for demand data collection Introduction to opportunities and challenges of the methodology in developing and emerging economies Tracking, travel demand, data collection, GPS, smartphone, Ukraine Travel demand data is a necessary basis for urban mobility planning, but especially in developing and emerging economies data availability is often weak or non-existing. The Global Positioning System (GPS) technology offers a cheap alternative for data collection to traditional diary or survey methods. This article elaborates on advantages and disadvantages of the approach. Also different aspects of the postprocessing of GPS data in order to determine trips, mode choice and trip purposes are discussed. In practice, GIZ collects first experiences with the methodology in four Ukrainian cities. Jakob Baum, Enrico Howe T he goal of every transport planner is to meet the citizen’s demand for mobility. In order to provide a reliable, safe and efficient transport system, they have a set of measures to choose from - ranging from different infrastructure measures that enhance the system, but also travel demand management measures such as congestion pricing and many more. Data necessities for transport system planning Facing the decision between such an array of possibilities, the planner needs to evaluate and appraise the options and their impact on the system, its users and the society as a whole. This is done through varying appraisal methods, including cost-benefitanalyses that are based on models. To run these models, three types of data are needed: • Existing network and prices - Especially in developing and emerging economies data availability reveals a serious challenge: while network information is often freely available through Open- StreetMap (given precautions in quality control), travel demand data and behavioural models are very rare. • Behavioural models - Behavioural models determine how a person travels given certain circumstances. Most modelling software on the market has a behavioural model implemented that is applied in projects located all over the world. Obviously, this lacks to account for cultural and local characteristics of travel decisions. For example, the travel time cost varies largely between different countries, age and sex groups [1, Ch. 5.2.7]. • Travel demand - This paper will guide through some aspects of GPS tracking as a travel demand data collection methodology. The most common travel demand format is an Origin-Destination-matrix (OD-matrix), describing the amount of people traveling between different places with a given mode of transport. Classical approach to data collection With the rise of transportation science in the 1950s, transport models have been based on data obtained from household surveys and travel diaries [2]. In these paperor telephone-based collection formats, the participants reveal information on each conducted trip at the given day. This method produces a dataset with relatively high information density by using detailed questionnaires. Representative surveys are comparatively easy to conduct and expensive at the same time. On the other hand, its reliability is often very weak as it relies on stated preferences: The high manual workload for the participants may result in low information quality on trips as well as their distances and durations. Also un-intentional misreporting, especially regarding multimodal and short trips by foot, are a challenge. The costs implied with a representative, traditional data collection are often close to prohibitive for cities in developing and emerging economies. In order to overcome the difficulties on data reliability of traditional data collection, transport researchers and practitioners started to add supplemental technology into their methodological designs. Global Positioning System (GPS) - based data collection For the first time, GPS-technology was used in 1997, in a context of a mobility survey in order to automate data collection Travel Diary or Survey GPS-based - low reliability + high reliability on trip times and durations - no information on chosen travel route + travel route information - low scalability + easily scalable - high collection cost + low collection cost + information on chosen mode + inter-modality detected with higher accuracy + information on trip purposes - complex post-processing needed: - no direct information on used mode - no direct information on trip purpose Table 1: Travel Diary vs. GPS-based Data Collection (own work, based on [4]) International Transportation (69) 1 | 2017 31 Data Tracking BEST PRACTICE and to enhance its robustness ([3] as cited by [4, p. 34]). This first study already showed a main advantage of GPS tracking over selfreported travel diaries: the share of short trips in this survey was higher than in previous, traditional surveys. Many short trips have simply not been included in household surveys or traffic diaries - probably because they are not perceived as ‘real’ trips by the participants in the first place or because making the effort of filling out a diary for a short trip is not perceived as necessary. An overview of advantages and disadvantages is given in table 1. Initially, GPS-based data collection took mostly place with special in-vehicle or handheld devices, guaranteeing a high data accuracy. Their handling is easy as they have to be simply carried by the participants without further need for interaction. Energy consumption is, in congruence with the small array of functionality, limited, even though the capture frequency is high and the generated tracks are usually very accurate. Downsides are the burden to carry an additional device for the participants (which may lead to unaccounted trips if (un)intentionally left at home) as well as the implied costs of acquisition. Smartphones: the omnipresent data-collection device With the dissemination of smartphones, their potency as personal tracking objects has become tremendous. Without need to carry an additional device and hence no further costs, barriers for their utilization are very low. Therefore, the strongest argument for the use of smartphones is the ease of acquisition of participants and scalability through simple software distribution. However, today’s smartphone batteries are not designed to facilitate constant tracking applications. On account for this, existing tracking software often decreases the frequency of GPS tracking points and the amount of requested satellites. Therewith the track quality decreases substantially compared to specialized handheld devices. Also, low battery status or anxiety for it can lead to interruption of tracking. Another challenge for GPS tracking is the loss of signal. GPS devices need constant view to multiple satellites. Tracking underground or in an urban environment with many skyscrapers is therefore difficult. Smartphones have the advantage to be able to use auxiliary sensors such as Wifi, mobile network or Bluetooth signals to locate their user. A general challenge is the misrepresentation of population subgroups in the sample. This bias may be reinforced by using smartphones as the tracking device, as varying dissemination rates and disproportionate ownership of smartphones in the population groups are to be expected. Especially in many developing countries, a large share of the population doesn’t own a smartphone. Also motivation to participate varies between different socio-demographic groups. This has to be accounted for in the investigation design, e.g. through accompanying traditional interviews, targeted recruiting of underrepresented groups, supply of smartphones and finally weighting in the data analysis (see table 2). Using data from the network provider (Call Detail Records) instead may also be an option - however, that comes with its own challenges. Why and how did you move? The need to post-process GPS-data A GPS track has information about location, elevation and time attached to each data point. From this, speed, heading and acceleration can easily be derived. Obviously, a high frequency in data point collection leads to higher resolution of location, speed and acceleration data (see figure 1). On a trip level, the track gives valuable and highly accurate insights into route choice, trip distance and duration. However, it doesn’t reveal used modes and trip purposes directly. Therefore, post-processing algorithms have been developed deducting this information from the tracks. Identifying Trips A raw GPS track consists of an array of points. In order to divide those points into different trips and stops, a set of criteria is applied as Gong et al. show in their literature review [5]: most researchers define that one trip has ended and a stop has been reached when a certain “dwell” time has been reached. The threshold is usually set between 120 and 300 seconds. Furthermore, the change of location, heading and the density of track points are common criteria. In order to account for different segments of a multimodal trip, the threshold can be decreased: More segments are detected, analysed and merged again afterwards. [6, p. 324] Main challenges arise from signals loss and inaccuracy. However, existing algorithms are able to segregate up to 98 % trips and stops correctly [7]. Handheld Tracking Devices Smartphones + easy to handle for participants +/ - knowledge about smartphones required + long battery lifetime - short battery lifetime + high tracking quality - lower tracking quality - need to carry additional device +/ - personal disposability of smartphones varies largely between countries (smartphone penetration rate) - higher cost of provision + lower cost of provision - tracking capability in challenging environments (high building density, underground) + WiFi and other signals can be used in challenging environments - bias in sample / ownership Table 2: Handheld Tracking Devices vs. Smartphones (own work, based on [4]) Figure 1: Example of an array of three GPS-points stored in a GPX-file [...] <trkpt lat="46.57638889" lon="8.89302778"><ele>2374</ ele><time>2017-01-14T10: 13: 20Z</ time></ trkpt> <trkpt lat="46.57652778" lon="8.89322222"><ele>2375</ ele><time>2017-01-14T10: 13: 48Z</ time></ trkpt> <trkpt lat="46.57661111" lon="8.89344444"><ele>2376</ ele><time>2017-01-14T10: 14: 08Z</ time></ trkpt> [...] AUF EINEN BLICK Wissen und Daten über Verkehrsnachfrage sind notwendige Grundlagen von Stadtverkehrsplanung. Insbesondere in Entwicklungs- und Schwellenländern besteht häufig kein Wissen über Bewegungsmuster der Bevölkerung. Das Globale Positionsbestimmungssystem (GPS) stellt eine kostengünstige Alternative zu traditionellen Datensammelmethoden, wie Haushaltsbefragungen und Mobilitätstagebüchern, dar. Dieser Artikel behandelt die Vor- und Nachteile der Methode sowie einige Aspekte der notwendigen Nachbearbeitung von GPS-Daten, um Fahrten, Verkehrsmittel und Fahrtzwecke herauszufinden. Derzeit verfügbare Methoden müssen im Regelfall auf Raumdaten zurückgreifen, um valide Ergebnisse zu erzielen. Verkehrsmittel können dann bereits relativ gut ermittelt werden. Die Bestimmung von Fahrtzwecken stellt eine große Herausforderung dar. Die GIZ sammelt nun erste Erfahrungen mit GPS-Datensammlung in vier ukrainischen Städten. International Transportation (69) 1 | 2017 32 BEST PRACTICE Data Tracking Identifying Mode Choice After trips have been extracted, the used mode has to be identified for each trip. In an urban context, foot, bike, car and different forms of public transport are the most commonly distinguished modes. Mode distinction can be done via different approaches such as machine learning, probabilistic methods or criteria-based algorithms or a mix of the above. Criteriabased (or rule-based) methods often work with speed patterns among other criteria such as transit network data. Some modes have distinctive characteristics in their speed and acceleration profile. A common speed pattern of a car can easily be differentiated from a pedestrian or a bicycle (see figure 2). The probability that an algorithm detects the right mode is very high in this case. Detecting public transport use in delimitation to car usage is already more complex since speeds of the two modes are similar. Regular stops are a potential way to identify public transit - but how to differentiate between cars and buses being stuck in traffic? In order to tackle this issue, many algorithms rely on spatial data as a second source of information. By matching the GPS track with road networks as well as public transport routes and schedules, the algorithms can e.g. detect if a person followed a bus route and stopped in proximity to public transport stations, which makes it very likely that transit was being used. Open source solutions like the Open Street Map public transportation database reveal important data such as routes, stops and public transportation mode. This data set varies globally in accuracy and information density but can be used as a solid input source for public transportation recognition. If necessary, local planners can edit the Open Street Map data base. More sophisticated algorithms are able to detect the mode of more than 90 % of trips correctly [6, p. 325ff.]. However, the described limitations in mode detection show that active user integration for tasks like track validation or transportation mode editing can prove beneficial to the aim for adequate data sets (see figure 3). Identifying Trip Purpose Next to the transport mode choice, the identification of trip purposes - the reason why trips were made - is a major issue in the creation of a reliable data basis for transport system planning. Common categories are home, job and leisure trips. In order to detect a trip’s purpose, the GPS data is combined with spatial data that includes land-uses (e. g. residential, industrial) and points of interests (e. g. restaurants, shops). If a trip starts at a residential area and ends at a school, chances are very high that it was a home-school or home-work trip. By adding a portion of personal data, e. g. the home and work address of a person, the predictions can be made even more precise. However, the accuracy of automated detection of trip purposes is not yet as high as the precision of mode choice detection and therefore much less applied. [4, p. 48] With very high data quality, just above 70 % walking bus waiting speed km/ h cycling car time 20 40 Figure 2: Speed Patterns of Different Modes Figure 3: A GPS-Track of a Person using different Modes Graphic: Enrico Howe, modalyzer Figure 4: Ukrainian Cities in which GPS Tracking Technology is used by GIZ International Transportation (69) 1 | 2017 33 Data Tracking BEST PRACTICE of accuracy has been reached, with many studies being in the range of 40-60 % accuracy ([8] and [9], cited by [6, p. 328f.]). An overview on the three steps of post-processing is given in figure 5. Best Practice: using GPS tracking in-the Ukraine The Deutsche Gesellschaft für Internationale Zusammenarbeit (GIZ) supports local governments in shaping sustainable mobility systems worldwide. In Ukraine, the development agency GIZ cooperates with modalyzer, a smartphone-based traffic data collection app developed by the Innovation Centre for Mobility and Societal Change (InnoZ). It is currently being used in four cities within the project “Integrated Urban Development in Ukraine” to get a better understanding of the mobility demand, identifying travel patterns and behaviour of the population. The generated data will finally be used for the development of integrated urban mobility concepts. Modalyzer has been adapted to Ukrainian needs and automatically identifies nine transportation modes and transit types. Further modes can be added manually - e.g. users can specify and edit trips by tagging them with the local transportation mode marshrutka, a local form of minibus service. Within the first three months, more than 1,000 participants already recorded over 140,000 km combined. The collection period will last five months. The individual users of modalyzer can benefit from the app by monitoring their travel patterns through user friendly diagrams and tables, including information on their CO 2 footprint and kilometres travelled by each mode. Verifying GPS-tracking: Prompted Recall Surveys As elaborated above, GPS post-processing algorithms are already capable of revealing information on trip mode and purpose to a large extent. In order to verify these findings or to gather further information, e.g. on reasons why a person choses a certain mode, so-called Prompted Recall Surveys (PRS) can be used. This increases the burden on the participant slightly, but ensures a higher prediction quality and may be used to enhance the algorithms, too. PRS has also been used in the Ukrainian case: modalyzer prompts the result of the tracking to the user at the end of each day and the user ensures its quality by verifying the detected trips. Conclusion GPS tracking technology has reached usability for travel demand data collection. It is cheaper and, when used appropriately, more detailed than traditional survey methods. However, in order to reach high accuracy in detecting trips, modes and trip purposes, high-quality spatial data and public transportation data is needed. Particularly in cities of the Global South, data availability on informal transit is often low. The Ukrainian example has shown that implementing Prompted Recall Surveys in a data collection application for smartphones is a viable option to overcome the issue of data availability and to enhance data reliability. Limited smartphone ownership is a remaining challenge. ■ REFERENCES [1] T. Litman, “Transportation Cost and Benefit Analysis: Techniques, Estimates and Implications.” Victoria Transport Policy Institute, 2009. [2] K. W. Axhausen, “Draft Travel Diaries: An Annotated Catalogue 2nd Edition,” ResearchGate, 1995. [3] Battelle Transport Division, “Lexington Area Travel Data Collection Test, Final report.” 1997. [4] M. Schönau, “GPS-basierte Studien zur Analyse der nachhaltigen urbanen Individualmobilität,” Dissertation, Universität Ulm, 2016. [5] L. Gong, T. Morikawa, T. Yamamoto, and H. Sato, “Deriving Personal Trip Data from GPS Data: A Literature Review on the Existing Methodologies,” Procedia - Soc. Behav. Sci., vol. 138, pp. 557-565, Jul. 2014. [6] L. Shen and P. R. Stopher, “Review of GPS Travel Survey and GPS Data-Processing Methods,” Transp. Rev., vol. 34, no. 3, pp. 316-334, May 2014. [7] H. Gong, C. Chen, E. Bialostozky, and C. T. Lawson, “A GPS/ GIS method for travel mode detection in New York City,” Comput. Environ. Urban Syst., vol. 36, no. 2, pp. 131-139, Mar. 2012. [8] W. Bohte and K. Maat, “Deriving and validating trip purposes and travel modes for multi-day GPS-based travel surveys: A large-scale application in the Netherlands,” Transp. Res. Part C Emerg. Technol., vol. 17, no. 3, pp. 285-297, Jun. 2009. [9] P. T. McGowen and M. G. McNally, “Evaluating the Potential To Predict Activity Types from GPS and GIS Data,” presented at the Transportation Research Board 86th Annual MeetingTransportation Research Board, 2007. Jakob Baum, M.Sc. Transport Policy Advisor, Deutsche Gesellschaft für Internationale Zusammenarbeit GmbH (GIZ), Bonn (DE) jakob.baum@giz.de Enrico Howe, M.Sc. Expert, Innovationszentrum für Mobilität und gesellschaftlichen Wandel GmbH (InnoZ), Berlin (DE) enrico.howe@innoz.de identify trips detect modes detect purpose Figure 5: Overview on one Day of Tracking
