EFFECT OF TRAFFIC INFORMATION ON TRAVEL TIME OF MEDIUM-DISTANCE TRIPS : A CASE STUDY

In populated cities with high traffic congestion, traffic information may play a key role in choosing the fastest route between origins and destinations, thus saving travel time. Several research studies investigated the effect of traffic information on travel time. However, little attention has been given to the effect of traffic information on travel time according to trip distance. This paper aims to investigate the relation between real-time traffic information dissemination and travel time reduction for medium-distance trips. To examine this relation, a methodology is applied to compare travel times of two types of vehicle, with and without traffic information, travelling between an origin and a destination employing probe vehicles. A real case study in the metropolitan city of Tehran, the capital of Iran, is applied to test the methodology. There is no significant statistical evidence to prove that traffic information would have a significant impact on travel time reduction in a medium-distance trip according to the case study.


INTRODUCTION
Routing is a decision support system that guides a traveler from a certain origin to a specific destination across a network. Although this system can be different in many aspects, its main goal is to provide travelers with an optimal solution, i.e., to identify the shortest or fastest path to travel from an origin to a destination. Routing methods can be classified into four categories: static and dynamic routing, deterministic and stochastic routing, reflective and forecasting routing, centralized and non-centralized routing. In static routing, historical traffic data is utilized, while in dynamic routing real-time traffic data is used as the system input. Stochastic routing considers the random nature of traffic data, whilst the deterministic approach treats traffic data as fixed variables. Forecasting routing applies models to predict traffic data, while reflective routing employs the current state of a network. Non-centralized routing carries out optimization for each user, whilst centralized routing takes the whole network into consideration [1]. Reliable traffic data is vital for planning efficient routing.
Reliable sources of traffic data may provide travelers and decision-makers with accurate and real-time information enabling them to select appropriate routes to reach their destinations. Many developed countries employ Advanced Travel Information Systems (ATIS) to capture real-time traffic data and disseminate it to travelers using devices such as monitoring cameras, inductive loop detectors, speed cameras, and Bluetooth detectors. Nevertheless, those systems require high capital investment. Recently, other sources of real-time traffic information, such as crowdsourcing, have been applied as emerging cost-effective alternatives. In this regard, smartphone applications have been widely deployed to send and receive information from and to travelers with highly developed systems. Those systems not only require low-cost infrastructures but also deliver information in real time to the city traffic control center, which provides travelers with visual outputs through smartphone applications [2]. However, this alternative is highly dependent on travelers holding smartphones connected to the internet. Practically speaking, the related penetration rate would not be high enough, especially in developing countries.
One of the most cost-effective methods of real-time traffic data collection is to deploy probe vehicles equipped with smartphones or handheld GPS while they are running in transportation networks. Information parameters such as location, speed, acceleration, and direction of vehicles are collected and sent to traffic centers. After data cleaning, useful information such as traffic conditions is transmitted to travelers to optimize their route in terms of travel time, distance, or convenience to reach their destinations [3]. This method has an advantage over the abovementioned ones free flow and condensed traffic conditions. Moreover, they investigated the penetration rate of vehicles equipped with data transferring devices to all passing vehicles [11]. Guanwan and Chandra proposed an averaging time interval which had an effect on the accuracy of the result of the probe vehicle technique. They found that there was an optimum averaging time interval (i.e., seven seconds) where the accuracy of traffic velocity was at the highest level [12].
Levinson et al. studied the effects of an information dissemination system to reduce the drivers' travel time and vehicle operation costs. Although the system causes vehicle operation cost reduction, it may increase the travel demands. The value of this system is apparent during recurring congestion when traffic flow is close to the capacity. However, its value is much more significant over non-recurring congestion [10]. Rouhani and Gao examined the effects of the Advanced Traveler General Information System (ATGIS) on the road network in Fresno, CA, and found that the total travel time in the city could be reduced by 17% (no pre-system perceived costs) to 1% (accurate pre-system perceived costs), and even increased by 1% (higher-than-actual pre-system perceived costs) [13]. Leontiadis et al. evaluated the effect of using a decentralized traffic-based navigation system. They designed a system where vehicles were allowed to reroute by using individually collected traffic information. By performing a case study, they evaluated the possibility of minimizing the travel time of drivers by using a decentralized system. The authors concluded that traffic congestion in a realistic scenario could be reduced by the decentralized approach. Similarly, they claimed that monitoring only a subset of streets could cause undesirable results even in a congested street [14].
To sum up, several valuable achievements have been accomplished by researchers in terms of collecting traffic condition information via conventional intelligent transportation systems or affordable devices such as smartphones or GPS. It has been claimed by many researchers that real-time traffic information could assist travelers or related authorities. However, little attention has been paid to investigating the fact of how helpful traffic information is with regards to the length of trips, especially over medium distances. Long-distance or inter-city trips are defined with thresholds of 50 to 100 km [15][16], while short-distance trips over less than 5 km can be carried out on foot or bicycle [17]. In short-distance trips, traffic information might not have a significant effect on travel time due to two reasons: (1) using mostly modes of transportation with separate lanes (walking and biking) which would not be largely affected by traffic jams; (2) short distances between origins and destinations would not allow for travel time to significantly decrease due to selecting uncongested routes using traffic information. Therefore, it is concluded that there has not due to the fact that it needs neither high capital investment nor travelers holding smartphones connected to the internet.

LITERATURE REVIEW
Different studies have been conducted on the application of real-time traffic data in transportation planning. Saraydar et al. proposed a method called mobility management to use wireless technology in collecting data. Detecting users within a wireless network was made possible by mobility management resulting in links flow [4]. Tsui and Shalaby designed a real-time data collection system based on GPS data to detect transportation modes using fuzzy logic. They claimed that the main challenge in real-time data collection was to obtain similar results with limited data from short-distance trips [5]. Yim and Cayford designed a field experiment to compare smartphone efficiency and GPS. The results showed that GPS technology was more accurate than those based on Base Transceiver Station (BTS) signals for routing and collecting traffic conditions. Likewise, the authors declared that low system accuracy of BTS signals made them inappropriate for traffic condition data collection usage, especially in situations where routes had complicated geometric characteristics [6]. Guo et al. presented a new approach in traffic data collection called link condition data collection system. This system received information from smartphones and fleet management systems to estimate the links speed and travel time. This information was applied to assist transportation agencies in transportation planning [7]. Conversely, some researchers focused on the limitations and drawbacks of employing smartphones in traffic data collection [8]. They stated that smartphones captured a high volume of traffic data which needed to be transferred to a data center, resulting in high battery and data usage and threatening user privacy.
Demers et al. employed 200 probe vehicles equipped with GPS to collect traffic data over three months. Seemingly, the main problem of this research was the low number of vehicles equipped with data collection devices compared to the total number of vehicles in the associated transportation network [9]. In a study conducted at the California Innovative Transportation Center, 100 probe vehicles equipped with smartphones were utilized to collect data from 10 miles of a highway in San Francisco. The vehicles' speeds and locations were captured every three seconds and transferred to a data center. Traffic condition data was sent to travelers to effectively select their routes to destinations [10]. Herrera et al. applied the traffic data collected in the California Innovative Transportation Center to determine traffic conditions based on GPS imbedded in the smartphones. They also attempted to evaluate the precision of traffic speed in

Experiment design
The selected case study was located in Tehran, the capital of Iran. In a metropolitan city such as Tehran, with a population of around nine million, making it the third largest city in the Middle East, the number of private cars and trips grows faster than the urban transportation network expands. According to the statistics presented by the Tehran Traffic and Transportation Center, almost 45% of daily trips in Tehran are taken by private cars [18]. During peak hours, 80% of trips are taken by private cars and taxis. The average speed of vehicles in Tehran is 26.5 kilometers per hour and the ratio of delay time (difference between free-flow travel time and actual travel time) over mean travel time is 50.7 percent. Almost 29 percent of the network is located in a level of service lower than "E" [18].
The design of the study included three parts: defining response variables and factors, determining the block of data collection and preparing protocols and schedule. First, the speed and travel times of probe vehicles were defined as a response variable. Having applied smartphones imbedded in probe vehicles, parameters such as vehicle speed and travel time were captured. After that, parts of the transportation network including an origin, a destination, and routes connecting the origin and destination were selected for data collection, where the origin and destination were at a medium distance from each other. For this purpose, a pilot study was conducted to investigate which part of the Tehran transportation network is been enough research conducted on the significance of real-time traffic data dissemination for travel time in medium-distance trips (from 5 to 50 km, mostly in urban areas).

OBJECTIVE AND SCOPE
The main objective of this study was to investigate the statistical significance of disseminating real-time traffic condition information to travelers in reducing their travel time moving from an origin to a destination in a medium-distance trip. The scope of this research is limited to urban transportation networks. The traffic data collection was performed using probe vehicles equipped with smartphones and cameras running in a medium-distance origin-destination loop on recurring congestions dealing with peak and off-peak hours on weekdays and a weekend.

METHODOLOGY
After a thorough literature review, the problem was defined including a description of the problem context along with the objective and scope. Then, an experiment was designed to collect traffic data over a medium-distance origin-destination in the Tehran transportation network. After data collection, data cleaning and data preprocessing were executed to remove noise, inconsistencies, and incompleteness. Finally, the information was analyzed to express the importance of real-time traffic data dissemination among travelers in medium-distance trips. The flowchart of the research methodology is depicted in Figure 1.  The average traffic condition of the chosen routes (based on the level of service) is "E". The routes may include freeways, highways, and arterial roads of Tehran which are specified in Table 1 with the associated number of lanes and lengths. The speed limits of arterial roads, highways, and freeways in Tehran are 50, 80, and 100 km, respectively. Figure 2 illustrates the origin and destination along with the five routes between them. It should be mentioned that only one OD pair was chosen here as a case study. Of course, the case study explained herein was insufficient to assess the effect of trip distance on travel time. However, this research only focuses on medium-distance trips and proposes a methodology to evaluate the effect of travel time information on OD travel time which was elaborated using a case study. In fact, more samples with various OD networks and distances in different regions of the city (i.e., CBD and non-CBD areas) over weekdays and weekends in peak and non-peak hours with at least three to five replications with minimum 30 samples for appropriate for traffic data acquisition. To choose the origin and destination, the following criteria were taken into consideration: -The origin and destination should be out of the Tehran traffic restricted zone (located in the CBD of Tehran, where only authorized vehicles can enter) to ensure that the traffic pattern is not affected by the zone moving restrictions on working days. -The number of feasible routes connecting the origin to the destination should be higher than one to provide opportunities to select a route with the best traffic conditions between the origin and destination. -The routes should include either highways or arterial roads. Also, these routes should have a wide range of traffic conditions in peak and off-peak hours. -The origin and destination areas should have sufficient space, such as a parking lot, for probe vehicles to remain organized by the survey coordinator before starting data collection. -The routes should be equipped with cameras and monitored by the Tehran Traffic Control Center (TTCC) to enable utilizing their captured data to validate the data collected by the probe vehicles. -The origin and destination should be at a medium distance (between 5 and 50 kilometers) in an urban area. After various field investigations in the pilot study, the Azadi and Sanat squares were selected as the origin and destination with a distance of approximately ten kilometers, in the west part of Tehran. Those squares are among the main squares in Tehran, with a high rate of trip generation which meets all the criteria mentioned above. The five most common routes were chosen (among various available routes) between the origin and destination based on local cab drivers' opinions. It should be noted that by selecting just five routes and instructing the drivers to follow one of those routes, other minor routes which may not actually be common alternatives were intentionally neglected, since some of them were so far from each other that switching between them after receiving real-time information did not seem reasonable. other sat in the front seat running the equipment and filling the traffic condition form. Both surveyors rated the traffic condition in such a way that the passenger first wrote his/her judgment down on the form on his/ her own and then asked the driver to express his/her rating about the traffic condition. This approach led to unbiased rating by surveyors. More specifically, if the passenger had asked the driver's rating first, it would have affected the passenger's rating, that is, it might have caused bias in the data. The panel rating was employed to generally validate the data captured by the equipment mounted on the vehicles. In each probe vehicle, two pieces of equipment were employed in the survey, consisting of smartphones and cameras. Smartphones were mounted in the vehicles in a horizontal position (Figure 3b). A proper smartphone application was utilized to record travel time, vehicle location, and speed. The data was captured and stored every one second on the smartphones. Simultaneously, a camera was mounted on each vehicle behind the windscreen to capture the video of the whole survey. The video assisted the panel to verify their qualitative rating of traffic conditions after the survey. After all the arrangements, the initial experiment was executed on different routes to check whether the designed experiment was implementable or not. The initial experiment proved that the routes were properly selected, and the experiment could be successfully executed.

Data collection and preprocessing
The data collection was carried out over three time slots with different traffic conditions to enable unbiased data collection in order to obtain a valid conclusion. The traffic conditions were off-peak hours on a weekday morning, peak hours on a weekday afternoon, and a weekend. Three to five replications were conducted in each time slot to decrease the error inherited in the collected data. Each route was surveyed in both directions: first from the origin to the destination each OD are required to examine a significant effect of OD distance on travel time while travelers are using real-time traffic information.

Survey preparation
After designing the study, preparation for the survey was carried out. The preparation included assigning tasks and developing instructions, preparing forms and equipment, and conducting an initial experiment. A panel of 42 surveyors was employed in this survey, categorized in 21 probe vehicles running from the origin to destination, i.e., the Azadi square and Sanat square, respectively, and vice versa. It is noted that the routes from the origin to destination and from the destination to origin are abbreviated as AS and SA, respectively.
An instruction was developed to provide detailed information to the surveyors about the routes and associated segments that each vehicle should pass, equipment that should be mounted on the vehicles, methods of applying it, and the traffic condition form which should be qualitatively completed for each segment. The instruction also included the verbal and visual descriptions of traffic, which assisted the panel to rate the traffic condition of each segment. The descriptions were defined based on the concepts applied by the TTCC in preparing a color-coded map to be able to validate the panel rating. The TTCC subjectively rates the Tehran transportation network according to traffic conditions using a color-code scheme which is associated with the letters A to E (A relates to free flow, while E is a congested traffic condition).
A few discussion sessions were held with the panel to clearly explain the traffic conditions, simulate the situations they encountered in the field by showing videos of various traffic conditions and expressing how to rate them. The panel was to subjectively rate the traffic condition of each segment with the letters A, B, C, D, and E, where A is the perfect traffic condition, while E represents the worst one. Two surveyors were on board in each vehicle (Figure 3a). One was the driver and the travel time for the informed vehicle as compared to the application of travel time prediction models. In the latter approach, a model is fitted to the captured data which inevitably contains errors in travel time prediction. Nevertheless, the proposed method reports the real traffic conditions of the routes that travelers are going to pass. The five routes provide enough flexibility for drivers to decide to reroute based on the best travel time and traffic condition. This concept is illustrated in Figure 4. Concurrently with the data collection, a person in the office was in charge of recording and storing a color-coded qualitative traffic condition map of the selected routes provided by the TTCC. It is of great importance to note that being familiar with a route or being a commuter plays an important role in choosing a route. Sometimes commuters do not even trust or pay attention to real-time traffic information. This becomes even more important for medium-distance trips as compared to long-distance ones. In this study, the drivers' levels of familiarity with the routes were categorized into two groups: the uninformed vehicle which decided based on previous experience, such as that of a commuter, and the informed vehicle that selected the route based only on traffic information, rather than previous knowledge. The drivers were all at the same level of knowledge and familiarity with the routes.
After data collection, data preprocessing was carried out. Incomplete data, outliers, and inconsistencies were investigated. In some cases, smartphones did not work properly, or cameras failed to store data. In those cases, incomplete data was discarded. The box plot method was applied to detect outliers in the data and eliminate them. Moreover, the consistency of the data was checked with regards to units and dimensions; no inconsistencies were detected. and second from the destination to the origin. The data collection took on average three to five hours in different traffic conditions.
In each survey, five vehicles drove through five different routes (illustrated in Figure 2) from the origin to the destination starting at the same time, and two vehicles began the survey five minutes later. The travel times of all vehicles were captured. One of those two vehicles (running five minutes later than the others) called "the informed vehicle" received real-time traffic conditions of upcoming routes from those five vehicles running earlier to be able to wisely take the one route which most probably had the lowest congestion and shortest travel time. The information received by the informed vehicles covered a spatially sufficient area, i.e., considering the average speed of 50 km/h (assumed based on the observations in this survey), the five lead vehicles were roughly 5 km (almost half of the length of the routes) ahead of the informed vehicle. It seems that real-time data from the intermediate subroutes (sub-routes used when switching from one path to another one) was provided to the informed vehicle. This may result in a wise route choice by the informed vehicle. However, it should be noted that due to the subjectivity of inputs such as traffic congestion reported by the vehicles running earlier, the decision made by the informed vehicle might be susceptible to error in decision-making regarding the best route to take.
Hence, another vehicle, called "the uninformed vehicle", took a route based on the best of the driver's knowledge, i.e., without traffic information on the upcoming routes. The novel idea behind this five-minute gap was to monitor the traffic conditions of all possible routes that the informed vehicle could take. This was provided by five probe vehicles and disseminated as future traffic conditions to the informed vehicle. This idea results in a more realistic and accurate future  Figure 4 -Routing based on the best traffic condition and shortest travel time is meaningful (i.e., negative) only in the peak hours of weekdays although its magnitude is relatively low. In a case of non-recurring congestion due to an unexpected event (i.e., an accident or an unexpected event during driving), there would be a significant effect of real-time traffic information on travel time reduction.
Moreover, it was expected that the travel time of the informed vehicle would be shorter than that of the uninformed vehicle. Almost all cases, especially surveys conducted in peak hours on weekdays, followed this fact, but some replications (such as replication 2, SA on the weekend) violated this rule. The reason is that the traffic data of upcoming routes provided to the informed vehicle would be useful in case of encountering congestion on some routes and uncongested conditions on others or even in non-recurring congestion caused by accidents or special events. In other words, if all routes experience smooth flow, or if they are all fully congested, the knowledge of traffic conditions of upcoming routes does not play a significant role in reducing travel time of the informed vehicle since all choices roughly face the same travel time. On the other hand, if a traveler can pick a route with smoother traffic conditions (using future traffic condition data) among other routes suffering from traffic jams, he/she will effectively decrease his/her total travel time.

RESULTS AND DISCUSSION
Data exploration was executed through computing descriptive statistics (e.g. mean and standard deviation) for speeds and travel times of the informed and uninformed vehicles over different time slots and origin-destination pairs. A sample of data is presented in Table 2.
As expected, the average speed of vehicles on weekend is higher than in off-peak hours on weekdays. Likewise, the average speed of vehicles in peak hours on weekdays is the lowest. The same order of magnitude can be observed on travel time in the reverse order, i.e., the highest travel time values correspond to peak hours on weekdays. The same pattern can be understood from the panel rating and TTCC outputs. It should be mentioned that the captured videos from the routes were consistent with the traffic conditions rated by the panel and TTCC.
Similarly, the travel times of different replications are almost the same. On average, the travel time of the direct route from the origin to the destination (AS) is shorter than that of the opposite route (SA), which is due to the longer distance of SA compared to AS. Furthermore, the difference in travel time (informed minus uninformed vehicle travel time value/percentage) This pattern is even vaguer in the case of weekends (Figures 5c and 5d), which seems logical. That is, in case of lower traffic congestion, i.e., on weekends, using traffic condition information would not significantly affect the travel time since most of the routes between the targeted origin and destination are in the higher/ better traffic level of service, as stated before. Therefore, the same result is derived based on the speed histograms of various replications on the weekend. Figure 6 shows the travel times of informed and uninformed vehicles on weekdays and the weekend on Sanat to Azadi squares ( Figure 6a) and Azadi to Sanat squares (Figure 6b). Only in the case of peak hours on weekdays does the pattern seem logical, showing that the travel time of the informed vehicle is shorter than that of the uninformed one. There is no strong evidence from the figures that there is a significant difference between the informed and uninformed vehicles in both directions (i.e., from Azadi to Sanat squares and from Sanat to Azadi squares) in terms of travel times.
Travel time, in fact, is a random variable which should be properly expressed by probability distribution. Having applied numerous rows of travel time data (replications) collected for each vehicle (informed/uninformed) in various traffic conditions, a probability distribution was defined as, for instance, schematically illustrated in Figure 6a. The area over which two probability distributions are overlapped (highlighted in the figure) represents the concept of error types I and II, i.e., although the probability Furthermore, the high standard deviation of speed (in most cases) expresses the variability of speed over routes. This fact is more visible on weekdays due to unstable fluctuated traffic patterns (several stop-andgo and free-flow movements). Moreover, the average and standard deviations presented in Table 2 were calculated using instantaneous speed (captured almost every 1-2 seconds) which results in a high variability of speed. Nevertheless, this variability would not affect the travel time since the travel time was directly measured (not using the instantaneous speed).
The histograms of travel speeds of informed and uninformed vehicles from origin to destination (AS) and vice versa (SA) on weekdays and weekends are depicted in Figure 5. As shown in this figure, no logical pattern can be understood with regards to the significance of traffic condition information on travel time. In other words, it was expected that the histogram of the informed vehicle would have been skewed to the left, i.e., most of the captured speed data should have been located in higher speed bins. However, there is no significant evidence to prove this expectation. For instance, as shown in Figure 5a related to the vehicle speed from the Azadi to Sanat squares on weekdays, the frequency of the uninformed vehicle at a speed bin of [90,105] is higher than that of the informed vehicle and likewise in the opposite direction, as in Figure 5b. The uninformed vehicle frequency in the range of 60 to 95 km/h is almost similar or even higher than that of the informed vehicle.  In this test, the null hypothesis stated the equality of the mean of travel time of the informed (n info ) and uninformed (n uninfo ) vehicles (i.e., H 0 :n info =n uninfo ). Hence, the alternative hypothesis stated the inequality of the travel time means of the informed and uninformed vehicles (i.e., H 1 :n info ≠n uninfo ). The test that was carried out at the level of significance of 95% resulted in t statistics of 0.791 (as shown in Table 3), which was much lower than the critical t value (almost 2). Since the t statistics was lower than the critical t value, there was no significant evidence to reject the null hypothesis. Therefore, the null hypothesis was retained (could not be rejected), meaning that there was no statistically distribution mean of the uninformed vehicle's travel time is greater compared to the informed one (in peak hours on working days), in rare cases the situation is completely opposite, meaning that the travel time of the informed vehicle is longer than that of the uninformed vehicle. Therefore, this error is partially due to the inherent randomness of travel time. Figure 6 can only subjectively support the hypothesis of travel time equality of informed and uninformed vehicles, i.e., no evidence can be found to support the fact that traffic information has a significant effect on the travel time of the informed vehicle between the origin and destination. In order to retain or reject the hypothesis, a statistical test is required, called the twotailed t-test. This test is applied in the case that the   significant difference between the travel times of the informed and uninformed vehicles. It should be noted that the power of the hypothesis testing is on rejection. In cases such as this hypothesis, where the null hypothesis is not rejected (it is not called "accepted", just noted as "retained"), more investigations are suggested. For future work, it is recommended to expand the sample size to ensure that the conclusion drawn is valid. It seems that in medium-distance trips the traffic information would not significantly impact their travel time. Obviously, this result does not take real-time traffic information dissemination for granted. It just emphasizes the possibility of the trip distance effect on travel time of vehicles with real-time traffic information. It is worth noting that to draw a concrete conclusion whether to retain or reject this statistical test a more comprehensive data base, as mentioned earlier, is required.

CONCLUSION
Distributing traffic information may generally affect travel times of travelers within transportation networks. Knowing which route between the targeted origin and destination has better traffic conditions may assist travelers in reaching their destination earlier. This fact has not been thoroughly investigated on medium-distance trips. The main aim of this study was to investigate the effect of traffic information on reduction in travel time on medium-distance trips. An experiment was designed to gather data via probe vehicles to be able to study this effect. Not enough evidence was provided to reject the hypothesis of equality of travel times of informed and uninformed vehicles (with regards to real-time traffic information) in medium-distance trips based on a case study run in the metropolitan city of Tehran, the capital of Iran. Of course, to draw a concrete conclusion whether to retain or reject this statistical test, more comprehensive data gathering is required.
The randomness inherent in the travel times resulted in some cases where the travel times of informed vehicles were longer than those of uninformed ones, which seems unreasonable. This fact was mostly observed in off-peak hours, where traffic information would not have a positive impact on travel time due to the fact that all routes were encountering smooth traffic, and traffic information could not make the travel time shorter. In peak hours, however, the difference between informed and uninformed vehicles was more visible but not statistically significant for medium-distance trips.