Determining the macroscopic fundamental diagram from mixed and partial traffic data

The macroscopic fundamental diagram (MFD) is a graphical method used to characterize the traffic state in a road network and to monitor and evaluate the effect of traffic management. For the determination of an MFD, both traffic volumes and traffic densities are needed. This study introduces a methodology to determine an MFD using combined data from probe vehicles and loop detector counts. The probe vehicles in this study were taxis with GPS. The ratio of taxis in the total traffic was determined and used to convert taxi density to the density of all vehicles. This ratio changes over the day and between different links. We found evidence that the MFD was rather similar for days in the same year based on real data collected in Changsha, China. The difference between MFDs made of data from 2013 and 2015 reveals that the modification of traffic control can influence the MFD significantly. A macroscopic fundamental diagram could also be drawn for an area with incomplete data gained from a sample of loop detectors. An MFD based on incomplete data can also be used to monitor the emergence and disappearance of congestion, just as an MFD based on complete traffic data.


INTRODUCTION
Dynamic management of vehicle flows on a road network requires both monitoring and control instruments.In this study we focus on urban roads.The more traditional, still effective way to control urban road traffic on intersections is the application of traffic signals, which often makes use of vehicle detectors on the lanes leading to the stop line or leaving the intersections.In many cases congestion on a specific road may have an impact on larger parts of the network through gridlock and spill-back of queues on intersections.These phenomena make it necessary to base traffic network management on the macroscopic state of the whole network instead of single links (e.g.Giglio and Minciardi [1], van Zuylen et al. [2], Le et al. [3]).
The macroscopic traffic state in an urban road network can be characterized by just two overall parameters in a simplified way.One parameter characterizes the (weighted) average flow of vehicles traveling over the links in the network.The other one represents the (weighted) average density of vehicles in the road network (e.g.Daganzo [4], Daganzo and Geroliminis [5]).Queues remained on the intersections after the end of a green phase will block the traffic flow for a following signal phase.If this happens, the average flow even diminishes with growing traffic densities.This pattern has been observed in some empirical data (e.g.Geroliminis and Daganzo [6]) and several simulations (e.g.Mühlich et al. [7]).
At present, more sources of traffic data are available, but in most cases those sources are not sufficient to provide all data needed for the determination of an MFD.The objectives of this article are to show 2008 [5], Daganzo and Geroliminis 2008 [6], Yuxuan et al. 2014 [11]).In this diagram, the traffic state is characterized by the sum of the weighted average traffic flow and the sum of the weighted average traffic density.This characterization of the traffic state makes it possible to diagnose the emergence of congestion and choose measures to mitigate traffic problems, e.g. by redirecting flows to areas with spare capacity (e.g.Daganzo 2007 [2], Gayah and Daganzo, 2011 [3], van Zuylen et al. 2014 [2], Ortigosa and Menendez, 2014 [12], Le et al. 2015 [3]).
The applicability of the MFD is basically limited to networks that are homogeneous, i.e., whose traffic conditions are similar on the different roads of the network and the data is obtained by the same method, i.e., similar locations and detectors (Geroliminis and Sun 2011 [13], Geroliminis and Sun 2011 [14], Buisson and Ladier 2009 [15], Knoop et al 2013 [16]).
Generally, it is assumed that this relation in the MFD -between weighted average density and weighted average flow -does not depend on traffic demand and is stable in time, although it depends on traffic control and network geometry.We observed that in reality this relation has a lot of 'noise', i.e., deviations from a real function.As Leclercq et al. (2014) [17] showed, MFDs are an envelope of possible traffic states in practice.This is partly due to the inhomogeneity of the network and partly to the stochastic character of the saturation flow [18]: the flow at a fixed density fluctuates, owing to the variation of the saturation flow rate, i.e., the change of the number of vehicles that can pass a signalized intersection per time unit during the green phase.We denote the length of link i by l i , the outflow per link by q i , and vehicle density (number of vehicles per length of road) by k i .Then the weighted average flow q w is defined by The outflow q i can be measured by the loop detectors of a traffic control system.In the study area in Changsha volumes were collected in five-minute periods by the SCATS network management system.
The other main parameter in the fundamental diagram is the network density.In an urban area the corresponding characteristic of traffic density is the weighted sum k w of the number of vehicles per kilometer queuing or driving on the relevant roads.
The MFD is considered to be a useful evaluation and control tool for networks with a homogeneous flow pattern.In practice, this condition strongly restrains how an MFD can be derived for an urban network from measured traffic volumes and GPS data from probe vehicles and what the properties of the MFD are.It also demonstrates that even if some of the detectors are not operational, it is still possible to determine an MFD for the links combining detectors with GPS data.We show with this empirical data that the MFD is not a stable relationship, but may change over time depending on traffic management measures, such as traffic control.
In the next section, the macroscopic fundamental diagram (MFD) will be introduced, along with the way in which network characteristics can be used for monitoring and control purposes.The case study area will be described in Section 3, followed by the method used to match GPS data to the digitized network in Section 4. The combination of detector data (for traffic volumes) and GPS data provided by probe vehicles (taxis) makes it possible to determine the shape of the MFD as shown in Section 5. Section 5 discusses also the ways in which the proportion of taxis in the total traffic flow can be determined.Section 6 will show how the MFD was determined for different days of the week in April 2013 and 2015, and will demonstrate that traffic control apparently influenced the shape of the MFD.Section 7 will show that even where data availability is limited, an approximated MFD can be derived, and this limited data can still serve for dynamic monitoring of the traffic state.Section 8 will discuss the shape of the MFD derived for Changsha and present the conclusions of this article.
The data used to illustrate the methodology was collected in Changsha (PR China) in April 2013 and 2015.Traffic volumes were counted by loop detectors that are a part of the SCATS network traffic control system (one detector per lane for all signalized intersections).The probe vehicles data originated from more than 7000 taxis with GPS, monitored once every 30 seconds.

Definition of MFD
Since the 1970s, the total distance traveled (TDT) and the total time spent (TTS) in an urban network have been used as evaluation criteria to compare different traffic control strategies (e.g.Godfrey 1969 [8], Holroyd and Hillier 1971 [9], Olszewski et al. 1995 [10]).It was rather easy to characterize the network traffic state, because flows could directly be measured by detectors while journey times were measured by probe vehicles.
An alternative and equivalent method that is frequently used at present is the macroscopic fundamental diagram, MFD (e.g.Geroliminis and Daganzo collected from taxis.Because they did not have any traffic volume data, they had to make assumptions concerning the proportion of taxis in total traffic.They used a simulation in which they assumed a constant percentage of taxis, but even in that case, due to stochastic effects, the calculation of this percentage on the basis of simulated flows led to a rather large uncertainty in the outcome.Ambuhl and Menendez (2016) [27] proposed a methodology for estimating the MFD based on both loop detector and floating car data.The estimation error was significantly reduced compared with the case where only one data source is used.Lu et al. (2013) [28] combined taxi GPS data and visually-counted traffic volumes on a network in the central business district (CBD) of Changsha (PR China).They were able to determine the proportion of taxis in the total flow and to derive an MFD, but due to the short time period (two hours), only a part of the MFD could be drawn.
The study described in this article uses GPS data collected from more than 7000 taxis in Changsha and loop detector data generated by the SCATS traffic control system.Our study deals with another part of the city's road network than the study by Lu et al. (2013) [28].For our article, more detector data was available.The data collection and analysis took place over two longer observation periods (4 times 24 hours in April 2013 and 3 times 24 hours in April 2015 versus 2 hours only in Lu's study).
A problem that loop detector traffic monitoring faces in urban areas is the fact that road works and heavy traffic often result in the loops' wires getting broken, and subsequently loop detectors in the road surface are no longer operational (Li et al. 2014 [29]).In the network studied in this article the detectors were all giving reliable traffic counts.In practice, necessary and reliable traffic data is often not completely available.To be able to deal with such situations, we analyzed what the possibilities would be to obtain an MFD if a number of detectors did not work well.One research question is whether it is possible to derive a well-defined and useful MFD from probe vehicles and partially failing loop detectors.Chapter 7 gives more details about this.
The loop detectors of the SCATS control system cannot give a reliable traffic density.If loops are located in the middle of a link, the traffic density k i might be estimated from the occupancy of the loops (Geroliminis and Daganzo 2008 [5], Buisson and Ladier 2009 [15]), but SCATS detectors measure occupancy just at the stop line and no clear relation exists between this kind of measurement and traffic density (Saeedmanesh and Geroliminis 2015 [30]).One possibility is to derive the number of vehicles on a road section from the number of probe vehicles, if the proportion of probe vehicles in the total flow is known.its applicability.Most networks experience tidal flows, and congested sections co-exist with uncongested links where traffic runs in the opposite direction.Some researchers have proposed to give the MFD a third dimension which would characterize the variability of the accumulation over different links of the network (e.g.Knoop et al. 2013 [16]).This relation between k w and q w shows a maximum q w at a certain level of average density in several existing studies.Several studies (especially simulations) show that the q w decreases at average densities higher than a certain threshold.The decreasing branch of the MFD is assumed to be the consequence of the fact that more traffic in the area reduces q w because spill-back emerges and intersections become blocked.This status in the MFD can be used as a warning and as a reason to limit the inflow into or increase the outflow from certain areas.

Data provision for the MFD
At this moment a great amount of traffic data is becoming available; it is generated by, for instance, loop detectors, GPS-equipped probe vehicles, mobile phones, Bluetooth, and automated number plate recognition cameras (ANPR) (Herrera et al. 2010 [19]).There are clear differences in the usability of such data collection tools on freeways and on urban roads.For travel time measurements, Bluetooth scanners are more suitable for freeways where mainly cars are driving and most trips between two scanners have no interruptions.On urban roads, cyclists and pedestrians are also detected, just as bus passengers, and the deviation and stops between Bluetooth scanners are more frequent than on freeways.The trajectories obtained from probe vehicles with GPS can easily be analyzed and outliers can be removed for the calculation of travel times (Li et. 2011 [20]).Link travel times can be combined with traffic volumes to estimate traffic densities (Edie 1965 [21], Du et al. 2016 [22]).
Courbon and Leclercq (2011) [23] and Leclercq et al. (2014) [17] evaluated different methods to determine the MFD.They tested those methods in a simulation environment.They could keep network loadings homogeneous.The combination of loop detector data and probe vehicles appears to give valid results in these simulation studies.Nagle and Gayah (2013 and 2014) [24] [25] discussed the use of probe vehicles.They concluded that the percentage of probe vehicles had to be known in advance, which was not the case in their study or in most other studies.In general, the percentage of probe vehicles is not only unknown but also depends on the time of day, route choice of the drivers, and traffic conditions, as we will show in this article.Ji and Geroliminis (2012) [26] studied the road network of Shenzhen using GPS data and 26, 2013) were processed for further analysis; afterwards the data obtained during three days in April 2015 was analyzed.Traffic volumes generated by the loop data were recorded every 5 minutes.

MAP MATCHING OF THE TAXIS
Several methods have been developed to map GPS data on a digitized network.Some of those methods map trajectories on separated links of the network.The methods estimate routes through a network, even if location registrations on some sections of the route are missing (e.g.Wei et al. 2013 [31]).This is especially suited for sparse GPS data that is collected with a low frequency.The data available in Changsha is collected once every 30 seconds, which guarantees that on each link of the route one or more locations registrations are made, as long as links are longer than about 300 meters (assuming a cruising speed of 40 km/h and an additional journey time of 2 seconds for each turning into a perpendicular road).The example in Figure 2 illustrates that indeed taxis can be followed from link to link.However, since the length of the links is in general shorter than 500 m, most taxis are counted only one or two times on a link.
The other mapping method searches the closest link for each GPS registration.This is the method that was applied in this study.The GPS positions of the taxis were mapped on the road network by the shortest distance method.The links were represented by one or more straight link segments, and each link segment was defined by a center line.
The number of taxis driving along a certain link could be counted; the dwell time on a link (Nagle and Gayah 2013 [24]) and the density of taxis could be calculated (Lu et al. 2013 [28]).Taxis stopping for more than 2 minutes at a position along a link where no queues existed were eliminated from further analysis, because apparently there was a change of passengers, or the driver took a rest.
Because the length of the links is relatively short and the registrations are only made once per 30 seconds, most taxis are only registered once or twice on a

DESCRIPTION OF THE CASE STUDY AREA
The methodology to determine the MFD from loop detector counts and taxi GPS was applied in a study area of about 2 km 2 , close to the central business district (CBD) of Changsha.Changsha is the capital of Hunan province in south-central China and has a population of about 5 million people.Car ownership (more than 1 million motor vehicles in 2015) is rising rapidly, and congestion is already a serious, growing problem.Figure 1a shows a map of the center of Changsha and the study area, and Figure 1b shows the Google Earth map of links used for map matching.Some links were approximated by creating two link segments in order to take road curves into account.
Traffic in the city center is controlled by a SCATS network control system.SCATS is designed to provide traffic-responsive traffic control.It uses occupancy and flow rates measured by loop detectors close to the stop lines on the signalized lanes of intersections.Due to frequent road works in Changsha, many loop detectors had been damaged -giving no counts at all -and several gave counts with an unlikely distribution of volumes over parallel lanes (Li et al. 2014 [29]).After a check of the loop data across the whole network, an area of 1.2 x 1.9 km 2 with 12 signalized intersections was selected in the south of the city center, where all 175 loop detectors were working well and gave reliable output during most of the day.This area is shown in At the same time, GPS data from more than 7000 taxis driving around the whole network of Changsha were recorded every 30 seconds.On average, about 1000 of these taxis were driving in the study area during day time.The GPS data was mapped on the roads of the study area.The mapping procedure is described in the next section.The taxi GPS device gives information about the taxi's position and speed, but no information about the presence of passengers.
We selected the data generated during one particular day (April 25, 2013) for the development of the methodology and the primary analysis.The data collected on three other days of the week (April 23, 24, The summation for j!Dt is the summation for all vehicles that pass the end stop line in the time period Dt.Those times are estimated from the measurements of the taxi positions on the link itself and on the preceding and following links as visualized in  The ratio of taxis to all vehicles is determined according to Formula 3. As for average traffic density, it can be calculated by dividing the taxi density by the taxi ratio as shown in Formula 4. The taxi density is also obtained from GPS data.The estimated numbers of cars and taxis are shown in Figure 4b. In Figure 5 the average proportion of taxis on all roads of the network is shown by a fitted curve connecting averages over two-hour periods.It also shows a 95% range aggregated over two-hour time intervals.During daytime the taxi ratio is lower than at night.For link, sufficient to estimate their dwell time, but insufficient to determine queuing behavior in detail (Li et al. 2014 [29], Ramezani and Geroliminis 2015 [32]).

COMBINING TAXI AND LOOP DETECTOR DATA 5.1 Drawing the MFD on the basis of loop detector and probe vehicle data
Loop detectors that are located close to the stop line of the intersection cannot give reliable traffic density.However, we can use the ratio of taxis in total traffic which passed the loops to estimate the link density as described below.
The MFD is the relationship between the average weighted flow q w and average weighted density k w , as defined in Equations 1 and 2. Taxis that left a road segment during a five-minute time interval and continued along an adjacent road were counted as taxi volume passing one of the loop detectors at the end of the link.The ratios of taxis in total traffic (d i ) which pass the loop detectors are firstly calculated based on the loop data shown in Equation 3.

all vehicles volume th t passes the loop on link i each 5 minutes taxi volume th t passes the loop on link i each 5 minutes a
The density during time period Dt on link i is calculated according the definition by Edie (1965):

A further analysis of taxi ratios using the Student's t-test
We investigated the difference in taxi ratios on congested and uncongested roads.In general, taxis are less present on congested roads, especially during peak hours.It is likely that taxi drivers are aware of congestion and avoid such roads.A spokesman from the Changsha Traffic Police confirmed this and stated that this phenomenon also made it difficult to determine travel times on congested links from taxi GPS data.The literature that reports on this possibility is based on simulations (e.g.Gayah and Daganzo 2011 [33]).The result of the simulations is that route adaptation reduces the congested branch of the MFD.The real data that we use confirms that taxis are less present on congested roads.
XinShao Lu suffers most congestion during peak hours, compared with WanFu Lu and XiangFu Lu.The taxi ratios are lower than those on WanFu Lu and XiangFu Lu.During peak hours, the taxi ratio on Xin-Shao Lu is very low, only 0.03 to 0.05, as can be seen in Figure 6.
example, the percentage of taxis ranges from 13% to 30% in the night period from 22:00 to 7:00, while it reaches only 5% to 12% from 7:00 to 22:00.At peak hours, the ratio is much lower than at off-peak times.For example, from 7:00 to 9:00 during the morning peak and from 17:00 to 19:00 during the evening peak, only around 6% to 7% of vehicles were taxis.In daytime, the absolute number of taxis remains rather stable while the volume of other vehicles increases quickly; this results in a low taxi share.At night, both the volumes of taxis and other vehicles decrease, but the number of other vehicles decreases more quickly, so the proportion of taxis at night is higher than in daytime, as shown in Figure 5.The analysis of the statistical significance of the differences in taxi ratios will be presented in the next Section 5.2.
Furthermore, taxis are less present on congested roads: a smaller percentage of taxis drove during peak hours on critical links.Therefore, it is necessary to estimate the percentage of taxis locally and dynamically if such data are used for the determination of the traffic state.Lu were also significantly lower than on Xiang Fu Lu, except at 9:05~10:00 (p value is 1.08•10 -1 ), 12:05~13:00 (p value is 1.46•10 -1 ), 13:05~14:00 (p value is 3.25•10 -1 ) and 19:05~20:00 (p value is 2.40×10 -1 ) in the daytime.This is an indication that many taxi drivers avoid the most congested road to reduce the time spent on driving.For the MFD, this means that rerouted taxis will reduce their time spent (i.e., the accumulation) in the network without reducing average flow.The differences between the taxi ratios on different roads for 3 days in 2015 (e.g.XinShao Lu/WanFu Lu) are significant: except at off peak times (normally after 19:00, sometimes in the noon time 12:00~14:00) during the day, taxi ratios on XinShao Lu were significantly lower than on WanFu Lu.This indicates that many taxi drivers are familiar with the traffic condition so that they try to avoid the most congested roads to reduce the time spent on driving.
We use 4 kinds of significance test methods to check the correlation between congestion and taxi ratio, in addition to t-test.Firstly, the descriptive statistics of taxi proportion for XinShaolu, WanFulu, and XiangFulu are shown in Table 2.
We investigated whether these differences in taxi ratios at different times of day were significant on Xin Shao Lu by applying the Student's t-test.The null hypothesis H 0 was that there would be no difference between the taxi ratios during certain time periods at a level 0.05.For the time periods 7:05-8:00 (peak hour) and 6:05-7:00 (off-peak), the p value is 6.54×10 -5 .This indicates that taxi ratios in the period 7:05-8:00 were significantly lower than the ratios in the period 6:05-7:00.The p value is 0.0401 for the periods 17:05-18:00 (peak hour) compared with 12:05-13:00 (off peak), which also indicates that taxi ratios between 17:05-18:00 were significantly lower than the ratios between 12:05-13:00.
The differences between the taxi ratios in several time periods on different roads (e.g.XinShaoLu/ WanFuLu and XinShaoLu/XiangFuLu) are shown in Table 1.They are also significant, except at off peak times 13:05~14:00 (p value is 8.77•10 -2 ) and 19:05~20:00 (p value is 3.73•10 -1 ).During the day, taxi ratios on Xin Shao Lu were significantly lower than on Wan Fu Lu.Similar differences were found between XinShaoLu and XiangFuLu.The taxi ratios on Xin Shao  or a slightly decreasing trend.Such moments are important for traffic management purposes, since these traffic states are a warning that network management measures should be taken.The fitting curves are used to analyze whether the configuration of the MFD significantly varies in different days.The real, practical value of these fitted curves is simply that they describe the data, show the bending of the MFD, and provide a way to determine the maximum network flow and the critical network density.A polynomial of a higher degree slightly reduces the RMSE, but according to the Akaike criterion, [36] the increase of the number of model parameters does sufficiently improve the quality of the fit to justify the use of a higher degree polynomial for this data.
In The minimum and the maximum of the taxi proportion for XinShaolu are 0.0264 and 0.1244, which is apparently less than those for WanFulu and XiangFulu shown in Table 2.
The other 3 kinds of significance tests were the Mann-Whitney, Kolmogorov-Smirnov, and Wald-Wolfowitz, which were used to check the correlation between congestion and taxi ratio, in addition to the Student's t-test.The results are shown in Table 3.All significance values are much lower than 0.05, which indicates that the taxi ratios of the peak hours are significantly different from those of the off-peak hours.
Furthermore, most simulation studies in which probe vehicles are involved (e.g.Ji et al. 2012 [26], Leclercq et al. 2014 [17], and Du et al. 2016 [22]) do not take into account the fact that drivers of probe vehicles have different route choice strategies than other drivers.This might exaggerate the congestion level in these studies.

Drawing the MFD
Using these results, a macroscopic fundamental diagram for the network could be drawn in Figure 7.The data for this MFD covers 24 hours, including peak hours where congestion occurs.The common assumption is that congestion should be represented by a decreasing branch in the right side of the MFD.Such decreasing part of the MFD has been reported by several simulation studies and in the empirical study of Yokohama by Geroliminis and Daganzo [6].In several other empirical MFD studies this decreasing part of the MFD is lacking (e.g.Zhang et al. 2013 [34], Buisson and Ladier 2009 [15], Zheng et al. 2017 [35]).The MFD that was found in this case study shows a very limited decreasing tendency at higher traffic density.
We estimated a second degree polynomial with significant coefficients to model the points in the MFD; the data fit the increasing, flat, and slightly decreasing parts of the polynomial.The coefficients of the polynomial are given in Table 4.In Figure 7, situations can be identified where an increase in average traffic density coincides with a stabilization of average weighted flow,  7 Fit of a quadratic function:  The maximum average flow in 2015 is clearly higher than in 2013: this is probably due to the fact that between 2013 and 2015 the traffic police improved traffic signal settings in the SCATS system for the whole city.This was confirmed by a spokesman of the police.
In these years dynamic route information panels were also installed, which may have improved the balanced loading of traffic on the network.The maximum average flow increased by 19.7%, while the average density at which maximum average flow occurs also increased (by 35%), from 128 to 172 vehicles per kilometer.Indeed, the number of motor vehicles in Changsha has been growing in the past few years by about 20% per year.This significant improvement in road network capacity in terms of maximum average flow and storage capacity, achieved in only two years, is impressive.
The fact that the MFD derived in 2015 is different from the MFD derived in 2013 is consistent with the result of simulation studies of several other networks (e.g.Zhang et al. 2013 [34], Gayah et al. 2014, [37]), but as far as we know no empirical evidence for this phenomenon has been published in the research literature so far.

COMPARISONS OF THE MFD FOR DIFFERENT DAYS OF THE WEEK, AND BETWEEN 2013 AND 2015
The analysis in the previous sections was based on a single day.In order to verify whether the results could be generalized to other days, we analyzed more days in 2013 (April 23 to 26) and 2015 (April 20, 21, and 22).The fitting curves have almost the same trend for these 3 days, but the top throughput flow of the network slightly decreased from Monday to Wednesday, as shown by the fitting curves.
There is a critical density where the network has the maximum throughput flow (weighted average flow).If the volume in the network is higher than the critical density, more scatters are present and the output flows decrease, indicating that network efficiency is reduced, and there is more congestion in the network.
The observations made on the four days in 2013 and the three days in 2015 are shown in Figure 9; the parameters of the fitted curve are shown in Table 5.We investigated what would happen with the MFD that we determined for the full network of our study area if we eliminated some of the detector data and at the same time exclude the links without counts from the summation as given in Equations 1 and 2.
In order to investigate this, we divided our network into two subnetworks.One subnetwork consisted of 30% of the busiest links, and the other one consisted of 30% of the least busy links.Both networks still had 14 links that were monitored.The selection was based on the link volumes counted for 24 hours.The question we investigated was whether for each subnetwork an MFD could be derived, and whether it was possible (and useful) to use only the subnetwork's MFD to identify the moment that critical vehicle densities occur in the subnetwork.
The choice for the 30% of the busiest links and 30% of the least busy links was made based on the accumulated link densities during the morning and evening peak hours.
In Figure 10, three different MFDs are shown: the full network, the 30% of the busiest links and the 30% of the least busy links.We found that the use of only a sample of the links could generate an MFD, but that the shape and critical average traffic density for a sample of the links might be different from those for the full network, since the network size was different when we only considered a sample of all links.In all cases we found a similar pattern.The most interesting question is whether we could derive a certain critical average traffic density for each (sub)network that could be used for monitoring and control.Since in most cases there is a strong correlation between traffic conditions on different parts of the network, one could intuitively argue that it is sufficient to select a (representative) part of the network in order to characterize the whole network.If for all MFDs the times of day when the mean density is higher than a certain critical value are the same (which is specific for a certain subnetwork), it would mean that the MFD of a subnetwork may be used for traffic monitoring and network management purposes.In the investigated cases we found indeed that for all selections the times at which critical mean traffic density occurs are the same.Figure 11 shows the times where the average density is higher than a For the peak hours, the flows and standard deviation are given in Table 6.The distribution of the average weighted flows does not follow a normal distribution, so that the evidence that the flows are significantly different is based on the Mann-Whitney, Kolmogorov-Smirnov, and Wald-Wolfowitz tests.
Table 7 shows that the average weighted peak flows are indeed significantly higher for 2015 than for 2013.

MFD FOR AN INCOMPLETELY MONITORED AREA
Since a large proportion of the loop detectors fail in the parts of the Changsha road network outside our study area, we wished to find a way to obtain a similar diagnostic tool for the traffic state of road networks with failing detectors on parts of the roads; this alternative method should be applicable to imperfectly monitored networks.Keyvan-Ekbatani et al. (2013) [38] recently studied such a method for a simulated network.They found that an MFD derived from only a sample of the links within a network could also be used for traffic monitoring and control.In this section, we examine whether this is also the case for a real-life network generating empirical data.
Ortigosa et al. (2014) [12] carried out a similar analysis on a simulated network in order to determine on which links should traffic flows and densities be measured in order to obtain a useful MFD.They tried out several selections and optimized the number and location of detectors, aiming to derive an MFD that would be usable for traffic management purposes.They found that a 25% sampling of the roads was sufficient to achieve a 15% accuracy of density estimates.Apart from the fact that Ortigosa et al. [12] and Keyvan-Ekbatani et al. [38] used a simulated network with simulated detectors (no probe vehicles), we can use their work as a reference for our analysis of the effect of missing detector data on the determination of an MFD.We assessed the quality of an MFD based on limited network data with respect to the question whether such a diagram was able to identify the transition from free flowing traffic to network congestion.
A question may be posed whether incomplete traffic counts for roads that do not form a connected part of a network would have a large impact on the usability

CONCLUSIONS
The macroscopic fundamental diagram is an interesting framework for monitoring the traffic state in an urban network.Combining real traffic data collected from both loop detectors and probe vehicles with GPS enabled us to determine the shape of an MFD.In contrast to most studies on MFD, this study is not based on simulations but on measured data from real traffic in a network where drivers, probe vehicles, and traffic police each were operating in a specific way, which is nearly impossible to model in simulations.
In this study, a methodology has been developed to determine an MFD for an urban area.It uses traffic volumes counted by combining loop detectors and positions of taxis in the road network measured by GPS.The proportion of taxis in the total traffic flow is calculated as the ratio of taxis leaving a link and the total traffic counted by the detectors.The taxi density divided by the proportion of taxis in the traffic flow gives the total vehicle density.
An important practical problem is that loop detector data is not always fully available and its reliability has to be verified.In the case study carried out in Changsha we found that in the whole urban network about 25% of the detectors did not work at all and only 50% gave reliable counts.Since in the original MFD detector concept counts have to be available for all lanes of a road link, the real availability of detector data for the calculation of an MFD is even less than 50%.Only in the selected subarea of the Changsha network did all loop detectors work sufficiently well.
A test was carried out in this sub-network of Changsha.We obtained valid loop detectors for all road links.We showed first of all that it was possible to determine the proportion of taxis in the total traffic.This proportion varied over 24 hours and was much smaller in daytime than at night, because in daytime many more private cars use the roads.The proportion of taxis also depended on traffic conditions.Because taxi drivers apparently know where congestion spots are located, they seem to choose their route along uncongested roads, while other drivers seem not to display similar congestion avoidance.
The MFDs were drawn for different days in April 2013 and 2015.The maximum weighted flow of the network slightly differs between different days.A much larger difference exists between the 2013 MFDs and the 2015 MFDs.The results show that the network is sensitive to the increased traffic demand and improved traffic management; the network could process more traffic in 2015 than two years earlier.
The MFD has changed due to the changes in the network with respect to both traffic demand and traffic management measures.In that way, the MFD appears to be an evaluation instrument that shows differences in traffic performance and effectivity of traffic management measures.certain critical value for the different (sub)networks.Most transition moments for the morning and afternoon peak hours occur practically simultaneously within a margin of 5 minutes.The state of higher average density around 12:00 has a short duration and is not so clearly visible for all (sub)networks.
Apparently, an MFD determined for a smaller subnetwork can act as a monitoring tool to find the transition moments from free-flowing traffic to congestion.A full coverage of all relevant links in the whole study area with detectors is not necessary for this purpose.For cases where detector data was not completely available for all links, we investigated whether it was possible to draw an MFD using only data from a sample of the links.By using the data of only 30% of the links, we could draw an MFD similar to the one derived from 100% data.The critical densities appear to occur in all cases at the same moments during the peak hours, which means that an MFD based on data from a selected set of links may be used for monitoring and control purposes.
These results are very promising as regards the practical applicability of the MFD methodology to networks with a limited availability of reliable loop detector data and when relying on probe vehicles that behave differently from private cars.Therefore, this methodology is suited for application in other parts of the network in Changsha.It leads to a traffic monitoring methodology applicable to road networks with imperfect loop detectors that can display both problematic network areas and areas with some spare capacity.

Figure 1 -
Figure 1 -Study area position of a taxi Estimated entry and exit moments of the taxi

Figure 3 -Figure 4
Figure 3 -Example of trajectories of taxis (dashed lines) and other vehicles and the entry and exit moments of the taxisFigure4shows an example of these volumes.The taxi volumes calculated in this way using the GPS dataset and all vehicle volumes measured from loop data for all links together are compared in Figure4a.Both display the same trend.The ratio of taxis to all vehicles is determined according to Formula 3. As for average traffic density, it can be calculated by dividing the taxi density by the taxi ratio as shown in Formula 4. The taxi density is also obtained from GPS data.The estimated numbers of cars and taxis are shown in Figure4b.In Figure5the average proportion of taxis on all roads of the network is shown by a fitted curve connecting averages over two-hour periods.It also shows a 95% range aggregated over two-hour time intervals.During daytime the taxi ratio is lower than at night.For

Figure 5 -
Figure 5 -Proportion of taxis in the total vehicle flow (d i )

Figure 4 Figure 6 -
Figure 4 -a) Volumes of all vehicles and volumes of taxis; b) Numbers of taxis and estimated numbers of vehicles

Figure 7 ,
morning (7:30 and 10:00) and evening (16:05 and 19:00) peak hours are represented by red circles and points, which represent the congestion branch of the MFD.Early mornings and late evenings produce the traffic states in the lower left area of the MFD.

Figure 7 -
Figure 7 -The MFD matched with the data collected from the probe vehicles (taxis) and loop detectors in the study area for one day (April 25, 2013)

Figure 8
shows the data points for the observation days in 2015.

Figure 9 -
Figure 9 -Comparison between the April 2013 MFD and the April 2015 MFD MFD based on whole links MFD based on 30% congested links MFD based on 30% uncongested links

Figure 10 -Figure 11 -
Figure 10 -Three alternative MFDs, based on the full network, 30% of the busiest links and 30% of the least busy links the length of link i; tt j -the time spent by probe vehicle j on link i during Dt;

Table 1 -
Significance analysis using a t-test between taxi ratios on XinShaoLu and WanFuLu.H 0 is the hypothesis that the ratios are the same

Table 2 -
Descriptive statistics of the taxi proportion for three roads: XinShaolu, WanFulu and XiangFulu

Table 4 -
Parameters of a second degree polynomial describing the observed MFD in Figure

Table 5 -
Estimated parameters for the fitted parabolic curve f(x)=p 1 •x 2 +p 2 •x to all points per year with the 95% confidence bounds in brackets

Table 6 -
Descriptive statistics for weighted flow at peak hours in 2013 and 2015

Table 7 -
Significance test of the difference of average weighted flows in 2013 and 2015