ESTIMATION OF ORIGIN-DESTINATION TRIP MATRICES FOR SMALL CITIES

The paper presents a model of data assessment for the requirements of a classical four-step model of traffic demand in individual traffic in small cities. The procedure is carried out by creating an initial origin-destination trip matrix using data from the traffic count and by defining the average rate of trip generation within single households. The research applied fuzzy logic for the correction of the initial trip matrix. The paper also presents the recommendations for defining the borders of traffic zones, as well as the locations of traffic counts. A flowchart has been used to show a summarized presentation of the proposed model. In the last part of the paper the model was tested on an example of a smaller city in the Republic of Croatia.


INTRODUCTION
In the countries with lower GDP per capita there are many traffic problems caused by disorganized and isolated solving of a problem not taking into consideration the entire picture and consequences over a longer period of time.First of all, no traffic studies are carried out, which are expensive and time-consuming and do not provide fast, politically measurable effects.
This paper presents one of the possibilities to obtain, at relatively low costs, the data necessary in the application of the classical four-step model of transport demand.One starts from the assumption that with the data obtained by the traffic count method and the development of the simulation model one can estimate the initial origin-destination (OD) trip matrix which is approximately equal to the actual origin-destination trip matrix by passenger cars.The quality of the initial origin-destination trip matrix can be affected by the definition of parameters for the determination of boundaries and facilities of the traffic network zone.By using fuzzy logic it is possible to correct the initial origin-destination trip matrix whose assignment to the traffic network yields results that minimally deviate from the data obtained by the traffic count method.
The calibration of the gravity model parameters is one of the first methods of evaluating origin-destination matrices from the data on the traffic count [1].The basic idea consists in setting a special form of gravity model and analysing of traffic flows on the links after their assignment on the network.After the calibration of the gravitation model parameters, which is applied very often even today, Wilson [2] proposed the use of the method of maximizing entropy for the estimation of the origin-destination matrices.In Bayesian probability the principle of entropy maximisation assumes that with the known restrictions the probability distribution which best describes the current condition is the one with the highest entropy.The assessment model of the origin-destination matrices by application of the maximum likelihood was first applied by Spiess [3].The model is based on the existence of the sample matrix which contains "a priori" data on the matrix.The sample matrix is obtained by the application of the sample method for every origin-destination pair assuming the Poisson distribution of unknown mean value.The objective of this model is to foresee the origin-destination pairs of matrices by means of the known matrix of constant values and data on the traffic load of the links [3].In [4] and [5] Cascetta describes the generalized method of least squares (GLS), and somewhat later also Bell in his paper [6] solves the problems of applying the GLS method, primarily in the form of unequal restrictions imposed in the model.For the first time in the model of estimating OD matrices this model takes into consideration the existence of error in the traffic count by using variance-covariance matrices.A number of authors have later improved the GLS method, mainly by including the time dimensions, i.e. by varying the transport demand in time.Thus, Cremer and Keller in their paper [7] use Kalman's filter, and later Kang in [8], Ashok and Ben Akiva in [9], and Zhou and Mahmassani in [10].Nihan and Davis estimate the OD matrix with the time-variable elements using the recursive GLS method [11].Hyunmyung et al. in their paper [12] on an example of a hypothetical city assess the origin-destination matrix by using the genetic algorithm, and Gong in [13] and Kim and Chang in [14] by the application of neural networks.

Determining of boundaries and facilities of the traffic zones
Traffic zones represent land areas of approximately homogeneous purpose and aggregated trip origins and destinations.The definition of the number of zones and their boundaries require apart from theoretical also empirical knowledge and it is necessary to find a good balance between the number of zones on the one hand and the costs of collecting input data and their processing, on the other.
By multiple testing using the simulation tools, in this paper the following rules for defining of zone boundaries have been applied: 1.The zone boundaries should define the areas of approximately homogeneous characteristics.Thus, the error of the model is reduced due to small deviation of the factor of trip generation obtained by the count between households.

It is necessary to separate zones containing only
residential objects from big attractors and producers of trips.

Special zones should be defined for big attractors
and producers of trips: primary and secondary schools, hospitals and clinics, faculties, factories, administrative city centres, business zones, airports, shopping centres, parking lots and garages (Park&Ride systems), transport terminals, big recreation centres and parks, etc. 4. On all entries into/ exits out of the city outer zones have to be defined, whose trip attraction and production will be obtained by traffic count.Defining of the production and attraction of outer zones is an important datum in the model since transit traffic accounts for a high share in the total traffic load.

If possible, adjust the zone boundaries with statisti-
cal zones on the basis of which the census is performed.6.It is necessary to select several residential zones of similar characteristics (newer construction, older construction, city centre, city periphery, residential buildings, etc.) in which the traffic count will determine the trip production and attraction factors.One of the selection criteria is that with zones selected in this way, there is no transit traffic.

Determining locations of traffic counts
Traffic count is crucial input datum for designing the model of evaluating the OD trip matrix, but also the key element of calibration and validation of the model.For the model which has been developed within the framework of this paper the locations of traffic counts have been determined at: 1. all input/output links of the observed traffic network.In this way a large share of traffic load of a certain city is collected and data about transit traffic are obtained; 2. entries/exits of special zones of large attractors and producers of trips; 3. entries/exits of the zones of typical residential character (city centre, periphery, apartments, houses -new constructions/old constructions).In this way the data on traffic count provide different factors of production and attraction of trips for different types of residential zones.4. intersections of high traffic load.This traffic count serves for the calibration of the model and comparison of the data about the traffic load of the links obtained by the model and those obtained by counting.

intersections of connection and collector roads.
This count is also used for the calibration and validation of the proposed model.6. several arbitrary locations.It is performed for the sake of quality of assessing of OD matrices and comparison of the modelled traffic load on the links with the actual ones.

MODEL OF ASSESSING THE OD MATRIX OF TRAVELLING BY PASSENGER CARS
This section explains the procedure of developing the model of assessing the OD matrix of travelling by passenger cars by using data from the traffic count, without performing the surveys of households and motorists.The OD matrix of individual transport of a small city will be assessed.The assessment of OD matrices of public transport is very complex, there are no rel-L.Novačko, Lj.Šimunović, D. Krasić: Estimation of Origin-Destination Trip Matrices for Small Cities evant studies in this area and it is necessary to use "a priori" OD matrix.However, small cities usually have no public transport or it is very scarce regarding the number of lines, and this model can satisfy their needs to the largest extent.

Model of assessing initial OD matrix
The greatest difficulty is the assessment of generating and attracting trips for a certain zone without performing the survey of the households.As user attributes for each zone the number of houses and the number of apartments are defined as input parameters of the sub-model of generating trips.The number of houses is determined by the application of the cadastre basis and digital orthophoto maps.By combining these two platforms it is possible to register from the cadastre the cadastre plots; however, since the boundaries of these plots are not completely reliable, digital orthophoto maps are used as verification of the actual situation in the field.After having performed the counting of houses, in each of them the number of apartments was determined.This has to be performed by going out on the terrain.When the results of the counting of houses and apartments are summed up, the obtained data are compared with the data from the census.By implementing this method of determining the number of households per single zones the costs and the time of working on the model are substantially reduced.
The results of the traffic counts at external entryexit zones even in case of big producers/attractors of trips, that are located in separate zones, will represent the attraction and production of trips of these zones in the study period.It should be emphasised that the obtained results of the counts have to be multiplied with the average number of passengers in the vehicles in order to be used as the production or attraction of a zone.In other zones the attraction and production of trips will be determined by multiplying the number of households with the factors of production and attraction.
Production or attraction of trips of single zones in this model is obtained in the following manner: 1.For zones consisting of households: ) where: Pi -production of zone i; Ai -attraction of zone i; pk , ps -factor of production for residential houses, apartments; ak , as -factor of attraction for residential houses, apartments; BK ,BS -number of residential houses, apartments.
Because of the precision of output results of the model the residential houses and apartments can be divided according to the location (centre of the city, periphery), age of construction (new, old construction) or some other criteria.2. For the zones of big producers and attractors (factories, primary and secondary schools, hospitals, faculties, healthcare centres, shopping centres, etc.) the data are obtained by the method of traffic count that are multiplied by the factor of vehicle occupancy.
) where: BPiz -traffic count at the exit from the zone; BPul -traffic count at the entry into the zone; fzv -factor of vehicle occupancy.Due to the lack of relevant studies in Croatia (which is the characteristic of other similar countries as well) the assessment of trip generation factors in a minor city has been made by isolating several zones exclusively of residential characteristics (zones with family houses and zones with apartments in buildings without the presence of transit traffic).When boundaries of such zones are defined, the vehicle count is done for vehicles entering and exiting from these zones.By multiplying the number of vehicles in the morning and afternoon peak periods with the factor of average vehicle occupancy and then by dividing by the number of houses/apartments the estimated factors of attraction and production are obtained.The counting is performed in several zones, in order to determine the statistical dependence, i.e. reliability of using these data in further modelling.
After balancing the production and attraction of trips the gravity model as part of a sub-model of trip distribution is made.The calibration of the gravitational model, i.e. estimation of parameters of trip costs functions is performed in iterations, and the function of trip costs distribution is used as the input datum.According to [15] for unimodal trips the combined or Tanner function of trip costs distribution is recommended (Figure 1).(5) where: cij -value of the costs of trips between zone i and zone j; Uij -utility of travelling ( ); a, b, c -parameters that need to be assessed.
Based on the Tanner function of the trip distance distribution (travelling cost) the initial distribution of the number of trips in each class of distance in the diagram of trip distance distribution is defined.In the software tools PTV Visum which serves to make the traffic models, there is a module KALIBRI which is used in order to accelerate iterations and definitions of gravitation parameters: a, b and c.In each iteration a demand matrix is formed and it tends towards the distribution of trips according to the given distribution of the trip distance.The procedure is repeated until by applying linear regression the number of trips in relation to the travelling costs is adjusted to the given function of trip distribution according to the distance.Since in minor cities the transit traffic is often the carrier of traffic load, video recording of vehicles at the entries and exits of the transit traffic in the city is proposed.According to [16] in the cities of up to 5,000 inhabitants the transit traffic accounts for 45% to 65% of the overall traffic.
For the distribution of trips on the traffic network the simulation tools are used in order to accelerate the process which determines the number of trips towards and into each zone of the traffic network: a) Number of trips which go from zone i towards zone j: where: Oi -total number of trips generated in zone i; Dj -total number of trips attracted to zone j; j' -destination zones different from j. b) The number of trips that are attracted from zone j to zone i: where: i' -origin zones different from i.In the model, the initial calibrating of the gravitation model is performed so that after the trip distribution the initial assignment of trips to the traffic network is done (the recommendation is equilibrium assignment of trips since it requires the fewest input parameters) and on several more loaded links the data are compared with those from the traffic counts.This is only "rough" calibration which serves to identify major errors in the procedure of estimating the OD trip matrix and obtain-ing of the best possible results after trip assignment and OD matrix correction by means of fuzzy logic.If substantial deviations are noted, a reassessment of the parameters of combined function of generalized travelling cost should be started, varying the classes of trip distribution function according to the distance.
After assigning the OD trip matrix on the traffic network, the coefficient of determination R 2 , relative mean square error (RelRMSE), standard deviation, linear regression line parameters and GEH are determined.One of the problems that occur during the comparison of the traffic data is the lack of uniformity of their values.For instance, in case of data about the traffic load of the roads, it is difficult to compare the data from the motorways (large numbers) and the data from the local roads.As the solution to this problem Geoffrey E. Havers proposed the following empirical formula [1]: where: Oi -actual values obtained by measurements; Mi -values obtained from the model.
According to [1] GEH values lower than 5.0 are considered acceptable and represent good match of actual and modelled data.Also, at 60% to 85% of links the traffic load should have GEH lower than 5.0.

Correction of OD matrix by means of fuzzy logic
According to [17] the data on traffic count can be treated as fuzzy set since it is known that the trips originating in one zone vary by about 20% from day to day.In this way the fuzzy set is defined with its boundaries.
By determining in the previous step the initial OD matrix of travelling by passenger vehicles and its assignment to the traffic network links, the traffic load of links q is obtained: where: P -matrix of OD pair share on a certain link;

Figure 1 -Tanner and exponential distribution function of travelling costs
Source: [15] L. Novačko, Lj.Šimunović, D. Krasić: Estimation of Origin-Destination Trip Matrices for Small Cities t -origin-destination trip matrix; q -traffic load of links.Since there is a large number of OD matrices and combinations of value tij whose assignment to traffic links yields results that match the traffic count according to [18] by using the method of entropy maximisation the best of all possible corrected OD matrices is obtained.
) where: tij W -traffic demand on one OD pair of initial OD matrix; tij -traffic demand on one OD pair of the corrected OD matrix; p -number of non-negative elements of OD matrix.The drawback of the above formulations is that they assume that the data on traffic count are reliable, without deviations and without errors.In reality, the counting is performed in one relatively short time period and is prone to sampling errors.By using fuzzy logic this imprecision of counting is described and its deviation defined (e.g.10% of the counted values).The counted values of the traffic load are substituted by fuzzy set q u with bottom value of deviation s and upper value of deviation s (Figure 2) [18].The problem of entropy maximisation presented in (8) takes the following form now [18]: max q t q s q s + + ^^ĥ h hh so that: P t s q $ + = P t s q $ + = (12) where: q , q -maximal/minimal value of fuzzy set; s , s -variables of deviation from the data obtained by traffic counts.After assigning the initial OD matrix to the traffic network, the links on which GEH > 3 are isolated.On these links the correction is gradually performed by applying the fuzzy logic by means of entropy maximisation (Figure 3).
The flowchart in Figure 4 summarizes the steps of the described model of estimating OD matrices of travelling by passenger cars.

APPLICATION OF THE MODEL ON THE EXAMPLE OF A SMALL CITY
Model of estimating OD trip matrix proposed in the previous sub-chapter was tested on the example of the city of Čazma (9,000 citizens).This small city was selected primarily because of the availability of data on traffic counts, as well as because of its suitability for implementing the proposed model of estimating the origin-destination trip matrices by passenger cars.In the city there are also major attractors and producers of trips: shopping centres, primary and secondary schools, bus station, healthcare centre, factory, etc.Data on the traffic counts were used from the Concept of traffic system of the city of Čazma, which was done by the Faculty of Transport and Traffic Sciences in 2009 [19].
Based on the obtained data on traffic counts for determining the trip generation factors, three streets of residential character were selected in which traffic of transit vehicles is not possible, i.e. all trips have origin and destination in these zones.The trip generation factors are divided into three groups: trip generation factors in residential units in the city centre, trip generation factors in residential units of older construction on city periphery, and trip generation factors in residential units of newer construction on city periphery.Significant deviation of trip generation factors has been noticed, particularly of attraction factors in morning peak hour in relation to trip generation factors from the manual Trip generation [20].
After the carried out trip generation procedure, the sum of production and attraction per zones is not traffic counts (veh/h) fuzzy function membership degree 1 0 q -s q q + s

Figure 2 -Fuzzy logic set
Source: [18] Traffic counts Traffic Network

OD trip matrix
Trip assignment Links with GEH > 3 equal.The balancing of production and attraction can be realized in three ways: according to production, according to attraction, or according to mean value of both sums.In the model, modelling has been performed for every method separately in order to determine which approach is the best one.In this phase it comes to layering of the transport demand and 12 layers of transport demand are defined (Table 1) for which 12 OD matrices will be defined in the sub-model of trip distribution.Parameters from (5) a, b and c of the combined function of travelling costs affect significantly the trip distribution between zones.The percentage distribution of trip volume is entered as the input datum in relation to the travelled distance.By using the KALIBRI module in the software tools PTV Visum the parameters of the combined travelling costs function have been obtained (Table 2) with adequate functions of travelling costs distribution (Figure 5).

Fuzzy logic
By estimating the gravity model parameters it is possible to fill in the OD matrix cells.OD trip matrix can be graphically presented by trip desired lines between zones (Figure 6).
After making the initial OD trip matrix for every transport demand layer it is possible to assign trips to links on the traffic network by means of equilibrium and stochastic assignment of trips.After having as-  signed the trips, the deviation of the obtained values of traffic load on the links has been analyzed by means of the proposed model with those obtained from the traffic counts.The comparison of the obtained results was done by calculation of the coefficient of determination R 2 for every transport demand layer (Figure 7).
The layer which has the highest coefficient of determination R 2 , and the lowest relative mean square error RMSE in the morning and afternoon peak hours is then corrected by applying the fuzzy logic.Table 3 shows the results of the transport demand layers that featured the best results.In the morning peak hour this was the transport demand layer with the trip generation factors according to the proposed model, balanced according to the production and assigned by equilibrium method to the traffic network.In the afternoon peak hour the best results were featured by the layer also with the modelled trip generation factors, balanced according to the attraction and assigned by equilibration to the traffic network.After selecting the transport demand layer which best describes the current situation on the traffic net-work, parameter GEH has been determined for those links at which the data on traffic counts are available.In the morning peak hour only at seven links, and in the afternoon hour at six links GEH > 5 was determined so that the correction of the initial OD matrix by means of fuzzy logic was performed for links at which GEH > 3.At these links the correction of data by means of fuzzy logic in the software tools PTV Visum started to be gradually implemented.In the links attributes the values obtained by counting are entered as well as the tolerance of these values so that the fuzzy set could be defined.As the result of the implementation of fuzzy logic the corrected OD trip matrix is obtained, and it is then assigned to the traffic network by using the methods of equilibrium trip assignment.In the morning peak hour, by using fuzzy logic, the production zone changed approximately by 38%, and the attraction by 45%.In the afternoon peak hour the production zone changed on the average by 52%, and the attraction by 48%.
With the application of fuzzy logic in the model of the morning peak hour only on four links GEH remained higher than 4, the coefficient of determination R2 reached the value of 0.975, and relative mean square error RMSE the value of 0.112 (Figure 8).In the model of the afternoon peak hour on nine links GEH remained greater than 4, the coefficient of determination R 2 reached the value of 0.976, and the relative mean square error RMSE was 0.110.
Graphical presentation of traffic load of links after equilibrium assignment of corrected OD trip matrices is presented in Figure 9.In this paper the procedure has been developed which allows making of traffic models of smaller cities based on the data collected by traffic counts.The starting assumption is that the traffic load of roads represents the percentage share in the total number of trips from zone i into zone j which are realized through the observed traffic route.The results of implementing the proposed model on an example of a smaller city, presented in the paper yield satisfactory outcomes, particularly by using fuzzy logic for the correction of the initial OD trip matrix.
One of the major obstacles in performing the proposed model is the organization of human resources to collect and analyze traffic volumes on transport network.As it was shown in the model one of the major steps is calibration of parameters of gravity model.It is very important to predict trip length distribution on satisfactory level as it is input data for gravity model calibration.The separation of traffic network into zones can also be challenging, especially in areas with mixed trip purposes.In some cities there could be problems with determination of representative zones for trip generation rates prediction if it is not possible to isolate residential streets without transit traffic.
The developed countries have a long tradition in collecting traffic data through surveys and interviews of the household members, passenger car motorists and passengers in vehicles of public urban transport.As long as the developing countries do not understand the importance of continuous collection of data for the needs of traffic planning and modelling the application of one of the possible transition solutions presented in this paper is proposed.It will also be necessary in the future research to study the models for assessing trips made by public urban transport.

Figure 3 -Figure 4 -
Figure 3 -Application of fuzzy logic for the correction of initial OD matrix

Figure 5 -Figure 6 -
Figure 5 -Trip costs distribution function in relation to the distance for the demand layers JMP, JMA and JMSR

Figure 7 -Figure 8 -Figure 9 -
Figure 7 -Regression line when assigning the initial OD trip matrix in morning peak hour

Table 1 -
Transport demand layers in the proposed model

Table 2 -
Estimated parameters a, b and c of combined cost function for single transport demand layers

Table 3 -
Transport demand layers with highest coefficient of determination and lowest RMSE