PREDICTION OF COMMUTERS ’ DAILY TIME ALLOCATION

This paper presents a model system to predict the time allocation in commuters’ daily activity-travel pattern. The departure time and the arrival time are estimated with Ordered Probit model and Support Vector Regression is introduced for travel time and activity duration prediction. Applied in a real-world time allocation prediction experiment, the model system shows a satisfactory level of prediction accuracy. This study provides useful insights into commuters’ activity-travel time allocation decision by identifying the important influences, and the results are readily applied to a wide range of transportation practice, such as travel information system, by providing reliable forecast for variations in travel demand over time. By introducing the Support Vector Regression, it also makes a methodological contribution in enhancing prediction accuracy of travel time and activity duration prediction.


INTRODUCTION
An individual's daily activity-travel schedule is an important determinant of the temporal pattern of traffic demand.Understanding time allocation behaviour is not only indispensable in the development of fullscale model of daily activity-travel patterns but also crucial in forecasting temporal variation in travel demands and informing time-dependent transportation policies and measures, such as flexible working hours and real-time traveller information system [1,2].
A daily activity-travel schedule includes all the timing and duration of all the activities and trips in a day.Since individuals have limited time, they need to make joint decisions of the timing and duration of all the trips in a day.However, timing and duration have been largely treated separately in previous literature, such as the report in [3], and many existing studies only considered part of daily activity agenda, like the one reported in [4].Although the number of studies on daily activity-travel modelling, which covers both timing and duration prediction, has seen considerable increase in recent years, the accuracy of time prediction is determined by the performance of a whole model system, which usually includes activity pattern model and mode choice model besides time allocation model as well.It is argued that an ad hoc time allocation model may enhance the accuracy of timing and duration prediction.
This study is aimed at addressing the above issues by developing a model system to predict the typical daily commute activity-travel schedule, which includes both timing and duration of daily activities and trips.The reason for modelling commute time allocation is that commute trips are one of the major causes of congestion during the peak hours, e.g., according to the travel survey data in Beijing (2005), over 32% of the total trips during peak hours are commute trips [5].
The remainder of this paper is organized as follows.In Section 2 a review on activity-travel time allocation literature in general is presented.The data and the conceptual framework of the model are described in Section 3.This is followed by the model of travel departure time and stop arrival time in Section 4 and the model of travel time and activity duration in Section 5. Prediction of a commute activity-travel schedule is presented in Section 6.The paper closes with some major conclusions and a discussion of future research directions.

EXISTING LITERATURE
A number of studies investigated the timing for a wide span of activities.For example, Bowman and Ben-Akiva [6] predicted the departure time of individual's daily tours by using Multinomial logit (MNL) model.Bhat [7] modelled the departure time of shopping trip using an Ordered generalized extreme value (OGEV) model.Small [8] constructed an MNL time allocation model for home-to-work morning departure time choice.Small [9] compared MNL, OGEV and Nested logit model in time of day modelling.As for duration prediction, Bhat and Steed [10] estimated a Hazard model of urban shopping trip durations.Juan and Xianyu [11] predicted daily travel time using the Hazard model.
Instead of analyzing the activity timing and duration separately, joint analysis can capture the interdependence between timing and duration.Pendyala and Bhat [12] used a joint discrete-continuous model to investigate the relationship between maintenance activity start time and activity duration.Other examples, to name a few, are one report in [13] on work start time and work duration, and another study reported in [14] on relationship between commuting time and work duration, etc.Although these studies estimated timing and duration jointly, they were concerned about only part of the daily activity-travel pattern.
In recent years, several travel demand model systems have been developed to predict the entire daily activity-travel pattern.For instance, Bowman and Ben-Akiva [6] developed an activity-based travel demand model system, which consisted of daily activity pattern model, time allocation model, and mode choice model.Kitamura et al. [15] presented a sequential simulation approach to model daily activity-travel pattern, which includes the type, duration, location and mode of daily activities and trips.Guo and Bhat [16] constructed a model system to predict both workers' and non-workers' activity-travel patterns, including departure time, activity duration, destination and mode choice.Since these travel demand model systems jointly model time decision, mode choice and activity pattern, the accuracy of time prediction is determined by the performance of the whole model system.Therefore, instead of modelling the daily activity pattern, travel mode and time allocation with one model system, a separate daily activity-travel time prediction model is developed in this paper.

Data
This study employs data from a large-scale daily travel survey conducted in Beijing in 2005 [5].The sur-  1 presents all the socio-demographic and trip characteristic variables in the model, which have proven to be important influences in time allocation decisions in previous studies [3,7,17].

Analysis of the commute activity-travel agenda
In general, the daily commute activity-travel pattern (shown in Figure 1) is characterized by five different (sub-)patterns: a) Before-work pattern, which represents the activitytravel undertaken before leaving home to work in the morning; b) Home-to-work commute pattern, which represents the activity-travel undertaken during the morning commutes; c) Work-based sub-pattern, which represents the activity-travel undertaken from work during daytime; d) Work-to-home commute pattern, which represents the activity-travel undertaken during the evening commutes; e) Post-work pattern, which represents the activitytravel undertaken after arriving home at the end of the evening commute.
There may be more than one tour and also stops in each pattern.However, in order to reduce the complexity of the model, only the first tour in each pattern and only the first stop in the home-to-work / work-to-home commute pattern are modelled in this paper.
As shown in Figure 2 the key time and duration values in daily activity-travel pattern include: Then, the modelling framework is set up, which is composed of four categories of models as shown in

TRAVEL DEPARTURE TIME AND STOP ARRIVAL TIME MODELLING 4.1 Alternatives in the models
In order to model the travel departure time and stop arrival time, the continuous time was divided into discrete time intervals.These alternative time intervals are chosen by analyzing the distributions of travel departure times and stop arrival times based on the survey data.Moreover, in order to analyze and predict the pattern of time allocation in the peak periods, two specific alternatives were set, in the morning peak hours (6:00 am-9:00 am) and two alternatives in the evening peak hours (16:00 pm-19:00 pm) for TDhome-work and TDwork-home model, respectively.Table 2 shows the alternatives for five departure time choice models and two arrival time choice models.

Ordered Probit model
As shown in Table 2, the alternatives in travel departure time and stop arrival time choice models are ordered time periods.Since MNL model, which is commonly used in discrete choice modelling, would fail to account for the ordinal nature of the dependent variable and have the problem of IIA (Independence from irrelevant alternatives) [18], this study will employ Ordered multiple choice model for departure time and arrival time modelling.
The Ordered multiple choice model assumes the relationship: , , , , , and where P j n ^h is the probability that alternative j is chosen as departure time of trip , , a is an alternative specific constant, Xn is a vector of the attributes of trip n, j b is a vector of estimable coefficients, and i is a parameter that controls the shape of probability distribution F. Therefore, F can have various shapes of distribution based on different value of i .
The Ordered Probit model, which assumes standard normal distribution for F is the most commonly used Ordered multiple choice model [19].The Ordered Probit model has the following form:

/
where P j n ^h is the cumulative standard normal distribution function.For all the probabilities to be positive, we must have

Estimation results
By using oprobit() function in Stata [20], both the travel departure time and the stop arrival time choice models are estimated and the results are shown in Table 3.
The results indicate that the older the commuter, the earlier the departure time of before-work trip, while old people tend to depart later for both hometo-work and work-to-home commute trips.Moreover, older commuters are also found to make home-towork first stop later than younger commuters.Gender affects the decisions of all the daily travel departure times and stop arrival times except stop arrival time of work-to-home pattern.Compared with women, men tend to depart later for work-to-home commute trip and post-work evening trip, while earlier for the other four types of trips (stops).In addition, the travel departure times and stop arrival times get later for commuters with higher income, except before-work departure time, which moves up with the increase in commuters' income.According to the survey data, people with higher income are older in general, and older people are more likely to leave home earlier for before-work morning activities, e.g., morning exercise.On the contrary, people with higher income are more flexible in terms of working schedule than those with lower income, which makes them have the luxury of leaving home for work later.
As for the trip-related independent variables, travel distance is proven to be significant for all the models except SAhome-work model.For before-work morning trip and home-to-work commute trip, the longer the travel distance, the earlier the commuter leaves home.On the contrary, the longer the distance, the later the work-based travel departure time, workto-home travel departure time, post-work travel departure time and stop arrival time of work-to-home pattern are.In addition, mode does not affect the decision on stop arrival time, but has impact on choice of travel departure time.The reason is that it is mainly the activity of the stop which leads to the stop that decides the stop arrival time, while it is not only the activity but also the travel time that affects the travel departure time.Furthermore, since the mode has in-fluence on travel time, then it has impact on travel departure time.
The results also show that the higher the speed of the mode, the later the before-work travel departure and work-to-home travel departure times are; while the higher the speed, the earlier the home-to-work travel departure time, work-based travel departure time and post-work travel departure times are.

SVR model
As one of the Support Vector Machines (SVM)--machine learning methods analyzing data and recognizing patterns, the Support Vector Regression (SVR) is widely applied in building a regression model with continuous dependent variable [21,22].We made the first effort of introducing SVR in estimating the travel time and activity duration.
Given a set of input-output data pairs , x y , l is the number of training samples), that are randomly and independently generated from an unknown function, SVM estimates the function using the following Equation [23]: ) where x U^h represents the high-dimensional feature spaces which are non-linearly mapped from the input space x.Here w denotes a parameter vector and b is the threshold [24].If the domain of output space y contains continuous real values, the learning problem then refers to SVR.Otherwise, if the interpretation y only takes category values, i.e., -1 and +1, it denotes Support Vector Classification (SVC) [25].
For SVR, the coefficients w and b in Equation ( 4) can be estimated by the so-called regularized risk functional: The first term w 1 2 2 is called the regularized term which is used as the measurement of function flatness.The second term R f emp 6 @ is the so-called loss function to measure the empirical error.C is regularization constant to determine the trade-off between the training error and the generalization performance.
Here, the f-insensitive loss function is employed to measure the empirical error: where L is the Lagrangian, and i h , * i h , i a , * i a are Lagrange multipliers.Hence the dual variables in Equation (8) have to satisfy the positive constraints.
, , , 0 The above problem can be converted into a dual problem where the task is to optimize the Lagrangian multipliers, i a and * i a .The dual problem contains a quadratic objective function of i a and * i a with one linear constraint: . .0 By introducing kernel function , K x x i j ^h, Equation ( 12) can be rewritten as follows: where , K x x i j ^h is the so-called kernel function which is equal to the inner product of two vectors xi and xj in the feature space xi U^h and xj U^h, that is Without having to compute map x U^h, the kernel function is proven to simplify the use of mapping.Some popular kernel functions are the linear kernel, polynomial kernel and the radial-basis function (RBF) kernel.Using different kernel functions, one can construct different learning machines with arbitrary types of decision surfaces.In general, the RBF kernel, as a non-linear kernel function, is a reasonable first choice [26,27].Thus, the RBF kernel is chosen in this work: where v is a parameter which determines the area of influence this support vector has over the data space.
As user-determined parameters in the SVR model and the RBF kernel, C, f and v have to be defined before model estimation.To reduce the search space, referring to previous literature using SVM [28,29], it is recommended to use the constraints of the three parameters which attribute respectively to the range , C 2 2 5 5 ! f --6 @, and , 0 2 !v 6 @.Then the Equation ( 6) defines f tube (shown in Figure 4).The loss is zero if the predicted value is within the tube.If it is outside the tube, the loss is the magnitude of the difference between the predicted value and the radius f of the tube.Both C and f are user-determined parameters.Two positive slack variables p , * p are used to cope with infeasible constraints of the optimization problem.To get the estimation of w and b, Equation ( 5) can be transformed to a primal objective function (7).
. ., This constrained optimization problem is solved by using the following primal Lagrangian form: / optimal values of these three parameters are calculated by employing exhaustive search method.The values with which the minimum RRMSE of the model is generated are taken as the optimal values.RRMSE (Relative Root Mean Square Error) is a statistic for examining the goodness-of-fit of a model.It is calculated by

Comparing SVR and Hazard model
Instead of applying the Hazard model [30], which is often used in duration modelling, this paper introduces SVR and compares the performances of these two methods in TThome-work, TTwork-home and ADbefore&ADpost model.By using streg() function in Stata and the SVM Toolbox for Matlab respectively, both the Hazard and SVR models are estimated [20,31,32].The prediction accuracies of TThome-work, TTwork-home and ADbefore&ADpost models are shown in Table 4.As mentioned above, the lower value of RRMSE represents the higher goodnessof-fit of the model [33].Since all the RRMSE values of the three SVR models are accordingly lower than that of the Hazard models, SVR is chosen to be employed in the travel time and activity duration modelling.

Estimation results
Beside TThome-work, TTwork-home and ADbefore&ADpost model, the estimation results of the other five travel time and activity duration prediction models are shown in Table 4. RRMSE indicates that the forecast accuracies of all the models are acceptable, with that of ADwork-based model and ADbefore&ADpost model being lower than the others.The travel distance is an important factor in both travel time and activity duration decisions, as it has significant impact on all the dependent variables.Travel mode, a proxy of travel speed, has influence on all the dependent variables except travel time of work-based sub-trip as well as activity duration of before-work&post-work trips.One possible reason is that most of the work-based sub-trip and before-work&post-work trips are short-distance trips, and 55.40% of them are made by walking.This results in the insignificant effect of traffic mode on travel time and activity duration.Moreover, the results show that the departure time has impact on the travel time of home-to-work commute trip and work-based sub-tour, which indicates that the traffic condition at different times of day will affect the travel time.In addition, the influence of travel departure time (or stop arrival time) in activity duration models (including ADhwstop&ADwhstop model and ADbefore&ADpost model) shows that commuters will decide how long to stay in an activity partly based on the arrival time.

PREDICTION OF THE COMMUTE ACTIVITY-TRAVEL AGENDA
The model system developed can be directly applied to predict daily commute activity-travel schedule.
Here is an example: the first member in the family with ID number 0101040**** in our sample, is a 35-year old blue-collar worker, who earns 1501-2500 RMB every month.He had a typical commute activitytravel pattern on the survey day, which is composed of one morning and evening commute trip with one stop respectively, two work-based sub-tours, a before-work tour and two post-work tours.To examine the developed model system, only the first work-based sub-tour and post-work tour will be considered.According to the basic travel information of this respondent (shown in Table 5), his real daily commute schedule can by mapped in Figure 5.In comparison, his predicted daily commute activity-travel schedule is reported in Table 5 and Figure 6.
The results indicate that the typical daily commute schedule can be predicted by the model system developed in this paper with high accuracy.The maximum error for travel time and activity duration prediction in the above example is 4.69 minutes and 13.74 minutes, while the minimum error is 0.27 and 0.03, respectively.Furthermore, the hit ratio of the travel departure time and stop arrival time choice models is 57%.This indicates that there are some limitations with the Ordered Probit model, which will be detailed later.As for two of the models that produced wrong predictions, i.e., TDwork-home and SAwork-home, the forecasted time ranges are all later than the observed value of our sample data.One potential reason may be that this commuter did not choose a common schedule for work-to-home commute trip as well as the stop during this trip, i.e., these two time points are a little earlier than the regular times, which are commonly after 16:30 p.m. according to the survey data.
The results also reveal that the prediction accuracies of SVR models are higher than those of the Ordered Probit models.The main reason is that we divided the natural continuous departure and arrival time into several discrete time intervals, and the number of alternative time intervals is heuristically determined.More alternatives would potentially increase the prediction accuracy, but significantly increase the complexity of the model.

CONCLUSION
In this paper, a model system of commuters' daily activity-travel time allocation has been constructed.The Ordered Probit model is employed for travel departure time and stop arrival time forecast, and Support Vector Regression is used in travel time and activity F. Zong, J. Hongfei, P. Xiang, W. Yang: Prediction of Commuters' Daily Time Allocation duration modelling.Time allocation of the typical daily commute activity-travel pattern is predicted with the model system.
This work develops a model system that can be used to predict the time allocation decisions of all the activity-travel segments (including stops) in a typical commute day, and it focuses on temporal pattern learning and modelling, which cannot only enhance the model's prediction accuracy, but also makes it more suitable for analysis of time-related policies.By introducing the Support Vector Regression, it also makes a methodological contribution in enhancing prediction accuracy of the travel time and activity duration prediction.This work also serves as a foundation on which future models of full-scale daily activity-travel pattern can be built.
Study results can be applied to a wide range of Transportation Demand Management (TDM) policies, especially the measures aimed at reducing commute trips, such as flexible working hours.Flexible working hours provide commuters with a wide span of departure time choices.The model system can then be applied to predict the commuters' travel time based on each departure time alternative.The predicted travel time can be used to indicate the optimal departure time and thus the development of flexible working hour program targeting this departure time.This study is also essential for planning the development and construction of new transportation infrastructure by providing predicted temporal travel demands.Moreover, the temporal travel demand information can be integrated in the traveller information system to help people make better travel decisions.
One limitation of the current study is that the prediction accuracy of the discrete choice model is lower than the continuous model, as it divides the naturally continuous time into artificially defined time periods.To address this problem, we may try to use more continuous models instead of discrete choice models in daily time allocation.For example, we may employ SVR   to examine how long it will take for a commuter to arrive to the first stop after leaving home for work.Then the stop arrival time of home-to-work trip can be calculated with travel departure time of home-to-work trip + this time interval, instead of being modelled with a discrete choice model directly.By changing the discrete models to continuous models, the prediction accuracy of the daily time allocation model system can increase.

Figure 4 -
Figure 4 -Parameters for Support Vector Regression

Figure 5 -
Figure 5 -Observed daily commute time allocation

Figure 6 -
Figure 6 -Predicted daily commute time allocation

Table 1 -
Variables and statistics based on survey data Table

Table 2 -
Alternatives in the departure time and start time choice models

Table 3 -
Estimation results of the departure time and arrival time choice models

Table 5 -
Observed and predicted values for daily commute time allocation 1 "-" represents that there is no significant value for the item.