A MULTI-CLASSIFICATION METHOD OF IMPROVED SVM-BASED INFORMATION FUSION FOR TRAFFIC PARAMETERS FORECASTING

With the enrichment of perception methods, modern transportation system has many physical objects whose states are influenced by many information factors so that it is a typical Cyber-Physical System (CPS). Thus, the traffic information is generally multi-sourced, heterogeneous and hierarchical. Existing research results show that the multisourced traffic information through accurate classification in the process of information fusion can achieve better parameters forecasting performance. For solving the problem of traffic information accurate classification, via analysing the characteristics of the multi-sourced traffic information and using redefined binary tree to overcome the shortcomings of the original Support Vector Machine (SVM) classification in information fusion, a multi-classification method using improved SVM in information fusion for traffic parameters forecasting is proposed. The experiment was conducted to examine the performance of the proposed scheme, and the results reveal that the method can get more accurate and practical outcomes.


INTRODUCTION
Traffic parameters forecasting is important for mastering the mechanism of the traffic system and realizing traffic flow control and guidance, the forecasting results can be used as the input of the traffic control centre for making active traffic control strategy and publishing traffic information to provide real-time effective information for helping travellers to choose better route [1].
Meanwhile, the modern transportation system has many physical objects and information factors, the information elements and the physical elements are fused together, and achieve information communication, coordination, optimal decision-making and control of Cyber-physical system (CPS).Its core is the organic integration of computation, communication and control technology to achieve real-time monitoring, analysing and controlling different scale interconnected physical system, and fulfil feedback of CPS to realize accurate identification of traffic physical targets [2].Thus, modern transportation system is a typical CPS.Therefore, the traffic information for traffic parameters forecasting comes from a large number of multisourced sensors.Existing research results show that the multi-sourced traffic information through accurate classification in the process of information fusion can achieve better parameters forecasting performance.The [3] presents a review of traffic information fusion approaches which is applied to traffic monitoring and forecasting, emphasizing that it is very important to use suitable and practical method for traffic information classification in traffic information fusion.As [4] states, traffic information fusion encompasses two dimensions: A. spatio-temporal semantics (traffic information point or section measurements), and B. aggregation level (single event or aggregated over a given period of time); these dimensions impose certain complexities to the problem of traffic information fusing from multi-sourced sensors.
Therefore, how to classify the multi-sourced traffic information and solve the multi-classification problem in the process of information fusion for achieving better parameters forecasting performance is worthy of a study.The ability to provide such traffic parameters forecasting performance is the result of phenomenal technologies and computational advances that have enabled researchers to collect data and subsequently predict at very high temporal resolutions [5].However, designed in SVM-based information fusion for traffic parameters forecasting, and our approach is formally validated using traffic parameters forecasting.Our protocol is also evaluated and compared with other methods.Our protocol leads to much smaller prediction errors and demonstrates its effectiveness from the results of those analyses.
The remainder of the paper is structured as follows: Section 2 gives the theory foundation of SVM and the shortcomings of typical SVM applied in traffic information fusion; Section 3 improves SVM for multi-classification and proposes a multi-classification method of improved SVM-based information fusion for traffic parameters forecasting; Section 4 validates our protocol and compares it against other methods; Section 5 summarizes the conclusions and discusses open research directions about information processing.

THEORETICAL FOUNDATION
SVM has excellent statistical and learning ability.Compared with the heuristic learning style and big experience included in neural network, SVM has a stricter theory and mathematical foundation without local minimization of the problem.Many problems with small samples, non-linearity, high dimension and local minimization in pattern recognition could be solved by SVM.SVM has become a new hotspot after neural network and has been applied to a wide variety of fields successfully, such as signal processing, regression analysis and function approximation fields and so on [12].
SVM is a pattern recognition method based on statistical learning theory.The connotation of SVM is that through the kernel functions of selected non-linear mappings, the input vectors are mapped to a high-dimensional feature space.
In Figure 1, x  SVM has two types: Support Vector Classification (SVC) and Support Vector Regression (SVR) [13].( ) x U of SVC has two options: -1 means not belonging to this class and 1 means belonging to this class; while ( ) x U of SVR could be any real number.Thus, traffic information fusion issue belongs to the type of SVR.An optimal separating hyper-program is constructed in this paper [14].Typical SVM process is illustrated in Figure 1.
traffic information is collected from a large amount of homogeneous and heterogeneous sensors.The collected information mainly includes four elements: people, cars, roads and environment, and presents multisourced, heterogeneous and hierarchical attributes.In a sense, traffic information classification is a typical multi-classification problem.
To solve this problem, some authors have processed this problem in the context of Bayesian framework [6].Some others have employed Kalman filtering technique [7] or neural networks [8] and system identification [9], and more recently a nonparametric paradigm has been adopted via kernel functions [10].These proposals are not suitable for traffic information classification and fusion except in some special situations (for some network configurations or with high detector coverage).
Some authors have also adopted typical SVM (Support Vector Machine) [11] for information classification.Although SVM is an excellent method for information classification, it cannot use it for traffic information fusion directly due to its hard-to-solve multi-classification problem and large-scale experiments on the samples which are difficult to implement.In addition, there is less relevant research of information fusion from the CPS point of view, despising the importance of the cyber system or the physical system.Cyber-physical system (CPS) is a highly hybrid system that deeply integrates the physical system and cyber system to solve the application problem more effectively.T-CPS is one of the specific CPS applications and depth coupling of CPS in Intelligent Transport System (ITS) and T-CPS uses the cyber physical perspective to solve the traffic problems.The status information of traffic physical entities map to the cyber system, then the information elements and the physical elements are fused together through 3C (Computation, Communication, Control) technology to realize information communication, coordination and optimal decision-making.Finally, the traffic physical entities are accurately identified, controlled and optimized using fused information with feedforward and feedback mechanism [2].
Thus, traffic information fusion of the cyber system in T-CPS is one of the key segments to achieve accurate information to influence the physical system.In details, traffic information classification in traffic information fusion is an objective existing problem due to the traffic information which is collected by heterogeneous physical sensors and must be classified for further information processing in the cyber system.Thus, traffic information classification is one of the typical cyber physical fusion steps in T-CPS.
The above shortcomings are alleviated and the problem of multi-classification is solved by analysing the characteristics of T-CPS information and improving SVM.Then, an improved multi-classification method is x 2 x n

Figure 1 -The process of SVM classification
Then the training result expression is described as follow: Although SVM has so many advantages, it also has two main shortcomings: 1) It is hard to implement large-scale samples by SVM.
SVM solves support vector by the aid of Quadratic Programming, but solving Quadratic Programming is related to n order matrix (n is the number of samples).Thus, it costs a large number of memories and running time when n is large.
2) It is difficult to solve multi-classification problem by SVM.Typical SVM only gives the two categories classification algorithm, but traffic information fusion needs to solve multi-classification problem.
On the basis of above statements, the general method is to decompose the multi-classification problems into many binary-class problems, and the consolidation of many binary-class SVMs became the most popularly used method for multi-classification problems [15].There are some commonly used multiclass classification methods based on SVM, e.g. one -against-one (OAO), one-against-all (OAA), directed acyclic graph SVM (DAG-SVM) and binary-tree SVM (BT-SVM).Among these methods, OAO and OAA are the most common methods.Although they are superior in classification accuracy, both require a great amount of calculation, they are especially used for non real-time applications [16].DAG-SVM and BT-SVM can give superior performance in computing speed because there are fewer binary-class SVMs tested if an unknown set is classified by a trained model.Besides, the number of binary-class SVMs in the training phase for BT-SVM is fewer than DAG-SVM.Thus, BT-SVM takes advantages of both efficient computation of the tree architecture and high classification accuracy of SVMs.These advantages make it applicable to some typical multi-classification problems, such as image restoration [17], tree species classification [18], fingerprint identification [19], eleven benchmarking classification [20] and so on.
To our best knowledge, using the BT-SVM to classify the typical multi-classification problem of Traffic information classification for information fusion has rarely been reported.Thus, BT-SVM is applied to our fusion model for traffic parameters forecasting.

BT-SVM
In this paper, the improved SVM is used to classify the information fusion nodes, improve feedback hierarchical structure of the original algorithm, and the cross feedback model is used for information fusing.The parallel algorithm of cross feedback can effectively shorten the training time of SVM algorithm and have good scalability.
For solving the problem of multi-classification, assume that the target has k types, construct a binary tree SVM, every leaf node corresponds to a type, every node which is not leaf node and has two degrees, corresponds to SVM.The binary decision tree SVM of traffic information k classification hierarchical structure is shown in Figure 2. The BT-SVM solves the multi-class problems with a binary tree in which every node makes a binary decision using the binary-class SVM [21] .
Thus, it takes advantages of both the efficient computation of the tree architecture and the high classification accuracy of SVMs.In SVM training, the training phase determines the architecture of the BT-SVM from knowledge of the training data-set.Then, a series of binary-class SVMs are placed at each non-leaf node Type No. i to train the classes.The tree should be as balanced as possible to reduce the layer of the tree.Therefore, fewer-node SVMs will be employed in the testing process.

The process of the traffic information fusion
Based on above BT-SVM, by combining the characteristics of T-CPS data, improved SVM based traffic information fusion model is to find a feasible regression function through the training samples.Then, input the test data into the regression function and get fusion results as final traffic forecasting parameters.The process of the traffic information fusion is shown in Figure 3.
The fusion process is designed as follows: 1) Information pre-processing.Assume that the original information have been pre-processed for eliminating the errors and redundancy.
2) The samples construction.The pre-processed information is constructed as the training samples and the test samples.
Select proper generalization parameter g , which decides the generalization ability of SVM, namely, the permitted error of function; select proper upper bound D, which is also called penalty factor, is the mutual parameter of any kernel; select loss function, whose common loss function is g insensitive loss function: n is loss function, g represents generic parameter, x is slack variable; and select suitable kernel functions f(x i ,x), which is the core of SVM because the non-linear expansion of SVM mainly depends on kernel function using the feature space.In this paper, the cross validation method is used to estimate predictive parameters.Then, optimization problem is constructed and solved as follows: Obtain the optimal solution: , ,..., , ,..., , , , ,..., a a a a a a a i n 1 2 4) Build the decision function to represent the relationship between real data and detector data: Then, g and kernel function obtained under the optimal conditions will be taken as fusion predictive model parameter.5) Fusion calculation.After above training, g and kernel function are selected and the test samples are used to predict the traffic parameters.6) Fusion evaluation.The prediction results will be evaluated, if they satisfy target value accuracy, the trained g and kernel function can be chosen as final predictive model parameters.If not, reconstruct samples, and loop from step 2 to step 5 until the results satisfy accuracy requirement.

EXPERIMENT AND RESULT ANALYSIS
In order to test and verify our method, our improved SVM of multi-classification method based traffic information fusion algorithm is compared to the informa-tion fusion algorithms of the Bayesian Inference Model (BIM) and the Dempster-Shafer theory Model (DSM) [3].After traffic information fusion, two forecasting algorithms of Chaotic model [22] and Non-parametric regression model [23] are used to test and verify BIM, DSM and our improved SVM of multi-classification method based on traffic information fusion algorithm, respectively.
The cross validation method is used to select the relevant functions and parameters are selected as follows: Kernel function-Linear (because the randomly weighted average function is close to a linear function), D=120,     Firstly, the three traffic information fusion methods are tested using the forecasting model of Chaotic model.The forecasting performance of flow and velocity are shown in Figure 4 and Figure 5.
Through analysing mean absolute error (MAE), relative mean error (RME) and root mean square error (RMSE) of relative error of flow prediction and relative error of velocity prediction, performance of our scheme is compared to others.The analysing results are shown in Table 1 and Table 2.
Let N be the number of flow or velocity and let y i be the actual value of flow or velocity, so yi t is the predicted value of flow or velocity.MAE, RME and RMSE can be expressed as follows:   From Table 1 and Table 2, it can be found that, after information fusion using our scheme, compared with other methods, MAE, RME and RMS of flow and velocity are all lower than other methods.

MAE N y y 1
Secondly, the three traffic information fusion methods are tested using the forecasting model of Non-parametric regression model.The forecasting performance of flow and velocity are shown in Figure 6  and Figure 7.
From Table 3 and Table 4, it can be also figured out that, after information fusion using our fusion scheme, compared with other methods, MAE, RME and RMS of flow and velocity forecasting are all lower than other methods.

CONCLUSION
For achieving better traffic parameters forecasting performance, this paper focuses on the problem of the multi-sourced traffic information classification in the process of information fusion.Through analysing the disadvantage of SVM and comparing different multi-classification methods of SVM, then improving SVM by binary tree, the proposed method can solve multi-classification problem in the information fusion process for better traffic parameters forecasting.
To our best knowledge, there have been few applications of SVM in the field of analysing, dealing and utilizing traffic information.This study has provided an exploratory research on SVM's application in traffic information classification and fusion and solved the typical multi-classification problems.Further studies on improving binary trees SVM for adaptively classifying all kinds of multi-source data and the selection of accurate parameters and kernel function model will be recommended in the future.

Figure 2 -
Figure 2 -The BT-SVM of traffic information k classification hierarchical structure

H
. Zhao, et al.: A Multi-Classification Method of Improved SVM-based Information Fusion for Traffic Parameters Forecasting

1 g
= pcm/min [16].Microwave sensor data and loop detector data are used which are collected from a road section of west of XiShanPing of Chongqing in the People's Republic of China.The detector data from March 13 th , 2013 to March 25 th , 2013 are used as the historical data of the flow (vehicles/h) and the velocity (km/h) for fusing and forecasting on March 26 th , 2013.The detector data of March 140

Figure 4 -
Figure 4 -The flow forecasting performance of the three-fusion method for Chaotic model

Figure 5 -
Figure 5 -The velocity forecasting performance of the three-fusion method for Chaotic model

Figure 7 -
Figure 7 -The velocity forecasting performance of the three-fusion method for Non-parametric regression model

Table 1 -
The error comparison of predicted results of flow the BIM-based predicted flow value the real flow value Our model-based predicted flow value the DSM-based predicted flow valueFigure 6 -The flow forecasting performance of the three-fusion method for Non-parametric regression model the BIM-based predicted velocity value the real velocity value Our model-based predicted velocity value the DSM-based predicted velocity value

Table 2 -
The error comparison of predicted results of velocity

Table 3 -
The error comparison of predicted results of flow

Table 4 -
The error comparison of predicted results of velocity