CHEMICAL ENGINEERING TRANSACTIONS VOL. 51, 2016 A publication of The Italian Association of Chemical Engineering Online at www.aidic.it/cet Guest Editors: Tichun Wang, Hongyang Zhang, Lei Tian Copyright © 2016, AIDIC Servizi S.r.l., ISBN 978-88-95608-43-3; ISSN 2283-9216 An Improved Support Vector Machine Algorithm and its Application in Intelligent Transportation System Ronghui Fu College of Computer Science, Neijiang Normal University, Neijiang, SiChuan, China, 641110 furonghui@163.com Intelligent transportation system as the combination of computer technology, information technology, communication technology, electronic control technology and transportation has been a hot point has been widely used to save many existing problems in transportation. It is widely accepted that traffic incident has strong randomness and unpredictable destructiveness Support vector machine proposed by Vapnik et al. is introduced in this paper to solve the existing problems in traffic incidents in order to help improve the efficiency and effect in intelligent transportation system. Here, support vector machine is improved by introducing particle swarm optimization (PSO) which is a powerful and easy way to implement. This improved method can optimize both the optimal feature subset and parameters in SVM, which can further to solve time in computation. Finally, an experiment is demonstrated to show the application of the proposed method in intelligent transportation system. 1. Introduction With the rapid development of highway construction, our country has acquired significant economy benefits and social benefits. Highway as the result of economy development has been an important symbol of modernization of a country (Levin and Krause, 1978; Li and McDonald, 2004). However, recently many dangerous events have happened and resulted in losing lives. Then, intelligent transportation system, as the combination of computer technology, information technology, communication technology, electronic control technology and transportation has been a hot point in the academy and industry fields (Janusova and Cicmancova, 2016). Intelligent transportation system has been widely used to save many existing problems in transportation such as highway. Traffic incident has strong randomness and unpredictable destructiveness (Yuan and Chen, 2003). So, forecast related to traffic incidents in transportation especially in highway is also a difficult problem. Up to now, many researchers have proposed many algorithms and models to deal with this problem (Aburomman and Ibne, 2016; Dae, et al., 2015). As early as 1978, Levin and Krause developed Bayesian algorithm to distinguish crowded events by calculating the change of occupancy of traffic emergency. Based on floating car, Li and McDonald proposed a new detection algorithm to detect traffic incidents in highway. Here, support vector machine is a popular algorithm to handle prediction and detection problems in intelligent transportation system (Le, 2014). Support vector machine proposed by Vapnik et al. is a set of related supervised learning methods used for prediction and classification. To build a SVM based prediction and classification model, feature subset selection is an important issue. There is advantage to limit the number of input features in a classifier to produce a good predictive and less computationally intensive model. With a small and appropriate subset, the rational classification decision can be generated easier. In general, SVM uses a function to map the data into a different feature space. This transformation often comes in the form of mapping to a high-dimensional space. A function used to perform this transformation is called kernel function which plays a critical role both in the theory and application of SVM. In this paper, based on particle swarm optimization (PSO) (Malik et al., 2011; Kenedy and Eberhart, 1995), an improved SVM alogrithm is proposed to be applied in intelligent transportation system so as to detect or predict traffic incidents in time. POS is a population-based search algorithm that is initialized with a population DOI: 10.3303/CET1651101 Please cite this article as: Fu R.H., 2016, An improved support vector machine algorithm and its application in intelligent transportation system, Chemical Engineering Transactions, 51, 601-606 DOI:10.3303/CET1651101 601 of random particles. Each particle in the POS flies through the search space at a velocity that is dynamically adjusted according to its own and its companion’s historical behavior. This improved method can optimize both the optimal feature subset and parameters in SVM, which can further to solve time in computation. Finally, an experiment is demonstrated to show the application of the proposed method in intelligent transportation system. This pape is organized as follows. Section 2 describes the basic concepts of kernel function and support vector machine. Section 3 demonstrates the concepts of particle swarm optimization and the process of the improved SVM algorithm. Section 4 shows the appliction of the proposed method in intelligent transportation system. Stection 5 concludes this paper. Section 6 gives the references in this paper. 2. Basic concepts In this section, we will introduce the basic concepts about kernel function and SVM so as to help understand the proposed method in section 3. 2.1 Kernel function In general, kernel function as an important function in support degree machine can provide a way to transform from the linear learning machine to nonlinear learning machine. Then, we will demonstrate some relative concepts of kernel function in the following. Definition 1. (Feature Space) Given an initial data set s={(x1, y1),(x2, y2),(xi, yi)}, where xiRn and yiRm. A feature mapping is : n n x R H R    , (1) where H is feature space, denoting space Hilbert . Definition 2. (Kernel function) Let binary function K(x,x’) belonging to RnRn be a kernel function of RnRn. If there is a mapping from Rn to any feature space H, then it can be obtained that ( , ) ( ), ( ) i j K x x x x     , (2) where <(xi), (xj)> denotes inner product of H. Definition 3. (Positive semi-definite matrix) Let binary function K(xi, xj):RnRnR be positive definite. If it is symmetric such as K(xi, xj)=K(xj, xi), and satisfies , 1 ( , ) 0 m i j i j i j K x x    , (3) where ml (positive integer), xi, x2…,xmRn and ai, a2…,amRn. Definition 4. (Gram matrix) Given K(x,x’):RnRnR and xi, x2…,xiRn. Then, the element in the i line of the j column constructs a ll matrix of Kij=K(xi, xj). Therefore, K is a Gram matrix of K(x,x’) related to xi, x2…,xl. The common core function includes the following types: (1) polynomial core function,    , 1 d i i k x x xx  ; (2)Gauss core function,   2 2 , exp 2 i i x x k x x           ; (3) Sigmoid core function,    , tanh ,i ik x x v x x c   . 2.2 Modelling of SVR Based on VC dimension theory of statistical learning and structural risk minimization, Vapnik et al. proposed support vector machine as a new general learning method. Through using finite sample information, SVM can search an optimal compromise between complexity and learning capability of model to obtain better predictive ability and solve some practical problems such as small sample, non-linearity, high dimension and partial 602 minimal point. Here, SVM only need to determine insensitive coefficient, penalty factor and core parameter where the modelling is simpler than BP network. Firstly, given the training set T= s={(x1, y1),…,(xN, yN)}, where xiRn and yi(-1,1), i=1, 2, …, N. Then, we can obtain that 21 min 2  (4) s.t. ( ) 1T i i y x b   (5) where b is a threshold value used to classify. In order to deal with this binary optimization problem, Lagrange equation where optimal solution is under constraint condition is constructed as follows: 2 1 1 ( , , ) ( ( ) 1) 2 N T i i i i L b a y x b          (6) where αi is Lagrange multiplier. Based on optimality principle, it can be acquired that ( , , ) 0 L b      (7) ( , , ) 0 L b b     (8) Solving the Eqs. (7)-(8), it can be further obtained that 1 N i i i i w y x    (9) 1 0 N i i i y    (10) Then, using Eq. (10), we can obtain that   1 1 1 1 min 2 N N N i j i j i j j i j j y y x x         (11) S.t. 1 0, 0 N i i i i y      1, 2, ,i N . (12) Using optimization algorithm, we can acquire that * * 1 ( ) N i i i i k i b y y x x     . (13) After that, it can be derived that   * * 1 sgn ( ) N i i i k i y x y x x b           , (14) where sgn represents sign function. Parameter optimization and feature selection are two important aspects in SVM. Now, many researchers have devoted to study these two points such as grid searching method, gradient descent and meta-heuristic algorithm (Cordeiro and Pappa, 2011; Garcia Nieto et al., 2016). However, there are some drawbacks in these methods. So, it is difficult to obtain optimal result. Then, in this paper, we propose an improved SVM method based on POS algorithm. 3. Improved support vector machine In this section, the new SVM algorithm is proposed and the process is demonstrated below. 603 Firstly, we will introduce the concepts of particle swarm optimization which is an evolutionary computation technique. Inspired by social behaviour among individuals, particles can represent a potential problem solution moving through n-dimensional search space. The PSO starts out with a set of agents called particles in random velocity. Each particle shows a record of the position of its previous best performance in a vector called pbest. The nbest is another best value and can be tracked by the particle swarm optimizer, which can be also obtained so far by any particle in that particle’s neighbourhood. If a particle takes the entire population as its topological neighbours, the best value is a global best and is called gbest. It is important step to define fitness in POS. Classification accuracy and the number of feature are two criteria used to define a fitness function. Let pi,j denote the best previous position encounter by the ith particle. pg,j denotes the global best position so far and t denotes the iteration counter. The current velocity of dth dimension of the ith particle at time t is defined as        , j , j 1 1 , , 2 2 , ,1 t t t t i i i j i j g j g j v t v t c r p x c r p x          where r1 and r2 are random function in the range [0,1], position constant c1 and c2 are personal and social learning factors, X = {x1, x2, …, xn} is population consisted by n particles, velocity is restricted to the [-vmax,vmax] range in which vmax denotes the predefined boundary value. Compared with genetic algorithm, POS has no evolution operators such as crossover which makes it easy to implement with great success to several problems. In order to obtain better performance of SVM, POS algorithm is introduced to acquire balance between global level and local level. Step 1: Select n+2 dimensional particles to code individual. Step 2: Initialize population randomly and assign upper and lower bound, initial speed, size of population and the number of iterations. Step 3: Train SVM model using the feature sets selected in Step 2. Step 4: Construct objective function considering ACC, SVs and the number of features. 1 1 2 1 3 1 2 3 1 1 K i n i i Test Accuracy f avgacc K nsv f m ft f avgacc n f f f f                                             (15) α, β, γ represents the weights of classification accuracy, the number of support vector and selected feature sets. Weight can adjust the contributions of different objectives on classification. Here, α, β, γ can be obtained from  1 2 2 max = - t t     ,  1 2 2 max = - t t     ,  1 2 2 max = - t t     which satisfy 1 1 1 + + =1   and 2 2 2 + + =1   . Step 5: Increase the number of iterations, iter = iter +1. Step 6: Increase the number of population and renew the position and speed of each particle. Step 7: Introduce mutation strategy as follow:     ' , 0 , 1 k k k k k x t UB x if rand x x t x LB if rand          , (16) where xk demonstrates a variation in particles, x' k demonstrates the value after mutation and UB and LB shows upper limit and lower limit of xk. Step 8: Train SVM model using the feature sets selected in Step 6 and calculate fitness values of each particle based on Eq. (15). 604 Step 9: Compare the existing fitness value and pfit in internal storage to renew optimal fitness value of individual pfit and optimal position of individual pbest. Step 10: If the order of particles reaches the maxi size of population, go to Step 11. Otherwise, go to Step 6. Step 11: Compare the gfit value and the optimal pfit to renew optimal fitness value of global level gfit and optimal position of global level gbest. Step 12: If condition is satisfied, then go to Step 13. Otherwise, go to Step 5. Step 13: Obtain optimal parameters from gbest. 4. Simulated Experiment In this section, we will demonstrate the application of the proposed method in intelligent transportation system. The simulated data set is collected from traffic flow of road system in main road, which can signify the classificatory accuracy and correct features selection of the improve SVM method. Generally, the drawback of evolutionary-based such as the GA is the long training time when deal with a large scale data set. But in this case, by using POS, the improved method can implement very well. This experiment is performed by using MATLAB 2011. In the method, the particle movement update is performed in the application server site. A new SVM classification is constructed and the feature subset is prepared. The application server can find the global and personal best of each particle according to particle’s fitness and update the new position for each particle using the particle’s velocity and update formula. Some parameters are set as 1, 1, 1, 2, 2 and 2 are equal to 0.3, 0.7, 0, 0.6, 0.3, and 0.1. c1 and c2 are set as 2. The fitness value of selected two data sets from 9.00-10.00 and 18.00-17.00 in Figure 1. The accuracy of data sets is demonstrated in the Figure 2. Figure 1: The fitness value of the training set Figure 2: The accuracy of data sets during different period Then, based on the mentioned two figures, it is easy to know that the accuracy of the proposed method range from 50%-90%. Different data set in different time has different accuracy. With the number of iteration, the fitness is decreased from the global view. To sum up, the improved method in this paper have better accuracy of data sets and feature selection which can help us to predict the data flow in intelligent transportation system and further control traffic incidents. 605 5. Conclusion With the rapid development of highway construction, our country has acquired significant economy benefits and social benefits. However, recently many dangerous transportation events have happened and resulted in losing lives. Then, intelligent transportation system, as the combination of computer technology, information technology, communication technology, electronic control technology and transportation has been a hot point in the academy and industry fields. It is widely accepted that traffic incident has strong randomness and unpredictable destructiveness Support vector machine proposed by Vapnik et al. is introduced in this paper to solve the existing problems in traffic incidents in order to help improve the efficiency and effect in intelligent transportation system. Here, support vector machine is improved by introducing particle swarm optimization (PSO) which is a powerful and easy way to implement. This improved method can optimize both the optimal feature subset and parameters in SVM, which can further to solve time in computation. Finally, an experiment is demonstrated to show the application of the proposed method in intelligent transportation system. Acknowledgment Project Name: Provincial Excellent Engineer Programs-The Software Engineering(the file of Neijiang Normal University in 2014 with number 19). Reference Aburomman A. A., Ibne Reaz M. B., 2016, A novel SVM-KNN-POS ensemble method for intrusion detection system, Applied Soft Computing, 38, 360-372. Cordeiro Z., Pappa C. L., 2011, A POS algorithm for improving multi-view classification, in: 2011 IEEE Congress on Evolutionary Computation (CEC), IEEE, 925-932. Dae Ko Y., Jae Jang Y., Seok Lee M., 2015, The optimal economic design of the wireless powered intelligent transportation system using genetic algorithm considering nonlinear cost function, 89, 67-79. Garcia Nieto P. J., Garcia-Gonzalo E., Alonso Fernandez J. R., Diaz Muniz C., 2016, A hybrid PSO optimized SVM-based model for predicting a successful growth cycle of the spirulina platrnsis from raceway experiments data, Journal of Computational and Applied Mathematics, 291, 293-303. Janusova L., Cicmancova S., 2016, Improving safety of transportation by using intelligent transport systems, Procedia Engineering, 134, 14-22. Kenedy J., Eberhart R. C., 1995, Particle swarm optimization, in: Proceedings of IEEE International Conference on Neural Networks, Piscataway, NJ, 1942-1948. Le Thi H. A., Thanh Vo X., Pham Dinh T., 2014, Feature selection for linear SVMs under uncertain data: Robust optimization based on difference of convex functions algorithms, Pattern Recognition Letters, 42, 40-46. Levin M., Krause G. M., 1978, Incident detection: A Bayesian approach. Li Y., McDonald M. 2004, Motorway incident detection using probe vehicles. Transport, 158, 11-15. Yuan F., Cheu R. L., 2003, Incident detection using support vector machines. Transportation Research Part C, 11, 309-328. Malik A. J., Shahzad W., Kahn F. A., 2011, Binary POS and random forests algorithm for probe attacks detection in a network, in: 2011 IEEE Congress on Evolutionary Computation (CEC), IE 606