Plane Thermoelastic Waves in Infinite Half-Space Caused Decision Making: Applications in Management and Engineering Vol. 5, Issue 1, 2022, pp. 208-224. ISSN: 2560-6018 eISSN: 2620-0104 DOI: https://doi.org/10.31181/dmame0315052022u * Corresponding author. E-mail addresses: anilutku@munzur.edu.tr (A. Utku), semakayapinar@munzur.edu.tr (S.K. Kaya) MULTI-LAYER PERCEPTRON BASED TRANSFER PASSENGER FLOW PREDICTION IN ISTANBUL TRANSPORTATION SYSTEM Anıl Utku1 and Sema Kayapınar Kaya2* 1 Munzur University, Department of Computer Engineering, Tunceli, Turkey 2 Munzur University, Department of Industrial Engineering, Tunceli, Turkey Received: 1 April 2022; Accepted: 15 May 2022; Available online: 15 May 2022. Original scientific paper Abstract: Predicting passenger movement in transportation networks is a critical aspect of public transportation systems. It allows for a greater understanding of traffic patterns, as well as efficient system evaluation and monitoring. It could also help with precise timing to emergencies or important events, as well as the improvement of urban transport system weaknesses and service quality. The number of transfer passengers demand in Istanbul, Turkey's biggest and most developed metropolis, was used to construct a real- world forecasting model in this study. The number of transfer passengers has been forecasted using popular machine learning methods such as kNN (k- Nearest Neighbours), LR (Linear Regression), RF (Random Forest), SVM (Support Vector Machine), XGBoost and MLP. The dataset utilized is made up of hourly passenger transfer counts gathered at two public transportation transfer stations in Istanbul in January 2020. Using MSE, RMSE, MAE and R2 parameters, each model's experimental data have been thoroughly evaluated. MLP has more successfully other machine learning algorithms in the majority of transportation lines, according to the experimental results. Key words: Machine learning, passenger flow management, transfer data. 1. Introduction In 2020, the city, which straddles the Bosporus and is located in both Europe and Asia, have a population of over 15 million people, contributing for 20 percent of Turkey's total population. (TUIK, 2021). In according to world demographics data, Istanbul is the most crowded city in Europe and the world's fifteenth most densely populated metropolis (Statista, 2020). The number of passengers utilizing public transportation is significantly higher as a result of the high density of the population (Pamucar et al., 2020). While nearly 11 million 500 thousand people use public Multi-layer perceptron based transfer passenger flow prediction in Istanbul transportation… 209 transportation in Istanbul every day, passengers who prefer highway transportation (metrobus, public urban transportation, private bus, etc.) account for nearly 84 %, followed by railway transportation (metro, light metro, tram, etc.) at around %14, and sealine transportation at just under 2% (IETT, 2021). Despite the constant increase in the number of people and vehicles, the fact that the proportion of cars per thousand people is also constantly rising is important in terms of showing the increased traffic density in Istanbul. According to the traffic index published by TomTom, Europe's largest navigation systems company, Istanbul is the fifth city with the highest traffic density of 51% in the world (TomTom Traffic Index, 2021). With the increase of urban transportation challenges, forecasting the number of people entering and departing Istanbul's transit terminals has become more challenging. Passenger flow forecasting provides a better understanding of travel patterns, efficient monitoring and evaluation of the system status of Istanbul transportation system. It may also help in the prompt response to crises or special events, as well as the correction of defects and enhancement of public transportation service quality. Several predicting methodologies have been proposed to enhance the effectiveness of passenger forecasting models, encompassing mathematical modelling methods, statistical methods, and non-parametric methodology. The machine learning-based (ML) framework is one of the most well-known non-parametric approaches today (Boukerche and Wang, 2020). It's a subset of AI that integrates the problem of learning from data samples with the concept of reasoning in generally (Boukerche & Wa ng, 2020). It's a subfield of AI that relates the difficulty of learning from sample data with the concept of reasoning in overall (Tom Mitchell, 2006). There are two stages to any learning process: (i) Particular a given dataset, calculation of unknown relationships in a system Particular a given dataset, calculation of unknown relationships in a system (ii) predicted connections are used to forecast new platform outputs. Machine Learning has also been shown to be an interesting topic of study in passenger demand prediction, with several applications (Liu et al., 2020, Zheng et al., 2021, Wang et al., 2021, Hayadi et al., 2021; Gummadi and Edara (2018); Ye et al. (2019); Messinis and Vosniakos, (2020); Liu et al. (2020); Hayadi et al. (2021); Guo et al. (2021); Wang et al. (2021); Bozanic et al. (2021); Yang et al.( 2021); Ge et al. (2021); Kamandanipour et al. (2022); Müller-Hannemann et al. (2022). The ability to predict passenger traffic in transportation networks is critical to public transportation management. It helps to improve transportation services, provide early warnings for unusual traffic situations, and make cities smarter and safer. Furthermore, transfer passenger flow prediction can improve the transfer operation efficiency reduce the transfer waiting time and enhancing passengers’ satisfaction. To address this problem, the transfer passenger flow transferring a various modes transportation (metro & tram, bus & metrobus, rail, and ferries & sea-bus) in Istanbul has been developed for the first time in the literature. The followings are some of the study's contributions: I. This paper offers a clear theoretical foundation and decision support for the practical work of using intelligent technologies to optimize the predictive performance of the number of passengers moving between different modes, including "metro and tram," "bus and metrobus," "rail," and "ferries and sea-bus." II. As Istanbul has a very heavy traffic; the number of lines can be increased or decreased according to the number of passengers. Accurate transfer passenger volume is the fundamental of transportation scheduling in Istanbul. III. This enhances the service standards of an urban public transportation system and exposes passengers with real-time transfer passenger demand information across several routes, allowing them to make greater decision to travel. Utku and Kaya /Decis. Mak. Appl. Manag. Eng. 5 (1) (2022) 208-224 210 IV. Prediction transfer passenger flow assists Istanbul IETT authorities and management in increasing public transit reliability of the system, improving passenger experience, and maximizing routing plans. The motivation of this paper is the prediction of the number passengers transferring in various lines in Istanbul recorded at 1-hour intervals between 1-31 January 2020. Istanbul is Turkey's biggest and most developed metropolis therefore, the dataset utilized comprises of passenger transfer numbers on several transportation lines in Istanbul. The goal of this research is to anticipate the number of transfer passengers in Istanbul. The number of transfer passengers was determined based on passenger data gathered during one-hour intervals. The Istanbul Public Urban Transport Company (IETT), Private Public Bus (OHO), motor/boat, and the IETT tunnel will have been subjected to empirical investigations. Time information such as hourly, daily, and weekly has been revealed in this fashion on certain lines. The goal of this research is to use machine learning techniques to predict the amount of transfer passengers on transportation lines using a different kNN, LR, RF, SVM, XGBoost and MLP methods. 2. Literature Review With the development of big data technology, using machine learning algorithms to detect the principles of urban passenger movement has become one of the research hotspots in the field of public transportation. In recent decades, there has been a huge amount of work on passenger flow and forecasting using statistical methodological approaches notably Machine Learning. Xie et al. (2014) employed a combination of Seasonal Decomposition (SD) and Least Squares Support Vector Regression (LSSVR) methods to forecast air passenger volume for a short amount of time. Sun et al. (2015) proposed a hybrid Wavelet and Support Vector Machine (SVM) methods that consist of three significant levels to predict the number of people entering and leaving the subway in Beijing. Roos et al. (2017) proposed a predicting technique, which is based on dynamic Bayesian Network (BN) built to function even passenger flow data is missing or uncertain. Ni et al. (2017) created a combination time series model based on seasonal ARIMA and Loss Function (LF), using data from the Twitter social media platform to monitor subway passengers. Toqué et al. (2017) addressed a passenger flow predicting in multimodal transport using ML methods such as Random Forest (RF) and Long-Short Term Memory (LSTM) neural networks. Zhang et al. (2017) predicted the short-term passenger data taken from GPS device and smart card system in favour of the two-Step Real Time Prediction (2RTP) approach based on the extended Kalman Filtering (KF) method. Milenkovıć et al. (2018) estimated an ARIMA analysis to simulate the monthly number of train passengers while considering seasonal variations into consideration. Gummadi and Edara (2018) employed the ARIMA and seasonal ARIMA to estimate bus passenger flow in India's transport industry over a short period of time. Ye et al. (2019) aimed to predict the daily bus passenger traffic amount using the ARIMA method and examined the outcomes of predictions in the case of complete weekday non-peak data collected from January to March 2018. Li et al. (2020) predicted shared passenger demand in various locations with a hybrid algorithm based on WT-FCBF- LSTM (Wavelet Transform, Fast Correlation-based Filter, and Long Short-term Memory). Liu et al. (2020) focused on a short-term estimation model for local bus passenger flow using SVM. Hayadi et al. (2021) proposed a Random Forest (RF) using Multi-layer perceptron based transfer passenger flow prediction in Istanbul transportation… 211 the location data from the GPS devices in the buses, the location of the bus stop used for operation management, and the volume of traffic estimated by an image processing method. Li et al. (2021) adopted the seasonal ARIMA and SVM to predict the periodic flow of railway passenger. Guo et al. (2021) proposed a regression tree combined with copula-based simulations employing passenger level data to generate real-time distributional estimates of travels in an airport. Rajendran et al. (2021) developed a logistic regression (LR), artificial neural networks (ANN), RF, and gradient boosting (GB) for assessing air taxi demand considering various factors such as temperature, weather conditions and visibility. Zheng et al. (2021) designed an integrated LR, a fully connected neural network (NN) and LSTM model for anticipating a metro station’s abnormally substantial passenger movement. Rodríguez-Sanz et al. (2021) presented two RF algorithms that allow for the integration of flight data and passenger judgement for predicting the duration of queues at check-in counters and the security control area at Parma de Mallorca airport in Spain. Wang et al. (2021) established a LightGBM method to estimate railway high passenger parameters like railway specifications, past weather trends, and public transport time sequence. The LightGBM methodology outscored the XGBoost, RF, and ARIMA algorithms, as according their findings. Yang et al.( 2021) proposed a prediction model based on transit passenger flow using the wavelet analysis (WA) and LSTM combination model for the short-term period. Abeyrathna et al. (2021) investigated the relationship between the Regression Tsetlin (RT) machine algorithm and pandemic events such as daily COVID-19 cases and deaths, pandemic control measures to estimate the number of transport passengers under different scenarios. Jackson et al. (2021) benefited from various Bayesian Network (BN) models for predicting bus schedule time. Ge et al. (2021) implemented a combination of differentially ARIMA and SVM to achieve a highly predictive model for passenger flow in Shanghai-Guangzhou railway station. Kamandanipour et al. (2022) presented a multi-layer ANN system to forecast the strength of demand caused by seasonal conditions using train ticket service data. Müller-Hannemann et al. (2022) investigated a new technique of approximating scenario-based resilience employing XGBoost, Catboost, SVR and ANN models which are based on carefully selected important aspects of public transport systems. Wood et al. (2022) analysed its use of traditional LR analysis and a RF model to unveil future passenger occupancies on a bus when it reaches at next stops using real-time data from bus operating and meteorological data. Reitmann and Schultz (2022) developed the gradient boosting (XGBoost) algorithm and the point-of-interest (POI) model, helping in the reduction of the passenger flow forecast model's total training time, to forecast bus passenger flow in Beijing. Comparisons of these models are listed in Table 1 in detail. Table 1. Literature review of passenger flow prediction Author (year) Models Passenger type Period Xie et al. (2014) SD, LSSVR Air short Sun et al. (2015) Wavelet, SVM Subway short Roos et al. (2017) BN metro short Ni et al. (2017) Seasonal ARIMA, LF Subway short Toqué et al. (2017) RF, LSTM Multi model long Zhang et al. (2017) 2RTP, KF Bus short Utku and Kaya /Decis. Mak. Appl. Manag. Eng. 5 (1) (2022) 208-224 212 Author (year) Models Passenger type Period Milenkovıć et al. (2018) ARIMA railway short Li et al. (2020) WT-FCBF-LSTM railway long Liu et al. (2020b) SVM bus short Li et al. (2021) Seasonal ARIMA, SVM urban short Guo et al. (2021) RT urban short Rajendran et al. (2021) LR, ANN, RF, GB Taxi urban short Zheng et al. (2021) NN, LSTM, LR Metro short Rodríguez-Sanz et al. (2021) RF airport long Wang et al. (2021) LightGBM, XGBoost, RF, ARIMA railway long Yang et al. (2021) WA, LSTM Transit short Abeyrathna et al. (2021) RT Public transport short Jackson et al. (2021) BN bus short Ge et al. (2021) ARIMA, SVM railway long Kamandanipour et al. (2022) ANN Railway short Müller-Hannemann et al. (2022) XGBoost, Catboost, SVR and ANN Public transport - Wood et al. (2022) LR, RF Bus short Reitmann and Schultz (2022) XGBoost, POI Bus short 3. Machine Learning-Based Passenger Flow Prediction The amount of immediate data produced by urban transportation systems is also expanding, thanks to the growth of big data, internet of things, sensor networks, and cloud computing applications. In topics like safety management, emergency response efficiency, and urban traffic management, passenger flow forecast in urban transportation networks is critical. Passenger flow planning is important for concerns including scheduling, traffic planning, and passenger flow control. The goal of this research is to anticipate the number of transfer passengers in Istanbul, Turkey's largest and most developed metropolis, using passenger flow data. The dataset utilized comprises of passenger transfer numbers on various transportation lines in Istanbul, such as transfers and normal boarding, recorded for one month between January 1, 2020 and January 31, 2020. The objective of this research is to use machine learning algorithms to forecast the amount of transfer passengers on transportation lines. In practice, kNN (k-Nearest Neighbors), LR (Linear Regression), RF (Random Forest), SVM (Support Vector Machine), XGBoost (eXtreme Gradient Boosting), and MLP (Multi-layer Perceptron) have been examined then, each model's experimental findings have been thoroughly examined using MSE, RMSE, MAE, and R2 metrics. 3.1. Original Data Analysis In this study, a dataset consisting of the transfer numbers of passengers such as transfer and normal boarding in different transportation lines in Istanbul recorded at 1-hour intervals between 1-31 January 2020 by Istanbul Metropolitan Municipality has been used. The dataset used consists of 23163 rows of transportation data. The Multi-layer perceptron based transfer passenger flow prediction in Istanbul transportation… 213 dataset contains id, date_time, transport_type_id, transport_type_desc, line, transfer_type_id, transfer_type, number_of_passenger parameters. In this study, IETT, ÖHO, motor/boat and IETT tunnel transfer lines have been selected for prediction because they have the highest number of transfer passengers. IETT transfer line refers to all bus lines offered by the Istanbul Metropolitan Municipality. ÖHO transfer line refers to all bus lines offered by private public bus companies. Motor/boat, on the other hand, refers to all sea transportation made by marine vehicles. IETT tunnel refers to all transfers made using the underground metro. Table 2 shows the first 10 rows of the dataset used as an example. Table 2. A sample from the dataset Date_time Line Transfer_type Number_of_passenger 1.01.2020 00:00 Motor_Tekne Normal 1393 1.01.2020 00:00 Kabataş_Bağcılar Normal 4310 1.01.2020 00:00 Aksaray_Airport Normal 2936 1.01.2020 00:00 Kabataş_Bağcılar Transfer 1586 1.01.2020 00:00 Kadıköy-Kartal Metro Transfer 677 1.01.2020 00:00 Kirazlı-Olimpiyatköy Transfer 10 1.01.2020 00:00 Edirnekapı-Sultançiftliği Normal 793 1.01.2020 00:00 Şehir Hatları Transfer 59 1.01.2020 00:00 Taksim-4.Levent Normal 8119 3.2. Methodology In this study, popular machine learning algorithms commonly used in the literature such as kNN, LR, RF, SVM, XGBoost and MLP have been applied. The dataset has been pre-processed before applying to the models. Possible blank or incorrect fields in the data have been checked. After the data pre-processing step, training, validation, and test datasets have been selected. 80% of the dataset is split into training and 20% testing. 10% of the training data have been split for validation. Validation data has been used for the optimization of model parameters. Time series data refers to series of numbers ordered according to a time index. Time series data refers to series of numbers ordered according to a time index. In supervised learning problems, it is aimed to estimate the output from the inputs by using a function like y=f(x). Time series data can be transformed into supervised learning problem for use in time series analysis. The time series data can be transformed into a supervised learning problem by using the values from the previous time step to predict the value in the next time step as seen in Figure 1. Utku and Kaya /Decis. Mak. Appl. Manag. Eng. 5 (1) (2022) 208-224 214 Figure 1. Converting time series data to supervised learning problem In this study, time series data has been converted to supervised learning problem by using the sliding window method as seen in Figure 1. The number of previous timestamps determines the size of the sliding window. In this study, the size of the sliding window has been determined as 3 as a result of the experimental studies. In order to optimize the parameters of the machine learning algorithms used, 10% of the training data has been used for validation. By using the optimized parameters, algorithms have been applied and prediction values have been obtained. The pseudo code of the developed system is presented below: Input: Passenger transfer data on IETT, OHO, motor/boat and IETT tunnel lines Output: Predicted passenger numbers 1: Start. 2: Checking the missing and incorrect areas in the data (data pre-processing). 3: Splitting training, validation and test sets and normalizing the data. 4: Optimizing model parameters using validation data. 5: Walk forward validation. 6: Have the parameters with the lowest MSE value been selected? If yes go to step 7, if no go to step 4. 7: Creation of the model. 8: Making predictions using the created model. 9: Calculation of MSE, RMSE, MAE and R2 values according to the prediction results. 10: Finish. 3.3. Developed Model In this study, a comparative analysis of the passenger number estimation problem of the MLP-based model developed with popular machine learning algorithms is presented. MLP is a neural network model inspired by the neuron structure in the brain. MLP is a combination of perceptron’s that bind in different ways and operate in different activation functions. It consists of input nodes, hidden nodes and output Multi-layer perceptron based transfer passenger flow prediction in Istanbul transportation… 215 nodes. Input nodes provide input information to the network. No computation is performed on any of the input nodes. This only relay information to hidden nodes. Hidden nodes are structures that are not directly connected to the outside world, perform calculations and transmit information from input nodes to output nodes. A hidden layer is created with a collection of hidden nodes. While a network has only a single input layer and a single output layer, it can have zero or multiple hidden layers. MLP has one or more hidden layers. Output nodes, on the other hand, are responsible for information processing and transferring information from the network to the outside world. The developed MLP model takes the passenger flow data in the training dataset as input and predicts the passenger numbers in the test dataset. According to the obtained result, the training process has been continued. The architecture of the developed model is shown in Figure 2. Figure 2. The architecture of the developed model In the developed MLP-based model, there are an input layer, three hidden layers and an output layer as seen in Figure 6. Hidden layers represent an intermediate processing step that is combined using weighted sums to obtain the classification result. The developed model is a sequential model with linear layers. There is a dropout layer between the input layer and the hidden layer. In the output layer, there are two output units that return the prediction of the probability of customer loss. ReLU activation function is used in the input layer and hidden layers, and the sigmoid activation function is used in the output layer. 3.4. Experimental Results In this study, a dataset consisting of the transfer numbers of passengers such as 1- month transfer and normal boarding in different transportation lines in Istanbul recorded at 1-hour intervals for 2020 has been used. IETT, ÖHO, motor/boat and IETT tunnel transfer lines with the highest transfer numbers have been selected for prediction. kNN, LR, RF, SVM, XGBoost and MLP algorithms, which are widely used in the literature, have been applied to the dataset. For each algorithm, the experimental results obtained using MSE, RMSE, MAE and R2 metrics have been compared. Utku and Kaya /Decis. Mak. Appl. Manag. Eng. 5 (1) (2022) 208-224 216 The IETT transfer line covers all bus lines offered by the Istanbul Metropolitan Municipality. IETT transfer line consists of passenger flow information transferring in 687 different time zones. 80% of this data is split for training and 20% for testing. After the train/test split, 6070 rows of data have been used in the training and 1518 rows of data have been used in the testing. Figure 3 shows the change in the number of transfer passengers on the IETT line over time. Table 3 show the average MSE, RMSE, MAE and R2 results obtained for each algorithm for IETT line. Figure 3. Change over time in the number of transfer passengers on the IETT line Table 3. Experimental results for each model according to the MSE, RMSE, MAE and R2 for IETT line Model MSE RMSE MAE R2 kNN 259682.920 509.590 323.066 0.958 LR 635367.832 797.091 653.525 0.906 RF 365886.694 604.883 354.582 0.944 SVM 237559.150 487.400 317.741 0.946 XGBoost 392332.530 626.364 411.415 0.942 MLP 227419.633 476.885 315.104 0.961 The experimental results show that the MSE values of kNN, LR, RF, SVM, XGBoost and MLP are 259682.920, 635367.832, 365886.694, 237559.150, 392332.530, 227419.633, respectively. The RMSE values of kNN, LR, RF and SVM are 509.590, 797.091, 604.883, 487.400, 626.364, 476.885, respectively. The MAE values of kNN, LR, RF and SVM are 323.066, 653.525, 354.582, 317.741, 411.415, 315.104, respectively. The R2 values of kNN, LR, RF and SVM are 0.958, 0.906, 0.944, 0.946, 0.942, 0.961, respectively. The ÖHO line covers all passenger transfers offered by private public bus companies. ÖHO transfer line consists of passenger flow information transferring in 716 different timestamps. 80% of this data is split for training and 20% for testing. Multi-layer perceptron based transfer passenger flow prediction in Istanbul transportation… 217 After the train/test split, 572 rows of data have been used in the training and 144 rows of data have been used in the testing. Figure 4 shows the change in the number of transfer passengers on the ÖHO line over time. Table 4 show the average MSE, RMSE, MAE and R2 results obtained for each algorithm for ÖHO line. Figure 4. Change over time in the number of transfer passengers on the ÖHO line Table 4. Experimental results for each model according to the MSE, RMSE, MAE and R2 for ÖHO line Model MSE RMSE MAE R2 kNN 1335463.741 1155.624 712.375 0.965 LR 2050366.100 1431.909 1117.355 0.949 RF 1332583.670 1154.375 710.408 0.967 SVM 1640037.728 1280.639 943.779 0.959 XGBoost 2236156.800 1495.378 931.525 0.944 MLP 1252185.815 1119.011 692.050 0.969 The experimental results show that the MSE values of kNN, LR, RF and SVM are 1335463.741, 2050366.100, 1332583.670, 1640037.728, 2236156.800, 1252185.815, respectively. The RMSE values of kNN, LR, RF and SVM are 1155.624, 1431.909, 1154.375, 1280.639, 1495.378, 1119.011, respectively. The MAE values of kNN, LR, RF and SVM are 712.375, 1117.355, 710.408, 943.779, 931.525, 692.050, respectively. The R2 values of kNN, LR, RF and SVM are 0.965, 0.949, 0.967, 0.959, 0.944, 0.969, respectively. Motor/boat transfer line refers to all transfers made by sea vehicles that provide sea transportation. Motor/boat transfer line consists of passenger flow information transferring in 618 different timestamps. 80% of this data is split for training and 20% for testing. After the train/test split, 494 rows of data have been used in the training and 124 rows of data have been used in the testing. Figure 5 shows the change in the number of transfer passengers on the motor/boat line over time. Table 5 show the Utku and Kaya /Decis. Mak. Appl. Manag. Eng. 5 (1) (2022) 208-224 218 average MSE, RMSE, MAE and R2 results obtained for each algorithm for motor/boat line. Figure 5. Change over time in the number of transfer passengers on the motor/boat line Table 5. Experimental results for each model according to the MSE, RMSE, MAE and R2 for motor/boat line Model MSE RMSE MAE R2 kNN 57453.940 239.695 160.711 0.884 LR 48366.810 219.924 168.390 0.903 RF 55962.547 236.565 159.844 0.887 SVM 34556.885 185.894 136.343 0.93 XGBoost 45494.010 213.293 144.571 0.907 MLP 30629.115 175.012 125.101 0.938 The experimental results show that the MSE values of kNN, LR, RF and SVM are 57453.940, 48366.810, 55962.547, 34556.885, 45494.010, 30629.115, respectively. The RMSE values of kNN, LR, RF and SVM are 239.695, 219.924, 236.565, 185.894, 213.293, 175.012, respectively. The MAE values of kNN, LR, RF and SVM are 160.711, 168.390, 159.844, 136.343, 144.571, 125.101, respectively. The R2 values of kNN, LR, RF and SVM are 0.884, 0.903, 0.887, 0.930, 0.907, 0.938, respectively. IETT tunnel transfer line refers to all transfers made using the underground metro. IETT tunnel transfer line consists of passenger flow information transferring in 502 different timestamps. 80% of this data is split for training and 20% for testing. After the train/test split, 401 rows of data have been used in the training and 101 rows of data have been used in the testing. Figure 6 shows the change in the number of transfer passengers on the IETT tunnel line over time. Multi-layer perceptron based transfer passenger flow prediction in Istanbul transportation… 219 Figure 6. Change over time in the number of transfer passengers on the IETT tunnel line Table 6 show the average MSE, RMSE, MAE and R2 results obtained for each algorithm for IETT tunnel line. Table 6. Experimental results for each model for IETT tunnel line Model MSE RMSE MAE R2 kNN 1909.902 43.702 34.096 0.879 LR 2619.355 51.179 40.426 0.835 RF 2070.691 45.504 34.879 0.869 SVM 2336.181 48.334 36.738 0.852 XGBoost 2832.245 53.218 40.686 0.836 MLP 1904.229 43.637 32.560 0.88 The experimental results show that the MSE values of kNN, LR, RF and SVM are 1909.902, 2619.355, 2070.691, 2336.181, 2832.245, 1904.229, respectively. The RMSE values of kNN, LR, RF and SVM are 43.702, 51.179, 45.504, 48.334, 53.218, 43.637, respectively. The MAE values of kNN, LR, RF and SVM are 34.096, 40.426, 34.879, 36.738, 40.686, 32.560, respectively. The R2 values of kNN, LR, RF and SVM are 0.879, 0.835, 0.869, 0.852, 0.836, 0.880, respectively. The prediction results of the developed MLP-based model are shown in Figure 7. Utku and Kaya /Decis. Mak. Appl. Manag. Eng. 5 (1) (2022) 208-224 220 Figure 7. Prediction results of developed MLP-based model The prediction results of the developed model for the IETT line in Figure 7.a, the ÖHO line in Figure 7.b, the motor/boat line in Figure 7.c and the IETT tunnel line in Figure 7.d are shown. As can be seen in the Figure 7, the MLP-based model successfully predicted the patterns in the training and test data. 4. Conclusions and Future Studies In this study, a comparative analysis of popular machine learning algorithms such as kNN, LR, RF, SVM, XGBoost and MLP for passenger flow prediction is presented. The experimental results for IETT, ÖHO, motor/boat and IETT tunnel lines have been extensively tested using MSE, RMSE, MAE and R2. For the IETT line, the MLP has more successful than the other models compared. After MLP, SVM, kNN, RF, XGBoost and LR have been successful, respectively. For the ÖHO line, the MLP has more successful than the other models compared. After MLP, RF, kNN, SVM, LR and XGBoost have been successful, respectively. For the motor/boat line, the MLP has more successful than the other models compared. After MLP, SVM, XGBoost, LR, RF and kNN have been successful, respectively. For the IETT tunnel line, the MLP has more successful than the other models compared. After MLP, kNN, RF, SVM, XGBoost and LR have been successful, respectively. Experimental results show that these machine learning methods can be used in passenger flow prediction problems. Among the compared algorithms, MLP achieved successful results in all of the transportation lines. MLP is a neural network model developed based on biological neural network structures. The MLP consists of interconnected processing units, similar to the functioning of neurons. MLP's ability to detect non-linear, linear or non-linear distributed data makes it perform well on most datasets. XGBoost is a machine learning model that uses a gradient boosting Multi-layer perceptron based transfer passenger flow prediction in Istanbul transportation… 221 framework. XGBoost is a decision-tree and gradient-boosting based machine learning model. It works successfully on non-structured data such as images, text and audio. kNN may be inefficient in terms of performance on small datasets. SVM is successful when having a limited set of points. SVM is good at outliers as it will only use the most relevant points to find support vectors. For this reason, SVM have successful results in this study. LR is expected to be successful when the dataset is truly linear, especially when there are many features with a very low signal-to-noise ratio. However, RF may fail to model linear combinations of many features. All methods compared in this study had successful results. All methods had R 2 values above 0.90 for the IETT line, above 0.94 for the ÖHO line, above 0.88 for the motor/boat line, and above 0.84 for the IETT tunnel line. Experimental results showed that the developed MLP-based model gives better results than the compared models for all transfer lines used in the prediction of the number of passengers. The prediction of the number of passengers is an important factor for the urbanization phenomenon and city management. Transportation planning is also important in terms of avoiding disruptions in transportation and reducing the traffic load. The developed model can be applied to real-world problems by using effective passenger predicting in the field of transportation planning. In future studies, longer-term predictions can be made using passenger data over a larger time period. In addition, the results can be evaluated by applying different models such as deep learning. In this study, traditional machine learning methods and MLP, which is a neural network-based model, are compared in practice. Here, it is aimed to benefit from the prominent features of neural networks in the time series prediction problem. The ability of a neural network to process data in detail stems from its ability to reveal hidden patterns between input and output data. An important advantage of neural networks is that they have the ability to learn and generalize information. MLP is tolerant of missing values and can model complex relationships such as nonlinear trends. It can also support multiple inputs. One of the important limitations of this study is that it only considers the number of transfer passenger volume prediction. For this reason, different external factors such as transfer time, rush hours and holiday days could be examined for passenger prediction model in the future. Secondly, ML algorithms such as kNN, LR, RF, SVM, XGBoost and MLP methods was employed during the short-term prediction process. In the further study, a state of art deep neural network algorithm could be developed to improve the prediction result for the number of transferring passengers. Author Contributions: Conceptualization, Software, Methodology, Validation, Writing, Visualization, Editing, A.U.; Review, Writing, Original draft preparation, Resources, Editing, S.K.K. All authors have read and agreed to the published version of the manuscript. Funding: This research received no external funding. Acknowledgement: The authors would like to express their gratitude to the editors and anonymous referees for their informative, helpful remarks and suggestions to improve this paper as well as the important guiding significance to us researches. Data Availability Statement: In this section, please provide details regarding where data supporting reported results can be found, including links to publicly archived datasets analysed or generated during the study. You might choose to exclude this statement if the study did not report any data. Utku and Kaya /Decis. Mak. Appl. Manag. Eng. 5 (1) (2022) 208-224 222 Conflicts of Interest: The authors declare no conflicts of interest. References Abeyrathna, K. D., Rasca, S., Markvica, K., & Granmo, O.C. (2021). Public Transport Passenger Count Forecasting in Pandemic Scenarios Using Regression Tsetlin Machine. Case Study of Agder, Norway. In Smart Transportation Systems 2021 (pp. 27–37). Springer. Boukerche, A., & Wang, J. (2020). Machine Learning-based traffic prediction models for Intelligent Transportation Systems. Computer Networks, 181, 107530. Bozanic, D., Tešić, D., Marinković, D., & Milić, A. (2021). Modeling of neuro-fuzzy system as a support in decision-making processes. Reports in Mechanical Engineering, 2(1), 222-234. Ge, M., Junfeng, Z., Jinfei, W., Huiting, H., Xinghua, S., & Hongye, W. (2021). ARIMA-FSVR Hybrid Method for High-Speed Railway Passenger Traffic Forecasting. Mathematical Problems in Engineering, 2021. Gummadi, R., & Edara, S. R. (2018). Analysis of Passenger Flow Prediction of Transit Buses Along a Route Based on Time Series. In S. C. Satapathy, J. M. R. S. Tavares, V. Bhateja, & J. R. Mohanty (Eds.), Information and Decision Sciences (pp. 31–37). Springer. https://doi.org/10.1007/978-981-10-7563-6_4 Guo, X., Grushka-Cockayne, Y., & De Reyck, B. (2021). Forecasting Airport Transfer Passenger Flow Using Real-Time Data and Machine Learning. Manufacturing & Service Operations Management. https://doi.org/10.1287/msom.2021.0975 Hayadi, B. H., Kim, J.-M., Hulliyah, K., & Sukmana, H. T. (2021). Predicting Airline Passenger Satisfaction with Classification Algorithms. International Journal of Informatics and Information Systems, 4(1), 82–94. In Public Transportation in Istanbul. (2021). https://iett.istanbul/en/main/pages/public-transportation-in-istanbul/316, Accessed 13 January 2022. Jackson, M. D., Leung, C. K., Mbacke, M. D. B., & Cuzzocrea, A. (2021). A Bayesian framework for supporting predictive analytics over big transportation data. 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), 332–337. Kamandanipour, K., Yakhchali, S. H., & Tavakkoli-Moghaddam, R. (2022). Learning- based dynamic ticket pricing for passenger railway service providers. Engineering Optimization, 10(1), 1–15. Li, W., Sui, L., Zhou, M., & Dong, H. (2021). Short-term passenger flow forecast for urban rail transit based on multi-source data. EURASIP Journal on Wireless Communications and Networking, 2021(1), 1–13. Li, X., Zhang, Y., Du, M., & Yang, J. (2020). The forecasting of passenger demand under hybrid ridesharing service modes: A combined model based on WT-FCBF-LSTM. Sustainable Cities and Society, 62, 102419. https://iett.istanbul/en/main/pages/public-transportation-in-istanbul/316 Multi-layer perceptron based transfer passenger flow prediction in Istanbul transportation… 223 Liu, W., Tan, Q., & Wu, W. (2020a). Forecast and early warning of regional bus passenger flow based on machine learning. Mathematical Problems in Engineering, 2020, 6625435, https://doi.org/10.1155/2020/6625435. Liu, W., Tan, Q., & Wu, W. (2020b). Forecast and Early Warning of Regional Bus Passenger Flow Based on Machine Learning. Mathematical Problems in Engineering, 2020, e6625435. https://doi.org/10.1155/2020/6625435. Messinis, S., & Vosniakos, G. C. (2020). An agent-based flexible manufacturing system controller with Petri-net enabled algebraic deadlock avoidance. Reports in Mechanical Engineering, 1(1), 77-92. Milenković, M., Švadlenka, L., Melichar, V., Bojović, N., & Avramović, Z. (2018). Sarima Modelling Approach For Railway Passenger Flow Forecasting. Transport, 33(5), 1113–1120. Müller-Hannemann, M., Rückert, R., Schiewe, A., & Schöbel, A. (2022). Estimating the robustness of public transport schedules using machine learning. Transportation Research Part C: Emerging Technologies, 137, 103566. https://doi.org/10.1016/j.trc.2022.103566 Ni, M., He, Q., & Gao, J. (2017). Forecasting the Subway Passenger Flow Under Event Occurrences With Social Media. IEEE Transactions on Intelligent Transportation Systems, 18(6), 1623–1632. https://doi.org/10.1109/TITS.2016.2611644 Pamucar, D., Deveci, M., Canıtez, F., & Bozanic, D. (2020). A fuzzy Full Consistency Method-Dombi-Bonferroni model for prioritizing transportation demand management measures. Applied Soft Computing, 87, 105952. https://doi.org/10.1016/j.asoc.2019.105952 Rajendran, S., Srinivas, S., & Grimshaw, T. (2021). Predicting demand for air taxi urban aviation services using machine learning algorithms. Journal of Air Transport Management, 92, 102043. https://doi.org/10.1016/j.jairtraman.2021.102043. Reitmann, S., & Schultz, M. (2022). An Adaptive Framework for Optimization and Prediction of Air Traffic Management (Sub-) Systems with Machine Learning. Aerospace, 9(2), 77, 1-15. Rodríguez-Sanz, Á., de Marcos, A. F., Pérez-Castán, J. A., Comendador, F. G., Valdés, R. A., & Loreiro, Á. P. (2021). Queue behavioural patterns for passengers at airport terminals: A machine learning approach. Journal of Air Transport Management, 90, 101940. https://doi.org/10.1016/j.jairtraman.2020.101940 Roos, J., Gavin, G., & Bonnevay, S. (2017). A dynamic Bayesian network approach to forecast short-term urban rail passenger flows with incomplete data. Transportation Research Procedia, 26, 53–61. Statista Demographies. (2020). https://www.statista.com/statistics/1101883/largest-european-cities/, Accessed 17 February 2022. Sun, Y., Leng, B., & Guan, W. (2015). A novel wavelet-SVM short-time passenger flow prediction in Beijing subway system. Neurocomputing, 166, 109–121. Tom Mitchell. (2006). The Discipline of Machine Learning. Pittsburgh, PA. http://ra.adm.cs.cmu.edu/anon/usr0/ftp/anon/ml/CMU-ML-06-108.pdf https://doi.org/10.1016/j.jairtraman.2021.102043 https://doi.org/10.1016/j.jairtraman.2020.101940 https://www.statista.com/statistics/1101883/largest-european-cities/ Utku and Kaya /Decis. Mak. Appl. Manag. Eng. 5 (1) (2022) 208-224 224 Toqué, F., Khouadjia, M., Come, E., Trepanier, M., & Oukhellou, L. (2017). Short amp; long term forecasting of multimodal transport passenger flows with machine learning methods. 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), 560–566. https://doi.org/10.1109/ITSC.2017.8317939 Traffic congestion ranking | TomTom Traffic Index. (2021). https://www.tomtom.com/en_gb/traffic-index/ranking/ TUIK. (2021). https://data.tuik.gov.tr/Bulten/Index?p=Adrese-Dayal%C4%B1- N%C3%BCfus-Kay%C4%B1t-Sistemi-Sonu%C3%A7lar%C4%B1-2020- 37210&dil=1, Accessed 19 February 2022. Wang, B., Wu, P., Chen, Q., & Ni, S. (2021). Prediction and Analysis of Train Passenger Load Factor of High-Speed Railway Based on LightGBM Algorithm. Journal of Advanced Transportation, 2021, ID 9963394, https://doi.org/10.1155/2021/9963394. Wood, J., Yu, Z., & Gayah, V. V. (2022). Development and evaluation of frameworks for real-time bus passenger occupancy prediction. International Journal of Transportation Science and Technology. https://doi.org/10.1016/j.ijtst.2022.03.005 Xie, G., Wang, S., & Lai, K. K. (2014). Short-term forecasting of air passenger by using hybrid seasonal decomposition and least squares support vector regression approaches. Journal of Air Transport Management, 37, 20–26. Yang, X., Xue, Q., Yang, X., Yin, H., Qu, Y., Li, X., & Wu, J. (2021). A novel prediction model for the inbound passenger flow of urban rail transit. Information Sciences, 566, 347– 363. Ye, Y., Chen, L., & Xue, F. (2019). Passenger Flow Prediction in Bus Transportation System using ARIMA Models with Big Data. 2019 International Conference on Cyber- Enabled Distributed Computing and Knowledge Discovery (CyberC), 436–443. https://doi.org/10.1109/CyberC.2019.00081 Zhang, J., Shen, D., Tu, L., Zhang, F., Xu, C., Wang, Y., Tian, C., Li, X., Huang, B., & Li, Z. (2017). A Real-Time Passenger Flow Estimation and Prediction Method for Urban Bus Transit Systems. IEEE Transactions on Intelligent Transportation Systems, 18(11), 3168–3178. Zheng, Z., Ling, X., Wang, P., Xiao, J., & Zhang, F. (2021). Hybrid model for predicting anomalous large passenger flow in urban metros. IET Intelligent Transport Systems, 14(14), 1987–1996. © 2022 by the authors. Submitted for possible open access publication under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). https://data.tuik.gov.tr/Bulten/Index?p=Adrese-Dayal%C4%B1-N%C3%BCfus-Kay%C4%B1t-Sistemi-Sonu%C3%A7lar%C4%B1-2020-37210&dil=1 https://data.tuik.gov.tr/Bulten/Index?p=Adrese-Dayal%C4%B1-N%C3%BCfus-Kay%C4%B1t-Sistemi-Sonu%C3%A7lar%C4%B1-2020-37210&dil=1 https://data.tuik.gov.tr/Bulten/Index?p=Adrese-Dayal%C4%B1-N%C3%BCfus-Kay%C4%B1t-Sistemi-Sonu%C3%A7lar%C4%B1-2020-37210&dil=1 https://doi.org/10.1155/2021/9963394