On the Performance of Machine Learning Based Flight Delay Prediction – Investigating the Impact of Short-Term Features

  • Delia Schösser Technical University Dresden "Friedrich List", Faculty of Transport and Traffic Sciences, Institute of Transport and Economics
  • Jörn Schönberger Technical University Dresden "Friedrich List", Faculty of Transport and Traffic Sciences, Institute of Transport and Economics
Keywords: flight delay prediction, machine learning, aviation, feature importance, classification, SHAP


People and companies today are connected around the world, which has led to a growing importance of the aviation industry. As flight delays are a big challenge in aviation, machine learning algorithms can be used to forecast those. This paper investigates the prediction of the occurrence of flight arrival delays with three promi-nent machine learning algorithms for a data set of do-mestic flights in the USA. The task is regarded as a clas-sification problem. The focus lies on the investigation of the influence of short-term features on the quality of the results. Therefore, three scenarios are created that are characterised by different input feature sets. When for-going the inclusion of short-term information in order to shift the prediction timing to an early point in time, an accuracy of 69.5% with a recall of 68.2% is achieved. By including information on the delay that the aircraft had on its previous flight, the prediction quality increases slightly. Hence, this is a compromise between the early prediction timing of the first model and the good predic-tion quality of the third model, where the departure delay of the aircraft is added as an input feature. In this case, an accuracy of 89.9% with a recall of 83.4% is obtained. The desired timing of prediction therefore determines which features to use as inputs since short-term features significantly improve the prediction quality.


Awad M, Khanna R. Efficient learning machines theories, concepts, and applications for engineers and system designers. Berkeley, CA: Apress; 2015.

Bureau of Transportation Statistics (BTS). 2019 traffic data for U.S. airlines and foreign airlines U.S. flights. 2020. https://www.bts.dot.gov/newsroom/final-full-year-2019-traffic-data-us-airlines-and-foreign-airlines-us-flights [Accessed 21st Mar. 2022].

Bureau of Transportation Statistics (BTS). Airline on-time performance and causes of flight delays. 2021. https://www.bts.gov/topics/airlines-and-airports/airline-time-performance-and-causes-flight-delays [Accessed 21st Mar. 2022].

Federal Aviation Administration (FAA). Air traffic by the numbers. 2020. https://www.faa.gov/air_traffic/by_the_numbers/media/Air_Traffic_by_the_Numbers_2020.pdf [Accessed 21st Mar. 2022].

Jacquillat A, Odoni AR. A roadmap toward airport demand and capacity management. Transportation Research Part A: Policy and Practice. 2018;114: 168-185. doi: 10.1016/j.tra.2017.09.027.

Belcastro L, Marozzo F, Talia D, Trunfio P. Using scalable data mining for predicting flight delays. ACM Transactions on Intelligent Systems and Technology. 2016;8(1): 1-20. doi: 10.1145/2888402.

Ding Y. Predicting flight delay based on multiple linear regression. In: Jia XL, Zhou SQ, Patty AA (eds.) IOP Conference Series: Earth and Environmental Science, Volume 81, 2nd International Conference on Materials Science, Energy Technology and Environmental Engineering (MSETEE 2017), 28–30 Apr. 2017, Zhuhai, China. IOP Publishing; 2017. 012198.

Yazdi MF, Kamel SR, Chabok SJM, Kheirabadi M. Flight delay prediction based on deep learning and Levenberg-Marquart algorithm. Journal of Big Data. 2020;7(106): 1-28. doi: 10.1186/s40537-020-00380-z.

Huo J, et al. The prediction of flight delay: Big data-driven machine learning approach. 2020 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), 14–17 Dec. 2020. IEEE; 2020. p. 190-194.

Gui G, et al. Flight delay prediction based on aviation big data and machine learning. IEEE Transactions on Vehicular Technology. 2020;69(1): 140-150. doi: 10.1109/tvt.2019.2954094.

Kalyani NL, et al. Machine learning model - based prediction of flight delay. 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 7-9 Oct. 2020. IEEE; 2020. p. 577-581.

Manna S, et al. A statistical approach to predict flight delay using gradient boosted decision tree. 2017 International Conference on Computational Intelligence in Data Science (ICCIDS), 2-3 June 2017, Tamilnadu, India. IEEE; 2017. p. 1-5.

US Department of Transportation (US DOT). 2015 flight delays and cancellations. 2017. https://www.kaggle.com/usdot/flight-delays [Accessed 21st Mar. 2022].

Marsland S. Machine learning - An algorithmic perspective. New York: CRC Press; 2015.

Burnett RA, Si D. Prediction of injuries and fatalities in aviation accidents through machine learning. ICCDA '17: Proceedings of the International Conference on Compute and Data Analysis, 19-23 May 2017, Lakeland, USA. New York: ACM Press; 2017. p. 60-68.

Horiguchi Y, et al. Predicting fuel consumption and flight delays for low-cost airlines. AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 4-9 Feb. 2017, San Francisco, USA. AAAI Press; 2017. p. 4686–4693.

Jan SS, Chen YT. Development of a new airport unusual-weather detection system with aircraft surveillance information. IEEE Sensors Journal. 2019;19(20): 9543-9551. doi: 10.1109/jsen.2019.2926391.

Yablonsky G, et al. Flight delay performance at Hartsfield-Jackson Atlanta International Airport. Journal of Airline and Airport Management. 2014;4(1): 78-95. doi: 10.3926/jairm.22.

Xu N, Sherry L, Laskey KB. Multifactor model for predicting delays at U.S. Airports. Transportation Research Record: Journal of the Transportation Research Board. 2008;2052(1): 1-15. doi: 10.3141/2052-08.

National Oceanic and Atmospheric Administration (NOAA). data/ global-hourly/ archive/ csv. 2019. https://www.ncei.noaa.gov/data/global-hourly/archive/csv/ [Accessed 21st Mar. 2022].

NOAA SciJinks. How reliable are weather forecasts? https://scijinks.gov/forecast-reliability/ [Accessed 21st Mar. 2022].

Federal Aviation Administration (FAA). Core 30. https://aspm.faa.gov/aspmhelp/index/Core_30.html [Accessed 21st Mar. 2022].

Alpaydin E. Introduction to machine learning. Cambridge: MIT Press; 2020.

Russell SJ, Norvig P. Artificial intelligence - A modern approach. London: Prentice Hall; 2010.

Pedregosa F, et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research. 2011;12: 2825-2830. https://www.jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf [Accessed 21st Mar. 2022].

Chen T, Guestrin C. XGBoost: A scalable tree boosting system. KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data, Mining 13-17 Aug. 2016, San Francisco, USA. New York: ACM Press; 2016. p. 785-794.

Chollet F. Keras. https://keras.io [Accessed 21st Mar. 2022].

Abadi M, et al. TensorFlow: Large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/ [Accessed 21st Mar. 2022].

Kubat M. An introduction to machine learning. Cham: Springer Nature; 2021.

Dembczynski K, et al. Optimizing the F-measure in multi-label classification: Plug-in rule approach versus structured loss minimization. PMLR Proceedings of the 30th International Conference on Machine Learning, Atlanta, USA. 2013. p. 1130-1138.

Esmaeilzadeh E, Mokhtarimousavi S. Machine learning approach for flight departure delay prediction and analysis. Transportation Research Record: Journal of the Transportation Research Board. 2020;2674(8): 145-159. doi: 10.1177/0361198120930014.

Claesen M, et al. Hyperparameter tuning in Python using Optunity. International Workshop on Technical Computing for Machine Learning and Mathematical Engineering (TCMM 2014), Leuven, Belgium. 2014. p. 1-2.

Freitas D, Guerreiro Lopes L, Morgado-Dias F. Particle swarm optimisation: A historical review up to the current developments. Entropy. 2020;22(3): 1-36. doi: 10.3390/e22030362.

Lundberg SM, Lee SI. A unified approach to interpreting model predictions. NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems, 4-9 Dec. 2017, Long Beach, USA. Red Hook: Curran Associates Inc.; 2017. p. 4765-4774.

Gianfagna L, Di Cecco A. Explainable AI with Python. Cham: Springer International Publishing; 2021.

How to Cite
Schösser D, Schönberger J. On the Performance of Machine Learning Based Flight Delay Prediction – Investigating the Impact of Short-Term Features. Promet [Internet]. 2022Dec.1 [cited 2023Jan.29];34(6):825-38. Available from: https://traffic.fpz.hr/index.php/PROMTT/article/view/4132