DATA-DRIVEN OPERATIONAL TOOLS FOR FREIGHT RAIL SYSTEMS
Motivated by the increased traffic and network capacity constraints of freight railroads in the United States, and the increased availability of operational data, this dissertation addresses the development of new methods to leverage large volumes of railroad data to perform operational analysis and prediction. The main contributions of the dissertation are 1) an automatic method to clean and impute historical railroad dispatch data sources while guaranteeing feasibility of the overall dataset, called data reconciliation; 2) a dispatch analysis methodology that answers questions about decisions that impacted future replanning ability, hypothetical improvements to decisions that could have improved replanning, and the effects of specific trains on other traffic and the dispatch plan; and 3) a machine learning framework that poses the problem of estimating arrival times (ETA) of individual freight trains on networks. I demonstrate that the data reconciliation method effectively utilizes the constraint set of the optimal dispatch problem to guarantee feasibility of individual train trajectories and interactions between trains. It achieves improvements in recovery of timing points and location of meet and overtake events on a synthetically decimated dataset, compared to baseline methods of imputing data. The dispatch analysis methodology is able to isolate specific areas of dispatching decisions that impose downstream effects on the ability to run the optimal plan. It finds small changes to these decisions that could have been made to reduce future impact. The impact that individual trains have on the dispatch plan is also isolated and it is shown that small deviations from a train’s optimal plan can have a large secondary effect on other trains and the optimal plan as a whole. The machine learning framework for prediction of train ETA considers a multitude of data features about train characteristics, network properties, and other train traffic. Multiple machine learning algorithms are applied to the problem and, ultimately, it is shown to produce an average estimation improvement of over 40% across a subdivision of the rail network. Together, the contributions of this dissertation comprise a powerful set of data-driven tools for freight railroads to analyze and improve specific operational aspects.