On any given day, 500,000 passengers and pedestrians, 150,000 privately owned vehicles, and approximately $7.6 billion worth of imported goods cross U.S. borders. Delays at the crossing points along the border are a recurring problem. A limited number of agents, officers, and government professionals conduct operations across more than 300 ports of entry every day, which can experience unexpected surges or declines in traffic volume. Wait times to enter the U.S. from Mexico can exceed 10 hours and cost upwards of $7 billion in economic activity annually.
DataRobot’s AI Cloud Platform can enable effective and secure border transportation by predicting activity at crossing points to support better decisions about staffing levels. This use case can reduce wait times to spur economic trade, as well as ensure enough personnel are on hand to screen for illegal goods and criminal activity. For instance, every day Customs and Border Protection (CBP) arrests an average of 25 wanted criminals at ports of entry and seizes over 4,700 pounds of drugs. Having more agents in the right spot for more effective inspections can increase those seizures and help keep America more safe. AI-enabled staffing can also improve efficiency by predicting periods where activity will be low and allow CBP to reduce staffing to minimal levels without impacting risk.
Department of Transportation Data
The U.S. Department of Transportation (USDOT) Bureau of Transportation Statistics (BTS) provide publicly-available monthly summary statistics for both the U.S.-Canada and U.S.-Mexico borders at the port-of-entry level. The database contains entry data from Mexico to the U.S. for 26 years dating back to 1996. It includes pedestrian, bus, personal vehicle, rail container, train, and truck data. For this example, DataRobot is only predicting truck crossings.
An example of the truck data is shown to the left. This image displays the total truck crossings per port of entry in January 2021. In this example, DataRobot used all 26 years of data to predict unexpected increases or decreases in truck crossings at a specific port of entry for the next month.
DataRobot Time Series Modeling
DataRobot’s Automated Time Series Modeling rapidly builds forecasting models to scale across an organization’s needs. Time series modeling is different from other types of machine learning and requires specialized data handling, preprocessing, and modeling capabilities. Using DataRobot’s built-in automation and no-code user interface, users can easily access the full-spectrum of time-based machine learning techniques. DataRobot automatically identifies the ports of entries as different series in the dataset and treats them independently. DataRobot also automatically handles complicated time series requirements like date and time partitioning while generating explainable predictions and visualizations, which increases model explainability and builds trust with users.
Predicting Border Surges
In this example, the DataRobot team used truck data from the USDOT dataset to forecast the next month’s total truck crossings at each port of entry using the DataRobot AI Cloud Platform. With this information, leaders could modify staffing levels, alter lane openings and closures, and plan major repairs around surges or shortfalls in expected volume, thereby decreasing wait times and increasing trade throughput.
An indicator variable was created in the dataset to account for COVID-19 (known as a “regime change” in data science). For more accurate predictions, truck traffic could be aggregated at a more precise level such as hourly or daily. DataRobot model performance could also be improved by training on organizational-specific data such as border-specific events and historic staffing levels at ports of entry.
A six-month feature derivation window generated the best results for forecasting the truck volumes of the next month. DataRobot enables quick and easy iterations of various backtest configurations to rapidly find the best performing model parameters. DataRobot also took the nine original input features and generated 135 new features during automated Feature Discovery to increase the model performance. Using these new features, DataRobot automatically built 63 models for comparison.
DataRobot quickly produced a multi-series time series forecasting model capable of predicting surges of truck traffic at each port of entry across the southwest border. Performance of the model dropped immediately around the beginning of COVID-19, then rapidly regained accuracy. DataRobot Time Series modeling can be applied to numerous use cases across homeland security organizations including staffing, demand forecasting, supply chain management, predictive maintenance, anomaly detection, and more. Contact a member of the DataRobot team today to see how your organization can become AI-driven.