AI is the last thing you should do, but you should do it.

Charles Donly
Nov 13, 2023
12 min read

Case study: Real-time predictive analytics for the logistics and trucking industry.

Summary: A logistics/trucking client has lots of data and experience and wants to use it to improve their daily predictions which are stressful, urgent and material to the business.

We walk thru the engagement and lay out how the goal of implementing Machine Learning (ML) and Artificial Intelligence (AI) is the last step in the process of establishing a “rate of learning” and working our way up the ladder of “continuous improvement” (mapping, digitizing, analyzing, testing, creating, etc..).

When AI is the last step, you have a more robust final process because, first, you have investigated and improved before you automated the process. The AI model will benefit from the pruning of features as your continuous improvement advances. AI implementation will suffer less from inconsistent input data or “scope creep”. An increased understanding of the “hidden factory” allows you to target better and less risky places to add the “right amount” of AI (more complex and potentially divergent form of computation) in the process after you have already implemented descriptive statistics (think dashboards) and predictive analytics (a form of ML). Setting the BHAG (Big Harry Audacious Goal — Built to last) of implementing AI is the motivation that gives the team all the “rate-of-learning” gains along the way.

List of Big Ideas in this case study:

1. Once you can measure something and give that information to the people who actually do the job, they will improve the process with their knowledge and creativity. Their solutions will pleasantly surprise you.

2. Being able to run experiments in digital space is many times quicker and less expensive than in real life (but you need a form of a digital twin).

3. Predictive analytics does not replace but improves someone’s job. How? It does so by converting existing data into most likely outcomes, that when done conservatively and with the minimum number of relevant factors, is reliable. This is very similar to the golden rule in product design: the best part is the part you can eliminate from the design.

4. There is a treasure chest of continuous improvement just from 1. Confirming you are collecting the relevant data via. a process map audit. 2. Monitoring a dashboard of descriptive statistics to confirm anecdotal evidence 3. Process re-engineering and 4. Using predicting along with an indicator of confidence or risk. None of those steps requires complicated optimization techniques, which is great to do once you understand how a system performs and reacts.

5. Never let best become the enemy of better.

Problem statement: imaging being a dispatcher in one if the busiest shipping hubs, and you need to predict how many overnight trucks and drivers you need. Your deadline to submit this plan is 5pm for that night. When you start the day (let’s say 10am) you only have 20% of the freight pick-ups currently scheduled, and you only start estimating the nightly demand on the shipping lanes after a pick-up is complete and loaded into the system. If you are wrong and over-estimate the number of drivers and trucks, drivers get sent home that night without a job. If you under-estimate, freight gets delayed. This leaves a dispatcher with about a three-hour window to gather all their data and turn in their prediction. Even if it is accurate, it sounds stressful every single day. That is the problem we faced with our client.

Goal: To remove as much calculation off the “critical path” as possible, and provide a dynamic and daily prediction for the dispatcher along with a confidence level they can use to gauge the amount of risk in the prediction.

Structure: Our team of 3 comprised, Asher Nuckolls who brought much of the predictive analytics to life, Joe Fuqua who managed the business relationship and brought decades of logistics, government and financial consulting, technical leadership and Machine Learning research and myself who was the engagement manager and architect. Brian Rogers, my business partner, created this opportunity for us. The project ran 3 months (12 weeks) from first meeting until the final phase review. In this case study, we will focus on the predictive analytics outcome and skip over the continuous improvement recommendations that came out of our weekly rate of learning.

Definitions: Predictive analytics is a method of extrapolation (at least in the time domain) and often including as many blocking variables (orthogonal) or KPIVs (key process input variables) in a prediction of future behavior/demand to be robust to as many known and meaningful fluctuations (pooled vs. total sigma) as understandable (fishbone diagrams and cause and effect matrices are good tools to ferret these out early in a project). A key building block towards predictive analytics is descriptive statistics, which creates a short-hand snapshot of large amounts of data.

Machine Learning is the ability to build an analytics engine that ingests data and uses the computational power of computers to make more accurate future predictions. This offloads the analytical updating steps in predictive analytics to a computer and allows data scientists, statisticians, and engineers to work on the architecture.

Artificial Intelligence (AI) is the ability within machine learning to implement logic and then reasoning that is on par with some level of human decision making. Many new Large Language Model (LLM) agents are coming out now that perform well when the question (or prompt) that is given to them is broken down into more manageable and smaller calculation steps. In can be expected that, future research will be in Machine Learning agents that can more efficiently break down unstructured problems, apply they against a problem specific set of data (encoder/decoder architecture) and process that thru an RLHF (reinforcement learning, human feedback) trained LLM which is basically a connected representation of digital human knowledge.

The engagement: The engagement is broken up in five phases, and not all in series. For example, we started building a simulator while we were doing analysis and asking questions based on the patterns we were seeing in the historic data. The engagement phases were:

1. Document (visual process map and audited that against the datasets we received),

2. Analyze the data we had which led to more discussions and potential ideas (this is the core of the engagement where the wheel of “the rate of learning” starts).

3. Recreate what we understood in the digital space,

4. Define the features and rules we wanted for our analytics engine and

5. Build and test the predictive analytics engine as we add or remove features (continuation of the “rate of learning” wheel).

Process Mapping: Leading up to building the Predictive analytics engine, we had to create a process map of daily activity around a distribution center so that we could align the 10+ datasets that we were given. Below is a high-level process map for a distribution center:

This process map allowed us to break the project into 2 parts. The “grey” project was working on overnight arrivals and how to 1. Reduce delays between cities and 2. Improve estimated arrival times.

An example of a quick win was accounting for time sensitive routes where driving into traffic caused predictable and meaningful trip delays that could easily be predicted. We also found lost opportunity during trailer hand-offs that were ahead of schedule.

Now we can focus on the very dynamic day at a distribution center. Asher took on the challenge to create a simulator and train the neural network (pattern recognition of distributions) to mimic the daily drop-off and pick-up of goods. Here is where we get to apply Machine Learning. Asher created the simulator while I was investigating the descriptive statistics and identifying patterns that would lead to future “what-if” experiments in the simulator. The simulator served as our Digital Twin, and allowed us to create distributions and test hypothesis that turned into features of our predictive analytics model. This type of problem is referred to as VRP (vehicle route planning) with back-fill.

One of the first decisions was to decide the scope of the prediction. The engagement focused on the distribution center with the largest volume and the densest arrival of goods in the morning. Which would suggest that this distribution center had the most urgent needs for predictive analytics and the highest potential to include multiple types of variation in the data.

Get an understanding of the current process: In addition to building the simulator, we also had a demo of the current tools and steps a dispatcher uses in their daily work. The demo revealed that their current predictions were only based on actual weight that was picked-up that day. This would be an example of what is called a “hidden factory” in Operations Research. One reason for this is that their day was viewed as dynamic with just 20% (as mentioned before) of pick-ups scheduled at the start of a day and most pick-ups don’t have an estimate of what is being shipped or where. So, how can you predict the weight and the shipment’s destination if you only know 20% of the orders at the start of the day? That is where the work begins…

Another concern and “hidden factory” is that “last-minute” calls may come late in the day for a shipment to be picked-up. Understanding who is likely to make a last-minute call and how full a trailer (at that moment) is projected to be is very important information to estimate the final number of drivers and lanes. Sometimes that last minute pickup is material and sometimes it does not impact needing another trailer or rerouting.

Creating and testing factors: With our simulator and analysis, we looked for patterns and made some assumptions that we worked towards verifying.

One assumption is that there were two categories of equally valuable and awesome customers: predictable customers and customers who either were new or had a variable history for shipments. We had not defined what predictable looked like except that predictable would have a unique and repeatable value for the prediction algorithm like shipment destination and weight. A new or not predictable customer would have values randomly selected from a larger and generic distribution of weights and locations based on previous history.

When it came time to measure a shipment, we had to choose the right metrics. For example, shipments can have multiple freight bills and can also be measured by an estimate of volume (cubes). Or we can just look at the aggregate metric of a “pickup” which may include multiple freight-bills or final destinations. So, which one is a better historical predictor of a customer’s pick-up?

Refining the analysis: During this process we also refined our analysis and definitions and vetted that with our client. This sounds boring but was a key tool to creating the final product.

For example, we try to predict where the shipment goes. The analysis can use the shipping address or it can use the trailer the shipment goes on. Those variables give two very different predictions and correlations. When we ask the question, “what trailer does the shipment go on”, it reduces the analysis to the shipment’s first destination, not its final destination.

When we refined that analysis, we found our predictive customers used the same first destination every time, and this was more than half of the customers. So, someone who shipped North, kept shipping North. We could now predict where the shipment was going, even if the final cities were changing. We recognized one of our first key predictable patterns.

After identifying the distributions center, creating predictive customers and how to identify them, we looked at the predictive accuracy by time of day. Here we learned that we have higher accuracy as the day progressed. We used that end-of-the-day estimate to make our trailer and destination prediction about 4 hours earlier than the current manual prediction:

Not all hypotheses and experience is material. One hypothesis we heard about, tested and rejected was that pickups on certain days of the week happened later in the day and thus impacted the prediction model. The idea was that companies rushed to complete orders towards the end of the week, and then had more “last minute” shipments depending on the day of the week. It is plausible and certainly happens at some frequency, but the key question when building predictive analytics is if it is meaningful to the prediction. We found that over many months of data, day-of-the-week did not change how early or late in the day shipments were picked-up (as measured by weight). To confirm this suspicion, we converted our data to 100% of the maximum weight picked-up that day and that allowed us to equally compare all days and the rate at which shipments were collected during the day. This is one of the normalization techniques used to be able to verify or reject a hypothesis.

Creating the actual predictive analytics tool:

We have predicted:

1. how many shipments we will have at the end of the day,

2. we have predicted where they will go and estimated how much they weigh by the type of customer.

a. using either a historic predictable value or

b. a random historic value if they have no history or lack consistency.

Creating the prediction:

1. We will update our predictions every 30 minutes to account for any pickups that happened.

a. if the pickup is from a predictable customer and was scheduled, it is moved from the prediction list to the actual list.

b. Otherwise, assign the weight and overnight lane to the actual list based on a random selection from historical data from the pool of overall customers (using our Digital Twin).

2. Historically, we have confidence that by early afternoon we have reached maximum weight for a trailer on a typical day.

Predictive model: The model is a combination of predicted and actual shipments. Early in the day we have mostly predicted pick-ups, and by the end of the day we have mostly actual pickups (with some percentage change of a late call-in which we estimate from our Digital Twin).

1. We combine the above findings into a predictive model that outputs the weight and number of trailers for each overnight destination.

2. We add a confidence interval around that estimate based on the amount of the predicted weight and historic variation associated.

3. As more actual weight for a trailer gets picked-up by customers we add that to the trailer and reduce the predicted weight which then improves the confidence band of our prediction.

4. Once the confidence band is within the range of one trailer, then, to that confidence level (we used 96% or +/- 2 sigma), we know that is the number of trailers we need.

Model and validation: Below is the predictive model on a validation day not used in training distributions. As we described, each bar is an overnight destination or trailer load. We start the day with medium blue estimates vs. the light blue actual at the end of the day. Our estimate gets updated per the model above with actual values and updated predictions every 30 minutes. The key is refining the prediction of predictable customers (that gives us better accuracy on weight and location) and overall pick-ups remaining in the day (prevents underestimating).

Description of the model: Below, light blue is the actual end of day values. Integers are the number of trailers/trucks needed at the end of the day. Each bar is a an evening destination. Medium blue is the prediction and dark blue is the actual amount of shipments picked-up at that time. The day starts with all light blue and ends with all dark blue. In the early afternoon, we have a complete prediction where the confidence intervals are small enough (don’t overlap between number of trailers) that we can predict the number of trailers we need at the end of the day.

Using the model: Using a model like this doesn’t end with the prediction. The dispatcher will review the shipments by overnight destination and adjust shipment to consolidate trailers and optimize for other requirements. This tool gives an earlier view of the end of day shipments and destinations to the dispatcher so they can start their work earlier in the day, reduce their stress and have time to better optimize their work before they turn in their plan which becomes a demand for overnight drivers.

A note about Optimization: Another key take-away is that we did not jump to any optimization techniques or attempt to define routing of trucks.

For example, data and predictive analytics can be extended to an optimization routine with different goals: 1. Ideal lowest cost 2. Most robust to weather, drivers or trucks and 3. Most robust to changes in customer demand or handling unexpected upside deliveries. First, we want to understand and get improvements “on-the-shop-floor”.

Future work: As we discussed at the outset, there is a place for more advanced AI once we have a Digital Twin (digital representation of the real-world process) and enough continuous improvement processes that we know how the system responds to intentional changes. For overnight trailer/lane estimation, here are some ideas for enhancements that include AI:

1. Weighted historical time averaging. Using historic data but averaging it such that the previous week is more valuable than data 20 weeks ago.

2. Updating predictable customers on a monthly rolling schedule

3. Further customized variables like trailer weight by destination. On average, if we see a repeatable and meaningful pattern, we can include a unique average trailer weight by destination to better refine our estimate of number of drivers needed.

4. Agents that run multiple scenarios using actual historic distributions and create strategies that maximize either low cost, fewest number of trucks or types of shipments received (number of cubes or no stacking requirements).

5. Refining estimates with additional projects like vision recognition of incoming loads to better understand if the shipment has additional impact on filling a trailer.

6. Using a trained AI network to predict last minute shipments towards the end of the day. Who is it likely to come from, and what type of shipment based on the season and, say, day of week?

7. Additional AI recommendations from external sources like sentiment analysis by industry (based on feeds in X.com etc..) and weather predictions.

8. Rerouting pick-ups to collect the least predictable shipments first. Then we would reduce our uncertainty about the end of the day faster and complete our prediction with higher accuracy and sooner in the day.

AI is the last thing you should do, but you should do it.

List of Big Ideas in this case study:

Recent Posts

Comments

SUBSCRIBE