Scientific Paper / Artículo Científico


pISSN: 1390-650X / eISSN: 1390-860X





Fernando Hernández Gobertti1,*, Rafael Sotelo1, Marcelo Forets2


Received: 06-06-2023, Received after review: 14-09-2023, Accepted: 06-10-2023, Published: 01-01-2024




This article explores the design and application of machine learning techniques to enhance traditional approaches for solving NP-hard optimization problems. Specifically, it focuses on the Last-Mile Routing Research Challenge (LMRRC), supported by Amazon and MIT, which sought innovative solutions for cargo routing optimization. While the challenge provided travel times and zone identifiers, the dependency on these factors raises concerns about the algorithms’ generalizability to different contexts and regions with standard delivery services registries. To address these concerns, this study proposes personalized cost matrices that incorporate both distance and time models, along with the relationships between delivery stops. Additionally, it presents an improved approach to sequencing stops by combining exact and approximate algorithms, utilizing a customized regression technique alongside fine-tuned metaheuristics and heuristics refinements. The resulting methodology achieves competitive scores on the LMRRC validation dataset, which comprises routes from the USA. By carefully delineating route characteristics, the study enables the selection of specific technique combinations for each route, considering its geometrical and geographical attributes. Furthermore, the proposed methodologies are successfully applied to real-case scenarios of last-mile deliveries in Montevideo (Uruguay), demonstrating similar average scores and accuracy on new testing routes. This research contributes to the advancement of last-mile delivery optimization by leveraging personalized cost matrices and algorithmic refinements. The findings highlight the potential for improving existing approaches and their adaptability to diverse geographic contexts, paving the way for more efficient and effective delivery services in the future.

Este artículo explora el diseño y aplicación de técnicas de aprendizaje automático para mejorar los enfoques tradicionales y así resolver problemas de optimización NP-hard. En particular, se enfoca en el Last-Mile Routing Research Challenge (LMRRC), apoyado por Amazon y MIT, que buscaba soluciones innovadoras para la optimización de rutas de carga. Si bien el desafío proporcionó tiempos de viaje e identificadores de zona, la dependencia de estos factores plantea preocupaciones sobre la generalización de los algoritmos a diferentes contextos y regiones con registros de servicios de entrega estándar. Para abordar estas interrogantes, este estudio propone matrices de costos personalizadas que incorporan modelos de distancia y tiempo, junto con las relaciones entre las paradas de entrega. Además, presenta enfoques mejorados para la secuenciación de paradas mediante la combinación de algoritmos exactos y aproximados, utilizando técnicas de regresión personalizada junto con metaheurísticas y refinamientos heurísticos ajustados. La metodología resultante logra puntajes competitivos en el conjunto de datos de validación LMRRC, que usa rutas de EE. UU. Al delinear cuidadosamente las características de la ruta, el estudio permite la selección de combinaciones de técnicas específicas para cada ruta, considerando sus atributos geométricos y geográficos. Además, las metodologías propuestas se aplican con éxito a escenarios de casos reales de entregas en Montevideo (Uruguay), demostrando puntajes promedio y precisión similares en nuevas rutas de prueba. Esta investigación contribuye al avance de la optimización de la entrega de última milla al aprovechar matrices de costos personalizadas y refinamientos algorítmicos. Los hallazgos resaltan el potencial para mejorar los enfoques existentes y su adaptabilidad a diversos contextos geográficos, allanando el camino para servicios de entrega más eficientes y efectivos en el futuro.


Keywords: Optimization, Routing, Operations, Logistics,

International Competition

Palabras clave: Optimización, enrutamiento, operaciones, logística, competencia internacional













1,*Facultad de Ingeniería, Universidad de Montevideo, Uruguay.Corresponding author :

2Centro Universitario Regional del Este, Universidad de la República, Uruguay.


Suggested citation: Hernández, F.; Sotelo, R.; Forets, M. “Optimization algorithms for adaptative route sequencingon real-world last-mile deliveries,” Ingenius, Revista de Ciencia y Tecnología, N.◦ 31, pp. 64-80, 2024, doi:


1.      Introduction


The proper definition and sequencing of multiple stops on cargo routes has a direct practical impact on the transports’ efficacy, logistics and fuel consumption [1]. These factors encourage the search of time-efficient and optimal solutions capable of performing acceptably under real and scalable use cases on every day and ever-growing demands of deliveries [2]. This section includes information with regard to the project genesis over an Amazon’s and MIT’s recent contest and continues to define the main problems subdivided as both supervised and unsupervised learning treated on this document. Finally, once these topics are considered, a consequent thematic flow is proposed, specifying basis and original contributions of this work.


1.1.  Amazon’s Last-Mile Routing Research Challenge, supported by MIT


During the course of March to June 2021, the competition known as LMRRC [3] took place, promoting the participants to group and solve a terrestrial cargo routing optimization problem with real routes data from cities of USA with great demand of last-mile deliveries (i.e. transportation from the initial deposit or station to several dropoff sites), seeking innovative original solutions for sequencing geographical nodes. These data corresponded to stops, packages and travel times information for each of the 6125 training and validation routes whose segmentation was already defined. There was a total of 1,457,175 packages and 898,415 stops, where several packages were linked to each stop, which at the same time were associated to a zone identifier specific to the city. Each of these routes also had an actual sequence with which to compare the results according to an ’order-differential’ score that became smaller the best the algorithms performed and later generalized on testing set. The authors’ team, named AlphaCentauri, classified as one of the top-performing team of the competition.


1.2.  Problem Statement


With focus on expanding the LMRRC initiative, the global difficulties range from certain supervised learning aspects through prediction and routing algorithms with objectives of reaching the best possible results with respect to predefined solutions.

The current cargo routing problem requests to approximate a stops sequence solution diagrammed by experienced drivers based on previous package transports and route state knowledge. All packages are weightless but dimensioned, as well as the single transport units per route, which may involve a necessity to resupply at station. Also, some packages are subject to fairly loose

time windows that limit the packages provisioning hours. Then, most stops are linked to geographically distributed zones, with variable dependencies 1:ni for packages and stops as well as stops and zones. All possible travel times combinations between stops are known in addition to their coordinates and the planned service time needed for each package. The problem is framed as a type of Asymmetric Travelling Salesman Problem with no return or a weighted Hamiltonian Path Problem with Time Windows and Capacity constraints.

This article follows a logical disposition in order to handle the full process of data preparation, structures organization, algorithms experimentation and results observation.

Figure 1 shows the global proposed thematic sequence. This article will primarily focus on the personalized cost matrices formulations and the adaptative stops routing through extremely diverse routes while examining main applications, limitations and errors of the methodologies.


1.3.  Literature Review


Large-scale logistics clustering and routing optimization problems have been research topics since mid-20th century, even before the terminology and practices for unsupervised and supervised learning and contemporary applications premiered. Documents such as [4] sought using clustering among other techniques for production scheduling and mobilization optimal estimations, while [5] remains one of the first records of large routing problem solution approximation. Generalities of recent investigations on these topics are mentioned in this section, focusing on contents related to the current research.


1.3.1.      Research on Cargo Routing


The general case of cargo routing involves searching for the optimal sequence of nodes (known as stops) on a set of routes for a fleet of vehicles that must satisfy some customer demands such as the roads’ state, the vehicles’ load capacity, the travel times between each stop, the processing times per package delivered, the time windows when some package needs to be dispatched and the drivers’ working hours. In the following subsections, an overview of usual routing approaches is undertaken and an exposition through examples of recent routing investigation topics is presented.


1.3.2.      Overview of Routing Approaches.


Certain cargo routing optimization are NP-hard problems [6] that involve ordering stops to fulfill constraints for a physical unit carrying dimensioned and weighted packages [7].



Figure 1. Diagram of the document’s entire flow consisting of grouped and ordered inter-dependable blocks. Delineated central block illustrates required validation over provided LMRRC’s datasets.


The initial problem formulation can vary, considering revisiting the starting stop and the use of single or multiple transport units. Figure 2 illustrates typical interrelated problems, including the NP-complete and NP-hard Vehicle Routing Problem (VRP) [8], that seeks minimum-length and minimum-time routes for a fleet. The Quadratic Assignment Problem (QAP) [9], assigns facilities to different locations to minimize distances multiplied by flows. The Travelling Salesman Problem (TSP) [10], finds the shortest sequence visiting each stop

exactly once before returning to the starting station. This

problem has applications in logistics, microchip manufacturing [11], and DNA sequencing [12]. The Hamiltonian Path Problem (HPP) [13], a subproblem of TSP, searches for a global optimal sequence without returning to the station, aiming for a Hamiltonian-connected graph with unique paths between all vertices. Exhaustive search is not feasible due to the number of different Hamiltonian cycles: a complete undirected graph with n stops and (n-1)! in a complete directed graph.



Figure 2. Different standard types of cargo routing problems. Nodes are labelled using stop names.



1.3.3.      Related Routing Applications.


Globally, the routing scenarios tend to seek innovative and particular procedures through time-efficient algorithms that facilitate the enterprise’s effort for providing effective service as well as reduce transportation’s costs. This can be seen on documents such as [14], with recent Quantum Computing approaches to VRP using Quantum and Simulated Annealers. Similarly, possible QAP resolutions are analyzed in [15–17], using QUBO and Ising formulations. In any case, although problem specific in nature, the pursue for finding general use routing optimization protocols for air, sea and land transportation remains an active research area [18, 19].

Regarding TSP, multiple approaches have been developed, including standard logic algorithms such as greedy (e.g. nearest neighbour) and/or with dynamic programming, as well as with constructive heuristics based approximations like multi-fragment [20] and different variable-opt techniques [21]. However, none of these possibilities reaches a global solution in polynomial time, and are deemed not applicable to the current problem, given the extended amount of stops in each route and their interrelationships.

With regards to the specific routing problem, the MIT’s LMRRC Technical Proceedings illustrate the proposals of several competitors that faced the challenge with a wide range of perspectives. Analyzing the finalists’ documents and algorithms, CHH [22] suggests doing local searches by travel time on which to apply precedence, path and neighbor restrictions for sequencing learned areas based on penalties. Then, GMW [23] presents an untrained approach by zones in three levels, together with a linear modification of the travel times cost matrix and a later stage of postprocessing of the final sequence for possible sequence inversion. Similarly, ArsAb [24] brings a greedy procedure along with an already trained genetic algorithm with a TSP subroutine also based on penalizations. Finally, HSFv1 [25] puts forward a combination of exact and heuristic approaches that conciliates multiple possible perspectives and generalize appropriately to the LMRRC’s testing dataset (thus improving the resulting score with regards to validation routes).

1.3.4.      Research on Multimethodologies


Multimethodologies for routing logistics have evolved over time to address the complex challenges associated with optimizing transportation routes together with improving e-commerce [26] and collaborative approaches [27] in order to improve efficiency as well as

reduce delivery costs. Related recent studies [28,29] suggest diverse multistage hierarchical methods to solve the vehicle routing problem for a heterogeneous fleet with various constraints, and its unique feature is the close proximity to real logistics practice. Other proposals [30, 31] help to reduce post-harvest wastage during the collection process by using both internal and hired fleets with heterogeneous capacities and employing a Greedy algorithm-based heuristic and several local search methods to obtain near-optimal solutions.

Apart from interesting simultaneous methodologies for supply chain pickup and delivery for e-commerce [32] when being trained with a great deal of routes [33], there has been an increasing demand of efficient realword mixed approaches due to the recent COVID epidemic [34]. The provided experimental results enrich the research related to vehicle routing problem models and algorithms under major public health emergencies and provide optimized relief distribution solutions for decision-makers of emergency logistics [35]. Similar articles [36] investigate a collaborative truck-drone routing problem for contactless parcel delivery in epidemic areas, which combines the Metropolis acceptance criterion [37] of Simulated Annealing and Tabu Search for urban logistics in smart cities [38].


2.      Materials and methods


2.1.  Cost Matrices Formulation


Defining cost matrices is crucial for sequencing procedures, enabling problem structuring and resolution. Even slight variations in these matrices can lead to significant changes in results for routing optimization. Figure 3 illustrates two approaches for defining cost matrices, each with their own variations for parameter combination and tuning.





Figure 3. Cost prospects analysis considering temporal and distance information.



2.2.  Temporal Information through Travel Times


Time-related data used as cost matrix is useful for abstract distance-independent approaches. The traditional parameter for general routing scenarios is using an asymmetrical squared travel times matrix where each element contains the value in seconds needed for unidirectional movement from one stop to another. For a given set of l stops, a general temporal matrix is defined in Equation (1).




This expression is known as the static travel times matrix, where usual transport characteristics (e.g. vehicle speed, acceleration) are considered.

A relevant alternative consists of the expected travel times matrix (Equation (2)), where an individual perturbation is associated to each element as transit properties for a given time and date (e.g. traffic jams, no signaling) are reported.




Thus, defining P as the set of packages associated to a given route, the current time trace (Equation (3)) counted for a sequence interval i is calculated as the sum of the starting route time with the planned service time per package and the travel times duration up to that point.








Time considerations are important for travel times, including time windows, restrictions, and packagespecific constraints. Simpler approaches use a constant vehicle speed, while more complex options consider distributed speeds based on street configurations.


2.3.  Spatial Information through Distances


Distance-based data wielded as cost matrix is advantageous for geographical time-independent perspectives. Similar to previous temporal case, the general instance depicts a mainly asymmetric square matrix with distance values that unidirectionally links one stops with another. Analogous to time-dependent expression, all possible combinations are examined as in Equation (4).






The mentioned distance element is considered to be indirect (i.e. Route distances, where street directions and silhouettes are respected) or direct (e.g. Euclidean or Manhattan distances, where street-related examinations are not needed). Formulations regarding these distances are:




Then, the current distance trace (Equation (5)) with respect to a sequence interval i is calculated as the sum of all previous distances following the stops ordering.




Contrasting with current time calculation, the current distance expression is computationally easier to manage and dependable of fewer parameters once the spatial cost matrix is determined.


2.4.  Mixed Information through Matrix Combinations


Matrix variations in time and space costs offer a different perspective on the problem, considering seemingly independent attributes. However, it is necessary to examine and tune combinations of standard parameters to achieve simplicity and generality.

The formulation of the mixed information matrix CCM is expressed in Equation (6), considering temporal and spatial matrices. Empirical observations show the similarity and redundancy of StaticTT and ExpectedTT for last-mile trajectories. DirectTT is deemed too general and imprecise, while the RouteD cost matrix varies based on coordinates and local transportation updates.





2.5.  Route Sequencing Algorithms


Route planning is vital for efficient transportation sequencing, determining the order of stops to optimize delivery. The LMRRC proposal considers various factors, such as time-expanded variations, metric and distance-based algorithms, and symmetric/asymmetric alternatives [39]. To address the challenge of multiple perspectives, a multi-approach method combining learning, exact, and heuristics methodologies is proposed.

Figure 4 depicts the procedural flow, including stop sequence generation, attribute, and cost matrix determination. The global section analyzes route features, employs regression for cluster ordering, and presents exact/heuristics approaches for individual stops. Validationresults are compared using a variation score. This seque ncing challenge was central in the LMRRC competition.




Figure 4. General routing sequencing flow treated in the section.


2.6.  General Analysis


The proposed methodology adopts a systematic process that incorporates regression, exact, and heuristic approaches to provide a direct and comprehensive solution. Regression approaches utilize specific learning algorithms and past experimentation to make predictions. However, relying solely on regression can lead to predictions that are too dependent on the training data, especially when considering multiple cities and contexts. Exact approaches aim to find precise solutions for constrained problems but can be computationally

intensive for practical scenarios. Heuristic approaches, on the other hand, offer approximate solutions within reasonable time frames, albeit with reduced accuracy and precision. The proposed procedure strategically combines these approaches to maximize their utility.

Figure 5 illustrates the overall procedure that encompasses all the considered approaches, ensuring relevant results within a realistic timeframe. This noniterative nature of the methodology facilitates efficient execution.




Figure 5. General routing process including a learning phase as well as exact and heuristics approaches.


Additionally, to enhance the initial analysis and facilitate comparison of final results, certain route features of interest are studied as part of the characteristics extraction step:

·         Route Extension Ratio (RER): Spatial relation per route, considering its entire set of zones.

·         Zone Extension Ratio (ZER): Spatial relation per zone, considering its entire set of stops.

·         Zone Extension Variation (ZEV): Difference between maximum and minimum spatial dimensions of zones included in the route (averaging in longitude and latitude).

·         Stops Amount per Zone (SAZ): Quantity of stops in a given zone from the data of a route.

·         Stops Variation per Zone (SVZ): Difference between the maximum and minimum number of stops of the total set of zones belonging to the route of interest.

·         Zones Amount per Route (ZAR): Quantity of zones in a given route.

The learning step estimates zone sequencing using LMRRC’s training data and the observation that cluster order is generally preserved regardless of the city, aiming to minimize Unicode variation between contiguous stop groups. However, to account for cases where learned cluster sequences are not faithfully continued in subsequent routes, exclusion conditionals are defined. Global exact and heuristic perspectives with specific parameters are then employed to complete cluster ordering not covered by the learning step and arrange remaining stops.


2.7.  Learning Methodology


The presented regression seeks to find relationships between contiguous stops clusters, based on the hypothesis that previously learned zone sequence is mostly maintained and reiterated on further routes. Clusters are divided in four layers and identified with alphanumerical characters with an heterogeneous spatial distribution and dynamic silhouettes. This necessity for a learning methodology is rooted on routes visualization


and examination of historical sequences, thus determining a correlation and dependency towards initially independent stops orderings which stem from clustering definition and identification consistencies.


2.7.1.      Empirical Motivation.


Observed stop sequences follow a pattern of visiting all stops within a zone before proceeding, prioritizing contiguous clusters and accounting for zone and package structure. This suggests a layered approach, simplifying routing with an average of 8 stops per cluster and emphasizing accurate zone sequencing. Figure 6 illustrates this layered perspective, completing stops within zones before advancing, regardless of proximity. The learning approach focuses on major and minor zones, aiming for simplicity and variable results, as major layers remain stable within a route.


2.7.2.      Algorithm for Cluster Ordering.


The training step of the regression algorithm builds a model from a training dataset, counting the repetitions of

cluster sequences for major and minor zones independently, grouped in sets of 7. Major and minor zones consist of alphabetic and numeric layers, with the numeric layer being the most variable (e.g., ’A-F’ and ’1-25’ character ranges). Relevant exclusion conditions include unequal first layers for major zones (i.e., dif (M1(j), M1(j+1)) ≥ 1) a difference of 3 or more points in the first layer of minor zones (i.e., dif (m(j)1, m(j+1)1) ≥ 3), and a Unicode variation of 6 or more points across all zone layers (i.e., ∑i dif(z(j)i, z(j+1)i) ≥ 6). The model’s depth is shallower for major zones but grows in specificity for minor zones. Considering consistent repeatability of identifiers and an average of 21 clusters per route, the training algorithm has a complexity of O(m) for routes with m zones and takes approximately 0.15 seconds on a 2.3 GHz Quad-Core i7 processor.

In the evaluation step, zones are initially ordered using the model sequences with the highest repetitions. Unused clusters are then checked against sequences with progressively lower repetitions and incorporated into the initial ordering if found. If there are any remaining unused zones, they are ordered using exact and approximate designs explained in the next section. This process also has a complexity of O(m) and takes around 0.4 seconds per route.



Figure 6. Layered approach to routing problem visualized through a route of California on Google Maps.



2.8.  Exact and Approximate Approaches


In order to determine each cluster’ stops sequence as well as to consider outliers and particularities from previous learning step, a set of rules exact and heuristics possibilities are defined. The suggested approaches, realized empirically through observation and parameter determinations, seek to provide solutions with realistic time complexities that approximate the given optimal results.


2.8.1.      Procedure Design.


The process of defining rules begins by understanding the problem and classifying each input route based on RER,

ZER, SAZ, and ZAR characteristics. Relevant attributes are then determined, forming a hierarchy that mirrors the decisions made by actual sequences. The defined approaches and configurations are tested using a scheduling procedure with new routes. Finally, the results obtained from these formulations and executions are compared to optimize the output.

Figure 7 outlines the chronological guidelines for defining and testing exact and heuristic parameters. The scheduling involves three steps based on the attributes hierarchies mentioned earlier. The capacity approach has the least influence on the overall route structuring, making variables such as package dimensions auxiliary. The temporal approach, which involves calculations based on the current time and time window specifications, is also considered secondary compared to typical global route characteristics.




Figure 7. General design for exact and heuristics parameters’ process definition.



2.8.2.      Parameters Definition.


The following attributes are thus observed useful for routing designation, focusing on empirically relevant cases and approaches:


·         Clusters Sequencing (CS): Order of zones that remain unused from learning approach.


-          Metric distance (CS1): Exact minimum euclidean separation from contiguous zones’ baricenters determined as the geographical middle point of each cluster polygon.

-          Set distance (CS2): Heuristic minimum Hausdorff interval between contiguous zones.


·         Stops Ordering (SO): Sequence of stops inside a given cluster.


-          Global approach (SO1): Exact reduced or exhaustive search for global minimum cost.

-          Local approach (SO2): Heuristic search for local minimum cost on contiguous stops.


·         Transition Stop Definition (TSD): Determination of stops inbetween clusters.


-          Last Inner Stop (TSD1): Last stop of current cluster is first stop of following cluster.

-          First Outer Stop (TSD2): Local approach between clusters for first stop of next cluster.

·         Time Window Conditional (TWC): Stops boost or delay in sequence based on time occurrence.


-          Cluster perspective (TWC1): Order modification based on exact time windows per cluster.

-          Route perspective (TWC2): Order modification based on heuristic global time windows.


·         Capacity Limitations (CL): Necessity to resupply transport at deposit based on load excess.

-          At cluster division (CL1): Resupply before entering following cluster if needed.

-          Maximum saturation (CL2): Resupply when the unit locally reached maximum capacity.


The utilization of these tasks or parameters does not rely on a training step as each scenario is evaluated directly through each route examination. The mentioned cost matrix for this routing optimization problem is defined as mostly time dependent as in Equation (7).






The complexity needed for the evaluated cases using these approaches consists of O(n + m) per route with n stops and m clusters, demanding for each route an average of 0.22 seconds on a 2.3 GHz Quad-Core i7 processor.


3.      Results and discusión


3.1.  Sequencing Examination


This section analyzes the validation sequences using qualitative and quantitative methods. LMRRC provides a score metric to observe the ordering results, including Sequence Deviation (SD) and Edit Distance with Real Penalty (ERPe and ERPnorm).






The metric yields positive values, where lower scores indicate better routing matching the optimal sequence. A score between 0.8 and 1.2 represents a uniformly random


order, and the sequence must start at a station without repetition. Scores below 0.1 are considered competitive based on the LMRRC criteria.


3.2.  Validation Results for Regression.


The regression-based zones ordering had an average score variation of 0.06 between correct and incorrect cluster sequencing. Major zones order accounted for approximately 0.02 of the variation, while minor zones order contributed to a 0.04 variation due to increased variability.

The proposed segmentation of 7 clusters facilitated the identification of coherent repetitions during the training step, disregarding transition sequences between segments. However, this clustering error affected less than 5% of observed scenarios, ensuring efficient data management. The error arose from a maximum repetition of 3 contiguous clusters per segment, resulting in a significant number of interleaving clusters.

Figure 8 presents example results and route sequences illustrating the aforementioned considerations. The main cause of error during the evaluation step was the secondary checks of model sequences with fewer repetitions and cluster orders that had the same repetition quantity. An improvement could involve weighted repetitions counts based on Unicode variations or additional empirical observations.





Figure 8. Results of approaches combinations on validation dataset, joined by sequencing layout of example routes with actual and proposed routing.



3.3.  Multimethodologies Considerations.


Utilizing metaheuristic, exact, and heuristic methods enhances the precision of the global stop routing system, especially when combined with the initial regression-based approach. Combinations of these methods, as shown in Figure 9, adapt to different route attributes. By modifying routing parameters, the combinations exchange global and local analysis functionalities, assessing their impact on validation paths and identifying the most suitable permutations.

The combined exact and heuristic approaches yield superior results for routes with higher RER and ZAR, particularly in cities like Chicago, Los Angeles, and Seattle. CS2 outperforms CS1 with a score variance of

0.013, as seen in Figure 9 and Table 1. SO1 achieves the best mean score of approximately 0.006 for small SAZ and larger ZER, while TSD1 generally outperforms TSD2. TWC and CL have minimal effects, but TWC1 and CL2 yield slightly better results. CS1 and SO1 are more suitable for low ZEVs and high SVRs, while TSD2 and TWC1 also provide improvements with minimal variation in the CL attribute.

Figure 9 illustrates the method combinations based on route characteristics and their associated validation scores. The dominant source of error is the CS analysis, which significantly impacts results. Potential improvements involve defining new parameters and exploring additional variants within the proposed approaches.




Figure 9. Average results and distributions of particular combinations of exact and approximate stop routing techniques, using characteristic route attributes.


Table 1. Evaluation of average results of combinations of methodologies of interest using the validation routes in different cities from USA provided by LMRRC. 



3.4.  Application to General Case


This section analyzes the testing results of the entire process, including stop differentiations, zone predictions, and customized routing mechanisms. The main observations pertain to the custom cost functions and adaptable routing techniques discussed in Sections 3 and 4. It concludes with observations on sequencing, clustering applications, and generalizations to routes outside of the United States. These route features have varying effects when combined with optimization algorithms derived from LMRRC competition.


3.5.  Routes Characteristics


Obtained real datasets consists of 66 routes from the country Uruguay located in the Southern Cone of South America. Its cities are considerably smaller than those LMRRC selected, which usually translates to shorter distances and travel times between stops. They also have a centralized and populous downtown where businesses abound while bidirectional streets meager, and a peripheral and extended uptown where most markets and residential homes are localized. Given that transports travel daily to both city sections, their stations remain separated from about 20 kilometers of the nearest urban stop. Moreover, its cities’ structures are not uniformly diagrammed, laying globally an average of 85 metres per block and a maximum of 145 metres which also impacts considered costs. Finally, the whole extension is fairly plain (i.e. with no stepped mountain nor valley) and lacking of bridges, tunnels or subways, which complicates long-distanced travel times.


3.6.  Optimization Algorithms Selection


The decision over four routing methodologies derives from their LMRRC’s ranking, availability and variability, being all defined using Julia and Python programming languages [40]. In particular, GMWis used given its focus and baseline on zone identifiers and its relationships. Then, ArsAb is selected due to its original genetic training phase and greater results on routes with more than 100 stops. Finally, the initial HSFv1perspective, which declares a non-trained procedure as in GMWcase, provides better results with extremes latitude over longitude route ratio.

These algorithms are all extracted from the MIT’s LMRRC Technical Proceedings [41], discussed on Section 1.3.3. Also, in addition to juxtapose results over mentioned algorithms, the current proposition HSFv2with a previous training phase and improved evaluative approaches is compared, reaching satisfactory and competitive results with more direct approaches that

successfully determines and makes use of the characteristics of each location. Indeed, the proposed procedure empirically benefit from the diversity of selected multimethodologies and the simplicity of its combinations. Preceding stops and zones determination are used as preliminary steps for all algorithms.


3.7.  Optimization Algorithms Results


The selected routes had square spatial extensions and clustered stops with minimal intersections, aiding zone identification and stop classification. Increased stops per zone led to non-square spatial extensions, introducing higher complexity favoring local heuristics over global approaches. Multiple nexus zones had distinct identifiers in different route analyses, but tuned parameters remained useful for new routes with minimal score variation.

Figure 10 shows a test route with concentrated stops in downtown areas and scattered stops elsewhere, with zone prediction not prioritizing cluster characteristics.

Table 2 displays quantitative results for the four routing algorithms. GMW consistently achieved better metrics with lower variance as the number of stops increased. ArsAb performed similarly for routes with many stops. HSFv2 improved over HSFv1, especially for intermediate route lengths. Zone sequencing was the main source of error, as learning was based on non-local training routes.

The proposed algorithms perform and escalates adequately in terms of time and spatial complexity when applied to last-mile routes with lengths of at most 155 stops on a 2.3 GHz Quad-Core i7 processor, with comparable and lower complexities than the aforementioned competition algorithms. This is considered acceptable, given the demands considerations of current real-world last-mile routes.

The selected procedures also remain flexible to further customization by modifying the cost matrices parameters as well as by enabling a different combination of sequencing attributes, aiming at supporting multiple transportation modes and constraints. Indeed, filtering sequencing attributes, particularly CS considerations, allow for more operative resource availability which translates to faster processing times and a greater scalability of the sequencing process.

Despite additional error sources from customized stops and zone predictions, the scores generally outperformed LMRRC’s sequences with similar averages.




Figure 10. Visualization over key flow steps using an example testing route of Montevideo (Uruguay).



Table 2. Applied algorithms result scores on testing routes, displayed with format (Min, Average, Max) 



4.      Conclusions


This section summarizes and describes the main accomplishments of this project with focus on processes generalization, including observations on future related research possibilities and considerations aimed at corroborating and improving obtained results.


4.1.  Global Summary


To summarize, this document provides a review of important project components, followed by key principles and methodologies employed. Limitations and sources of error related to the problems of interest are also discussed. The proposed methodology expands the LMRRC contest initiative by sequencing and planning geographically diverse routes using raw GPS records and datasets from the US and Uruguay. The process includes record filtering, prediction of new stop groups, and a combination of regression, exact, and heuristic approaches for routing. Applying this procedure to the provided data yields competitive scores in validation routes (US) and demonstrates acceptable adaptation and generalization in test routes (Uruguay) from both LMRRC and OTUC.

4.1.1.      Relevant Limitations


The designs and applications of problem models in transport logistics aim to achieve efficient objectives. However, there are practical limitations that hinder their adaptability:


·         Limited availability of routes and real routes from specific cities at an international level, provided by official entities, restricts the ability to conduct extensive and comprehensive studies on various structural possibilities.

·         Fixed precision of coordinate and numerical data from GPS for waypoint detection, as well as numerical values related to temporal and spatial attributes for cost matrix formulations, impose constraints on the accuracy of the models.

·         Customized cost matrices can be complex to create and maintain, especially for large and complex routing problems.

·         Euclidean distance is not scaling invariant, meaning that multiplying the data by a common factor will change the distance. Manhattan distance does not take into account the curvature of the Earth, which can lead to inaccurate distance calculations for long distances.



·         Expected travel time can be difficult to estimate accurately, especially in dynamic traffic conditions.

·         Finite number of proposed combinations of exact and heuristic approaches for stop sequencing balances the need for precise results while avoiding overfitting the global model for additional traversal cases.


Considering limitations helps analyze the proposal, suggest improvements, and guide future works.


4.1.2.      Main Error Sources


The main errors identified in the proposal’s global procedure are as follows:


·         Loss of sequences of zones of interest in the regression model due to the grouping of 7 zones and limited repetition of zones across routes within the same city.

·         The accuracy of a customized cost matrix depends on the quality of the data used to create it. If the data is inaccurate or incomplete, then the cost matrix will not be accurate.

·         Increased variance of temporal information in comparison to relatively static distance measurements.

·         In contrast to time measurements, different distances may be measured in different units (e.g., miles and kilometers) depending on the source.

·         High variability of path attributes, posing challenges in determining combinations of metaheuristic and heuristic approaches while maintaining a balance between performance and score differences.


These errors significantly impact the final results and hinder achieving optimal resolutions. However, despite these limitations, the project still meets the satisfactory accuracy target within its scope.


4.2.  Future Research Possibilities


Logistics and optimization techniques are constantly evolving with technology, aiming to provide better services. This project can be expanded within the same research area and complementary themes to offer original contributions for complex issues in academic and commercial environments.

Future possibilities include extending the proposed algorithms to similar contexts and sharing tools with relevant entities. Emphasis would be placed on disseminating and increasing the project’s visibility, ensuring functional compatibility with applications and developing user-friendly software.

Another possibility is exploring the cargo routing problem using quantum perspectives, such as adiabatic annealers, variational methods, and quantum learning. These approaches address combinatorial complexity and utilize quantum computing’s potential for massive data management, analysis, and processing. They align with the growing demand for optimization in transport logistics and attract scientists and engineers worldwide to collaborate on routing optimization strategies.


We would like to share our appreciation with Universidad de Montevideo for their initiative and support, together with, MIT (Massachusetts Institute of Technology) as well as CINOI (Industrial Organization’s Innovation Center) and OTUC (Urban Cargo Transport Observatory) for the fundamental real route data provisioning.


[1] S. Pratap, M. Zhang, C. Shen, and G. Q. Huang, “A multi-objective approach to analyse the effect of fuel consumption on ship routing and scheduling problem,” International Journal of Shipping and Transport Logistics, vol. 11, p. 161, 01 2019. [Online]. Available:

[2] O. Turbaningsih, “The study of project cargo logistics operation: a general overview,” Journal of Shipping and Trade, vol. 7, no. 1, p. 24, Nov 2022. [Online]. Available:

[3] MIT, Amazon last-mile routing research challenge. Supported by the MIT center for transportation logistics. MIT center for transportation logistics, 2023. [Online]. Available:

[4] A. Charnes, W. W. Cooper, and R. O. Ferguson, “Optimal estimation of executive compensation by linear programming,” Management Science, vol. 1, no. 2, pp. 138–151, 1955. [Online]. Available:




[5] G. B. Dantzig, D. R. Fulkerson, and S. M. Johnson, Solution of a Large-Scale Traveling-Salesman Problem. Santa Monica, CA: RAND Corporation, 1954. [Online]. Available:

[6] Z. Wang, M. Zhang, R. Chu, and L. Zhao, “Modeling and planning multimodal transport paths for risk and energy efficiency using and/or graphs and discrete ant colony optimization,” IEEE Access, vol. 8, pp. 132 642–132 654, 2020. [Online]. Available:

[7] Y. Niu, Z. Yang, P. Chen, and J. Xiao, “A hybrid tabu search algorithm for a realworld open vehicle routing problem involving fuel consumption constraints,” Complexity, vol. 2018, pp. 1–12, 02 2018. [Online]. Available:

[8] M. Asghari and S. M. J. Mirzapour Al-e-hashem, “Green vehicle routing problem: A state-of-the-art review,” International Journal of Production Economics, vol. 231, p. 107899, 2021. [Online]. Available:

[9] H. Zhang, F. Liu, Y. Zhou, and Z. Zhang, “A hybrid method integrating an elite genetic algorithm with tabu search for the quadratic assignment problem,” Information Sciences, vol. 539, pp. 347–374, 2020. [Online]. Available:

[10] M. M. Flood, “The traveling-salesman problem,” Operations Research, vol. 4, no. 1, pp. 61–75, 1956. [Online]. Available:

[11] K. Saito, M. Aono, and S. Kasai, “Amoebainspired analog electronic computing system integrating resistance crossbar for solving the travelling salesman problem,” Scientific Reports, vol. 10, no. 1, p. 20772, Nov 2020. [Online]. Available:

[12] H. Bennaceur and E. Alanzi, “Genetic algorithm for the travelling salesman problem using enhanced sequential constructive crossover operator,” International Journal of Computer Science and Security (IJCSS), vol. 11, p. 42, 01 2017.

[13] A. Sergeenko, O. Granichin, and M. Yakunina, “Hamiltonian path problem: the performance comparison deoxyribonucleic acid computing and the branch-and-bound method,” Journal of Physics: Conference Series, vol. 1536, no. 1, p. 012003, may 2020. [Online]. Available: 

[14] M. Borowski, P. Gora, K. Karnas, M. Błajda, K. Król, A. Matyjasek, D. Burczyk, M. Szewczyk, and M. Kutwin, “New hybrid quantum annealing algorithms for solving vehicle routing problem,” in Computational Science – ICCS 2020, V. V. Krzhizhanovskaya, G. Závodszky, M. H. Lees, J. J. Dongarra, P. M. A. Sloot, S. Brissos, and J. Teixeira, Eds. Cham: Springer International Publishing, 2020, pp. 546–561. [Online]. Available:

[15] F. Hernández, K. Díaz, M. Forets, and R. Sotelo, “Application of quantum optimization techniques (QUBO method) to cargo logistics on ships and airplanes,” in 2020 IEEE Congreso Bienal de Argentina (ARGENCON), 2020. [Online]. Available:

[16] S. J. Weinberg, F. Sanches, T. Ide, K. Kamiya, and R. Correll, “Supply chain logistics with quantum and classical annealing algorithms,” Scientific Reports, vol. 13, no. 1, p. 4770, Mar 2023. [Online]. Available:

[17] R. Haba, M. Ohzeki, and K. Tanaka, “Travel time optimization on multi-agv routing by reverse annealing,” Scientific Reports, vol. 12, no. 1, p. 17753, Oct 2022. [Online]. Available:

[18] P. T. G. dos Santos and D. Borenstein, “Multiobjective optimization of the maritime cargo routing and scheduling problem,” International Transactions in Operational Research, vol. 31, no. 1, pp. 221–245, 2024. [Online]. Available:

[19] M. Oliskevych, V. Danchuk, and O. Mastykash, “Cross-docking cargo delivery routing for guaranteed minimum period,” Transport Technologies, vol. 3, no. 1, 2022. [Online]. Available:

[20] M. El Krari, B. Ahiod, and B. El Benani, “An empirical study of the multi-fragment tour construction algorithm for the travelling salesman problem,” in Proceedings of the 16th International Conference on Hybrid Intelligent Systems (HIS 2016), A. Abraham, A. Haqiq, A. M. Alimi, G. Mezzour, N. Rokbani, and A. K. Muda, Eds. Cham: Springer International Publishing, 2017, pp. 278–287.

[21] Punnen, Margot, and Kabadi, “Tsp heuristics: Domination analysis and complexity,” Algorithmica, vol. 35, no. 2, pp. 111–127, Feb 2003. [Online]. Available:

[22] W. Cook, S. Held, and K. Helsgaun, “Constrained local search for last-mile routing,” in Technical Proceedings of the Amazon Last Mile Routing Research Challenge, 12 2021. [Online]. Available:

[23] X. Guo, B. Mo, and Q. Wang, “Amazon last-mile delivery trajectory prediction using hierarchical TSP with customized cost matrix,” Technical Proceedings of the Amazon Last Mile Routing Research Challenge, 2023. [Online]. Available:

[24] O. Arsian and R. Abay, Data-driven vehicle routing in last-mile delivery. Cirrelt, 2021. [Online]. Available:

[25] F. Hernández, R. Sotelo, and M. Forets, “Combined exact and heuristicsbased approach to hamiltonian path problem optimization for routeplanning,” Technical Proceedings of the Amazon Last Mile Routing Research Challenge, 2021. [Online]. Available:


[26] A. Escudero-Santana, J. Muñuzuri, A. Lorenzo-Espejo, and M.-L. Muñoz-Díaz, “Improving e-commerce distribution through last-mile logistics with multiple possibilities of deliveries based on time and location,” Journal of Theoretical and Applied Electronic Commerce Research, vol. 17, no. 2, pp. 507–521, 2022. [Online]. Available:

[27] H. Wen, Y. Lin, H. Wan, S. Guo, F. Wu, L. Wu, C. Song, and Y. Xu, “Deeproute+:Modeling couriers’ spatial-temporal behaviors and decision preferences for package pick-up route prediction,” ACM Trans. Intell. Syst. Technol., vol. 13, no. 2, jan 2022. [Online]. Available:

[28] M. Bilgin and N. Bulut, “Compatibility themed solution of the vehicle routing problem on the heterogeneous fleet,” The International Arab Journal of Information Technology, vol. 19, 01 2022. [Online]. Available:

[29] X. Xu, W. Liu, M. Jiang, and Z. Lin, “A multicycle and multi-echelon location-routing problem for integrated reverse logistics,” Industrial Management & Data Systems, vol. 122, no. 10, pp. 2237–2260, Jan 2022. [Online]. Available:

[30] M. Fernando, A. Thibbotuwawa, H. N. Perera, and R. C. Ratnayake, “Close-open mixed vehicle routing optimization model with multiple collecting centers to collect farmers’ perishable produce,” in 2022 International Conference for Advancement in Technology (ICONAT), 2022. [Online]. Available:

[31] R. Ivut and I. Tsarenkova, “Formation of logistics approach to economic development of road sector of the republic of belarus,” Science & Technique, vol. 21, no. 1, pp. 73–81, 2022. [Online]. Available:

[32] M. I. D. Ranathunga, A. N. Wijayanayake, and D. H. H. Niwunhella, “Simulation-based efficiency assessment of integrated first-mile pickup and last-mile delivery in an ecommerce logistics network,” in 2022 International Research Conference on Smart Computing and Systems Engineering (SCSE), vol. 5, 2022, pp. 246–253. [Online]. Available:

[33] Y. Zhang, “Research on electronic commerce logistics dispatching routing optimization method under the background of big data,” in Proceedings of the 7th International Conference on Intelligent Information Processing, ser. ICIIP ’22. New York, NY, USA: Association for Computing Machinery, 2023. [Online]. Available:

[34] F. Baig, K. Kirytopoulos, J. Lee, E. Tsamilis, R. Mao, and P. Ntzeremes, “Changes in People’s Mobility Behavior in Greece after the COVID-19 Outbreak,” Sustainability, vol. 14, no. 6, March 2022. [Online]. Available:

[35] L. Du, X. Li, Y. Gan, and K. Leng, “Optimal model and algorithm of medical materials delivery drone routing problem under major public health emergencies,” Sustainability, vol. 14, no. 8, 2022. [Online]. Available:

 [36] G. Wu, N. Mao, Q. Luo, B. Xu, J. Shi, and P. N. Suganthan, “Collaborative truck-drone routing for contactless parcel delivery during the epidemic,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 12, pp. 25 077–25 091, 2022. [Online]. Available:

[37] J. Chang, C. Y. Tang, and Y. Zhu, “Efficiently handling constraints with metropolis-adjusted langevin algorithm,” ArXiv, 2023. [Online]. Available:

[38] M.-Y. Wu, C.-K. Ke, and S.-C. Lai, “Optimizing the routing of urban logistics by context-based social network and multi-criteria decision analysis,” Symmetry, vol. 14, no. 9, 2022. [Online]. Available:

[39] A. Arigliano, G. Ghiani, A. Grieco, E. Guerriero, and I. Plana, “Time-dependent asymmetric traveling salesman problem with time windows: Properties and an exact algorithm,” Discrete Applied Mathematics, vol. 261, pp. 28–39, 2019, gO X Meeting, Rigi Kaltbad (CH), July 10–14, 2016. [Online]. Available:

[40] H. F. (2021) Smartrouting. Github. [Online]. Available:

[41] M. Winkenbach, S. Parks, and J. Noszek, Technical Proceedings of the Amazon Last Mile Routing Research Challenge. MIT Libraries, 2021. [Online]. Available: