Efficient Algorithms for Identifying Loop Formation and Computing θ Value for Solving Minimum Cost Flow Network Problems

While the minimum cost flow (MCF) problems have been well documented in many publications, due to its broad applications, little or no effort have been devoted to explaining the algorithms for identifying loop formation and computing the θ value needed to solve MCF network problems. This paper proposes efficient algorithms, and MATLAB computer implementation, for solving MCF problems. Several academic and real-life network problems have been solved to validate the proposed algorithms; the numerical results obtained by the developed MCF code have been compared and matched with the built-in MATLAB function Linprog() (Simplex algorithm) for further validation. Key-Words: Minimal Cost Flow, Loop Formation, theta Received: December 2, 2020. Revised: May 31, 2021. Accepted: June 16, 2021. Published: June 24, 2021.


Introduction
There exist linear-programming (LP) problems [1][2][3][4][5][6][7][8] which exhibit a special structure, one balanced equation for each node in the network, that can be exploited for the development of efficient algorithms to obtain optimal solutions. The minimum cost flow (MCF) problems [9][10][11] are considered in this study due to their broad industrial, logistics applications. The transportation problem, the assignment problem, the shortest path problem, and more can be considered special cases of the MCF problem [12][13][14][15]. Although the MCF problem can be naturally formulated as a LP problem, the special structure of MCF problems need to be exploited to efficiently solve large-scale systems, which are otherwise not possible, or highly resource intensive, using a traditional LP/SIMPLEX algorithm.
A common scenario of a network-flow problem arising in industrial logistics concerns regarding the distribution of goods/products from a set of originating (supply) nodes to a set of destination (demand) nodes. The number of units available at each supply node and the number of units required at each demand node is assumed to be known. The delivered product from supply nodes to demand nodes often pass through "transshipment" nodes (neither supply nor demand), which may represent warehouses or distribution centers along the distribution path. The objective is to minimize the total cost of shipping the products to satisfy all consumers' requests at demand nodes. To facilitate the discussions, a simple 5-node and 7-link network [10], shown in Fig 1, was chosen; it will be used to explain different steps in the conventional MCF algorithm, with special emphasis on developing an efficient algorithm for identifying the loop-formation and computing the value for θ. In Fig 1, nodes #1 and #3 are identified as supply nodes (with the values 1 = 6 and 3 = 4), while nodes #4 and #5 are identified as demand nodes (with the values 4 = −5 and 5 = −5), and node #2 is a transshipment node (with the value 2 = 0). The cost ( ) for transporting each unit of product on any directional link i-j (from node "i" to node "j") are also shown in  In Section 2, the MCF network presented in Fig 1 is formulated as a Linear Programming (LP) problem, so that its special structure can be highlighted. Details of the three steps involved in each iteration for Phase 1 and Phase 2 of the MCF algorithms are explained in Section 3. The efficient algorithm for identifying loop-formation and computing θ value for MCF problems is also presented in Section 3. Numerical results for several academic and real-life networks are presented and validated in Section 4. Finally, conclusions are drawn in Section 5.

The MCF Linear Programming Problem
The MCF network presented in Fig 1 is formulated as a LP problem, to take advantage of one balanced equation for each node in the network. Based on the conservation of flow balance (equilibrium) at each node, such that: The following equations can be written for the network shown in Fig 1: Thus, the following LP problem (Eq. 7) can be formulated from the network costs shown in Fig 1, and the balanced equations above.
The objective is to find the amount of link flow ( ) directly connected between nodes "i" and "j" such that the summation of the link costs times the link flows for the network is minimized: with the constraints presented in Eqs. (2)(3)(4)(5)(6), n = number of nodes, = non-negative (integer) cost, and = non-negative (integer) flow. It should be noted that the summation of all terms on the left-hand-side (LHS) of Eqs. (2)(3)(4)(5)(6) and the corresponding summation of all terms on the righthand-side (RHS) of Eqs. (2)(3)(4)(5)(6) are both equal to zero. This special (unimodal) property will allow the MCF algorithm, which can be considered a special case of LP/Simplex solver, to obtain the integer values for at the optimum, without being required to solve a more costly Linear Integer Programming (LIP) problem. From the "duality theories" [1][2][3][4], the above 7-variable and 5-constraint "Primal" LP problem can be associated with the following 5-variable and 7-constraint "Dual" LP problem.

The Minimum Cost Flow Algorithm
With regards to the conventional MCF problem, the objective for Phase 1 is to find the BFS, while the objective for Phase 2 is to obtain the OFS. The MCF method is an iterative process that consists of identifying which non-basic link (not currently in the solution) should be brought into the basic (active links) set, identifying the current loop, adjusting the link flow along the loop, and removing unneeded links.

Initialization
To initialize the network and begin the iterative process of calculating the BFS, an artificial node #A Xode Flow Out -Flow In Net Supply/Demand +x,: + x,.
(2) -X1: +x., (with = 0) is added to the original network (shown in Fig 1). Then, we add artificial links connected from supply nodes (node #1 and node #3) to node #A and artificial links connected from node #A to each demand and transshipment nodes (node #2, node #4, and node #5).

Iterative Step #1 -Calculate Node Weights
At the beginning of each iteration, we repeatedly apply the following equation among the basic links to calculate , with = 0 (associated with the artificial node #A): The equivalent version of Eq. (10), in matrix notation, has already been explained in Eq. (8) of Section 2.
Then, for our 5-node example with an artificial node: Remarks:  In general, the selected #th node with the known/assigned value is NOT always to be the artificial node. In the MATLAB code, = 0 for node ## in which node ## has the largest number of connected basic links is selected; this should reduce the number of unknown node weights.  In this example, since there are a total of six nodes (five original nodes plus the artificial node), the number of basic links required to solve for all is five (six nodes total minus one).  In applying Eq. (10), it is required that either or is known, so that either (if is known) or vice versa can be computed.  Since this Step #1 is straight forward, the details for the MATLAB coding are not discussed here.

Iterative Step #2 -Determine Which Non-Basic Link Enters the Basic Set
Among the non-basic links (dashed links in Fig 2), one needs to determine which non-basic link should be added to the basic (solid links in Fig 2) group, based on the maximum positive value, v, in the following equation: Then, for the current example's non-basic links: some researchers have suggested a random selection is adequate in the case of a tie, the recommended criterion is to select the link which has the minimum, original cost (see Fig 1). Hence, in this example, link 3-5, with an original cost of $3, will be selected to enter the basic group.  Since this Step #2 is straight forward, the details for MATLAB coding will not be discussed here.

Iterative Step #3 -Determine Which Basic Link is Removed
Among the basic links, one needs to determine which basic link(s) is/are to be removed from the basic set, by identifying the loop-formation and computing the value of θ.
At the end of Step #2, it has been determined that link 3-5 should "enter the basic set", with the flow value 35 = , as shown in   The analysis is further restricted to only realistic solutions, introducing the constraint that the flow on any link must be "non-negative", then basic links: Based on the requirements stated in Eqs. (24-26), one concludes that the requirement θ ≤ 4 will control the outcome; when θ ≤ 4 is satisfied, θ ≤ 5 is also automatically satisfied. Thus, the maximum positive value for θ is, θ = 4.
Substituting the value of θ = 4 into Eqs.
To develop a simple algorithm which will automatically identify the loop-formation and compute the appropriate value for θ, one must answer the following 4 questions: 1. Among the basic links, how is it determined (automatically) which basic link i-j does NOT belong to the loop-formation; which links should NOT have their flow adjusted by the value of θ? 2. How is it determined (automatically) which basic links i-j (such as basic links 3-5, 3-A, and A-5) belong to the loop-formation? 3. How is the appropriate value for θ computed (automatically)? 4. For those links that should belong to the loop-formation (such as links 3-5, 3-A, and A-5), how is it determined that the value of θ should be "ADDED" or "SUBTRACTED"?
To answer the previous questions, the following terms are defined:  A "dead node" is defined as a node which is connected by only one or no basic link (solid line in the figures), and  A "dead basic link" is defined as a basic link (solid line) which is connected to a "dead node"; furthermore, "dead nodes" and any associated "dead links" can be DELETED (ignored or discarded) during the process of finding the θ value. Based on the above definitions, and using Fig 4 (focusing on basic links only), one concludes that (see Fig 5):  Node #1 is a "dead node" (because there is only one basic link 1-A connected to node #1), hence node #1's associated "dead link 1-A" can be DELETED.  Node #2 is a "dead node" (because there is only one basic link A-2 connected to node #2), hence node #2's associated "dead link A-2" can be DELETED.  Node #4 is a "dead node" (because there is only one basic link A-4 connected to node #4), hence node #4's associated "dead link A-4" can be DELETED.

Fig 5. Dead Links Deleted
Note:  The remaining nodes (#3, #5, and #A), and the remaining basic links (3-A, A-5, and 3-5) will form the closed loopformation needed to calculate the value of θ.  All nodes that belong to the closed loopformation will be connected (or surrounded) by at most two basic links (i.e., node #3 has link 3-5 and link 3-A; no others).
The following algorithm can be developed to automatically identify the closed loop-formation and compute θ value (in Step #3): Step #3.1 Identify all the "dead nodes".
Step #3.2 Identify and DELETE all the "dead basic links". Step #3.3 In Step #2, it was determined that the non-basic link 3-5 will "enter the basic group"; node #3 is defined as the "loose node #i" and node #5 is defined as the "loose node #j".
Step #3.4 The "link status" for the "loose node #i = loose node #3" can be defined as one of the following four possibilities as detailed in Fig 6: Link Status = 1: "out more", (Link Status = 1 is selected see Fig 6) Link Status = 2: "out less", Link Status = 3: "in more", or Link Status = 4: "in less" Once the status for the current "loose node #i" is updated, the new "loose node #i" shifts to the next connected node, in this case node #A.
Step #3.5 The "link status" for the "loose node #j = loose node #5" can be defined as one of the following four possibilities as detailed in Fig 6: Link Status = 1; "out more", Link Status = 2; "out less", Link Status = 3; "in more", or (link Status = 3 is selected see Fig 6)

Link Status = 4; "in less"
Once the status for the current "loose node #j" is updated, the new "loose node #j" shifts to the next connected node, in this case node #A. Step #3.6 In Fig 7, a basic link which is connected to the "loose node #i = loose node #3" can be EITHER case #1 OR case #4. In case #1, if the basic link u-i is connected TO a loose node #i (at this point node #3), then the flow for link u-i is + (i.e., flow on u-i), so that the loose node #i will maintain its equilibrium. Under this scenario, the UPDATED loose node #i becomes loose node #u (node #A in our example), and the UPDATED link status for the loose node #u is 1, which means "out more". Since the new flow (= + ) is required to be non-negative, one can have any non-negative value for θ.
In case #4, if the basic link i-u is connected FROM a loose node #i (at this point node #3), then flow for link i-u is − (i.e., flow on i-u), so that the loose node #i will maintain its equilibrium. Under this scenario, the UPDATED loose node #i becomes loose node #u (node #A in our example), and the UPDATED link status for the loose node #u is 4, which means "in less". Since the new flow (= − ) is required to be non-negative, the value of θ must be ≤ (LESS THAN or EQUAL TO ).
Step #3.7 In Fig 7, a basic link which is connected to the "loose node #j = loose node #5" can be EITHER case #2 OR case #3.
In case #2, if the basic link v-j is connected TO a loose node #j (at this point node #5), then the flow for link v-j is − (i.e., flow on v-j), so that the loose node #j will maintain its equilibrium. Under this scenario, the UPDATED loose node #j becomes loose node #v (node #A in our example), and the UPDATED link status for the loose node #v is 2, which means "out less". Since the new flow (= − ) is required to be non-negative, the value of θ must be ≤ (LESS THAN or EQUAL TO ). In case #3, if the basic link j-v is connected FROM a loose node #j, then the flow for link j-v is + (i.e., flow on j-v), so that the loose node #j will maintain its equilibrium. Under this scenario, the UPDATED loose node #j becomes loose node #v (node #A in our example), and the UPDATED link status for the loose node #v is 3, which means "in more". Since the new flow (= + ) is required to be non-negative, one can have any non-negative value for θ.
Steps #3.6 and #3.7 are repeated until the "loose #i = loose #j" and the loop is closed.
The final value for θ should be the smallest, positive value among cases #1, #2, #3, and #4, to guarantee the flows on every basic link of the closed loop will be non-negative and maintain equilibrium. After Step #3 in each iteration is completed, Steps #1 through #3 will be repeated in subsequent iterations Fig 8 -Fig 10), until the feasible solution, for Phase 1, is identified. The cost at the end of Phase 1 is zero because all artificial links with any flow have been removed (Fig 10).

B. Phase 2 -Calculate the Optimal Final Solution (OFS)
The objective for Phase 2 is to find an overall, OFS for the original (in our example, Fig 1, a 5-node and 7-link) network.
The BFS obtained at the end of Phase 1 (see Fig 10) will be used as the beginning of Phase 2 (see Fig 11). In Phase 2, the original links' cost with the value are used, and an #th node, with the most connected links (in this case, node #3), is selected such that 3 = known value = 0. Then, the same 3step algorithm (steps #1, #2, and #3) discussed in Phase 1 will be applied for Phase2.

Numerical Results
Based on the 3-step MCF algorithm (applied in Phase 1 and Phase 2) presented in Section 3, and the newly proposed shortest path (SP) algorithm used in Phase 1 and presented in Section 4, the following academic and real-life networks are used to validate our proposed algorithm for identifying the closed loop-formation and computing the appropriate value for θ (see the discussions in 3.1.4, for Step #3 of Phase 1 and Phase 2). All our optimum results, for these examples, have been compared with, and validated using, the built-in MATLAB Linprog() function (Simplex algorithm) [16] and the results were recorded with each example in this section. The process times were measured over 36 trials and the average of the median 30 measurements were included in the analysis of each example.

Example #1: 5-Node/7-Link Network
The example #1 network is displayed in Fig 12. In this example the conventional MCF and Linprog() solutions produced the same optimum solution.

Example #2: 9-Node/14-Link Network
The example #2 network is displayed in Fig 16. In this example, the MCF solution results in an optimal (minimum cost) value of 71 which is verified by the Linprog() solution; however, it required both Phase 1 (solution = 80) AND Phase 2 to reach the Linprog() verified solution (Fig 19).

Example #3: 20-Node/74-Link Network
The example #3 network is displayed in Fig 20. In this example, the MCF solution results in an optimal (minimum cost) value of 1775 which is verified by the Linprog() solution; however, it required both Phase 1 (solution = 2000) AND Phase 2 to reach the Linprog() verified solution (Fig 33).

Example #4: 24-Node/76-Link Sioux Falls
The example #4 network is displayed in Fig 25. In this real-life, Sioux Falls (Fig 24) example, the conventional MCF method required Phase 2 to refine the feasible solution and deliver the OFS (Fig 27). However, the conventional MCF method produced an alternate optimal flow; indicating that there is more than one path that will result in the same optimal value or cost.

Conclusion
The most complicated step in the conventional MCF (Phase 1 and Phase 2) algorithm occurs in Step #3 (identifying the closed loop and computing the appropriate value for θ). To the authors' best knowledge, most (if not all) existing literature does NOT discuss the (automated) numerical algorithm to accomplish Step #3 of the conventional MCF method. This work has explained, in detail, the stepby-step procedures to implement the conventional MCF algorithm, with emphasis on Step #3 (see Section 3.1.4) to determine the loop-formation and value for θ.