A Novel Algorithm for Freeing Network from Points of Failure Rahul Gupta and Suneeta Agarwal Department of Computer Science and Engineering, Motilal Nehru National Institute of Technology, Allahabad, India rahulgupta_mnnit@yahoo.co.in, suneeta@mnnit.ac.in Abstract. A network design may have many points of failure, the failure of any of which breaks up the network into two or more parts, thereby disrupting the communication between the nodes. This paper presents a heuristic for making an existing network more reliable by adding new communication links between certain nodes. The algorithm ensures the absence of any point of failure after addition of addition of minimal number of communication links determined by the algorithm. The paper further presents theoretical proofs and results which prove the minimality of the number of new links added in the network. Keywords: Points of Failure, Network Management, Safe Network Component, Connected Network, Reliable Network. 1 Introduction A network consists of number of interconnected nodes communicating among each other through communication channels between them. A wired communication link between two nodes is more reliable [1]. Various topology designs have been proposed for various network protocols and applications [2][3] such as bus topology, star topology, ring topology and mesh topology. All these network designs leave certain nodes as failure points [4][5][6]. These nodes become very important and must remain working all the time. If one of these nodes is down for any reason, it breaks the network into segments and the communication among the nodes in different segments is disrupted. Hence these nodes make the network unreliable. In this paper, we have designed a heuristic which has the capability to handle a single failure of node. The algorithm adds minimal number of new communication links between the nodes so that a single node failure does not disrupt communication among communicating nodes. 2 Basic Outline The various network designs common in use are ring topology, star topology, mesh topology, bus topology [1][4][5]. All these topology designs have their own advantages and disadvantages. Ring topology does not contain any points of failure. Bus topology on the other hand, has many points of failure. Star topology contains a single E. Corchado et al. (Eds.): CISIS 2008, ASC 53, pp. 219–226, 2009. © Springer-Verlag Berlin Heidelberg 2009 springerlink.com 220 R. Gupta and S. Agarwal point of failure, the failure of which disrupts communication between any pair of communicating nodes. Points marked P in figure 2 are the points of failure in the network design. Star topology and bus topology are least reliable from the point of view of failure of a single communication node in the network. In star topology, there is always one point of failure, the failure of which breaks the communication between all pairs of nodes and no nodes can communicate further. In a network of n nodes connected by bus topology, there are (n-2) points of failure. Ring topology is most advantageous and has no points of failure. For a reliable network, there must be no point of failure in the network design. These points of failure can be made safe by adding new communication links between nodes in the network. In this paper, we have presented an algorithm which finds the points of failure in a given network design. The paper further presents a heuristic which adds minimal number of communication links in the network to make it reliable. This ensures the removal of points of failure with least possible cost. 3 Algorithm for Making Network Reliable In this paper, we have designed an algorithm to find the points of failure in the network and an algorithm for converting these failure points into non failure points by the addition of minimal number of communication links. 3.1 New Terms Coined We have coined the following terms which aid in the algorithm development and network design understanding. N – Nodes of the Network E – Links in the Network P – Set of Points of Failure S – Set of Safe Network Components Pi – Point of Failure Si – Safe Network Component Si(a) – Safe Network Component Attached to the Failure Point ‘a’ B – Set of all Safe Network Components each having a Single Point of Failure in the Original Network Bi – A Safe Network Component having a Single Point of Failure in the Original Network |B| - Cardinality of Set B Fi – Point of Failure Corresponding to the Original Network in the member ‘Bi’ NFi – Non Failure Point corresponding to the Original Network in the member ‘Bi’ L – Set of New Communication Links Added Li – A New Communication Link C – Matrix List for the Components Reachable dfn(i) – Depth First Number of the node ‘i’ low(i) – Lowest Depth First Number of the Node Reachable from ‘i’. A Novel Algorithm for Freeing Network from Points of Failure 221 The points of failure are the nodes in the network, the failure of any of which breaks the network into isolated segments which can not have any communication among each other. A safe network component is the maximal subset of the connected nodes from the complete network which do not contain any point of failure. The safe component can handle a single failure occurring at any of its node within the subset. We have developed an algorithm which finds the minimal number of communication links to be added to the network to make the network capable of handling a single failure of any node. A safe component may have more than one point of failure in the original network. The algorithm considers the components having only a single point of failure differently. ‘B’ is the set of all safe components having only a single point of failure in the original network. ‘Fi’ is the point of failure in the original network design. ‘C’ corresponds to the matrix having the reachable components. Each Row in the matrix corresponds to the components reachable through one outgoing link from the point of failure. All the components having single point of failure and occurring on one outgoing link corresponds to the representatives in each row. The new communication links added in the network are collected in the set ‘L’. The set contains the pairs of nodes between which links must be added to make network free of points of failure. Fig. 1. (a) An example network design, (b) Safe components in the design 3.2 Algorithm for Finding Points of Failure and Safe Components To find all the points of failure in the network, we use depth first search [7][8] technique starting from any node in the network. Nodes that are reachable through more than one path become part of the safe component and the ones which are connected through only one path are vulnerable and the communication can get disrupted because of any one node in the single path of communication available for the node. The network is represented by a matrix of nodes connected to each other with edges representing the communication links. Each node of the network is numbered sequentially in the depth first search order starting from 0. This forms the dfn of each node. The unmarked nodes reachable from a node are called the child of each node and the node itself becomes the parent of those child nodes. The algorithm finds the low of each node and the points of failure in the network design and all the safe components in the network. The algorithm finds the safe components and all points of failure in the network. The starting node is a pint of failure if some unmarked nodes remain even on fully exploring any one single path from the node. 222 R. Gupta and S. Agarwal 3.3 Algorithm for Finding Points of Failure and Safe Components In this section, we describe our algorithm for the conversion of points of failure into non failure points by the addition of new communication links. The algorithm adds minimal number of new links which ensures least possible cost to make the network reliable. The algorithm is based on the concept that the safe components having more than one point of failure are necessarily connected to a safe component having only one point of failure directly or indirectly. Thus this component can become a part of larger safe component through more than one path which originates from any of the points of failure in the original network present in the component. Thus if the component having only one point of failure in the original network is made a part of larger safe component, the component having more than one point of failure is made safe itself. The algorithm finds new links to be added for making the safe component larger and larger and thus finally including all the nodes of the network making the complete network safe. When the maximal component that is safe consists of all the nodes of the network, the whole network is made safe and all points of failure are removed. The following steps are followed in order. 1. Initially the set L = ∅ is taken. 2. P, the set of points of failure is found using algorithm described in section 3.2. The algorithm also finds all safe components of the network and adds them to the set S. Each of the Si has a copy of failure point within it. Hence, the failure points are replicated in each component. A Novel Algorithm for Freeing Network from Points of Failure 223 3. Find the subset B of safe components having only single point of failure in the original network by using set S and set P found in step 2. Let each of these component members be B1, B2, B3,…. , Bk. These Bi`s are mutually disjoint with respect to non failure points. 4. Each of the components Bi has at least one non failure point. Any non failure point node is named as NFi and taken as the representative of the component Bi. 5. The failure point present in maximum number of safe components is chosen i.e, the node, the failure of which creates maximum number of safe components is chosen. Let it be named ‘s’. 6. Let S1(s), S2(s), S3(s),…. Sm(s) be all the safe components having the failure point ‘s’. Each of these components may have one or more points of failure corresponding to the original network. If the component has more than one point of failure, other safe components are reachable from these safe components through points of failure other than ‘s’. 7. Now we create the lists of components reachable from point of failure‘s’. For each Sj(s), j=1, 2,… m, if the component contains only one point of failure, add the representative of this component to the list as the next row element and if the component contains more than one point of failure, then the reachable safe components having only one point of failure are taken and their representatives are added to the list C. These components are found by going using depth search from this component. All the components that are reachable from the same component are considered for the same row and their corresponding representatives are added in the same row in the matrix C. The number of elements in each row of matrix C corresponds to the number of components that are reachable from the point of failure ‘s’ through that one outgoing link. It is to be noted here that the components having one point of failure only are considered for the algorithm. Now we have a row for each Sj(s), j=1, 2,… m. Thus the number of rows in matrix C is m. 8. The number of elements in each row of matrix C corresponds to the number of components that are reachable from the point of failure ‘s’ through that one outgoing link. It is to be noted that each component is represented just by a non failure point representative. Arrange the matrix rows in non decreasing order based on the size of the row i.e, on the basis of the number of elements in each row. 9. If all Ci(s) `s are of size 1, pair the only member of each row with the only member of next row. Here pairing means adding a communication link between the non failure point members acting as representatives of their corresponding components. Thus giving (k-1) new communication links to be added to the network for ‘k’ members. Add all these edges to set L, the set of all new communication links and exit from the algorithm. If the size of some Ci(s) `s is greater than 1, start with the last list Ci ( the list of the maximum size). For every k>=2, pair the kth element of this row with the (k-1) th element of the preceding row (if it exists). Here again pairing means addition of a communication link between the representative nodes. Remove these paired up elements from the lists and the lists are contracted. 10.Now if more than one element is left in the second last list, shift the last element from this list to the last list and append to the last list. 11.If the number of non empty lists is greater than one, go to step 8 for further processing. If the size of the last and the only left row is one, pair its only member with any of the non failure points in the network and exit from the algorithm. If the 224 R. Gupta and S. Agarwal last and the only row left have only two elements left in it, then pair the two representatives and exit from the algorithm. If the size of the last and the only left row is greater than two, add the edges from set L into the network design and repeat the algorithm from step 2 on updated network design. Since in every iteration of the algorithm at least one communication link is added to set L and only finite number of edges are added, the algorithm will terminate in finite number of steps. The algorithm ensures that there are at least two paths between any pair of nodes in the network. Thus, because of multiple paths of communication between any pair of nodes, the failure of any one of the node does not effect the communication between any other pair of nodes in the network. Thus the algorithm makes the points of failure in the original network safe by adding minimal number of communication links. 4 Theoretical Results and Proofs In this section, we describe the theoretical proofs for the correctness of the algorithm and sufficiency of the number of the new communication links added. Further, the lower and upper bounds on the number of links added to the network are proved. Theorem 1. If | B | = k, i.e., there are only k safe network components having only one point of failure in the original network, then the number of new edges necessary to make all points of failure safe varies between ⎡k/2⎤ and (k-1) both inclusive. Proof: Each safe component Bi has only point of failure corresponding to the original network. Failure of this node will separate the whole component Bi from remaining part of the network. Thus, for having communication from any node of this component Bi with any other node outside of Bi, at least one extra communication link is required to be added with this component. This argument is valid for each Bi. Thus at least one extra edge is to be added from each of the component Bi. This needs at least ⎡k/2⎤ extra links to be added each being incident on a distinct pair of Bi’s. This forms the lower bound on the number of links to be added to make the points of failure safe in the network design. Fig. 2. (a) and (b) Two Sample Network Designs In figure 2(a), there are k = 6 safe components each having only one point of failure and thus requiring k/2 = 3 new links to be added to make all the points of failure safe. It is easy to see that k/2 = 3 new links are sufficient to make the network failure free. A Novel Algorithm for Freeing Network from Points of Failure 225 Now, we consider the upper bound on the number of new communication links to be added to the network. This occurs when | B | = | S | = k, i.e, when each safe components in the network contain only one point of failure. Since, there is no safe component which can become safe through more than one path. Thus all the safe components are to be considered by the algorithm. Thus, it requires the addition of (k1) new communication links to join ‘k’ safe components. Theorem 2. If the edges determined by the algorithm are added to the network, the nodes will keep on communicating even after the failure of any single node in the network. Proof: We arbitrarily take 2 nodes ‘x’ and ‘y’ from the set ‘N’ of the network. Now we show that ‘x’ and ‘y’ can communicate even after the failure of any single node from the network. CASE 1: If the node that fails is not a point of failure, ‘x’ and ‘y’ can continue to communicate with each other. CASE 2: If the node that fails is a point of failure and both ‘x’ and ‘y’ are in the same safe component of the network, then by the definition of safe component ‘x’ and ‘y’ can still communicate because the failure of this node has no effect on the nodes that are in the same safe component. CASE 3: If the node that fails is a point of failure and ‘x’ and ‘y’ are in different safe components and ‘x’ and ‘y’ both are members of safe components in set ‘B’. We know that the algorithm makes all members of set ‘B’ safe by using only non failure points of each component so the failure of any point of failure will not effect the communication of any node member of the safe component formed. This is because the algorithm has already created an alternate path for each of the node in any of the safe member. CASE 4: If the node that fails is a point of failure and ‘x’ and ‘y’ are in different safe components and ‘x’ is a member of component belonging to set ‘B’ and ‘y’ a member of component belonging to set ‘(S-B)’. Now we know that any node occurring in any member of set ‘(S-B)’ is connected to at least 2 points of failure in the safe component and through each of these points of failure we can reach to a member of set ‘B’. So even after deletion of any point of failure, ‘y’ will remain connected with at least one member of set B. The algorithm has already connected all the members of set ‘B’ by finding new communication links, hence ‘x’ and ‘y’ can still communicate with each other. CASE 5: If the node that fails is a point of failure and ‘x’ and ‘y’ are in different safe components and both ‘x’ and ‘y’ belong to components that are members of set ‘(S-B)’. Now each member of set ‘(S-B)’ has at least 2 points of failure. So after the failure of any one of the failure point, ‘x’ can send message to at least one component that is a member of set ‘B’. Similarly, ‘y’ can send message to at least one component that is a member of set ‘B’. Now, the algorithm has already connected all the components belonging to set ‘B’, so ‘x’ and ‘y’ can continue to communicate with each other after the failure of any one node. After the addition of links determined by the algorithm, there exist multiple paths of communication between any pair of communicating nodes. Thus, no node is dependent on just one path. 226 R. Gupta and S. Agarwal Theorem 3. The algorithm provides the minimal number of new communication links to be added to the network to make it capable of handling any single failure. Proof: The algorithm considers only the components having a single point of failure corresponding to the original network. Since | B | = k, thus it requires at least ⎡k/2⎤ new communication links to be added to pair up these k components and making them capable of handling single failure of any node in the network. Thus adding less than ⎡k/2⎤ new communication links can never result in safe network. Thus the algorithm finds minimal number of new communication links as shown by the example discussed in theorem 1. In all the steps of the algorithm, except the last, only one link is added to join 2 members of set ‘B’ and these members are not further considered for the algorithm and hence do not generate any further edge in set ‘L’. In the last step, when only one vertical column of x rows with each row having single member is left, then (x-1) new links are added. These members have the property that only single point of failure ‘s’ can separate these into x disjoint groups, hence addition of (x-1) links is justified. When only single row of just one element is left, this can only be made safe by joining it with any one of the non failure nodes. Hence, the algorithm adds minimal number of new communication links to make the network. 5 Conclusion and Future Research This paper described an algorithm for making points of failure safe in the network. The new communication links determined by the algorithm are minimal and guarantees to make the network capable of handling a single failure of any node. The algorithm guarantees at least two paths of communication between any pair of nodes in the network. References 1. Tanenbaum, A.S.: Computer Networks, 4th edn. Pearson Education, London (2004) 2. Pearlman, R.: Interconnections: Bridges, Routers, Switches, and Internetworking Protocols, 2nd edn. Pearson Education, London (2006) 3. Kamiyana, N.: Network Topology Design Using Data Envelopment Analysis. In: IEEE Global Telecommunications Conference (2007) 4. Dengiz, B., Altiparmak, F., Smith, A.E.: Efficient optimization of all-terminal reliable networks, using an evolutionary approach. IEEE Transactions on Reliability 46(1), 18–26 (1997) 5. Mandal, S., Saha, D., Mukherjee, R., Roy, A.: An efficient algorithm for designing optimal backbone topology for a communication network. In: International Conference on Communication Technology, vol. 1, pp. 103–106 (2003) 6. Ray, G.A., Dunsmore, J.J.: Reliability of network topologies. In: IEEE INFOCOM 1988 Networks, pp. 842–850 (1988) 7. Horowitz, E., Sahni, S., Anderson-Freed, S.: Fundamentals of Data Structures in C, 8th edn. Computer Science Press (1998) 8. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. Prentice-Hall, India (2004)
© Copyright 2025 Paperzz