PIM-DM and PIM-SM Failover Behaviour In a High-Availability Configuration By Rahul Bahadur and Dhaval Shah Microsoft Word Format by David Taylor Version 1.5 February 7, 2003 I This document is a summary of PIM dense-mode and PIM sparse-mode failover behavior. II Executive Summary This document is a summary of PIM dense-mode and PIM sparse-mode failover behavior when these protocols are used in conjunction with RIPv2. It includes configuration recommendations for both PIM and VRRP to ensure that the router that is the VRRP master, and hence the designated forwarder for unicast traffic, is also the forwarder with respect to multicast traffic. It also includes recommendations of timers that can be used to help reduce failover time. In addition a PIM High Availability feature is described to help PIM-SM failover. Contents 1. Configuration Guidelines. 2. HA-mode scenario - PIM Dense-Mode, M is receiver, N is sender. 3. HA-mode scenario - PIM Dense-Mode, M is sender, N is receiver. 4. HA-mode scenario - PIM Sparse Mode, M is sender, N is receiver. 5. HA-mode scenario - PIM Sparse Mode, M is receiver, N is sender. Test Network Test Network Net A IP=192.168.22/24 Net B IP=192.168.23/24 VRRP VRID: 22 VRRP VRID: 23 VRRP backup address: (.225) on each LAN NOKIA IP650 22.222 FW 1 23.222 VRRP Priority:150 Primary Host M; 192.168.22.50 III Host N NOKIA IP650 22.137 FW 2 23.137 VRRP Priority:100 Router 1 Router 2 Configuration Guidelines. Assume that FW 1 is the primary firewall and FW 2 is the secondary firewall. FW 1 is configured with higher IP addresses on both Net A and Net B. FW 1 is configured with a higher VRRP priority. This to ensure that it always becomes the master irrespective of when either of the two routers is turned on and also when it comes back up after going down. PIM DR priorities are also left unconfigured. If all routers on a LAN advertise the DR priority option in their Hello messages, the router with the highest IP address becomes the designated router if the DR priority values are the same. In this network, both routers advertise the same DR priority (one) and hence FW 1 elects itself the DR on each LAN. When using PIM-SM, RIPv2 interface metrics are configured such that the cost to reach host M via FW 1 is less than the cost via FW 2. This is to ensure that FW 1 is the next hop towards M with respect to Router 1. For example: FW 1: default (zero) on both Interface A and Interface B. FW 2: default on Interface A , 3 on Interface B. VRRP and PIM: It should be noted that VRRP and PIM do not communicate and hence failover with respect to VRRP does not influence PIM behavior in any way. As in the example above, the routers should be configured such that the master with respect to VRRP is also the designated router or designated forwarder with respect to PIM. HA-mode scenarios and timer recommendations for failover. Scenario 1. PIM Dense-Mode, M is receiver, N is sender Net A Net B Firewall 1 NOKIA IP650 Firewall 2 Host M Router 1 Router 2 Host N NOKIA IP650 Initially both FW 1 and FW 2 forward traffic from N to net A: - Assert election on Net A results in FW 1 becoming the designated forwarder since it has a higher IP address and the cost to reach N from both FW 1 and FW 2 is the same. - FW 2 stops forwarding data traffic to Net A while setting the data interval timer (default 210 seconds). IV - Every time the timers expires FW 2 will forward traffic to network A resulting in another assert election. Interface A of FW 1 goes down: - Traffic stops. - When data interval timer expires FW 2 starts will start forwarding traffic to Net A (assuming traffic is still flowing). - If data interval timer has expired but an MFC (multicast forwarding cache) entry still exists on FW 2 and the join prune interval timer (default period 60secs) expires FW 2 will send a join to Router 1 to rebuild the pruned tree. This will not occur if traffic is still flowing because the multicast shared tree is complete up to Net B. Interface A of FW 1 comes back up: - FW 1 will forward data traffic to Net A. - If MFC state still exists FW 1 will also send a graft upstream to router 1. - Assert election on Net A will elect FW 1 as forwarder. Same results as above if only interface B or both interface A and interface B of FW 1 go down and come back up. TO REDUCE FAILOVER TIME: Configure a low value for data interval (Advanced PIM Options) on FW 2. This timer has a default value of 210 seconds and can be configured to a value between 11 to 3600 seconds. Con: every time the data interval timer expires on FW 2 duplicate packets will be seen on Net A before an assert election is processed. Note: On basis of discussions in multicast mailing lists it is seems common knowledge those applications using multicast data are capable of handling duplicate packets. However, there are no technical guidelines for such applications that we can reference at the moment and would like to note this as a potential problem when debugging such networks. Scenario 2. PIM Dense Mode, M is sender, N is receiver Net A Net B Firewall 1 NOKIA IP650 Firewall 2 Host M V Router 1 NOKIA IP650 Router 2 Host N Initially traffic is forwarded by both FW 1 and FW 2 to Net B resulting in an assert election: - FW 1 wins on basis of higher IP address and becomes designated forwarder on net B . - FW 2 prunes outgoing interface for data interval (default 210 seconds) Interface A of FW 1 goes down: - Traffic stops. - FW 2 forwards traffic when data interval timer expires. Interface A of FW 1 comes up again: - FW 1 forwards traffic to Net B. - Assert election results in FW 1 becoming designated forwarder. - FW 2 prunes outgoing interface for data interval. Same results as above if only interface B or both interface A and interface B of FW 1 go down and come back up. TO REDUCE FAILOVER TIME: Configure a low value for data interval (Advanced PIM Options) on FW 2. This timer has a default value of 210 seconds and can be configured to a value between 11 to 3600 seconds. Con: every time the data interval timer expires on FW 2 duplicate packets will be seen on Net B before an assert election is processed. Scenario 3. PIM Sparse Mode: - M is sender - N is receiver - Router 1 is the RP. - Initially FW 1 is elected DR on Net A and encapsulates the data traffic in register messages to Router 1. Net A Net B Firewall 1 NOKIA IP650 Firewall 2 Host M VI NOKIA IP650 Router 1 RP Router 2 Host N (I) Assume that shortest path tree (SPT) from source to receiver has not been built. Interface A of FW 1 goes down: - Data traffic stops. - FW 2 does not forward traffic till the neighbor state of FW 1 on Net A expires and it (FW 2) becomes the DR on Net A (default hello hold time is 105 seconds based on default hello interval of 30 seconds). Interface A of FW 1 comes up again: - FW 1 elects itself DR on Net A and starts sending register messages to the RP. - FW 2 realizes it is not DR and stops sending register messages. Interface B of FW 1 goes down (while Interface A is up): - Data traffic stops. After unicast routing converges providing FW 1 a route to Router 1 via FW 2, FW 1 starts sending register messages via Interface A of FW2. Although this is normal protocol behavior, it is not desirable in an HA-mode situation. To help workaround this new feature called PIM High-Availability (HA) Mode option is being proposed. You can find a summary of this feature at the bottom of this document. Interface B comes up again: - After unicast routing converges, FW 1 starts sending register messages via Interface B again. If both Interface A and B of FW 1 go down: - Same result as for Interface A failure above. (II) Assume the shortest path tree (SPT) from source to receiver has been built and data traffic is being forwarded natively, i.e. no register encapsulation is in progress. The RP or the DR of the receiver depending on the traffic threshold initiates the SPT. It is necessary that unicast routing be configured such that cost to reach M via FW 1 is lower than cost via FW 2 to ensure the last hop router (Router 1 in this case) does not attempt to build the SPT via FW 2. Interface A of FW 1 goes down: - Data traffic forwarding to Net B stops. After unicast routing converges and Router 1 realizes that next hop to source is FW 2 it will send S, G SPT join to FW 2. FW 1 will start forwarding data traffic as register messages after it elects itself DR on Interface A. The RP (Router 1) may not accept the register packets till it's current SPT state expires. This depends on how soon unicast routing converges and the PIM implementation clears the state. At a later time the RP or DR of the receiver may build another SPT to FW 2. Interface A of FW 1 comes back up: - FW 1 elects itself DR on Net A and immediately starts sending data packets encapsulated in register messages to RP. - FW 2 continues sending packets till it receives a prune message from Router 1. VII Interface B of FW 1 goes down: - Data traffic stops. - After unicast routing converges FW 1 will start sending register messages via Interface A of FW2. Even if Router 1 tries to build an SPT via FW2 it will not succeed because FW2 is not the DR on Net A. This situation is not desirable and the PIM HA-mode option will help provide a workaround for this. Interface B comes up again: - After unicast routing converges, FW 1 will start sending register messages via Interface B. If upstream routers decide to build the SPT again via Interface B of FW 1 then this tree will get established again. Both interface A and interface B of FW 1 go down: - FW 2 starts sending register packets after it elects itself DR on Net A. The RP (Router 1) may not accept the register packets till it's current SPT state expires. This depends on how soon unicast routing converges and the PIM implementation clears the state. At a later time the RP or the DR of the receiver may build another SPT to FW 2. Both Interface A and Interface B of FW 1 come back up: - FW 1 starts sending register packets to RP as soon as it elects itself DR. FW 2 also continues sending data till it receives a prune message from the RP. TO REDUCE FAILOVER TIME: (a) Configure low hello interval (Advanced PIM Options) on FW 1. The hello interval has a default value of 30 seconds that results in advertising a hold time of 210 (3 x hello interval). This will result in FW 2 timing out FW 1 earlier and electing itself the DR on the LAN. (b) Configure lower values on updates for unicast routing protocol being used. In this example RIPv2 is the unicast protocol used and it has a default update interval of 30 seconds that is configurable from 1 to 65535 seconds. This recommendation is for all the routers in the domain since the update interval also dictates the route hold down time, i.e., the amount of time a route is considered reachable in the absence of any new updates. Scenario 4. PIM Sparse Mode: - M is receiver - N is sender - Router 1 is the RP. Net A Net B Firewall 1 NOKIA IP650 Firewall 2 Host M VIII Router 1 NOKIA IP650 Router 2 Host N Initially FW 1 is elected DR on Net A and sends a (*, G) join to the RP on receiving an IGMP membership message for group G from M. It should be noted that most commonly deployed PIM-SM implementations (including Nokia) switch to the SPT on receiving the first data packet from a source. This is because the default threshold for the switchover is set to zero. This threshold can be raised higher using the sparsemode SPT threshold configuration (Advanced PIM Options). This option is configurable between ranges of 0 (default) to 1000000 kbps. If no SPT switchover is required then the option should be configured with the value “infinity”. Interface A of FW 1 goes down. There could be two situations here: (a) FW 2 does not have any MFC or PIM state: When FW 1 hello state expires on Interface A of FW 2, it will elect itself the DR on Net A, generate a (*, G) join towards the RP and also forward data traffic from N to M. (b) FW 2 has MFC or PIM state (because a failover occurred earlier or because data traffic was already flowing when FW 2 came up): FW 2 will start forwarding data traffic to M after it has elected itself DR on Net A. There was a bug due to which this step was not occurring properly and it has been fixed in IPSO 3.7. Interface A of FW 1 comes back up. FW 1 will immediately generate (*, G) join and also forward traffic. Assert election will result in FW 1 becoming forwarder. Interface B of FW 1 goes down: - When unicast route converges, FW 1 will build the multicast tree via interface A of FW 2. Now FW 2 is the forwarder although FW 1 is still responsible for building the join-prune tree. Although this is normal protocol behavior, it is not desirable in a HA-mode situation. The PIM HA-mode described below will help provide a workaround for this. Interface B of FW 1 comes up: - When duplicate data packets are forwarded assert election results in FW 1 becoming the winner. BOTH interface A and interface B of FW 1 go down: - same result as for just failure of Interface A. TO REDUCE FAILOVER TIME: (a) Configure low hello interval (Advanced PIM Options) on FW 1. The hello interval has a default value of 30 seconds that results in advertising a hold time of 210 (3 x hello interval). This will result in FW 2 timing out FW 1 earlier and electing itself the DR on the LAN. (b) Configure low join prune interval on FW 1. This interval has a default value of 60 seconds and can be configured to a value between 1 to 3600 seconds. New IPSO Feature - PIM High-Availability (HA) Mode: When using PIM-SM in a HA-mode situation, it is possible that if only one of interfaces of the primary forwarder goes down, it could still continue to be the router responsible for both generating control traffic and sending register messages to the RP because it is the DR on the LAN to a directly connected receiver or a sender respectively. Although this is normal protocol behavior it is not desirable in a HA-mode situation where it would be preferable that the secondary forwarder take over all responsibility for multicast traffic. The PIM HA-mode option has been added to IPSO 3.7 to help facilitate this. When this option is configured in conjunction with PIM-SM, PIM will monitor all the PIM-enabled interfaces. It will bring up IX all the interfaces only if they all have valid addresses and they are all up. Also, if any of the PIM-enabled interfaces goes down, PIM will bring all the other interfaces down (with respect to itself) and keep them in that state till they are all up again. Also, since PIM-DM is different from PIM-SM it does not require this feature and turning this option on with PIM-DM will have no influence on the PIM daemon. X
© Copyright 2026 Paperzz