OSPF Fast Hello – What’s the Point of Changing Hello Multiplier?

OSFP Fast Hello allows you to reduce OSPF convergence time by reducing OSPF Dead Interval to 1 second and Hello Interval to a fraction of 1 second. The command to configure OSPF Fast Hello on Cisco IOS is the following:

ip ospf dead-interval minimal hello-multiplier 5

Dead Interval affects convergence time directly by specifying how much time OSPF waits for a hello message from neighbor until considering it down. When you enable OSPF Fast Hello  using the command above there is no way to tune it any more. But what’s with the second parameter? You can change Hello Multiplier and that basically means changing Hello Interval which is 1 / Hello Multiplier seconds. What could be the purpose of making routers to send Hello packets more or less frequently ?

First argument is quite obvious: CPU resources. The more Hello packets router receives per second, the more CPU load it has. Also the more neighbors the router the more it will be affected.

The second  is the sensitivity to failures. More Hello packets means more sensitivity to short communication interruptions between neighbors. Less Hello packets can lead to situation when link goes down and then up in between two Hello packets received.

The third is that convergence time still gets affected, but in a different way than when you change Dead Interval. By convergence time here I mean the amount of time that passes from actual communication failure till the neighbor is declared down.  When decreasing Hello Interval (i.e. increasing Hello Multiplier) this time gets closer to the actual 1 second.  It’s important to understand that Dead Interval is counted since the moment when the last Hello packet was received. Here is an example of different convergence times with the same Dead and Hello intervals:

OSPFConvergence

Both diagrams show four Hello packets being received per second (hello multiplier = 4) or one per 0,25 sec. The time passed between communication failure and OSPF taking down the neighbor depends on the time passed between failure and last Hello packet. In the example above it can vary in between and 0,751 and 0,999 sec.  The shorter the interval between Hello packets the less is this variation of convergence time interval.

Here are two logs showing similar situations (Hello interval is 200 ms and the convergence times are different from the example above, but still gives the idea):

*Dec 17 13:43:13.537: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:43:13.628: OSPF-100 HELLO Et0/0: Rcv hello from 10.0.0.1 area 0 10.0.0.1
*Dec 17 13:43:13.739: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:43:13.835: OSPF-100 HELLO Et0/0: Rcv hello from 10.0.0.1 area 0 10.0.0.1
*Dec 17 13:43:13.872: %PARSER-5-CFGLOG_LOGGEDCMD: User:console logged command:ip access-group denyany in
*Dec 17 13:43:13.935: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:43:14.128: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:43:14.331: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:43:14.527: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:43:14.725: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:43:14.837: %OSPF-5-ADJCHG: Process 100, Nbr 10.0.0.1 on Ethernet0/0 from FULL to DOWN, Neighbor Down: Dead timer expired

 

*Dec 17 13:44:27.188: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:44:27.212: OSPF-100 HELLO Et0/0: Rcv hello from 10.0.0.1 area 0 10.0.0.1
*Dec 17 13:44:27.378: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:44:27.410: OSPF-100 HELLO Et0/0: Rcv hello from 10.0.0.1 area 0 10.0.0.1
*Dec 17 13:44:27.559: %PARSER-5-CFGLOG_LOGGEDCMD: User:console logged command:ip access-group denyany in
*Dec 17 13:44:27.575: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:44:27.778: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:44:27.982: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:44:28.186: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:44:28.383: OSPF-100 HELLO Et0/0: Send hello to 224.0.0.5 area 0 from 10.0.0.2
*Dec 17 13:44:28.416: %OSPF-5-ADJCHG: Process 100, Nbr 10.0.0.1 on Ethernet0/0 from FULL to DOWN, Neighbor Down: Dead timer expired

The communication is being “broken” by applying an ACL that blocks all the traffic coming to interface. In the first example the last Hello packet is received at 13:43:13.835, the communications stops at 13:43:13.872 and neighbor gets declared down at 13:43:14.837. The time between losing communication and tearing down the neighbor relationship is 965 msec.

The second log shows communication being interrupted at 13:44:27.559 and neighbor going down at 13:44:28.416 which gives us the time interval of 857 msec between losing communication and tearing down the neighbor.

By setting Hello Multiplier value to the highest 20 you may get convergence time dispersion somewhere between and 951 and 999 ms.

Don’t forget, that this all is about the case when the communications between OSPF neighbors is lost, while the link protocol and media on the interfaces connecting stays up. In cases when the interface goes down convergence occurs much faster because the interface failure itself triggers LSA and SPF recalculation.

One thought on “OSPF Fast Hello – What’s the Point of Changing Hello Multiplier?

Leave a Reply

Your email address will not be published. Required fields are marked *