Bidirectional Forwarding Detection Protocol

Bidirectional Forwarding Detection Protocol is a relatively new tool that allows to significantly lower dynamic routing protocol convergence times. Previously faster convergence in the LAN was often achieved by tuning protocol hello/keepalive timers. A good example is Fast Hello setting for OSPF. Lowering routing protocol timers, however, can lead to higher CPU resource utilization and somewhat unexpected behavior of the network. For example in a network where you have OSPF, BGP and LDP working simultaneously with two latter depending on the former, and you try to improve convergence times by tweaking the timers for all three protocols it is difficult to predict how the whole thing reacts on some link flapping in terms of both convergence time and hardware resource utilization during convergence period. Again, I am talking here about the complicated situations, such as when you lose communications over the link without interfaces going down, maybe just in one direction. Situations when the Layer 1 and Layer 2 go down normally trigger some protocol events and convergence occurs quite quickly.

BFD offers lightweight mechanism for detecting link communications failures and notifying routing protocol making them react quickly. Convergence times with BFD are lower than for routing protocols configuring with shorter timers (se my example for OSPF below). Of course, as detection times get shorter the sensitivity to link flapping grows which may be undesirable. Also BFD has dampening mechanism that allows to introduce exponential delay in communication failure detection mechanism.

For now BFD can work with the following protocols: 

  • Static Routing
  • BGP
  • EIGRP
  • ISIS
  • OSPF
  • HSRP
  • LDP
  • ATM Pseudowires

OSPF Example

Here is a simple topology I used to test BFD operations:

BFDTopology

One important thing to understand with BFD is that it works only in conjunction with the protocol it is supposed to notify of failures. If you just configure it on two adjacent interfaces like this:

R1 R2
interface Ethernet0/0
  bfd interval 50 min_rx 50 multiplier 5
interface Ethernet0/0
  bfd interval 50 min_rx 50 multiplier 5

It won’t do anything. No neighborship is going to be established a this stage, no packets will be sent. What you need is enable it for the routing protocol, in our case OSPF:

R1 R2
router ospf 100
bfd all-interfaces
router ospf 100
bfd all-interfaces

And here it is:

R1#show bfd neighbors

IPv4 Sessions
NeighAddr LD/RD RH/RS State Int
10.0.0.2 1/1 Up Up Et0/0

BFD neighborship is established:

Now, breaking the communication between OSPF neighbors (with debug bfd events enabled):

*Dec 21 17:14:32.811: %PARSER-5-CFGLOG_LOGGEDCMD: User:console logged command:ip access-group denyany in
*Dec 21 17:14:33.078: %BFDFSM-6-BFD_SESS_DOWN: BFD-SYSLOG: BFD session ld:1 handle:1,is going Down Reason: ECHO FAILURE
*Dec 21 17:14:33.078: BFD-DEBUG Event: V1 FSM ld:1 handle:1 event:ECHO FAILURE state:UP (0)
*Dec 21 17:14:33.078: BFD-DEBUG EVENT: bfd_session_destroyed, proc:OSPF, handle:1 act
*Dec 21 17:14:33.078: %BFD-6-BFD_SESS_DESTROYED: BFD-SYSLOG: bfd_session_destroyed, ld:1 neigh proc:OSPF, handle:1 act
*Dec 21 17:14:33.078: %OSPF-5-ADJCHG: Process 100, Nbr 10.0.0.2 on Ethernet0/0 from FULL to DOWN, Neighbor Down: BFD node down
*Dec 21 17:14:33.078: BFD-DEBUG Event: notify client(OSPF) IP:10.0.0.2, ld:1, handle:1, event:DOWN, cp independent failure (0)
*Dec 21 17:14:33.078: BFD-DEBUG Event: notify client(CEF) IP:10.0.0.2, ld:1, handle:1, event:DOWN, cp independent failure (0)

BFD reports communication failure in 267 ms and signals OSPF to shut down the neighborship which OSPF immediately does. This gives us convergence time a lot shorter than OSPF Fast Hello.

In real life you should, of course, check that you have enough hardware resources to run BFD with these parameters as well as understand whether you need such short convergence time (because of the possibility of false-positives).

What’s inside

Looking at how BFD packet capture you may expect to see some sort of keepalive exchange between neighbor IP addresses which in our case are 10.0.0.1 and 10.0.0.2. Here is what you actually see:

BFD Packet Capture

Both R1 and R2 are sending UDP messages to their own interface IP addresses. This is another important thing about BFD: it works on Layer 2. Here are these packets details:

BFD2Capt

R1 sends BFD control message to R2 MAC, R2 swaps source and destination MAC and sends it back out the interface it was received. R1 gets his own message back and sees that communication across the link is working. The same process happens in the opposite direction. So R2 doesn’t process BFD data coming from R1 and vice versa. Apart from this routers interchange control messages with link state information. These messages are sent on Layer 3:

BFDDiag

Here is what the link communication failure detection process looks like:

BFDFailure

Leave a Reply

Your email address will not be published. Required fields are marked *