Monday, March 14, 2011

OSPF vs ISIS - The Basics

OSPF operates directly over IP, with a protocol number of 89. The source address of all OSPF messages is always the local end of an adjacency, and all messages are either multicast to one of two reserved multicast addresses,224.0.0.5 and 224.0.0.6, or unicast to the distant end of an adjacency. At no time does OSPF broadcast its
messages.

IS-IS operates not over the network layer like OSPF, but over the data link layer. But like OSPF, IS-IS messages are either unicast or multicast never broadcast. The source address of IS-IS messages is always the data link layer address (the MAC address, for example) of the local end of the adjacency, and the destination address is either the data link layer address of the distant end of the adjacency or, on broadcast media such as Ethernet, one of two reserved multicast MAC addresses: 0180:c200:0014 or 0180:c200:0015.

Because it runs over IP, OSPF can be and has been the target of spoofing and denial-of-service (DoS) attacks. Attacking IS-IS requires direct access to a network link or router.

Because IS-IS is not an IP protocol, prioritizing its messages is more problematic. Some router manufacturers, such as Juniper Networks and Cisco Systems, use proprietary internal mechanisms to tag IS-IS messages in such a way that they can be added to the same network control queue as OSPF



OSPF uses five message types:

  1. Hello
  2. Database Description (DD)
  3. Link State Request
  4. Link State Acknowledgement
  5. Link State Update


IS-IS uses four basic message types:
  1. IS-IS Hello (IIH)
  2. Complete Sequence Number PDU (CSNP)
  3. Partial Sequence Number PDU (PSNP)
  4. Link State PDU (LSP)
Unlike OSPF, IS-IS messages have subtypes. There are LAN and Point-to-Point Hellos, used as the names imply on either broadcast or point-to-point media. The LAN Hellos are also subdivided into level 1 and level 2 types and are sent over level 1 and level 2 adjacencies. Likewise, Sequence Number PDUs (CSNPs and PSNPs) and LSPs are also subdivided into level 1 and level 2 types. So although there are only four basic types of IS-IS messages, when divided by function, there are nine actual types:
  1. Level 1 LAN IIH
  2. Level 2 LAN IIH
  3. Point-to-Point IIH
  4. Level 1 CSNP
  5. Level 2 CSNP
  6. Level 1 PSNP
  7. Level 2 PSNP
  8. Level 1 LSP
  9. Level 2 LSP


Each router must be able to uniquely identify itself within the routing domain. This is the purpose of the OSPF router ID
(RID) and the IS-IS system ID (SysID). In addition, the router must be able to identify its general position within the routing domain. This is the purpose of the area ID (AID).


A loopback interface is a logical interface it exists only in software and has no physical presence on the router it is not susceptible to physical failures. So, there is no risk that an interface failure or shutdown on a router could force


OSPF to find a new RID and re-advertise its LSAs using the new RID, which in turn causes SPF runs on routers throughout the area and contributes to network instability.


There are two approaches by which a particular OSPF implementation could handle the loss of a RID. One approach is that the failure of an interface will have no effect on the RID. After all, the OSPF process just needs to know some 32-bit value, with some confidence that the value is unique within the OSPF routing domain, to use as its


RID at start-up. Once the value is known, it can be remembered, and the subsequent failure of the interface from which the RID was derived is irrelevant. The problem with this approach is that the loss of an IP address on an interface might not have been accidental. What if the IP address is intentionally removed from an interface and is reused on another router, and that router selects that IP address as its own RID? If the first router retains the same address as its RID, you now have duplicate RIDs in your network.
The second approach avoids the problem just described and is therefore the lesser of two evils. This approach to the loss of an IP address from which the RID is derived is to force the router to acquire a new RID from its remaining IP addresses.


When you have duplicate RID’s in the network the ospf database would show different next hops for the same subnet intermittently thus it may look like a routing loop


Some implementations (Juniper)will fill in the leading 0s in
configuration of area ID’s output displays, whereas others display the configuration exactly as typed (Cisco)


If the decimal number entered for the AID is greater than 255, JUNOS expresses the binary equivalent of the number in dotted-decimal format


In contrast to the OSPF AID and RID, which are expressed separately, the IS-IS AID and SysID are specified together in the Network Entity Title(NET). The NET is a special version of an ISO network service access point (NSAP) address,

Link State Basics

If the routing information is received and forwarded before a calculation is performed, the information must be independent of the calculation. In other words, each router along a route is performing a local calculation independent of any other router's calculation, and the result of each router's calculation is not shared with other routers. If the calculations are local, it is imperative that all routers along a route derive the same conclusions so that packets are forwarded consistently. Therefore, the information must be exactly the same for each router. No router can alter the information in any way as it is passed along.

If no router is sharing the results of its route calculations with any other router, the only thing it can know before any calculation is itself and its directly connected links

It is not enough, however, for some router a number of hops away from the destination to receive an announcement of, "I am router A, and I have a directly connected subnet of X." The receiving router must still know how to reach router A. Therefore, every router in the network must send an announcement identifying itself, its links, and the cost associated with each link.

One final piece of information must be included in the individual announcements for everything to work properly:

Each router must not only identify itself and its directly connected links, but must also identify any directly connected neighboring routers on those links

•    Every router in a network sends Hello messages on its local links and listens for Hellos, to discover neighboring routers.
•    Every router sends an announcement identifying itself, its directly connected links, and its directly connected neighboring routers.
•    Every router receiving an announcement keeps a copy of the announcement in a database, and forwards the announcement to its downstream neighbors.
•    When a router has a copy of an announcement from every other router in its database, it can accurately calculate routes to destinations.

Therefore, each link state router has a router ID (RID), which is an address unique to each router within a single routing domain. The router ID can be administratively assigned, or it can be automatically derived by some means such as using an interface address. The only requirement is that it must be different from the ID used by any other router in the domain, and it must be consistent a router cannot identify itself differently to different neighbors.

When 2 routers have verified two-way communication, by seeing each other RID. This method of handshaking is called three-way handshaking. I see you, an you see me.

Every router in the network must receive every announcement and record a copy in its database. The process of getting the announcement to every router is called flooding

To guarantee "sameness," no router can modify another router's announcement. That means that every router is responsible for its own announcements

If the failure is a protocol daemon or the router itself, neighbors will detect the failure through the loss of Hello messages. In either case, the effected neighbors send new announcements indicating the change.
But now there can be a dilemma. If router A fails, nonadjacent routers receive announcements from A's adjacent neighbors indicating the loss of A. But these routers also still have A's link announcement in their databases. What should be done with those announcements? Can the other routers safely deduce that A has failed and remove A's announcement? Is this a violation of the rule that no router can modify another router's announcement? And more important, how can each router be confident that all other routers have removed the announcement, to preserve database consistency?

To help resolve this dilemma, link state protocols include an age field in each link announcement. When a router originates an announcement, it sets a value in the age field

This value is changed (either incremented or decremented)[ OSPF increments the age, whereas IS-IS decrements it.] by other routers during flooding, and by every router as the announcement resides in its database. Some absolute value is specified in the protocol at which, if the age reaches this value, the announcement is declared invalid or "aged out," and is deleted from the database. The originating router is responsible for sending a new copy of the announcement at some time prior to this age expiration. The origination of a new announcement is called a refresh. In our example scenario, router A's announcement ages out in all databases because A is no longer refreshing the announcement. Every router in the network then has some assurance that as it deletes A's announcement, all of the other routers are also deleting the announcement

An announcement will be replicated numerous times during flooding, and as a result some routers will receive multiple copies of the same announcement. If all announcements arrive at the same time, any announcement can be chosen. Of course, delays across different network paths back to the originator are going to vary, so in most cases the multiple copies of the same announcement arrive at different times. In this case, the router might simply accept the announcement with the lowest age, on the grounds that the lowest age indicates the newest announcement.

It is entirely possible that delay variations in the network could cause an updated announcement to arrive with a greater age than the first announcement, causing the second announcement to be incorrectly dropped. Therefore, age is not a reliable determinant for choosing the most recent announcement. A
second value, the sequence number, specifies the most recent version of a router's announcement. Given two announcements from the same router, the announcement with the higher sequence number is newer

The simplest sequence numbering scheme is a linear one: A sequence number starts at zero, and increments up to some maximum. For example, a 32-bit sequence number (used by both OSPF and IS-IS) can range from zero to 232, or about 4.3 billion. If a router always starts numbering its announcements at sequence number one, this many available numbers should last far beyond the lifetime of the router. Even if a router produces a new announcement every second a sign of a very unstable link the maximum sequence number would not be reached for more than 130 years

If for some weird reason the router reaches the max counter it has to be reset to 1, If a new announcement is sent with a sequence number of 1, that announcement will be viewed by other routers as an older announcement and will be rejected. To avoid this situation, the router must wait until the existing announcement ages out of all databases

A better approach is for the router cycling its sequence numbers back to the beginning to first cause its previous announcement to be deleted from all databases. The router can do this by issuing another copy of the announcement with the maximum sequence number, but with the age set to a value indicating that the announcement's age has expired. (MaxAge or 0, depending on whether the age counter is up-counting or downcounting).

For example, if there is a 32-bit sequence number space, the number after 4,294,967,295 is 0. However, there is the potential for confusion in this scheme. Suppose two announcements are received from a router, one of which has a sequence number of 1000 and one with a sequence number of 10. The sequence numbers are not contiguous; so is the first announcement newer, or has the sequence number restarted, making the second number higher?

a > b, and (a - b) < n/2
a < b, and (b - a) > n/2
n = 64
and
n/2 = 32.

Given two sequence numbers 48 and 18, 48 is more recent because by the first rule
48 > 18 and (48 18) = 30, and 30 < 32.
Given two sequence numbers 3 and 48, 3 is more recent because by the second rule
3 < 48 and (48 3) = 45, and 45 > 32.
Given two sequence numbers 3 and 18, 18 is more recent because by the first rule
18 > 3 and (18 3) = 15, and 15 < 32.

Although a circular sequence number space seems better than a linear one, an odd series of simultaneous errors in a circular sequence number space can cause a network outage much worse than what sequence number rollover in a linear space causes. This determination is based on a meltdown of the ARPANET on October 27, 1980, when an early link state protocol with a 6-bit circular sequence number space was being used. As a result of that experience, both OSPF and IS-IS use linear sequence numbers.

The last issue with sequence numbers is what happens when a router or protocol restarts and loses all memory of what its last sequence number was. It cannot start again at the beginning, because other routers are likely to have an old announcement from the router in their databases with a higher sequence number, in which case the other routers would reject the new announcement with the lower sequence number. Therefore, another rule is added.

When a link state protocol starts, the router sends its announcement with a sequence number of 1 (or whatever the beginning sequence number is for that protocol). If a neighbor has an announcement from the router in its database with a more recent sequence number, it sends a copy of the announcement to the router. The router can then look at the sequence number and begin numbering its new announcements one higher than that number.

A checksum is performed over the entire contents of a link state announcement with the exception of the age field. Because the age changes as the announcement passes through routers during flooding, including it in the checksum would mean recalculating the checksum every time the age changes.

The age of every announcement is tracked, and if the originating router does not refresh the announcement before the age limit is reached, the announcement is deleted from the database.

An announcement is completely differentiated by its header, only the header needs to be sent during the phases of synchronization in which a neighbor is describing the announcements in its database and when a router is requesting a copy of the announcements it does not have.

The costs assigned to the links do not have to be the same at both ends of the links. The costs are assigned to the router interfaces connecting to the links, and apply to outgoing traffic on those interfaces. This way, traffic flows between two routers across the network can be made asymmetric

All routers in one area must have an identical link state database, rather than all routers in the entire routing domain. The flooding of individual router announcements is limited to the boundaries of an area.

Rule says that all routers within an area must have the same link state database, so area border routers
must maintain separate link state databases for each area to which they are attached. The scope of the trees calculated by the routers internal to an area is the area boundary. Area border routers calculate multiple trees, one for each of its attached areas.

Distance Vector basics


A vector protocol transmits the destination prefix to its directly connected neighbors. Each neighbor, when it receives the route information, modifies the information in such a way that the route information indicates the distance from that router to the originating router. For example, RIP increases the router hop count by one. The modified route is then added into the router's routing table. Only then is the route sent to that router's own directly connected neighbors,

Processing of the update messages involves two steps. First, the advertised hop count is incremented by one. If router B tells router A that a destination prefix is two hops away from itself, router A increments the hop count to three to indicate its own distance to the destination.

If the route table already contains an entry for the processed route, the hop counts are compared. If the newly incremented hop count of the received route is equal to or greater than the hop count of the existing entry, no change is made. But if the hop count of the received route is less than the hop count of the existing entry, the entry is replaced by the new route. In this way, the shortest route to each destination is maintained in the routing table.

  • Each router along a path plays a part in the route calculation. If any router makes a mistake in its calculation, all subsequent routers inherit the mistake.
  • A router cannot update its neighbors about a route until it has performed its own route calculation.

This is significant because the processing at each router takes some finite amount of time.

If the processing time at each router is t, then the time for a route to be processed across three routers is 3t, and across six routers 6t. If the route includes many router hops, the convergence time the time for the last router on the path to correctly enter the route into its routing table can be unacceptably long.

  • If a destination is not directly connected, all a router knows about the destination is what a directly connected neighbor tells it.

The neighbor might be mistaken, or might even give you intentionally incorrect information. It cannot "see" the entire network topology.

Split-horizon - Because vector protocols process updates hop by hop, the updates for a particular destination should always flow Downstream that is, away from the destination. Updates should not flow upstream toward a destination; this can help only in liner networks

In case of an indirect link failure in a mesh network there is a possibility that the downstream routers can be confused about the network topology














  • Imagine a very small time period in which two things happen 

D sends a periodic unsolicited update to C before it processes the update from B.
    C receives and processes the update from B before receiving the update from D.

    Router C has already processed the update from B and knows that 10.1.1.0 is no longer reachable through that neighbor. But then it receives the update from D, claiming it can reach 10.1.1.0 2 hops away. C assumes this information is accurate (remember, a vector protocol only knows what its neighbors tell it), and makes an entry in its routing table indicating that the previously unreachable 10.1.1.0 is now reachable via D, three hops away.

    The new route at C triggers an update to B, which now thinks 10.1.1.0 is reachable from C, four hops away. B updates D, which records the subnet reachable via B, five hops away.

    D then sends this information to C. When C receives this new update, the route is compared with the existing route table entry. The existing entry says that 10.1.1.0 is three hops away, whereas the new route shows the same subnet six hops away. However, both the existing route and the new route came from router D.

    Therefore, C must assume that the new entry describes the same route, and that the distance has increased for some unknown reason. So, C replaces the existing route with the new one. Another consequence of "I only know what my neighbors tell me."

    C then updates B, which increments its route to seven hops and updates D, which increments its route to eight hops and updates C, and so on. Because this circular pattern of updates could go on, theoretically, until the hop count reaches to infinity, the problem is called counting to infinity.

    The ill effects of counting to infinity can be limited by defining a finite value at which a route is considered unreachable. RIP, for example, defines an unreachable route as being 16 hops. In our example, the first router to increment the route to 16 hops marks the route as unreachable and then updates its downstream neighbor that the route is unreachable. Such a procedure eventually stops the counting to infinity loop, but it does not prevent the loop.

    A solution for preventing the counting to infinity problem is a holddown timer. If a router learns a route from a neighbor and then hears from that same neighbor that the route has become unreachable, a holddown timer is set for the route. Until the timer expires, the router will not accept any new information about that destination unless either

    • The information comes from the same neighbor that announced that the route was unreachable; or
    • Another neighbor advertises a route to the destination with a distance that is equal to or less than the distance of the original route.

    SPF explained in a Nutshell

    The SPF algorithm creates a shortest-path tree to all hosts in an area or in the network backbone, with the router that is performing the calculation at the root of that tree. In order for the SPF algorithm to work in the correct manner, all routers in the area should have the same database information. In OSPF, this is performed via the database exchange process

    The three sets that are used in the SPF calculation are:
    1. Unknown
    2. Tentative (TENT)
    3. PATH
    When the SPF algorithm initializes, all nodes, except for the root, which is also referred to as 'self', are placed in the Unknown set or list. The "root" is the router performing the SPF calculation. As SPF continues to run, nodes in the Unknown set are moved to the Tentative set beginning with the nodes directly connected to the root. The Tentative set is also commonly referred to simply as the TENT set or list.

    As the router goes through the SPF calculation, the node in the TENT set or list that is closest (lowest cost) to the root is moved to the PATH or PATHS list or set. This process is repeated until all nodes are in the PATH set and the TENT list is empty and finally the shortest-path tree is built.


    For example the OSPF database at router A would like below in the beginning.

    Node
    Cost
    Next Hop
    A
    0
    Self
    B
    Infinity
    unknown
    C
    Infinity
    unknown
    D
    Infinity
    unknown

    As and when the nodes move from TENT list to the PATH list the cost changes from infinity to an actual cost and the next-hop changes from unknown to a valid next hop Eg - {B,5,B}Node B reachable in cost 5 via B

    Neighbor Table looks like below for each router


    Router
    {neighbor, cost} Pairs
    A
    {B,5} {C,10}
    B
    {A,5} {C,3} {D,8}
    C
    {A,10} {B,3} {D,4}
    D
    {B,8} {C,4}











    Step 1. R1 places itself in the PATH list with a next-hop of itself and a cost of 0. When R1 places itself in the PATH list, R1 is also referred to as the PATH node, in addition to being the SPF root. All other nodes or routers are temporarily placed in the Unknown set and have their cost presently set to infinity

    This means that Router A's databases look like

    PATH List
    TENT List
    {A,0,0}
    (empty)


    Step 2. Take the node just placed on the PATH list, and call it the PATH node. Look at that node's neighbor list. Add each neighbor in this list to the TENT list with a next hop of the PATH node

    Router
    {neighbor, cost} Pairs
    A
    {B,5} {C,10}
    B
    {A,5} {C,3} {D,8}
    C
    {A,10} {B,3} {D,4}
    D
    {B,8} {C,4}
     






    In this example, {B,5,B} and {C,10,C} get added to the TENT list, as reflected.
    PATH List
    TENT List
    {A,0,0}
    {B,5,B}

    {C,10,C}


    Step 3. Find the neighbor in the TENT list with the lowest cost, add that neighbor to the PATH list, and repeat Step 2. If the TENT list is empty, stop.

    {B,5,B} is moved to the PATH list, because that's the shortest path to B. Why does it thinks that B has the shortest cost? - Because {C,10,C} is the only other neighbor of Router A, and the cost to get to C is greater than the cost to get to B, it's impossible to ever have a path to B that has a lower cost than what's already known

    PATH List
    TENT List
    {A,0,0}
    {C,10,C}
    {B,5,B}



    Step 4. Repeat Step 2. Take the node just placed on the PATH list and call it the PATH node. Look at that node's neighbor list. Add each neighbor in this list to the TENT list with a next hop of the PATH node, unless the neighbor is already in the TENT or PATH list with a lower cost. If the node just added to a TENT list already exists in the list, but with a higher cost, replace the higher-cost node with the node currently under consideration.


    Router
    {neighbor, cost} Pairs
    A
    {B,5} {C,10}
    B
    {A,5} {C,3} {D,8}
    C
    {A,10} {B,3} {D,4}
    D
    {B,8} {C,4}










     Router B's neighbors are examined.
    Router B has a link to C with a cost of 3 and a link to D with a cost of 8.

    Router C, with a cost of 5 (to get from "self" to B) + 3 (to get from B to C) = 8 (the total cost from A to C via B) and a next hop of B, is added to the TENT list, as is Router D, with a cost of 5 (the cost to get from the root node to B) + 8 (the cost to get from B to D) = 13 and a next hop of B.

    Because the path to C with a cost of 8 through B is lower than the path to C with a cost of 10 through C, the path to C with a cost of 10 is removed from the TENT list. the PATH and TENT lists at this point.

    PATH List
    TENT List
    {A,0,0}
    {C,10,C}
    {B,5,B}
    {C,8,B}

    {D,13,B}

    Step 5.Find the path in the TENT list with the lowest cost, add that path to the PATH list, and repeat Step 2. If the TENT list is empty, stop.

    The path to C through {C,8,B} is moved from the TENT list to the PATH list, as reflected in
    PATH List
    TENT List
    {A,0,0}
    {D,13,B}
    {B,5,B}

    {C,8,B}


    Step 6. Take the path just placed on the PATH list, and look at that node's neighbor list. For each neighbor in this list, add the path to that neighbor to the TENT list, unless the neighbor is already in the TENT or PATH list with a lower cost. If the node just added to a TENT list already exists in the list, but with a higher cost, replace the higher-cost path with the path currently under consideration.

    Router
    {neighbor, cost} Pairs
    A
    {B,5} {C,10}
    B
    {A,5} {C,3} {D,8}
    C
    {A,10} {B,3} {D,4}
    D
    {B,8} {C,4}





     Under this rule, the path to D through B (which is really B->C->D) with a cost of 12 replaces the path to D through B->D with a cost of 13, as reflected
    PATH List
    TENT List
    {A,0,0}
    {D,13,B}
    {B,5,B}
    {D,12,B}
    {C,8,B}


    Step 7. Find the neighbor in the TENT list with the lowest cost, add that neighbor to the PATH list, and repeat Step 2. If the TENT list is empty, stop.

    The path to D is moved to the PATH list, as reflected
    PATH List
    TENT List
    {A,0,0}

    {B,5,B}

    {C,8,B}

    {D,12,B}


    Step 8. Find the neighbor in the TENT list with the lowest cost, add that neighbor to the PATH list, and repeat Step 2. If the TENT list is empty, stop.

    The TENT list is empty because D has no neighbors that are not already on the PATH list, so you stop. At this point, the PATH list becomes A's routing table, which looks like
    Node
    Cost
    Next Hop
    A
    0
    Self
    B
    5
    B (directly connected)
    C
    8
    B
    D
    12
    B

    So that's how the basic SPF algorithm works. After it's done, the topology (according to Router A's routing table) looks like below: