FRR ISIS Flex Algo: Configuration, Crash & Node SID Issues

by Alex Johnson 59 views

Introduction

This article delves into a critical issue encountered while configuring ISIS (Intermediate System to Intermediate System) with Flexible Algorithm (Flex Algo) within the FRRouting (FRR) environment. Specifically, it addresses problems related to the configuration acceptance, the distribution of Node SID (Segment Identifier) labels, and the unexpected crashes of the ISIS daemon upon FRR restart. Understanding these issues is crucial for network engineers and administrators who rely on FRR for routing and network segmentation.

Problem Description

When implementing an ISIS topology that incorporates Flex Algo, the initial configurations appear to be accepted without errors during runtime. However, the system fails to distribute the additional Node SID labels as expected. This lack of distribution can lead to routing inconsistencies and incomplete network segmentation. Furthermore, a more severe issue arises upon restarting FRR: the ISIS daemon crashes, disrupting network operations and requiring manual intervention.

Configuration Details

To better illustrate the problem, consider the following configurations used in the FRR environment:

Configuration Snippet

affinity-map blue bit-position 0
!
router isis LAB
 is-type level-2-only
 mpls-te on
 mpls-te router-address a.a.a.a
!
flex-algo 128
 dataplane sr-mpls
 advertise-definition
 affinity include-any blue
!
segment-routing on
segment-routing prefix a.b.c.d/32 index AB
segment-routing prefix a.b.c.d/32 algorithm 128 index ABC
!
interface ethA
 link-params
 affinity blue
 exit-link-params

In this setup, an affinity map named "blue" is defined, and ISIS is configured with MPLS-TE (Multiprotocol Label Switching - Traffic Engineering) enabled. A Flex Algo with ID 128 is defined to use SR-MPLS (Segment Routing with MPLS dataplane) and includes an affinity constraint for the "blue" map. Segment routing is enabled, and a prefix is associated with algorithm 128. The interface ethA is configured with the "blue" affinity.

Explanation

  • affinity-map blue bit-position 0: Defines an affinity map that can be used to associate links with specific attributes.
  • router isis LAB: Configures the ISIS router under the area tag "LAB".
  • mpls-te on: Enables MPLS-TE within the ISIS context, allowing for traffic engineering capabilities.
  • flex-algo 128: Defines a flexible algorithm with an ID of 128, which can be used for custom routing policies.
  • dataplane sr-mpls: Specifies that the flexible algorithm should use the SR-MPLS dataplane.
  • advertise-definition: Advertises the flexible algorithm definition.
  • affinity include-any blue: Includes any link with the "blue" affinity in the flexible algorithm's path calculation.
  • segment-routing on: Enables segment routing globally.
  • segment-routing prefix a.b.c.d/32 algorithm 128 index ABC: Associates a prefix with the flexible algorithm and assigns it an index.
  • interface ethA: Configures the interface ethA.
  • link-params: Defines link parameters for the interface.
  • affinity blue: Applies the "blue" affinity to the interface.

ISIS Crash Details

Upon restarting FRR, the ISIS daemon experiences a crash, as indicated by the following log excerpt:

Crash Log

ZEBRA: [V98V0-MTWPF] client 67 says hello and bids fair to announce only isis routes vrf=0
ISIS: [WH5BW-RNTXW][EC 100663326] nb_running_get_entry_worker: failed to find entry [xpath /frr-isisd:isis/instance[area-tag='LAB'][vrf='default']/flex-algos/flex-algo[flex-algo='128']/affinity-include-anies/affinity-include-any[.='blue']]
ISIS: Backtrace for 20 stack frames:
ISIS: [bt 0] /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(zlog_backtrace+0x3d) [0x7facb3076e31]
ISIS: [bt 1] /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(+0xcc6fa) [0x7facb30866fa]
ISIS: [bt 2] /usr/lib/frr/isisd(+0x8e756) [0x55f8a80fb756]
...
ISIS: Received signal 6 at 1762967454 (si_addr 0x670000024e, PC 0x7facb2d10eec); aborting...
...
ZEBRA: [N5M5Y-J5BPG][EC 4043309121] Client 'isis' (session id 0) encountered an error and is shutting down.
...
WATCHFRR: [M297P-SH4GY] restart isisd process 581 exited normally

The crash occurs because the ISIS daemon fails to retrieve a specific configuration entry related to the affinity include-any setting within the Flex Algo definition. This failure leads to a cascade of errors, ultimately causing the daemon to abort.

Impact and Expected Behavior

The inability to distribute Node SID labels and the ISIS daemon crashes have significant implications for network stability and functionality. Segment routing relies on the correct distribution of these labels to ensure proper path computation and traffic forwarding. The expected behavior is that FRR/ISIS should be able to restart cleanly with the given runtime configurations, and additional Node SID labels should be distributed correctly among ISIS peers. Instead, the actual behavior deviates significantly, leading to operational disruptions.

Steps to Reproduce

To reproduce this issue, the following steps can be taken:

  1. Configure FRR with ISIS, enabling MPLS-TE and segment routing.
  2. Define an affinity map (e.g., "blue") and apply it to an interface.
  3. Configure a Flex Algo with SR-MPLS dataplane and include the defined affinity map.
  4. Define a segment routing prefix associated with the Flex Algo.
  5. Restart the FRR instance and observe the ISIS daemon crash.

Example Configuration

affinity-map blue bit-position 0
!
router isis LAB
 mpls-te on
!
flex-algo 128
 dataplane sr-mpls
 advertise-definition
 affinity include-any blue
!
segment-routing on
segment-routing prefix x.x.x.x/32 algorithm 128 index XXX
!
interface ethX
 link-params
 affinity blue
 exit-link-params

Version Information

The issue has been observed in FRR versions 10.1.1 and 10.4.1, indicating that it is not specific to a single release but rather a persistent problem across multiple versions.

10.1.1 (but also failed in 10.4.1)

Conclusion

The problems encountered with ISIS SR Flex Algo configurations in FRR highlight the importance of thorough testing and validation when implementing advanced network features. The configuration acceptance without proper Node SID distribution, combined with the ISIS daemon crashes upon restart, pose significant challenges to network administrators. Addressing these issues requires a detailed investigation of the FRR codebase, focusing on the interaction between ISIS, Flex Algo, and segment routing modules.

For more in-depth information on FRRouting and ISIS, refer to the official FRR Documentation.