Datapath Healing

2022 Apr 14

Concept of datapath healing process in NSM

Datapath healing

Description

Datapath healing is extremely simple. The existing connection is used to make sure the connection stays online. The connection is checked at specified intervals using ‘ping’. As soon as the check fails, the client requests a new connection.

Parameters

These parameters are using for configuring datapath healing on the NSC side.

  • NSM_LIVENESS_CHECK_ENABLED means datapath healing feature is enabled.
  • NSM_LIVENESS_CHECK_INTERVAL means interval between pings.
  • NSM_LIVENESS_CHECK_TIMEOUT means timeout for the ping.

Benifits

  • Network Health: Regular liveness checking can help monitor the health and availability of a network or a specific vWire. If the ping responses are consistently successful, it indicates that the network or device is functioning correctly. Any disruptions or failures in ping responses can alert the NSC to ask NSM to change the connection.
  • Server Uptime: For NSC, liveness checking a endpoint’s interface at regular intervals can be a way to ensure that the endpoint is online and operational. If endpoint is not online then NSC requests a new endpoint immediately.
  • Intrusion Detection: Regular Liveness Check can be part of a broader security monitoring strategy. It helps detect unauthorized changes or intrusions by alerting you to unexpected changes in liveness check response patterns.
  • Firewall and ACL Testing: Liveness Check can be used to test the effectiveness of firewalls and Access Control Lists (ACLs). You can verify that the configured rules are working as intended if NSC doesn’t come up.

Add datapath healing for a custom dataplane framework

NSM uses the next contract for liveness check function.

1
2
// LivenessCheck - function that returns true of conn is 'live' and false otherwise
type LivenessCheck func(deadlineCtx context.Context, conn *networkservice.Connection) bool

To use a custom liveness check function you need simply implement and pass it to the heal client

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
func MyLivenessCheck(deadlineCtx context.Context, conn *networkservice.Connection) bool {
    // TODO: check the Ethernet Context
    return true
}

func main() {
	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()
    
    // ********************************************************************************
	// Get config from environment
	// ********************************************************************************
	c := &config.Config{}
	if err := envconfig.Usage("nsm", c); err != nil {
		logger.Fatal(err)
	}
	if err := envconfig.Process("nsm", c); err != nil {
		logger.Fatalf("error processing rootConf from env: %+v", err)
	}
    
    request := &networkservice.NetworkServiceRequest{
        Connection: &networkservice.Connection{
            Id:             "my-nsc",
            NetworkService: c.NetworkService(),
            Labels:         c.Labels(),
        },
        MechanismPreferences: []*networkservice.Mechanism{
            u.Mechanism(),
        },
    }

    nsmClient := client.NewClient(
        ctx,
        client.WithClientURL(&c.ConnectTo),
        client.WithName(c.Name),
        client.WithAuthorizeClient(authorize.NewClient()),
        client.WithHealClient(heal.NewClient(heal.WithLivenessCheck(MyLivenessCheck))),
    )

    // ********************************************************************************
	// Initiate connections
	// ********************************************************************************

    nsmClient.Request(ctx, request)
}

See at example of kernel liveness check implementation.

References