Internet-Draft | Fault Management in MPTCP | June 2022 |
Kang, et al. | Expires 18 December 2022 | [Page] |
This document presents a mechanism for fault management during a MPTCP session. It is used to convey subflow failure information from client to server by other subflow running normally. It includes: 1) a new Fault Announce Option for describing subflow failure, 2) implementation and interoperability of this option during a MPTCP session when one subflow suffers a failure. In fact, the server is able to determine network problems accurately based on these fault information reported from multiple clients for their connections.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 18 December 2022.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
During data transmission in a MPTCP session, subflows may encounter some problems, for example, port failure on one endpoint, network failure, or middlebox working abnormally. Current MPTCP protocol does not provide exchanges between client and server when a fault happens on a subflow which will cause transmission failure or delay.¶
RFC8684 [RFC8684] introduces TCP RST Reason (MP_TCPRST) option to signal reasons for sending a RST on a subflow which can help an implementation decide whether to attempt later reconnection. TCP RST Reason (MP_TCPRST) option only reports the reason for a specific subflow that has been determined to be closed later. This solution does not cover the case of abnormal termination of one ongoing subflow.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].¶
This document proposes a fault announce mechanism with a new option that can be used to deliver failure information of abnormal subflow between client and server via another subflow in the MPTCP session that works properly. The flow is illustrated in Figure 1.¶
The Fault Announce option is carried on SYN, ACK or data packets.¶
Client may detect a local fault, for example, local port or network card failure, or an error in local protocol processing. In this way, the client can determine the fault cause.¶
Client may actively detect subflow failure by a detecting task to determine the fault cause. For example, the client may deploy a detection task using a bidirectional forwarding detection (BFD) to determine whether the subflow is faulty.¶
Client may send an ICMP request to server and determine the exceptions by the duration of a response. Specifically, if the client cannot receive a response within a preset time, it means that this subflow is not working properly.¶
Another way for client to determine the fault reason is ICMP error report. Client may receive an ICMP error report from a third device (e.g., middlebox on the faulty subflow), in which indicates the fault cause.¶
A new Fault Announce option is defined to describe the fault in detail occurring on one subflow. If it is set, the faulty subflow is identified by its source address ID (SrcAddressID) and destination address ID (SrcAddressID). The mapping between IP addresses and addresses IDs should be created on both client and server through the process of ADD_ADDR defined in RFC8684 [RFC8684] and RFC6824 [RFC6824].¶
The format of the Fault Announce option (FAULT_ANNOUNCE) is depicted in Figure 2:¶
A new subtype should be allocated to indicate Fault Announce option.¶
"Cause" is an 8-bit field to describe the reason code for which causes the subflow to malfunction. Client detects the fault and determines the cause. Following values (partially mapped to the Exception Code in ICMP error report) are defined in this document:¶
"SrcAddressID" is used to identify source address ID for the faulty subflow.¶
"DestAddressID" is used to identify destination address ID for the faulty subflow.¶
In some actual scenarios, it is the middlebox failure that causes blocking of one subflow. So client should report to server the information of the faulty middlebox by Fault Announce option so that the server can quickly locate it. The information of a faulty middlebox may include:¶
Middlebox IP: The IP address of the faulty middlebox.¶
IP protocol version: The IP protocol version adopted by the faulty middlebox, i.e. IPv4 or IPv6. Server can use it to parse the field of "Middlebox IP address".¶
Flag 'A': If "Middlebox IP address" is optional, this flag should be defined to indicate whether the field of "Middlebox IP address" is carried in Fault Announce option.¶
In some possible implementations, faults are classified into transient fault and non-transitory fault. So a field of "fault type" may be added to identify the type (transient fault or non-transitory fault) for subsequent processing.¶
IANA is requested to assign a MPTCP option subtype for the Fault Announce option.¶
Fault Announce option is neither encrypted nor authenticated, so on-path attackers and middleboxes could remove, add or modify this option on observed Multipath TCP connections.¶