Stateless Best Effort Multicast Using MRHFutureweiBoston, MAUSAHuaimo.chen@futurewei.comFuturewei2386 Panoramic CircleApopka, FL32703USA+1-508-333-2270d3e3e3@gmail.comFutureweimichael.mcbride@futurewei.comCasa SystemsUSAyfan@casa-systems.comVerizon13101 Columbia PikeSilver SpringMD 20904USA 301 502-1347gyan.s.mishra@verizon.comChina Mobileliuyisong@chinamobile.comChina TelecomBeiqijia Town, Changping DistrictBeijing102209Chinawangaj3@chinatelecom.cnIBM CorporationUSAxufeng.liu.ietf@gmail.comFujitsuUSAliulei.kddi@gmail.comThis document describes stateless best effort Multicast
along the shortest paths to the egress nodes of
a P2MP Path/Tree.
The multicast data packet is encapsulated in an IPv6
Multicast Routing Header (MRH). The MRH contains the
egress nodes represented by the indexes of the nodes and
flexible bit strings for the nodes.
The packet is delivered to each of the egress nodes
along the shortest path.
There is no state stored in the core of the network.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in
when, and only when, they appear in all
capitals, as shown here.The potential egress nodes and transit nodes in a network
are numbered or indexed
from 1 to the number of the nodes.
shows an example network having nodes
PE1 to PE10 and P1 to P5, where
PE1 to PE10 are edge nodes (i.e., potential egress nodes)
and P1 to P5 are transit nodes.
In this example, these nodes have node indexes
1 to 10 and 11 to 15 respectively.
The number labeling a link is the cost of the link. For example,
5 on the link between P5 and PE4 is the cost of the link.
The cost of a link without a numeric label is 1.
The P2MP path/tree from ingress PE1 towards egresses PE2 to PE6
(i.e., PE2, PE3, PE4, PE5 and PE6)
in is represented
by the indexes of the egress nodes of the P2MP path.
The indexes of PE2 to PE6 are 2 to 6 (i.e., 2, 3, 4, 5 and 6)
respectively.
The indexes are represented by a flexible bit string or
the indexes directly.
A more efficient representation is used.
That is that if the latter is more efficient,
the indexes are used directly; otherwise,
the flexible bit string is used.
A controller such as PCE as a controller can have the information
about the node indexes, and send the P2MP path to
the ingress of the path. After receiving a data packet from traffic source CE1,
ingress PE1 encapsulates the packet in a MRH with
the P2MP path represented by the indexes.
The packet is transmitted along the shortest
path to each of the egresses.
This document describes the encoding of a P2MP Path/Tree
using the indexes of the egress nodes of the tree
and specifies the procedure/behavior of the nodes along the
shortest paths to the egresses.The following acronyms are used in this document:
Customer edge/equipment.Multicast Routing Header.Point 2 Multi-Point.Provider Edge.This section describes two basic encodings of a P2MP tree
(i.e., the egress nodes of the tree):
flexible bitstring and explicit nodeindex.
We encode the tree more efficiently
using flexible bitstring and/or explicit nodeindex.
A flexible bitstring has four fields:
B flag with value 1,start index (StartIndex),size of bitstring (S-BitString) in bytes, andbitstring (BitString),
where each bit with value 1 indicates a node index equal to
StartIndex plus the bit number.
Note that the bit number is counted from left to right and from 0.For example,
the P2MP path/tree from ingress PE1 to egresses
PE2 - PE6 (i.e., PE2, PE3, PE4, PE5 and PE6) in
is represented in
using a flexible bitstring.
The indexes of egress nodes PE2 to PE6 are represented by four fields:
B flag with value of 1,
StartIndex of 15 bits with value of 2,
the S-BitString byte with value of 1, and
the BitString of 1 byte (i.e., 8 bits) with value 0b11111000.
S-BitString = 1 indicates BitString occupies 1 byte.
BitString = 0b11111000
combined with StartIndex = 2 indicates five node indexes 2, 3, 4, 5 and 6.
BitString's first bit (bit 0) with value 1 indicates the first
node index 2 equal to 2 + 0;
the BitString's second bit (bit 1) with value 1 indicates the second
node index 3 equal to 2 + 1, and so on.
In this case, the encoding of the P2MP tree uses 4 bytes.
An explicit nodeindex has two fields:
B flag with value of 0 and
node index (Nodeindex) representing a node index directly/explicitly.
Suppose that the indexes of egress nodes PE2 to PE6
of the P2MP tree in
are 2 to 6 respectively.
illustrates the encoding of
the tree (i.e., PE2, PE3, PE4, PE5 and PE6) using explicit nodeindex.
The node index of PE2 is represented by B = 0 and
NodeIndex of 15 bits with value of 2 (i.e., NodeIndex = 2);
The node index of PE3 is represented by B = 0 and
NodeIndex of 15 bits with value of 3 (i.e., NodeIndex = 3);
ans so on.
In this case, the encoding of the P2MP tree uses 10 bytes.
Using flexible bitstring is more efficient than using explicit nodeIndex.
We encode a tree more efficiently
using flexible bitstring and/or explicit nodeindex. That is that
we encode some egress nodes of the tree using flexible bitstring
and the others using explicit nodeindex.
For the tree from PE1 towards PE2 to PE6 in
,
if PE2 to PE6 have their indexes 2 to 6 respectively,
we encode the tree using flexible bitstring as shown in
.
Using flexible bitstring to encode the tree is more efficient than
using explicit nodeindex as shown in
.
If PE2 to PE6 have their indexes
102, 503, 904, 905 and 906 respectively,
we encode the tree using flexible bitstring and explicit nodeindex
as shown in .
We encode egress nodes PE2 and PE3 of the tree using
two explicit nodeindexes.
PE2's index 102 is represented by the first explicit nodeindex with B = 0 and
NodeIndex of 15 bits with value of 102 (i.e., NodeIndex = 102).
PE3's index 503 is represented by the second explicit nodeindex with B = 0 and
NodeIndex of 15 bits with value of 503 (i.e., NodeIndex = 503).
We encode egress nodes PE4 - PE6 (i.e., PE4, PE5 and PE6) of
the tree using a flexible bitstring.
The indexes of PE4 to PE6 are represented
by the flexible bitstring with four fields:
B flag of 1 bit with value of 1,
StartIndex of 15 bits with value of 904,
S-BitString of 8 bits with value of 1, and
BitString of 1 byte (i.e., 8 bits) with value 0b11100000.
S-BitString = 1 indicates BitString occupies 1 byte.
BitString = 0b11100000
combined with StartIndex = 904 indicates thsat the indexes of PE4 to PE6
are 904 to 906 respectively.
The BitString's first bit (i.e., bit 0) = 1 indicates PE4's index
904 (i.e., 904 = 904 + 0);
The BitString's second bit (i.e., bit 1) = 1 indicates PE5's index
905 (i.e., 905 = 904 + 1); and
The BitString's third bit (i.e., bit 2) = 1 indicates PE6's index
906 (i.e., 906 = 904 + 2).
Every node in a network has a Node Index IPv6 Forwarding Table (NIFT).
The table has a row for the index of each egress node.
The row contains the index of the egress node,
the IPv6 address and the index
of the next hop on the shortest path to the egress node, and
node index bit mask (BM) of the same next hop node (BM-SNH).
This table indicates the shortest IGP path to each egress, i.e.,
the next hop of the shortest path to each egress.
This is similar to a unicast forwarding table but organized
by exact match node index rather than longest match IP
address or the like.
shows an example Node Index IPv6 Forwarding Table of PE1 in
.
The table has 10 rows/entries of node index,
next hop IPv6 address, next hop index, and
BM of the same next hop.
The next hop to PE1 itself is NULL.
The next hop to each of PE2 to PE9 is P1.
The next hop to PE10 is PE10.
Note: The information such as port number or interface
used to forward a packet
to the next hop is not shown in the figure,
which is the same as the corresponding information in
the forwarding table (FIB) of PE1.
For example, the second row/entry contains
node index 2 of egress PE2, next hop node P1's IPv6 address,
next hop node P1's index 11, and the same next hop P1's bit mask (BM-SNH)
0b0111111110 indicating node indexes 2 to 9 of PE2 to PE9
have the same next hop P1.
The tenth row/entry contains
node index 10 of egress PE10,
next hop node PE10's IPv6 address,
next hop node PE10's index 10, and the same next hop PE10's bit mask (BM-SNH)
0b0000000001 indicating node index 10 of PE10
has the same next hop PE10.
shows an example Node Index IPv6 Forwarding Table of P1 in
.
The table has 10 rows/entries of node index,
next hop IPv6 address, next hop node index, and
BM of the same next hop (BM-SH).
For example, since the next hop to PE1 and PE10 is PE1,
the first row/entry contains has node index 1 of PE1,
next hop PE1's IPv6 address, next hop PE1's index 1,
and the same next hop PE1's bit mask (BM-SH) 0b1000000001 indicating
node indexes 1 and 10 of PE1 and PE10 have the same next hop PE1.
The next hop to PE2 and PE3 is P2.
The second row/entry contains node index 2 of egress PE2,
next hop P2's IPv6 address, next hop P2's index 12,
and the same next hop P2's bit mask (BM-SH) 0b0110000000 indicating
node indexes 2 and 3 of PE2 and PE3 have the same next hop P2.
The next hop to PE4 - PE7 is P5.
The fourth row/entry contains node index 4 of egress PE4,
next hop P5's IPv6 address, next hop P5's index 15,
and the same next hop P5's bit mask (BM-SH) 0b0001111000 indicating
node indexes 4 to 7 of PE4 to PE7 have the same next hop P5.
shows
a Multicast Routing Header (MRH) in an IPv6 packet.
The IPv6 packet has an IPv6 header with a destination address
(DA) and source address (SA) of IPv6,
a routing header with Routing type (TBD) indicating MRH and
an IP multicast datagram.
The routing header is indicated by the Next Header in the IPv6 header.
The format of the MRH is shown in
.
The MRH has the following fields:
The type of the header after the MRH.
Either another extension header or the type of
IP multicast datagram in the packet.Its value indicates the length of the
MRH in a unit of 64 bits (i.e., 8 bytes) excluding the first
8 bytes.Its value TBD
indicates that the routing header is a Multicast Routing
Header (MRH).The Version of the MRH.
This document specifies Version one.No flag is defined yet.Its value points to
the sub-tree (the start of the subtree).Its value indicates
the end of the sub-tree.Its value encodes the egress nodes of
the sub-tree. A node index MUST NOT occur more than once.
The node indexes in sub-tree are ordered.For the P2MP path/tree from PE1 via P1 to
PE2, PE3, PE4, PE5 and PE6 as shown
in ,
we select and use the encoding of the tree by
flexible bitstring as illustrated
in .
For an IP multicast datagram/packet to be transmitted by
the P2MP path/tree,
PE1 constructs an IPv6 packet for each sub-tree of the tree and
sends the packet containing a MRH and the IP multicast
datagram/packet to the next hop along the sub-tree.The number of sub-trees from PE1 is the number of different
next hop nodes from PE1 to the egress nodes (i.e., PE2 to PE6).
PE1 gets the next hops to the egress nodes
using its Node Index IPv6 Forwarding Table as shown in
with the node indexes of the egress nodes, which are 2, 3, 4, 5 and 6.
The next hops are the same,
which are P1. Thus, there is one sub-tree from PE1
via P1 towards PE2 to PE6.PE1 sets DA of the IPv6 packet to
P1's IPv6 address (P1's IPv6 for short) and
SA of the packet to PE1's IPv6 address (PE1's IPv6 for short).
PE1 builds the MRH based on the encoding of the tree
through including the sub-tree from P1 and setting SL to 4 as a pointer
pointing to the sub-tree and setting SE to 4,
which is the size of the sub-tree and indicates the end of the sub-tree.
shows the packet to be sent to P1, which is received by P1.
After receiving the IPv6 packet from PE1, P1 determines
whether the packet's next header is a MRH through checking if
the next header is a routing header, and if so,
whether the routing type in the routing
header is TBD for MRH.
When the next header is the MRH, P1 duplicates the packet
for each sub-tree from P1 and
sends the packet copy with an updated MRH to the next hop
along the sub-tree. P1 gets the next hops to the egress nodes
using its Node Index IPv6 Forwarding Table as shown in
with the node indexes of the egress nodes, which are 2, 3, 4, 5 and 6.
PE2 and PE3 have the same next hop P2 according to the table.
PE4 to PE6 have the same next hop P5.There are 2 sub-trees from P1. One sub-tree is from P1
via next hop P2 to PE2 and PE3.
The other is from P1 via next hop P5 to PE4, PE5 and PE6.
P1 duplicates the packet for each of these two sub-trees
and sends the packet copy to the next hop along the sub-tree.P1 sets the DA of one packet copy to P2's IPv6 address.
P1 updates the MRH based on the encoding of the tree in
through logically anding the BitString of 8 bits
with the corresponding 8 bits
(i.e., bits 2 to 9) in BM-SNH of PE2 (or PE3)
(i.e., removing the egress nodes PE4 to PE6, which are not on the sub-tree
from P2 to PE2 and PE3).
shows the IPv6 packet to be sent to P2, which is received by P2.P1 sets the DA of the other packet copy to P5's IPv6 address.
P1 updates the MRH based on the encoding of the tree in
through logically anding the BitString of 8 bits
with the corresponding 8 bits
(i.e., bits 2 to 9) in BM-SNH of PE4 (or PE5 or PE6)
(i.e., removing the egress nodes PE2 and PE3, which are not
on the sub-tree from P5 to PE4, PE5 and PE6).
shows the IPv6 packet to be sent to P5, which is received by P5.After receiving the IPv6 packet from P1, P5 determines
whether the packet's next header is an MRH.
When the next header is an MRH, P5 duplicates the packet
for each sub-tree from P5 and
sends the packet copy with an updated MRH to the next hop
along the sub-tree. P5 gets the next hops to the egress nodes using its
Node Index IPv6 Table with the node indexes of the egress nodes,
which are 4, 5 and 6. PE4, PE5 and PE6 have the same next hop
P4 according to the table. P5 sets the DA of the packet copy to P4's IPv6 address.
P5 updates the MRH based on the encoding of the tree in
.
shows the packet to be sent to P4, which is received by P4.
After receiving the IPv6 packet from P5, P4 determines
whether the packet's next header is an MRH.
When the next header is the MRH, P4 duplicates the packet
for each sub-tree from P4 and
sends the packet copy with an updated MRH to the next hop
along the sub-tree. P4 gets the next hops to the egress nodes using its
Node Index IPv6 Table with the node indexes of the egress nodes,
which are 4, 5 and 6.
PE4, PE5 and PE6 are the next hops PE4, PE5 and PE6 themselves
according to the table.P4 sends the copy with MRH containing SL = 0 to each of
PE4, PE5 and PE6.
The packet received by PE4 is shown in
.When a leaf/egress such as PE4 receives an IPv6 packet
with MRH having SL = 0, the leaf/egress sends the IP multicast
packet to the multicast layer of the leaf/egress.This section describes the procedure at
the ingress of a P2MP path/tree, and
the BE multicast forwarding procedure which can be used
at every node (i.e., ingress, transit and egress) of the tree.In one implementation, for a packet to be transported by a P2MP
Path/tree, the ingress of the tree duplicates the packet
for each sub-tree of the tree branching from the ingress,
encapsulates the packet copy in a MRH containing the sub-tree and
sends the encapsulated packet copy to the next hop node along
the sub-tree. For example, there is one sub-tree branching from the ingress
of the tree from ingress PE1 via next hop node P1 towards PE2 to PE6
in .
The sub-tree is from ingress PE1 via next hop node P1 towards PE2 to PE6.
Ingress PE1 sends P1 the packet as illustrated
in .
In another implementation, for a packet to be transported by a P2MP
Path/tree, the ingress of the tree encapsulates the packet in a MRH
containing the tree and "sends" the encapsulated packet to the ingress
itself through calling the BE multicast forwarding procedure of the ingress
as shown in .
This procedure duplicates the encapsulated packet for each sub-tree of
the tree branching from the ingress and sends the copy to the next hop
node along the sub-tree.
For example, suppose that there is a P2MP path/tree
from ingress PE1 to egresses PE2, PE3, PE4, PE5 and PE10
in .
There are two sub-trees branching from the ingress PE1 of the tree.
One is from ingress PE1 via next hop node P1 towards PE2 to PE5;
the other is from ingress PE1 to egress PE10.
For a packet to be transported by the tree,
ingress PE1 encapsulates the packet in a MRH containing the tree and
calls the BE multicast forwarding procedure of PE1.
The procedure duplicates the encapsulated packet for each of these
two sub-trees branching from PE1 and sends the copy to
the next hop node along the sub-tree.
When receiving an IPv6 packet with a MRH containing
a tree/sub-tree, a node duplicates the packet for each sub-tree
branching from the node and sends the packet copy with a updated
MRH to the next hop along the sub-tree.
The number of sub-trees branching from the node is the number of
different next hop nodes from the node to the egress nodes of
the tree.
The node determines the different next hops to the egress nodes
using the Node Index Forwarding Table of the node with the node
indexes of the egress nodes.
shows a
BE Multicast Forwarding Procedure.
The execution of the procedure for an IPv6 packet with a MRH
at a node duplicates the packet for each sub-tree branching
from the node and sends the packet copy with a updated MRH to
the next hop along the sub-tree.
Initially, Pkt-p is the IPv6 packet received by node N. At step 1, the procedure checks if the tree from N in Pkt-p's MRH
does not have any egress node index.
If the tree does not have any egress node index,
the procedure discards Pkt-p and return;
otherwise (i.e., the tree has some egress node indexes), the procedure
proceeds to next step (i.e., step 2).
SL and SE in the MRH indicates the start and end of the tree from N
respectively.
If each NodeIndex and BitString in the tree are zeros, the tree
does not have any egress node index. In one option, for a
flexible bitstring with a StartIndex and a BitString, when the
BitString becomes zeros, the StartIndex is set to zero (0). In this
case, if each NodeIndex and StartIndex in the tree are zeros,
the tree does not have any egress node index.
At step 2, the procedure finds the first egress node index J in
the tree from N in Pkt-p's MRH. J is the first node index represented
by a NodeIndex with value J or represented indirectly by a
flexible bitstring.
At step 3, the procedure checks if node index J is the index of
node N itself. If so, the procedure duplicates Pkt-p to Pkt-c,
decapsulates the packet copy (i.e., Pkt-c),
sends the decapsulated packet copy (i.e., IP multicast datagram/packet)
to the IP multicast forwarding module, clears node index J in the tree
from N in Pkt-p's MRH, and go to step 1;
otherwise (i.e., node index J is not the node index of N),
the procedure proceeds to next step (i.e., step 4).
Clearing node index J in the tree is setting NodeIndex
to 0 when node index J is represented by NodeIndex with
value J, or setting the bit for the node index J to 0 in the
BitString when node index J is represented by the
BitString.
At step 4, the procedure gets the next hop IPv6 address
(NH-IPv6 for short) and the BM-SNH from Node Index Forwarding Table of
N using node index J as the "index" into the table.
At step 5, the procedure duplicates Pkt-p to Pkt-c,
removes the egress node indexes from the tree from N in packet copy's MRH
(i.e., Pkt-c's MRH) that do not have the same next hop as node index J,
sets DA of the packet copy to NH-IPv6, sends the copy to DA
(i.e., the next hop).
Removing the egress node indexes from the tree
that do not have the same next hop as node index J is
logically ANDing each BitString with the BM-SNH's bits corresponding
to the BitString (i.e., BitString = BitString AND BM-SNH's bits
corresponding to BitString), and setting each NodeIndex to 0 when
node index in NodeIndex does not have the same next hop as node index
J.
At step 6, the procedure removes the egress node indexes having
the same next hop as node index J from the tree from N in Pkt-p's MRH,
and then go to step 1.
Removing the egress node indexes from the tree that
have the same next hop as node index J is logically ANDing each
BitString with INVERSE of the BM-SNH's bits corresponding to the
BitString (i.e., BitString = BitString AND ~BM-SNH's bits
corresponding to BitString), and setting each NodeIndex field to 0
when node index in the field has the same next hop as node index J.
After or while changing the tree in the MRH, each of step 3, 5 and 6
also updates SL and SE to indicate the start and end of the
tree/sub-tree in the MRH respectively,
wherein the updated SL points to the first flexible bitstring
with a bit having value 1 or the first NodeIndex with a value
greater than 0,
and the updated SE is the size of the tree/sub-tree from the
start pointed by the updated SL to the last flexible bitstring
with a bit having value 1 or the last NodeIndex with a value
greater than 0.
For general IPv6 and IPv6 extension header security
considerations, see .
More TBDIANA is requested to assign
a new Routing Type in the subregistry
"Routing Types" under registry
"Internet Protocol Version 6 (IPv6) Parameters"
as follows:
TBD