Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP Multicast HLD #1808

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

IP Multicast HLD #1808

wants to merge 6 commits into from

Conversation

philo-micas
Copy link
Contributor

The HLD introduce IPMC dataplane implement in SONiC, include database change, Linux Multicast route listening and handling, orchagent ipmc support.

@philo-micas philo-micas mentioned this pull request Sep 14, 2024
| sai_ipmc_group_api->create_ipmc_group_member | SAI_RPF_GROUP_MEMBER_ATTR_RPF_GROUP_ID</br>SAI_RPF_GROUP_MEMBER_ATTR_RPF_INTERFACE_ID |
| sai_rpf_group_api->create_ipmc_group | NULL |
| sai_ipmc_group_api->create_ipmc_group_member | SAI_IPMC_GROUP_MEMBER_ATTR_IPMC_GROUP_ID</br>SAI_IPMC_GROUP_MEMBER_ATTR_IPMC_OUTPUT_ID |
| sai_ipmc_api->create_ipmc_entry | SAI_IPMC_ENTRY_ATTR_OUTPUT_GROUP_ID</br>SAI_IPMC_ENTRY_ATTR_PACKET_ACTION</br>SAI_IPMC_ENTRY_ATTR_RPF_GROUP_ID |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include SAI_IPMC_ENTRY_ATTR_COUNTER_ID

```

- Run the IP multicast packet forwarding command to show the configuration of the multicast packet forwarding function on the interface

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful to have a CLI to display the IPMC entries and SAI_IPMC_ENTRY_ATTR_COUNTER_ID


- Reads IP multicast routing messages from the kernel and listens for IP multicast routing changes in the kernel
- Based on netlink messages, the source IP address, destination IP address, inbound interface member, and outbound member of the IP multicast route are resolved and written into the APPL DB
- During a warm reboot, compare the kernel multicast route and APPL DB data to update the warm reboot data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a mechanism to propagate the IPMC counters in the hardware (SAI_IPMC_ENTRY_ATTR_COUNTER_ID) to the kernel/FRR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a mechanism to propagate the IPMC counters in the hardware (SAI_IPMC_ENTRY_ATTR_COUNTER_ID) to the kernel/FRR.

In the SAI definition, SAI_IPMC_ENTRY_ATTR_COUNTER_ID means 'packet hits count', which I understand as the statistic for IPMC hit packets.
Why do we need to pass IPMC packet statistics to kernel/FRR? It seems that there is no use case for this at the moment.

IP multicast is a network communication technique that allows a single sender to send packets to multiple destinations without having to send packets separately for each destination. This method greatly saves bandwidth resources and improves the efficiency of the network. For different Multicast group members, Multicast service models can be divided into ASM(Any-Source multicast) and SSM(Source-Specific multicast) service models. To ensure efficient transmission of multicast data, IP multicast uses the RPF (Reverse Path Forwarding) mechanism.

Routers dynamically establish forwarding tables through protocols (such as IGMP, PIM, MSDP, etc.) and maintain a multicast forwarding table, recording the membership of the multicast group and the corresponding outbound interface.

Copy link
Collaborator

@venkatmahalingam venkatmahalingam Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you request a slot to review this HLD in Routing WG? FYI. @eddieruan-alibaba
Few questions from the community meeting today.

  1. Why didnt we explore fpmsyncd path for programming the multicast routes? we came to know pimd is directly programming the kernel, what if we have the bgp ipv4 multicast configuration, wont there be any MRTM (part of Zebra) to consolidate multicast routes from different protocols(e.g pimd, bgp and static..etc)? we need to discuss with FRR folks and decide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe your use case as well?

I am second on @venkatmahalingam 's comment. It is better to explore fpmsyncd approach to program mroutes directly from pimd instead of via Linux kernel for the following two reasons.

  1. Scale and performance
  2. feature velocity

Both of these two considerations are related to your use case and roadmap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, next we will try to request Routing WG review of the HLD. As for choosing between fpmsyncd or kernel, both solutions have their own significant advantages, so it's hard for me to decide :) Let's leave this discussion for after communicating with Routing WG and see what their feedback is

; Store IP multicast routing data

key = MROUTE_TABLE:vrf_name|source_ip|dest_ip
; APPL DB usually uses ':' as the separator, but '|' is chosen here.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have key prefixes e.g MROUTE_TABLE:vrf_name:src-<source_ip>:dst-<dest_ip> to get rid of this issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have key prefixes e.g MROUTE_TABLE:vrf_name:src-<source_ip>:dst-<dest_ip> to get rid of this issue?

Great suggestion, but we still need to parse src- and dst-. Maybe using @src-ip would be a bit more concise?


#### mgmanager

Follow the nexthop group design, and manage the Group and Members separately:
Copy link
Collaborator

@venkatmahalingam venkatmahalingam Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't generally follow manager naming convention to handle routes from ASIC, can we rename it to mroutegrouporch or combine this functionality part of mrouteorch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't generally follow manager naming convention to handle routes from ASIC, can we rename it to mroutegrouporch or combine this functionality part of mrouteorch?

  1. ‘combine this functionality part of mrouteorch‘ is not a good solution. Based on the example of the nexthop group, further development will become very difficult to maintain due to code redundancy
  2. Since the mroute group doesn't have an appl db entry, using 'orch' for naming would result in mroutegrouporch using 'orch' but not using doTask to generate the group, which feels a bit odd
  3. So, we used 'manager' to indicate the management of the mroute group


The frr-pimd/pim6d daemon process is introduced in the FRR Container, in order to learn IP multicast routes under the PIM protocol and install IP multicast routes to the kernel.

For the implementation of the data plane, firstly, the fpm component does not support IP multicast routing right now; secondly, in order to support more IP multicast protocols as much as possible; thirdly, the implementation and support of FRR Container are beyond the design scope of this document, so the design will use Linux kernel as multicast route source.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to fpmsyncd, for IPMC can we consider using ipmcfpmsyncd instead of mroutesyncd ?
This channel can also be used to get multicast enabled interface updates from kernel along with mroute entries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to fpmsyncd, for IPMC can we consider using ipmcfpmsyncd instead of mroutesyncd ? This channel can also be used to get multicast enabled interface updates from kernel along with mroute entries.

The decision between using fpmsyncd or the kernel will be made after further discussion with the Routing WG. We will update the HLD accordingly.

In ECMP routing, there is already a very elegant nexthop group design and implementation, so for IP multicast routing reference this design uses mgmanager to manage the ipmc group and rpf group.

The following diagram summarizes the key structure of IPMC functionality in SONiC:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Linux kernel notifies multicast interface creation or deletion using RTM_NEWNETCONF and RTM_DELNETCONF messages.
Given this, can the same path used for mroute updates also be applied for multicast-enabled interfaces?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Linux kernel notifies multicast interface creation or deletion using RTM_NEWNETCONF and RTM_DELNETCONF messages. Given this, can the same path used for mroute updates also be applied for multicast-enabled interfaces?

Since the current interface configurations are all passed through config DB, separating multicast-enabled to be read by the kernel doesn't seem to be necessary


- Listens to MROUTE_TABLE in the APPL DB and queries the corresponding rpf group and ipmc group
- Associate the corresponding rpf group and ipmc group IDs, and invoke the SAI API to create IP multicast routes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take care of defining CoPP rules and multicast control and data packets trapping behaviour when multicast feature is enabled.

```text
RTM_NEWROUTE
RTM_DELROUTE
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls update netlink messages used for multicast enabled interfaces.


- The MgManager exposes group query and management interfaces
- Internally, RpfMember and RpfGroup manage reference counting and exception handling, invoke the SAI API to generate the corresponding group id

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add details on how IPMC groups and RPF groups shared b/w multicast routes ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add details on how IPMC groups and RPF groups shared b/w multicast routes ?

Yes, it will be revised in the next commit.

Description: A table was added to store IP multicast routing data. One entry corresponds to one multicast route.

Schema:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APP_DB schema is missing for INTF_TABLE to store multicast forwarding on interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APP_DB schema is missing for INTF_TABLE to store multicast forwarding on interface.

Yes, it will be revised in the next commit.


#### Scalability and performance requirements

In terms of capacity, the SAI API does not implement a query interface for the capacity of IP multicast routes and multicast member groups like ECMP routes. Therefore, the relevant capacity is not restricted. However, the device must continue to run normally in scenarios where the capacity is exceeded.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SAI switch attribute SAI_SWITCH_ATTR_AVAILABLE_IPMC_ENTRY is available to query capacity of IP multicast routes, please include.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SAI switch attribute SAI_SWITCH_ATTR_AVAILABLE_IPMC_ENTRY is available to query capacity of IP multicast routes, please include.

OK, it will be revised in the next commit. The focus here is on the group size and the size of member within each group

| ---------------------------------------------------- | -------------- |
| sai_router_intfs_api->set_router_interface_attribute | SAI_ROUTER_INTERFACE_ATTR_V4_MCAST_ENABLE</br>SAI_ROUTER_INTERFACE_ATTR_V6_MCAST_ENABLE |
| sai_rpf_group_api->create_rpf_group | NULL |
| sai_ipmc_group_api->create_ipmc_group_member | SAI_RPF_GROUP_MEMBER_ATTR_RPF_GROUP_ID</br>SAI_RPF_GROUP_MEMBER_ATTR_RPF_INTERFACE_ID |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo ? it should be sai_rpf_group_api->create_rpf_group_member

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo ? it should be sai_rpf_group_api->create_rpf_group_member

Yes, it will be revised in the next commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants