绕过conntrack,使用eBPF增强 IPVS优化K8s网络性能
access • Major modes • Iptables • IPVS Iptables mode • How it works • DNAT at PREROUTING chain • SNAT at POSTROUTING chain • Pros • Iptables is widely adopted in popular Linux distributions • Cons difficult to debug IPVS mode • Services are organized in hash table • IPVS DNAT • conntrack/iptables SNAT • Pros • O(1) time complexity in control/data plane • Stably runs for two decades • Support rich ingress/egress packets IPVS bypass conntrack • Why IPVS depends on conntrack? • Iptables/conntrack SNAT • How IPVS bypasses conntrack? • Ingress • Move IPVS Netfilter hook from local-in to PREROUTING0 码力 | 24 页 | 1.90 MB | 1 年前3腾讯云 Kubernetes 高性能网络技术揭秘——使用 eBPF 增强 IPVS 优化 K8s 网络性能-范建明
NodePort 提供集群外部的访问 iptables mode • 在netfilter pre-routing阶段做DNAT • 在netfilter post-routing阶段做SNAT • 每个service 添加一条或多条rules。使用数组管理rules。 • 仅支持随机的调度算法 • kube-proxy代码实现比较简单 • iptables 在linux iptables rule 不容易调试 IPVS mode • 使用hashtable 管理service • IPVS 仅仅提供了DNAT,还需要借用 iptables+conntrack 做SNAT • 控制面和数据面算法复杂度都是O(1) • 经历了二十多年的运行,比较稳定成熟 • 支持多种调度算法 优势 IPVS mode 不足之处 • 没有绕过conntrack,由此带来了性能开销 编译成eBPF中间代码 • 注入内核 • 挂载到network traffic control • 报文激发eBPF代码 技术创新点一 • IPVS 对conntrack的功能依赖 • Iptables SNAT • 具体如何绕过conntrack? • 进报文 • 将处理请求的钩子从nf local-in 前移到nf pre-routing • skb的路由指针是NULL • 处理分片 •0 码力 | 27 页 | 1.19 MB | 9 月前3Cilium v1.8 Documentation
resources outside the cluster (e.g., VMs in the VPC or AWS managed services) is masqueraded (i.e., SNAT) by Cilium to use the VPC IP address of the Kubernetes worker node. Excluding the lines for global { "type": "portmap", "capabilities": {"portMappings": true}, "snat": true }, { "name": "cilium", "type": "cilium-cni" } }, { "type": "portmap", "capabilities": {"portMappings": true}, "snat": true }, { "name": "cilium", "type": "cilium-cni" }0 码力 | 1124 页 | 21.33 MB | 1 年前3Cilium v1.9 Documentation
resources outside the cluster (e.g., VMs in the VPC or AWS managed services) is masqueraded (i.e., SNAT) by Cilium to use the VPC IP address of the Kubernetes worker node. Excluding the lines for eni=true }, { "type": "portmap", "capabilities": {"portMappings": true}, "snat": true }, { "name": "cilium", "type": "cilium-cni" } "/etc/cni/net.d/calico-kubeconfig" } }, { "type": "portmap", "snat": true, "capabilities": {"portMappings": true} }, { "type": "cilium-cni"0 码力 | 1263 页 | 18.62 MB | 1 年前3Cilium v1.10 Documentation
resources outside the cluster (e.g., VMs in the VPC or AWS managed services) is masqueraded (i.e., SNAT) by Cilium to use the VPC IP address of the Kubernetes worker node. Excluding the lines for eni.enabled=true }, { "type": "portmap", "capabilities": {"portMappings": true}, "snat": true }, { "name": "cilium", "type": "cilium-cni" } "/etc/cni/net.d/calico-kubeconfig" } }, { "type": "portmap", "snat": true, "capabilities": {"portMappings": true} }, { "type": "cilium-cni"0 码力 | 1307 页 | 19.26 MB | 1 年前3Cilium v1.11 Documentation
resources outside the cluster (e.g., VMs in the VPC or AWS managed services) is masqueraded (i.e., SNAT) by Cilium to use the VPC IP address of the Kubernetes worker node. To set up Cilium overlay mode default). 2. Flush iptables rules added by VPC CNI iptables -t nat -F AWS-SNAT-CHAIN-0 \\ && iptables -t nat -F AWS-SNAT-CHAIN-1 \\ && iptables -t nat -F AWS-CONNMARK-CHAIN-0 \\ && iptables -t }, { "type": "portmap", "capabilities": {"portMappings": true}, "snat": true }, { "name": "cilium", "type": "cilium-cni" }0 码力 | 1373 页 | 19.37 MB | 1 年前3Cilium的网络加速秘诀
driver kube-proxy DNAT kube-proxy SNAT worker node nodePort request backend endpoint tc eBPF NAT XDP eBPF NAT DSR 加速南北向 nodePort 访问 传统的 nodePort 转发,伴随着 SNAT的发生。而 Cilium 为 nodePort 提供了 native redirect_neigh step1 client -> node1 : nodePort step3 client -> pod2 : targetPort native DSR DNAT and No SNAT step4 pod2:targetPort -> client step6 node2 : nodePort -> client client step5 node2 : nodePort0 码力 | 14 页 | 11.97 MB | 1 年前3Cilium v1.7 Documentation
resources outside the cluster (e.g., VMs in the VPC or AWS managed services) is masqueraded (i.e., SNAT) by Cilium to use the VPC IP address of the Kubernetes worker node. Excluding the lines for global }, { "type": "portmap", "capabilities": {"portMappings": true}, "snat": true }, { "name": "cilium", "type": "cilium-cni" } }, { "type": "portmap", "capabilities": {"portMappings": true}, "snat": true }, { "name": "cilium", "type": "cilium-cni" }0 码力 | 885 页 | 12.41 MB | 1 年前3CentOS 7 操作命令-基础篇1.2
REJECT DROP nat 地址转换 PREROUTING OUTPUT POSTROUTING 地址转换发生在路由之前 DNAT 转换由系统生成的包 地址转换发生在路由之后 SNAT DNAT REDIRECT SNAT security ...... ③iptables 命令工具 本地进程 93 iptables 语法: iptables -t filter 操作 链名 匹配规则 -j -A INPUT -i ens33 -m mac --mac-source 00:04:0d:33:33:21 -j DROP SNAT #iptables -t nat -A POSTROUTING -o ens37 -s 192.68.1.0/24 -j SNAT --to 200.1.1.2 或者转换为出接口的 IP(pppoe 拨号时) #iptables -t nat -A POSTROUTING0 码力 | 115 页 | 8.68 MB | 1 年前3运维上海2017-Kubernetes 在大规模场景下的service性能优化实战 - 杜军
��� Ø DR�Tunneling�����������IPVS Director Ø NAT���������������IPVS Director - �������� DNAT���SNAT DR Tunneling NAT IPVS����� • �VVIP ü dummy�G # ip link add dev dummy0 type dummy # ip addr add @Huawei���1.9�beta • ��ClusterIP�NodePort�External IP�Load Balancer�OnlyLocalNode… • ��iptables�SNAT����� Kubernetes�Service�� Iptables��Service���� ��iptables������� IPVS��Service���� Iptables0 码力 | 38 页 | 3.39 MB | 1 年前3
共 25 条
- 1
- 2
- 3