GPU Resource Management On JDOS
GPU Resource Management On JDOS 梁永清 liangyongqing1@jd.com 提供的服务 1. 用于实验的 GPU 容器 2.基于 Kubeflow 的机器学习训练服务 3.模型管理和模型 Serving 服务 Experiment Training Serving 均基于容器,不对业务方直接提供 GPU 物理机 GPU 实验 JDOS 常规的容器服务0 码力 | 11 页 | 13.40 MB | 1 年前3Node Operator: Kubernetes Node Management Made Simple
Node Operator: Kubernetes Node Management Made Simple 陈俊(Joe), Ant Financial Agenda • Background and Motivation • Introduction of Operators • Node-Operator • Advanced Topic: • Upgrade Master & Node Components reliably • Canary Rollout • Master & Node Component Versions Management Motivation: Work Order Deployment Worker Order • Upgrade Nodes Versions • Upgrade Node 10.10 Complicated architecture Work order deployment system can not meet the requirements of resource management. Operator Observe Action Analyze • Observe: watch desired resource and actual resource0 码力 | 18 页 | 11.70 MB | 1 年前3QCon北京2018/QCon北京2018-《Kubernetes-+面向未来的开发和部署》-Michael+Chen
Very manual, no fault tolerance, hard to scale, etc • Scheduling, provisioning, and resource management of multiple containers – Docker, Mesos à Kubernetes Support – AWS, Azure, Google à Kubernetes ContainerImage2 Replicas: 2 Kubernetes 101 at the Highest Level • Container Cluster = “Desired State Management” – Kubernetes Cluster Services (w/API) • Node = Container Host w/agent called “Kubelet” • Application Contains all state known about cluster • Kubernetes Front-end Control Plane • Provides RESTful interface • Returns state objects as JSON • Provides core control loops for platform • Watches shared state0 码力 | 42 页 | 10.97 MB | 1 年前301. K8s扩展功能解析
Kubernetes User Interface | Application Catalog | Monitoring | Logging Management Plane Infrastructure Services - Policy Management - Cluster Operations - User Management - Lifecycle Management Infrastructure • Support for extensible admission controllers • Pluggable cloud providers • Container runtime interface (CRI) enhancements © 2017 Rancher Labs, Inc. CustomResourceDefinition(CRD) • What CRD provides0 码力 | 12 页 | 1.08 MB | 1 年前3Go Programming Pattern in Kubernetes Philosophy
patterns of Kubernetes (Controller, codegen etc) • Write your own Controller • gPRC based interface design in Kubernetes (CRI as example) • For Kubernetes users: • Effective pattern of programming hood • Internal systems or commercial software Kubernetes • The container orchestration and management project created by Google • Successor of Google Borg/Omega system • One of the most popular Pattern 3: gRPC based Interface • Decouple Kubernetes from external dependencies • kubelet -> gRPC -> dockershim -> dockerd • go2idl: gogoprotobuf based protobuf gen CRI Management kubelet Workloads0 码力 | 29 页 | 2.12 MB | 1 年前3VMware SIG Deep Dive into Kubernetes Scheduling
placement options, for both control plane and worker nodes. 2 levels of scheduling and resource management are active. Currently no automatic scheduling integration occurs, that is, Kubernetes is not to solve potential issues with CPU and memory intensive workloads Kubernetes default resource management How it works Extending the functionality of Kubernetes Using vSphere DRS with Kubernetes High pre-container era Active discussions regarding Kubernetes enhancements going on now in Resource Management Working Group – please join in • See Issue #49964 14 Using a NUMA aware hypervisor to solve0 码力 | 28 页 | 1.85 MB | 1 年前3Putting an Invisible Shield on Kubernetes Secrets
complicated! ü User access management => raw and extensive! ü Secrets management => crucial! • Financial-grade security [1] KubeCon China 2018: Node Operator: Kubernetes Node Management Made Simple - Joe Chen extensions-webhook: /mutating-secret • Annotation: /storage-transform-disable=• Emergency management • High Availability guarantee • KMS • API server & kms-plugin • Cron job backup for KEKs (from com/occlum/occlum Occlum: SGX Dev Made Easy Occlum: Major Features Occlum: Container-Inspired Interface Demo • The purpose of this demo is to • Demonstrate TEE Transparency w/ Occlum’s Golang support 0 码力 | 33 页 | 20.81 MB | 1 年前3Amazon Elastic Kubernetes Service (EKS) 初探秘
Confidential Amazon VPC CNI plugin Elastic network interface Secondary IPs: 10.0.0.1 10.0.0.2 10.0.0.1 10.0.0.2 Elastic network interface 10.0.0.20 10.0.0.22 Secondary IPs: 10.0.0.20 10.0 Outbound Traffic SNAT EKS worker node Primary elastic network interface Pod Secondary elastic network interface Pod – 100.64. 0.200 © 2019, Amazon Web Services, Inc. or its Affiliates Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential AWS Identity and Access Management (IAM) 身份验证 Kubectl 3) Authorizes AWS identity with RBAC K8s API 1) Passes AWS identity 2)0 码力 | 39 页 | 1.83 MB | 1 年前3基于 Kubernetes 构建标准可扩展的云原生应用管理平台-孙健波、周正喜
KubeVela = OAM Kubernetes Runtime + Capability Center + UI (Cli + Dashboard) KubeVela Ø User interface layer - CLI/Dashboard/Appfile Ø KubeVela core - OAM Kubernetes Runtime to provide application controllers to implement core capabilities such as webservice, route and rollout etc. - Capability Management 云原生应用管理 千人钉钉交流群 一个既用户友好,又高可扩展,标准化 的应用管理引擎即将发布,敬请期待! https://github.com/oam-dev/kubevela 参考资料 e/master/cloudnativeto- presentation-20201029/kubevela - 应用运维 - Route - Scale - Capability management https://github.com/zzxwill/try-cloudnative/tree/master/capabilities - 集成 Dashboard/Cli à OpenAPI0 码力 | 27 页 | 3.60 MB | 9 月前3VMware SIG Intro to the vSphere Cloud Provider
dependency, the cloud-controller-manager was introduced. CSI provider for vSphere • Container Storage Interface (CSI) is a standard API allowing a storage provider to write just one plugin that will work for Kubernetes project to bring declarative, Kubernetes-style APIs to cluster creation, configuration, and management. It provides optional, additive functionality on top of core Kubernetes. Minikube driver for Fusion plug-in mechanism • allows Kubernetes to operate on different storage systems through a standard interface What it does The CSI spec reached 1.0 and has been released as stable/GA with Kubernetes v1.130 码力 | 12 页 | 425.38 KB | 1 年前3
共 36 条
- 1
- 2
- 3
- 4