Bring Your Own Codegen to TVM
subgraphs 1. Implement an operator-level annotator, OR 2. Implement a graph-level annotator© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Option 1: Operator-Level Annotation ● Implement Boolean functions in the template def conv2d(attrs, args): return is_float32(args) Relay operator name Operator attributes and args (inputs) can be checked as well Return True/False for this op After Device General Devices (CPU/GPU/FPGA) Mark supported operators or subgraphs 1. Implement extern operator functions, OR 2. Implement a graph annotator© 2019, Amazon Web Services, Inc. or its Affiliates0 码力 | 19 页 | 504.69 KB | 5 月前3TVM Meetup: Quantization
scratch • New Relay passes and TVM schedules required • AlterOpLayout, Graph Fusion etc require work/operator • No reuse of existing Relay and TVM infrastructure. Option 2 – Lower to a sequence of existing 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lowering of QNN Quantize Operator fn (%input_data: Tensor[(2, 5), float32]) { qnn.quantize(%input_data, out_dtype="uint8", output_zero_point=127 Affiliates. All rights reserved. QNN Conv2D Operator • Calculations are different from FP32 Conv2D https://discuss.tvm.ai/t/tf-lite-quantized-conv2d-operator-conversion/2651/8 𝑟𝑒𝑎𝑙_𝑣𝑎𝑙𝑢𝑒 = 𝒔𝒄𝒂𝒍𝒆0 码力 | 19 页 | 489.50 KB | 5 月前3Dynamic Model in TVM
function ● Relax type inference/checking for Any at compilation time ● Register a shape function for operator to check the type and compute the output shape© 2019, Amazon Web Services, Inc. or its Affiliates function ● Relax type inference/checking for Any at compilation time ● Register a shape function for operator to check the type and compute the output shape ● Shape function has two modes (op_attrs, input_tensors function ● Relax type inference/checking for Any at compilation time ● Register a shape function for operator to check the type and compute the output shape ● Shape function has two modes (op_attrs, input_tensors0 码力 | 24 页 | 417.46 KB | 5 月前3TVM: Where Are We Going
cuDNN Offload to heavily optimized DNN operator library FrameworksLimitations of Existing Approach cuDNN Frameworks New operator introduced by operator fusion optimization potential benefit: System High-level data flow graph and optimizations Directly generate optimized program for new operator workloads and hardware Hardware FrameworksWhy Automation is the Future Clear winner on0 码力 | 31 页 | 22.64 MB | 5 月前3OpenAI - AI in the Enterprise
help guide your own thinking. Product Note: Operator Operator is an example of OpenAI’s agentic approach. Leveraging its own virtual browser, Operator can navigate the web, click on buttons, fill that previously required human intervention, such as: Automating software testing and QA using Operator to interact with web apps like a real user, flagging any UI issues. Updating systems of record0 码力 | 25 页 | 9.48 MB | 5 月前3Facebook -- TVM AWS Meetup Talk
3400us (baseline), 40us (target) - 85x speedup - Uh ohEnter, TVM and model co-design - PyTorch operator overhead makes interpreter infeasible - Reduce FLOPs with block-sparsified weight matrices -0 码力 | 11 页 | 3.08 MB | 5 月前3Google 《Prompt Engineering v7》
`f` string syntax for string interpolation is more readable and concise than the traditional `+` operator. 4. The code doesn’t handle errors that might occur during the renaming process. It would be better0 码力 | 68 页 | 6.50 MB | 6 月前3DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Although training an MoE model will introduce additional commu- nication overheads, through our operator and communication optimizations, the training for DeepSeek-V2 can attain a relatively high Model0 码力 | 52 页 | 1.23 MB | 1 年前3Trends Artificial Intelligence
(1/25), Amazon (3/25) AI Agent Deployments = AI Incumbent Product Launches Accelerating OpenAI Operator (1/25 = Research Preview Release) Salesforce Agentforce (10/24 = General Release) Anthropic Claude0 码力 | 340 页 | 12.14 MB | 4 月前3
共 9 条
- 1