TVM Meetup: Quantization
Affiliates. All rights reserved. Quantization Overview • Represent FP32 numbers with a lower-precision INT8 numbers • Integer number stands as a proxy for FP32 number (not a downcast) • Quantized tensor is Quantization Relay Int8 Graph Framework Pre-quantized Graph MXNet Parser TF Parser QNN Graph Using QNN Dialect QNN passes Target-independent Relay passes Target-optimized Int8 Relay Graph Intel Intel x86 schedule ARM CPU schedule Nvidia GPU schedule ARM GPU schedule Relay Int8 Graph Target-dependent Relay layout opt© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Outline0 码力 | 19 页 | 489.50 KB | 5 月前3
共 1 条
- 1
相关搜索词