TVM Meetup: Quantization
Affiliates. All rights reserved. Quantization Overview • Represent FP32 numbers with a lower-precision INT8 numbers • Integer number stands as a proxy for FP32 number (not a downcast) • Quantized tensor is Quantization Relay Int8 Graph Framework Pre-quantized Graph MXNet Parser TF Parser QNN Graph Using QNN Dialect QNN passes Target-independent Relay passes Target-optimized Int8 Relay Graph Intel Intel x86 schedule ARM CPU schedule Nvidia GPU schedule ARM GPU schedule Relay Int8 Graph Target-dependent Relay layout opt© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Outline0 码力 | 19 页 | 489.50 KB | 5 月前3julia 1.10.10
primitive numeric types: • Integer types: Type Signed? Number of bits Smallest value Largest value Int8 ✓ 8 -2^7 2^7 - 1 UInt8 8 0 2^8 - 1 Int16 ✓ 16 -2^15 2^15 - 1 UInt16 16 0 2^16 - 1 Int32 ✓ 32 -2^31 FLOATING-POINT NUMBERS 16 julia> for T in [Int8,Int16,Int32,Int64,Int128,UInt8,UInt16,UInt32,UInt64,UInt128] println("$(lpad(T,7)): [$(typemin(T)),$(typemax(T))]") end Int8: [-128,127] Int16: [-32768,32767] julia> Int8(127) 127 julia> Int8(128) ERROR: InexactError: trunc(Int8, 128) Stacktrace: [...]CHAPTER 5. MATHEMATICAL OPERATIONS AND ELEMENTARY FUNCTIONS 35 julia> Int8(127.0) 127 julia> Int8(3.14)0 码力 | 1692 页 | 6.34 MB | 3 月前3Julia 1.10.9
primitive numeric types: • Integer types: Type Signed? Number of bits Smallest value Largest value Int8 ✓ 8 -2^7 2^7 - 1 UInt8 8 0 2^8 - 1 Int16 ✓ 16 -2^15 2^15 - 1 UInt16 16 0 2^16 - 1 Int32 ✓ 32 -2^31 FLOATING-POINT NUMBERS 16 julia> for T in [Int8,Int16,Int32,Int64,Int128,UInt8,UInt16,UInt32,UInt64,UInt128] println("$(lpad(T,7)): [$(typemin(T)),$(typemax(T))]") end Int8: [-128,127] Int16: [-32768,32767] julia> Int8(127) 127 julia> Int8(128) ERROR: InexactError: trunc(Int8, 128) Stacktrace: [...]CHAPTER 5. MATHEMATICAL OPERATIONS AND ELEMENTARY FUNCTIONS 35 julia> Int8(127.0) 127 julia> Int8(3.14)0 码力 | 1692 页 | 6.34 MB | 3 月前3Julia 1.11.5 Documentation
primitive numeric types: • Integer types: Type Signed? Number of bits Smallest value Largest value Int8 ✓ 8 -2^7 2^7 - 1 UInt8 8 0 2^8 - 1 Int16 ✓ 16 -2^15 2^15 - 1 UInt16 16 0 2^16 - 1 Int32 ✓ 32 -2^31 (-2147483648, 2147483647) julia> for T in [Int8,Int16,Int32,Int64,Int128,UInt8,UInt16,UInt32,UInt64,UInt128] println("$(lpad(T,7)): [$(typemin(T)),$(typemax(T))]") end Int8: [-128,127] Int16: [-32768,32767] different forms. julia> Int8(127) 127 julia> Int8(128) ERROR: InexactError: trunc(Int8, 128) Stacktrace: [...] julia> Int8(127.0) 127 julia> Int8(3.14) ERROR: InexactError: Int8(3.14) Stacktrace:0 码力 | 2007 页 | 6.73 MB | 3 月前3Julia 1.11.6 Release Notes
primitive numeric types: • Integer types: Type Signed? Number of bits Smallest value Largest value Int8 ✓ 8 -2^7 2^7 - 1 UInt8 8 0 2^8 - 1 Int16 ✓ 16 -2^15 2^15 - 1 UInt16 16 0 2^16 - 1 Int32 ✓ 32 -2^31 (-2147483648, 2147483647) julia> for T in [Int8,Int16,Int32,Int64,Int128,UInt8,UInt16,UInt32,UInt64,UInt128] println("$(lpad(T,7)): [$(typemin(T)),$(typemax(T))]") end Int8: [-128,127] Int16: [-32768,32767] different forms. julia> Int8(127) 127 julia> Int8(128) ERROR: InexactError: trunc(Int8, 128) Stacktrace: [...] julia> Int8(127.0) 127 julia> Int8(3.14) ERROR: InexactError: Int8(3.14) Stacktrace:0 码力 | 2007 页 | 6.73 MB | 3 月前3Julia 1.11.4
primitive numeric types: • Integer types: Type Signed? Number of bits Smallest value Largest value Int8 ✓ 8 -2^7 2^7 - 1 UInt8 8 0 2^8 - 1 Int16 ✓ 16 -2^15 2^15 - 1 UInt16 16 0 2^16 - 1 Int32 ✓ 32 -2^31 (-2147483648, 2147483647) julia> for T in [Int8,Int16,Int32,Int64,Int128,UInt8,UInt16,UInt32,UInt64,UInt128] println("$(lpad(T,7)): [$(typemin(T)),$(typemax(T))]") end Int8: [-128,127] Int16: [-32768,32767] different forms. julia> Int8(127) 127 julia> Int8(128) ERROR: InexactError: trunc(Int8, 128) Stacktrace: [...] julia> Int8(127.0) 127 julia> Int8(3.14) ERROR: InexactError: Int8(3.14) Stacktrace:0 码力 | 2007 页 | 6.73 MB | 3 月前3PAI & TVM Meetup - Shanghai 20191116
计算平台事业部 。TensorCore AutoCodeGen in TVM “。FP16 Mixed-Precision Training on PAI 。INT8 Inference on PAI-Blade 计算平台事业部 COMPUTING PLATFORM TensorCore AutoCodeGen Background Matching 计算平台事业部 shared/global lecal 印16/int8 - fpl6/ints ecal globalyshared Vocal 40X 1.26X 1.51X 1.30X 1.21X Performance on T4 计算下从事业部 国 Cublas INT8, 9 国 TVM INT8 国 TVM INT4 罩 TVMINT1 675 旨 号 昌 45 全 2.25 ”cublas baseline (512, 64, 512 ) (5120 码力 | 26 页 | 5.82 MB | 5 月前3julia 1.13.0 DEV
primitive numeric types: • Integer types: Type Signed? Number of bits Smallest value Largest value Int8 ✓ 8 -2^7 2^7 - 1 UInt8 8 0 2^8 - 1 Int16 ✓ 16 -2^15 2^15 - 1 UInt16 16 0 2^16 - 1 Int32 ✓ 32 -2^31 (-2147483648, 2147483647) julia> for T in [Int8,Int16,Int32,Int64,Int128,UInt8,UInt16,UInt32,UInt64,UInt128] println("$(lpad(T,7)): [$(typemin(T)),$(typemax(T))]") end Int8: [-128,127] Int16: [-32768,32767] different forms. julia> Int8(127) 127 julia> Int8(128) ERROR: InexactError: trunc(Int8, 128) Stacktrace: [...] julia> Int8(127.0) 127 julia> Int8(3.14) ERROR: InexactError: Int8(3.14) Stacktrace:0 码力 | 2058 页 | 7.45 MB | 3 月前3Julia 1.12.0 Beta4
primitive numeric types: • Integer types: Type Signed? Number of bits Smallest value Largest value Int8 ✓ 8 -2^7 2^7 - 1 UInt8 8 0 2^8 - 1 Int16 ✓ 16 -2^15 2^15 - 1 UInt16 16 0 2^16 - 1 Int32 ✓ 32 -2^31 (-2147483648, 2147483647) julia> for T in [Int8,Int16,Int32,Int64,Int128,UInt8,UInt16,UInt32,UInt64,UInt128] println("$(lpad(T,7)): [$(typemin(T)),$(typemax(T))]") end Int8: [-128,127] Int16: [-32768,32767] different forms. julia> Int8(127) 127 julia> Int8(128) ERROR: InexactError: trunc(Int8, 128) Stacktrace: [...] julia> Int8(127.0) 127 julia> Int8(3.14) ERROR: InexactError: Int8(3.14) Stacktrace:0 码力 | 2057 页 | 7.44 MB | 3 月前3Julia 1.12.0 Beta3
primitive numeric types: • Integer types: Type Signed? Number of bits Smallest value Largest value Int8 ✓ 8 -2^7 2^7 - 1 UInt8 8 0 2^8 - 1 Int16 ✓ 16 -2^15 2^15 - 1 UInt16 16 0 2^16 - 1 Int32 ✓ 32 -2^31 (-2147483648, 2147483647) julia> for T in [Int8,Int16,Int32,Int64,Int128,UInt8,UInt16,UInt32,UInt64,UInt128] println("$(lpad(T,7)): [$(typemin(T)),$(typemax(T))]") end Int8: [-128,127] Int16: [-32768,32767] different forms. julia> Int8(127) 127 julia> Int8(128) ERROR: InexactError: trunc(Int8, 128) Stacktrace: [...] julia> Int8(127.0) 127 julia> Int8(3.14) ERROR: InexactError: Int8(3.14) Stacktrace:0 码力 | 2057 页 | 7.44 MB | 3 月前3
共 16 条
- 1
- 2