Bring Your Own Codegen to TVM
Confidentia Presenter: Zhi Chen, Cody Yu Amazon SageMaker Neo, Deep Engine Science Bring Your Own Codegen to TVM AWS AI© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Considering testing.mobilenet.get_workload(batch_size=1) 3. Partition and build the network with an external codegen mod = relay.build_extern(mod, “dnnl”) 4. Run the inference exe = relay.create_executor(“vm”, mod=mod reserved. System Overview Relay IR Graph Annotation with Your Annotator Graph Partitioning Your Codegen LLVM, CUDA, Metal, VTA Serialized Subgraph Library Relay Runtime (VM, Graph Runtime, Interpreter)0 码力 | 19 页 | 504.69 KB | 5 月前3Make Successor Build Systems: World Tour of Build Systems
compile=16 set_property(TARGET atarget PROPERTY JOB_POOL_COMPILE compile) 1 2 link=1 3 codegen=16) 4 5 6 7 8 set_property(TARGET atarget 9 PROPERTY JOB_POOL_LINK link) 10 11 add_ add_custom_target(protocgen 12 COMMAND protoc --cpp_out=./out server.proto 13 JOB_POOL codegen 14 SOURCES t ) 15Will it CMake? ... and linking ... set_property(GLOBAL PROPERTY JOB_POOLS link=1 set_property(TARGET set_property(TARGET atarget PROPERTY JOB_POOL_LINK link) 1 compile=16 2 3 codegen=16) 4 5 set_property(TARGET atarget 6 PROPERTY JOB_POOL_COMPILE compile) 7 8 9 10 11 add_custom_target(protocgen0 码力 | 115 页 | 7.02 MB | 5 月前3Dynamic Model in TVM
function to compute the type at runtime ● Virtual machine as a new runtime for Relay ● Dynamic codegen (WIP) ○ Kernel dispatch for a single op ○ Graph dispatch for a (sub-)graph In collaboration with Amazon Web Services, Inc. or its Affiliates. All rights reserved. Dynamic codegen: op dispatch (proposal) ● Goal: support codegen for dynamic shape ● Challenges ○ Single kernel performs poor across different coupled together© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Dynamic codegen: kernel dispatch (proposal) Relay op: conv2d Default function FTVMStrategy A generic function0 码力 | 24 页 | 417.46 KB | 5 月前3Object Introspection: A Revolutionary Memory Profiler for C++ Objects
bar_vec, ret); }) .consume ([&t](auto ret) { return getSizeType(t.foo_str, ret); }); } CodeGen: class & struct struct Bar { std:string str; }; struct Foo { int a; int ret); }); } return tail; } template class std::vector {...}; CodeGen: containers/* Generated from Debug Info */ static types::st::Unit getSizeType( const Bar& { return returnArg .consume ([&t](auto ret) { return getSizeType (t.str, ret); }); } CodeGen: class & struct struct Bar { std:string str; }; struct Foo { int a; int 0 码力 | 62 页 | 2.24 MB | 5 月前32021-11-22 - Rust CTCFT - Rust for Linux
Moonshot: rust-analyzer support (e.g. “▶ Run Test | Debug”). Language Library Tooling Tooling Codegen quality: minimal source code example 1 struct Example(Option); impl Drop for Example { self.0.take(); } } pub fn example() -> u32 { Example(Some(10u32)).0.take().unwrap() } Codegen quality: output example::example: pushq %rbx subq $16, %rsp movabsq $42949672961 movl $10, %eax retq When unwrap_unchecked is used instead. Tooling Tooling Codegen quality: example 2, minimal source code use std::ptr::read_volatile; pub unsafe fn test1(ptr: *const 0 码力 | 53 页 | 332.50 KB | 9 月前3PAI & TVM Meetup - Shanghai 20191116
schedule Se 一人一 了9 。 Normal schedule: the schedule for CUDA 本 codegen IR Passes *。 Need to satisfy TensorCore Intrinsics "。Kind of Auto Tensorization 下 CUDA CodeGen *。IR passes to automatically transform sub-tree to TensorCore Intrinsics Pattern Matching compute_locallo族 了 了 Performance Optimization 计划了全事业部 “Same as non-TensorCore CUDA codegen 。Auto tune tiling sizes 。 Vectorized load/store for higher bandwidth utilization 。Double buffer0 码力 | 26 页 | 5.82 MB | 5 月前3Julia 1.11.4
as our C locks) to prevent recursion when doing certain operations (incremental package loading, codegen, etc.). The combination of a lock and this flag can be used to make finalizers safe. 2. A second initializes the global jl_root_task struct; and sets jl_current_task to the root task. jl_init_codegen() initializes the LLVM library. jl_init_serializer() initializes 8-bit serialization tags for builtin handled by codegen.cpp. Whenever a Julia function is called for the first time with a given set of argument types, type inference will be run on that function. This information is used by the codegen step0 码力 | 2007 页 | 6.73 MB | 3 月前3Julia 1.11.5 Documentation
as our C locks) to prevent recursion when doing certain operations (incremental package loading, codegen, etc.). The combination of a lock and this flag can be used to make finalizers safe. 2. A second initializes the global jl_root_task struct; and sets jl_current_task to the root task. jl_init_codegen() initializes the LLVM library. jl_init_serializer() initializes 8-bit serialization tags for builtin handled by codegen.cpp. Whenever a Julia function is called for the first time with a given set of argument types, type inference will be run on that function. This information is used by the codegen step0 码力 | 2007 页 | 6.73 MB | 3 月前3Julia 1.11.6 Release Notes
as our C locks) to prevent recursion when doing certain operations (incremental package loading, codegen, etc.). The combination of a lock and this flag can be used to make finalizers safe. 2. A second initializes the global jl_root_task struct; and sets jl_current_task to the root task. jl_init_codegen() initializes the LLVM library. jl_init_serializer() initializes 8-bit serialization tags for builtin handled by codegen.cpp. Whenever a Julia function is called for the first time with a given set of argument types, type inference will be run on that function. This information is used by the codegen step0 码力 | 2007 页 | 6.73 MB | 3 月前3Rust 语言学习笔记
false # 控制`-C lto` 参数,此参数影响可执行文件和静态库的生成, debug-assertions = true # 控制调试断言是否开启 codegen-units = 1 # 控制编译器的 `-C codegen-units` 参数。注意,当`lto = true`时,此字段值 被忽略 # 发布模板, 对应`cargo build --release`命令 [profile debug-assertions = false codegen-units = 1 # 测试模板,对应`cargo test`命令 [profile.test] opt-level = 0 debug = true rpath = false lto = false debug-assertions = true codegen-units = 1 # 性能评估模板,对应`cargo debug-assertions = false codegen-units = 1 # 文档模板,对应`cargo doc`命令 [profile.doc] opt-level = 0 debug = true rpath = false lto = false debug-assertions = true codegen-units = 1 5.2.4 feature0 码力 | 117 页 | 2.24 MB | 1 年前3
共 425 条
- 1
- 2
- 3
- 4
- 5
- 6
- 43