Bridging the Gap: Writing Portable Programs for CPU and GPU
1/66Bridging the Gap: Writing Portable Programs for CPU and GPU using CUDA Thomas Mejstrik Sebastian Woblistin 2/66Content 1 Motivation Audience etc.. Cuda crash course Quiz time 2 Patterns Oldschool afterwards7/66 Motivation Patterns The dark path Cuda proposal Thank you Why write programs for CPU and GPU Difference CPU/GPU Algorithms are designed differently Latency/Throughput Memory bandwidth Number of talk7/66 Motivation Patterns The dark path Cuda proposal Thank you Why write programs for CPU and GPU Difference CPU/GPU Why it makes sense? Library/Framework developers Embarrassingly parallel algorithms0 码力 | 124 页 | 4.10 MB | 5 月前3How and When You Should Measure CPU Overhead of eBPF Programs
How and When You Should Measure CPU Overhead of eBPF Programs Bryce Kahle, Datadog October 28, 2020 Why should I profile eBPF programs? CI variance tracking Tools kernel.bpf_stats_enabled kernel0 码力 | 20 页 | 2.04 MB | 1 年前3The Zig Programming Language 0.4.0 Documentation
by this number. You can use @alignOf to find out this value for any type. Alignment depends on the CPU architecture, but is always a power of two, and less than 1 << 29. In Zig, a pointer type has an alignment hermit, hurd, wasi, zen, uefi, }; pub const Arch = union(enum) { arm: Arm32, armeb: Arm32, aarch64: Arm64, aarch64_be: Arm64, arc, avr, bpfel, bpfeb, hexagon riscv64, sparc, sparcv9, sparcel, s390x, tce, tcele, thumb: Arm32, thumbeb: Arm32, i386, x86_64, xcore, nvptx, nvptx64, le32, le64, amdil0 码力 | 207 页 | 5.29 MB | 1 年前3The Zig Programming Language 0.5.0 Documentation
by this number. You can use @alignOf to find out this value for any type. Alignment depends on the CPU architecture, but is always a power of two, and less than 1 << 29. In Zig, a pointer type has an alignment wasi, emscripten, zen, uefi, }; pub const Arch = union(enum) { arm: Arm32, armeb: Arm32, aarch64: Arm64, aarch64_be: Arm64, aarch64_32: Arm64, arc, avr, riscv64, sparc, sparcv9, sparcel, s390x, tce, tcele, thumb: Arm32, thumbeb: Arm32, i386, x86_64, xcore, nvptx, nvptx64, le32, le64, amdil0 码力 | 224 页 | 5.80 MB | 1 年前3Kotlin 1.2 Language Documentation
Kotlin/Native supports the following platforms: iOS (arm32, arm64, emulator x86_64) MacOS (x86_64) Android (arm32, arm64) Windows (mingw x86_64) Linux (x86_64, arm32, MIPS, MIPS little endian) WebAssembly (wasm32) nextPrintTime = startTime var i = 0 while (i < 5) { // computation loop, just wastes CPU // print a message twice a second if (System.currentTimeMillis() >= nextPrintTime) Array) = runBlocking { val channel = Channel () launch { // this might be heavy CPU-consuming computation or async logic, we'll just send five squares for (x in 1..5) channel 0 码力 | 333 页 | 2.22 MB | 1 年前3Linux Lab v1.3 Manual
features (e.g. gc-sections) breaks the other kernel features on the main CPU architectures. These scripts uses qemu-system-ARCH as the cpu/board simulator, basic boot+func- tion tests have been done for ftrace+perf 用户组,联系微信:tinylab,公众号:泰晓科技 If often use, please increase disk storage to 100G~200G, memory storage to 8G, cpu cores to 4 and above. Currently, all of the X86_64 systems support Docker should be able to run Linux aarch64/raspi3 ]: 3 ARCH = arm64 4 CPU ?= cortex-a53 5 LINUX ?= v5.1 6 ROOTDEV_LIST := /dev/mmcblk0 /dev/ram0 7 ROOTDEV ?= /dev/mmcblk0 8 [ aarch64/virt ]: 9 ARCH = arm64 10 CPU ?= cortex-a57 11 LINUX ?=0 码力 | 66 页 | 1.12 MB | 1 年前3Linux Lab v1.2 Manual
features (e.g. gc-sections) breaks the other kernel features on the main CPU architectures. These scripts uses qemu-system-ARCH as the cpu/board simulator, basic boot+func- tion tests have been done for ftrace+perf unpredictable exceptions If often use, please increase disk storage to 100G~200G, memory storage to 8G, cpu cores to 4 and above. Currently, all of the X86_64 systems support Docker should be able to run Linux aarch64/raspi3 ]: 3 ARCH = arm64 4 CPU ?= cortex-a53 5 LINUX ?= v5.1 6 ROOTDEV_LIST := /dev/mmcblk0 /dev/ram0 7 ROOTDEV ?= /dev/mmcblk0 8 [ aarch64/virt ]: 9 ARCH = arm64 10 CPU ?= cortex-a57 11 LINUX ?=0 码力 | 67 页 | 1.13 MB | 1 年前3Linux Lab v1.1 Manual
features (e.g. gc-sections) breaks the other kernel features on the main CPU architectures. These scripts uses qemu-system-ARCH as the cpu/board simulator, basic boot+func- tion tests have been done for ftrace+perf unpredictable exceptions If often use, please increase disk storage to 100G~200G, memory storage to 8G, cpu cores to 4 and above. Currently, all of the X86_64 systems support Docker should be able to run Linux aarch64/raspi3 ]: 3 ARCH = arm64 4 CPU ?= cortex-a53 5 LINUX ?= v5.1 6 ROOTDEV_LIST := /dev/mmcblk0 /dev/ram0 7 ROOTDEV ?= /dev/mmcblk0 8 [ aarch64/virt ]: 9 ARCH = arm64 10 CPU ?= cortex-a57 11 LINUX ?=0 码力 | 65 页 | 1.12 MB | 1 年前3Kotlin Language Documentation 1.3
supports the following platforms: iOS (arm32, arm64, simulator x86_64) MacOS (x86_64) Android (arm32, arm64) Windows (mingw x86_64, x86) Linux (x86_64, arm32, MIPS, MIPS little endian, Raspberry Pi) var nextPrintTime = startTime var i = 0 while (i < 5) { // computation loop, just wastes CPU // print a message twice a second if (System.currentTimeMillis() >= nextPrintTime) function that was invoked. The uncon�ned dispatcher is appropriate for coroutines which neither consume CPU time nor update any shared data (like UI) con�ned to a speci�c thread. Uncon�ned vs con�ned dispatcher0 码力 | 597 页 | 3.61 MB | 1 年前3PyArmor Documentation v5.9.5
******************* **************** * FATAL ERROR: * * This OpenCV build doesn't support current CPU / HW configuration * * * * Use OPENCV_DUMP_CONFIG = 1 environment variable for details * ********* ˓→cpp:538: error: (-215: Assertion failed) Missing support for required CPU baseline features. Check ˓→OpenCV build configuration and required CPU / HW setup. in function 'initialize' One solution is to specify for core routines, they’re generated in runtime. Besides, the pre-built dynamic library for linux arm32/64 are packed into the source package. Fixed issues: • The module multiprocessing starts new process0 码力 | 131 页 | 428.65 KB | 1 年前3
共 1000 条
- 1
- 2
- 3
- 4
- 5
- 6
- 100