Interface Description Language - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Economical, and Efficient Mixture-of-Experts Language Model DeepSeek-AI research@deepseek.com Abstract We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training Acknowledgments 27 B DeepSeek-V2-Lite: A 16B Model Equipped with MLA and DeepSeekMoE 29 2 B.1 Model Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 B.2 Performance Evaluation Evaluations on Math and Code 33 G Evaluation Formats 34 3 1. Introduction In the past few years, Large Language Models (LLMs) (Anthropic, 2023; Google, 2023; OpenAI, 2022, 2023) have undergone rapid development

0 码力 | 52 页 | 1.23 MB | 1 年前
3
Trends Artificial Intelligence

breakthrough large language models (LLMs) that – in effect – found freedom with the November 2022 launch of OpenAI’s ChatGPT with its extremely easy-to-use / speedy user interface. In addition, relatively 260% Annual Growth Over Fifteen Years of… Data to Train AI Models Led To… Note: Only “notable” language models shown (per Epoch AI, includes state of the art improvement on a recognized benchmark, >1K FLOPs are often used to estimate the computational cost of training or running a model. Note: Only language models shown (per Epoch AI, includes state of the art improvement on a recognized benchmark, >1K

0 码力 | 340 页 | 12.14 MB | 4 月前
3
OpenAI - AI in the Enterprise

they could offer more and better insights to clients.  They started with three model evals: 01 Language translation Measuring the accuracy and quality of translations produced   by a model. 02 Summarization candidate why this specific job was recommended to them. Indeed uses the data analysis and natural language capabilities of GPT-4o mini to shape these ‘why’ statements in their emails and messages to jobseekers style, and context. Consistent tone and style For a retailer, that could mean every product description stays true to brand voice; for a law firm, it means properly formatted citations, every time

0 码力 | 25 页 | 9.48 MB | 5 月前
3
Google 《Prompt Engineering v7》

Summary 66 Endnotes 68 Prompt Engineering February 2025 6 Introduction When thinking about a large language model input and output, a text prompt (sometimes accompanied by other modalities such as image evaluating a prompt’s writing style and structure in relation to the task. In the context of natural language processing and LLMs, a prompt is an input provided to the model to generate a response or prediction such as text summarization, information extraction, question and answering, text classification, language or code translation, code generation, and code documentation or reasoning. Please feel free to

0 码力 | 68 页 | 6.50 MB | 6 月前
3
OpenAI 《A practical guide to building agents》

foundations 7 Guardrails 24 Conclusion 32 2 Practical guide to building agents Introduction Large language models are becoming increasingly capable of handling complex, multi-step tasks. Advances in reasoning security reviews. 03 Heavy reliance on unstructured data: Scenarios that involve interpreting natural language,   extracting meaning from documents, or interacting with   users conversationally, for example and prevent redundant definitions. Broadly speaking, agents need three types of tools: Type Description Examples Data Enable agents to retrieve context and information necessary for executing the workflow

0 码力 | 34 页 | 7.00 MB | 5 月前
3
OctoML OSS 2019 11 8

groundwork forimproved multi-language support for expPosing runtime, and |IRs. QQ octoML Unified Object Protocol vm::Object NDArray | Rd | tuplelclosure AST Nodes Cross language suppPort Easy to introduce

0 码力 | 16 页 | 1.77 MB | 5 月前
3
TVM: Where Are We Going

tvm::runtime::Module GetFunction(string) -> tvm::runtime::PackedFunc SaveToBinary/LoadFromBinary Runtime Module Interface SubclassesUnified Runtime Benefit mod.export_library("mylib.so") Unified library packaging Free reduce_axis((0, 8)) C = tvm.compute((8, 8),   lambda y, x: tvm.sum(A[k, y] * B[k], axis=k)) HW Interface Specification by Tensor Expression TensorizationVTA: Open & Flexible Deep Learning Accelerator for Flexible Deep Learning Acceleration. Moreau et al. IEEE Micro 2019. VTA Hardware/Software Interface (ISA) VTA MicroArchitecture VTA Simulator} compiler, driver, hardware design full stack open

0 码力 | 31 页 | 22.64 MB | 5 月前
3
Dynamic Model in TVM

2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. VM bytecode Instruction Description Move Moves data from one register to another. Ret Returns the object in register result to caller’s

0 码力 | 24 页 | 417.46 KB | 5 月前
3
TVM@Alibaba AI Labs

kernel, strides, padding, dilation, layout, out_dtype): #Describe algorithm with tensor expression language'; #Return the out operation w How to compute. @autotvm.register_ topi_schedule(schedule_conv2d_nchw，pvr

0 码力 | 12 页 | 1.94 MB | 5 月前
3
DeepSeek图解10页PDF

零基础必知为了更深入理解 DeepSeek-R1，首先需要掌握 LLM 的基础知识，包括其工作原理、架构、训练方法。近年来，人工智能（AI）技术的快速发展催生了大型语言模型（（Large Language Model, LLM））的兴起。LLM 在自然语言处理（NLP）领域发挥着越来越重要的作用，广泛应用于智能问答、文本生成、代码编写、机器翻译等任务。LLM 是一种基于深度学习的人工智能模型，其核心目标是

0 码力 | 11 页 | 2.64 MB | 7 月前
3

共 10 条前往

页

分类

语言

格式

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Trends Artificial Intelligence

OpenAI - AI in the Enterprise

Google 《Prompt Engineering v7》

OpenAI 《A practical guide to building agents》

OctoML OSS 2019 11 8

TVM: Where Are We Going

Dynamic Model in TVM

TVM@Alibaba AI Labs

DeepSeek图解10页PDF