《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques
forward model. Figure 3-12: An example of back-translation. EN=>DE model in this case is facebook/wmt19-en-de and DE=>EN model is facebook/wmt19-de-en. Table 3-4 shows the performance comparison of regular0 码力 | 56 页 | 18.93 MB | 1 年前3keras tutorial
Downloading https://files.pythonhosted.org/packages/a8/76/220ba4420459d9c4c9c9587c6ce607bf5 6c25b3d3d2de62056efe482dadc /seaborn-0.9.0-py3-none-any.whl (208kB) 100% |████████████████████████████████| Downloading https://files.pythonhosted.org/packages/c3/8b/af9e0984f5c0df06d3fab0bf396eb09cb f05f8452de4e9502b182f59c33b/ matplotlib-3.1.1-cp37-cp37m- macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_640 码力 | 98 页 | 1.57 MB | 1 年前3深度学习与PyTorch入门实战 - 38. 卷积神经网络
Animation https://medium.freecodecamp.org/an-intuitive-guide-to-convolutional-neural- networks-260c2de0a050 Notation Input_channels: Kernel_channels: 2 ch Kernel_size: Stride: Padding: Multi-Kernels0 码力 | 14 页 | 1.14 MB | 1 年前3深度学习与PyTorch入门实战 - 37. 什么是卷积
Receptive Field https://medium.freecodecamp.org/an-intuitive-guide-to-convolutional-neural- networks-260c2de0a050 Weight sharing ▪ ~60k parameters ▪ 6 Layers http://yann.lecun.com/exdb/publis/pdf/lecun-89e0 码力 | 18 页 | 1.14 MB | 1 年前3Experiment 2: Logistic Regression and Newton's Method
threshold ϵ, i.e. |L+(θ) − L(θ)| ≤ ϵ (7) Try to resolve the logistic regression problem using gradient de- scent method with the initialization θ = 0, and answer the following questions: 1. Assume ϵ = 10−60 码力 | 4 页 | 196.41 KB | 1 年前3PyTorch Brand Guidelines
(Digital) Orange Light 1 (Digital) Orange (Digital) Orange (Print) #B92B0F #F05F42 #F2765D #DE3412 Orange (Print) C00, M61, Y72, K00 Pantone 171 C Secondary Colors When designing content0 码力 | 12 页 | 34.16 MB | 1 年前3Lecture 3: Logistic Regression
then p(y | x; θ) = 1 1 + exp(−yθTx) Assuming the training examples were generated independently, we de- fine the likelihood of the parameters as L(θ) = m � i=1 p(y(i) | x(i); θ) = m � i=1 (hθ(x(i)))y(i)(10 码力 | 29 页 | 660.51 KB | 1 年前3Lecture 7: K-Means
2021 33 / 46 Divisive Clustering Bisecting K-means: Repeating 2-means algorithm until we have a de- sired number of clusters MST-based method: Build a minimum spanning tree from the dissim- ilarity0 码力 | 46 页 | 9.78 MB | 1 年前3机器学习课程-温州大学-13机器学习-人工神经网络
PAPERT, et al. Perceptrons : An Introduction to Computational Geometry[J]. The MIT Press, 1969. [6] DE Rumelhart, Hinton G E, Williams R J. Learning Representations by Back Propagating Errors[J]. Nature0 码力 | 29 页 | 1.60 MB | 1 年前3动手学深度学习 v2.0
列的准确性,因为模型在开始生成新序列之前不再 需要记住整个序列。 • 多阶段设计。例如,存储器网络 (Sukhbaatar et al., 2015) 和神经编程器‐解释器 (Reed and De Freitas, 2015)。它们允许统计建模者描述用于推理的迭代方法。这些工具允许重复修改深度神经网络的内部状 态,从而执行推理链中的后续步骤,类似于处理器如何修改用于计算的存储器。 • 另一个关键的发展是生成对抗网络 之前几节我们学习了一些训练深度网络的基本工具和网络正则化的技术(如权重衰减、暂退法等)。本节我 们将通过Kaggle比赛,将所学知识付诸实践。Kaggle的房价预测比赛是一个很好的起点。此数据集由Bart de Cock于2011年收集 (De Cock, 2011),涵盖了2006‐2010年期间亚利桑那州埃姆斯市的房价。这个数据集是相 当通用的,不会需要使用复杂模型架构。它比哈里森和鲁宾菲尔德的波士顿房价71 数据集要大得多,也有更 13. 计算机视觉 #@save d2l.DATA_HUB['banana-detection'] = ( d2l.DATA_URL + 'banana-detection.zip', '5de26c8fce5ccdea9f91267273464dc968d20d72') 13.6.2 读取数据集 通过read_data_bananas函数,我们读取香蕉检测数据集。该数据集包括一个的CSV文件,内含目标类别标签0 码力 | 797 页 | 29.45 MB | 1 年前3
共 11 条
- 1
- 2