【PyTorch深度学习-龙龙老师】-测试版202112
机中的语音助手、汽车上 的智能辅助驾驶、人脸支付等。下面将从计算机视觉、自然语言处理和强化学习 3 个领域 入手,为大家介绍深度学习的一些主流应用。 1.4.1 计算机视觉 图片识别(Image Classification) 是常见的分类问题。神经网络的输入为图片数据,输出 值为当前样本属于每个类别的概率分布。通常选取概率值最大的类别作为样本的预测类 别。图片识别是最早成功应用深度学习的任务之一,经典的网络模型有 果,具有时间维度信息的 3D 视频理解任务受到越来越多的关注。常见的视频理解任务有 视频分类、行为检测、视频主体抽取等。常用的模型有 C3D、TSN、DOVF、TS_LSTM 等。 图片生成(Image Generation) 是指通过学习真实图片的分布,并从学习到的分布中采样 而获得逼真度较高的生成图片。目前常见的生成模型有 VAE 系列、GAN 系列等。其中 GAN 系列算法近年来取得了巨大的进展,最新 torchvision # 导入视觉库 from matplotlib import pyplot as plt # 绘图工具 from utils import plot_image, plot_curve, one_hot # 便捷绘图函数 batch_size = 512 # 批大小 # 训练数据集,自动从网络下载 MNIST 数据集,保存至 mnist_data0 码力 | 439 页 | 29.91 MB | 1 年前3动手学深度学习 v2.0
(continues on next page) 目录 5 (continued from previous page) import torchvision from PIL import Image from torch import nn from torch.nn import functional as F from torch.utils import data from torchvision ImageNet数据集发布,并发起ImageNet挑战赛:要求研究人员从100万个样本中训练模型,以区分1000个不同 类别的对象。ImageNet数据集由斯坦福教授李飞飞小组的研究人员开发,利用谷歌图像搜索(Google Image Search)对每一类图像进行预筛选,并利用亚马逊众包(Amazon Mechanical Turk)来标注每张图片的相关 类别。这种规模是前所未有的。这项被称为ImageNet的挑战赛推动了计算机视觉和机器学习研究的发展,挑 training inputs', ylabel='Sorted testing inputs') 10.2.4 带参数注意力汇聚 非参数的Nadaraya‐Watson核回归具有一致性(consistency)的优点:如果有足够的数据,此模型会收敛到 最优结果。尽管如此,我们还是可以轻松地将可学习的参数集成到注意力汇聚中。 例如,与 (10.2.6)略有不同,在下面的查询x和键xi之间的距离乘以可学习参数w:0 码力 | 797 页 | 29.45 MB | 1 年前3PyTorch Brand Guidelines
the symbol, never exceed a minimum of 24 pixels for screen or 10mm for print. This ensures consistency and legibility of the symbol. Minimum Screen Size: 24px Minimum Print Size: 10mm 5 Brand0 码力 | 12 页 | 34.16 MB | 1 年前3《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques
target label is a composite of the inputs that were combined. A combination of a dog with a hamster image (figure 3-5) is assigned a composite [dog, hamster] label! 2 A whale’s tail fins are called flukes problems. Figure 3-5: A mixed composite of a dog (30%) and a hamster (70%). The label assigned to this image is a composite of the two classes in the same proportion. Thus, the model would be expected to predict a dataset Nx the size? What are the constraining factors? An image transformation recomputes the pixel values. The rotation of an RGB image of 100x100 requires at least 100x100x3 (3 channels) computations0 码力 | 56 页 | 18.93 MB | 1 年前3PyTorch Release Notes
experience. In the container, see /workspace/README.md for information about customizing your PyTorch image. For more information about PyTorch, including tutorials, documentation, and examples, see: ‣ PyTorch for NGC containers, when you run a container, the following occurs: ‣ The Docker engine loads the image into a container which runs the software. ‣ You define the runtime resources of the container by in your system depends on the DGX OS version that you installed (for DGX systems), the NGC Cloud Image that was provided by a Cloud Service Provider, or the software that you installed to prepare to run0 码力 | 365 页 | 2.94 MB | 1 年前3keras tutorial
in data science fields like robotics, artificial intelligence(AI), audio & video recognition and image recognition. Artificial neural network is the core of deep learning methodologies. Deep learning keras/keras.json. keras.json { "image_data_format": "channels_last", "epsilon": 1e-07, "floatx": "float32", "backend": "tensorflow" } Here, image_data_format represent the data format just change the backend = theano in keras.json file. It is described below: keras.json { "image_data_format": "channels_last", "epsilon": 1e-07, "floatx": "float32", "backend": "theano"0 码力 | 98 页 | 1.57 MB | 1 年前3《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniques
high quality image of a cat. The cat on the right is a lower quality compressed image. Source Both the cat images in figure 2-2 might serve their purpose equally well, but the compressed image is an order smaller. Discrete Cosine Transform (DCT), is a popular algorithm which is used in the JPEG format for image compression, and the MP3 format for audio. DCT breaks down the given input data into independent components transmitting images back to earth. However, transmission costs make it infeasible to send the original image. Can we compress the transmission, and decompress it on arrival? If so, what would be the ideal tradeoff0 码力 | 33 页 | 1.96 MB | 1 年前3Experiment 6: K-Means
K-Means November 27, 2018 1 Description In this exercise, you will use K-means to compress an image by reducing the number of colors it contains. To begin, download data6.zip and unpack its contents to Frank Wouters and is used with his permission. 2 Image Representation The data pack for this exercise contains a 538-pixel by 538-pixel TIFF image named bird large.tiff. It looks like the picture below below. In a straightforward 24-bit color representation of this image, each pixel is represented as three 8-bit numbers (ranging from 0 to 255) that specify red, green and blue intensity values. Our bird0 码力 | 3 页 | 605.46 KB | 1 年前3Keras: 基于 Python 的深度学习库
tower_3], axis=1) 3.2.7.2 卷积层上的残差连接 有关残差网络 (Residual Network) 的更多信息,请参阅 Deep Residual Learning for Image Recogni- tion。 from keras.layers import Conv2D, Input # 输入张量为 3 通道 256x256 图像 x = Input(shape=(256 add(MaxPooling2D((2, 2))) vision_model.add(Flatten()) # 现在让我们用视觉模型来得到一个输出张量: image_input = Input(shape=(224, 224, 3)) encoded_image = vision_model(image_input) # 接下来,定义一个语言模型来将问题编码成一个向量。 # 每个问题最长 100 个词,词的索引从 1 到 concatenate([encoded_question, encoded_image]) # 然后在上面训练一个 1000 词的逻辑回归模型: output = Dense(1000, activation='softmax')(merged) # 最终模型: vqa_model = Model(inputs=[image_input, question_input], outputs=output)0 码力 | 257 页 | 1.19 MB | 1 年前3《Efficient Deep Learning Book》[EDL] Chapter 7 - Automation
and channel configurations can also be parameterized using hyperparameters. For example, when using image data augmentation with rotation, we can treat the angle of rotation as a hyper-parameter. Think of chapter 3. # Dataset image size IMG_SIZE = 264 def resize_image(image, label): image = tf.image.resize(image, [IMG_SIZE, IMG_SIZE]) image = tf.cast(image, tf.uint8) return image, label train_ds = train_ds train_ds.map(resize_image) val_ds = val_ds.map(resize_image) test_ds = test_ds.map(resize_image) Note that the create_model() function here has two additional parameters: learning_rate and dropout_rate0 码力 | 33 页 | 2.48 MB | 1 年前3
共 40 条
- 1
- 2
- 3
- 4