深度学习的通俗理解

内容分享1周前发布
0 0 0

以下是为你规划的学习路线图,它清晰地展示了从入门到精通需要掌握的五大核心模块:

深度学习的通俗理解

接下来,我将详细讲解每个模块的核心知识点,并提供可直接运行的Python代码。

模块一:深度学习基础 – 多层感知机

深度学习的基础是多层感知机(MLP),它由全连接层构成,是最基本的神经网络结构。

核心知识点

前向传播:数据从输入层到输出层的计算过程

激活函数:引入非线性,使网络能学习复杂模式,如ReLU、Sigmoid、Tanh

损失函数:衡量模型预测与真实值的差距

反向传播:根据损失调整网络参数的算法

下面是使用TensorFlow/Keras构建MLP的完整代码:



import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
 
# 构建MLP模型
def create_mlp_model(input_dim=784, num_classes=10):
    model = models.Sequential([
        layers.Dense(128, activation='relu', input_shape=(input_dim,)),
        layers.Dropout(0.2),  # 丢弃层防止过拟合
        layers.Dense(64, activation='relu'),
        layers.Dropout(0.2),
        layers.Dense(num_classes, activation='softmax')  # 多分类使用softmax
    ])
    return model
 
# 编译和训练模型
def train_mlp():
    # 加载MNIST数据集
    (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
    
    # 数据预处理
    x_train = x_train.reshape((60000, 28*28)).astype('float32') / 255
    x_test = x_test.reshape((10000, 28*28)).astype('float32') / 255
    
    # 创建模型
    model = create_mlp_model()
    
    # 编译模型
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    
    # 训练模型
    history = model.fit(x_train, y_train,
                        epochs=5,
                        batch_size=32,
                        validation_split=0.2)
    
    # 评估模型
    test_loss, test_acc = model.evaluate(x_test, y_test)
    print(f'
测试准确率: {test_acc:.4f}')
    
    return model, history
 
# 运行代码
if __name__ == "__main__":
    model, history = train_mlp()

代码解析

使用Keras的Sequential API构建顺序模型


Dense
层实现全连接,
Dropout
层防止过拟合

使用Adam优化器和交叉熵损失函数

数据归一化到[0,1]区间,提高训练稳定性

模块二:卷积神经网络(CNN)

CNN专门处理网格状数据(如图像),通过卷积核提取局部特征,具有参数共享平移不变性的优点。

核心知识点

卷积层:使用滤波器提取特征,通过局部连接和权值共享减少参数量

池化层:降维并保留主要特征,增加平移不变性

经典架构:LeNet-5、VGG、ResNet等

下面是CNN的完整实现代码:



import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt
 
# 构建CNN模型(LeNet-5改进版)
def create_cnn_model(input_shape=(28, 28, 1), num_classes=10):
    model = models.Sequential([
        # 第一个卷积块
        layers.Conv2D(32, (3, 3), activation='relu', input_shape=input_shape),
        layers.MaxPooling2D((2, 2)),
        
        # 第二个卷积块
        layers.Conv2D(64, (3, 3), activation='relu'),
        layers.MaxPooling2D((2, 2)),
        
        # 分类头
        layers.Flatten(),
        layers.Dense(128, activation='relu'),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation='softmax')
    ])
    return model
 
def train_cnn():
    # 加载数据
    (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
    
    # 预处理 - 添加通道维度并归一化
    x_train = x_train.reshape((60000, 28, 28, 1)).astype('float32') / 255
    x_test = x_test.reshape((10000, 28, 28, 1)).astype('float32') / 255
    
    # 创建模型
    model = create_cnn_model()
    
    # 编译
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    
    # 训练
    history = model.fit(x_train, y_train,
                        epochs=5,
                        batch_size=32,
                        validation_split=0.2)
    
    # 评估
    test_loss, test_acc = model.evaluate(x_test, y_test)
    print(f'
CNN测试准确率: {test_acc:.4f}')
    
    # 可视化预测
    visualize_predictions(model, x_test[:5])
    
    return model, history
 
def visualize_predictions(model, test_images):
    predictions = model.predict(test_images)
    fig, axes = plt.subplots(1, 5, figsize=(15, 3))
    for i, ax in enumerate(axes):
        ax.imshow(test_images[i].reshape(28, 28), cmap='gray')
        ax.set_title(f'预测: {np.argmax(predictions[i])}')
        ax.axis('off')
    plt.show()
 
if __name__ == "__main__":
    model, history = train_cnn()

代码解析


Conv2D
层进行特征提取,
MaxPooling2D
降低特征图尺寸

卷积-池化层的堆叠构建层次化特征表示


Flatten
层将多维特征图展平以便全连接层处理

模块三:循环神经网络(RNN)

RNN专门处理序列数据,具有时间维度上的参数共享特性,能够捕捉时间依赖关系。

核心知识点

时间展开:RNN按时间步展开,共享相同权重

梯度问题:存在梯度消失/爆炸,LSTM和GRU通过门控机制解决

双向RNN:同时考虑过去和未来信息

下面是RNN的完整实现代码:



import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
 
# 构建RNN模型
def create_rnn_model(vocab_size=10000, embedding_dim=128, rnn_units=64):
    model = models.Sequential([
        # 嵌入层将整数索引转换为密集向量
        layers.Embedding(vocab_size, embedding_dim),
        
        # 简单RNN层
        layers.SimpleRNN(rnn_units, return_sequences=False),
        
        # 分类层
        layers.Dense(32, activation='relu'),
        layers.Dropout(0.2),
        layers.Dense(1, activation='sigmoid')  # 二分类
    ])
    return model
 
def train_rnn():
    # 加载IMDB电影评论数据集
    (x_train, y_train), (x_test, y_test) = tf.keras.datasets.imdb.load_data(num_words=10000)
    
    # 文本数据预处理 - 填充序列到相同长度
    x_train = tf.keras.preprocessing.sequence.pad_sequences(x_train, maxlen=200)
    x_test = tf.keras.preprocessing.sequence.pad_sequences(x_test, maxlen=200)
    
    # 创建模型
    model = create_rnn_model()
    
    # 编译
    model.compile(optimizer='adam',
                  loss='binary_crossentropy',
                  metrics=['accuracy'])
    
    # 训练
    history = model.fit(x_train, y_train,
                        epochs=3,
                        batch_size=32,
                        validation_split=0.2)
    
    # 评估
    test_loss, test_acc = model.evaluate(x_test, y_test)
    print(f'
RNN测试准确率: {test_acc:.4f}')
    
    return model, history
 
# LSTM实现(更强大的RNN变体)
def create_lstm_model(vocab_size=10000, embedding_dim=128, lstm_units=64):
    model = models.Sequential([
        layers.Embedding(vocab_size, embedding_dim),
        
        # LSTM层,解决梯度消失问题
        layers.LSTM(lstm_units),
        
        layers.Dense(32, activation='relu'),
        layers.Dropout(0.2),
        layers.Dense(1, activation='sigmoid')
    ])
    return model
 
if __name__ == "__main__":
    # 训练简单RNN
    print("训练简单RNN...")
    rnn_model, rnn_history = train_rnn()
    
    # 训练LSTM
    print("
训练LSTM...")
    lstm_model = create_lstm_model()
    lstm_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    lstm_history = lstm_model.fit(x_train, y_train, epochs=3, batch_size=32, validation_split=0.2)
    
    lstm_test_loss, lstm_test_acc = lstm_model.evaluate(x_test, y_test)
    print(f'LSTM测试准确率: {lstm_test_acc:.4f}')

代码解析


Embedding
层将离散文本索引转换为连续向量表示


SimpleRNN
处理序列数据,
LSTM
通过门控机制缓解梯度消失

填充序列确保输入长度一致

模块四:生成对抗网络(GAN)

GAN由生成器(Generator)判别器(Discriminator)组成,通过对抗训练学习数据分布。

核心知识点

生成器:从随机噪声生成假数据,目标是”欺骗”判别器

判别器:区分真实数据和生成数据

最小最大博弈:两者相互对抗、共同提升

下面是GAN的完整实现代码:



import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
import matplotlib.pyplot as plt
 
# 构建生成器
def build_generator(noise_dim=100):
    model = models.Sequential([
        layers.Dense(7*7*256, use_bias=False, input_shape=(noise_dim,)),
        layers.BatchNormalization(),
        layers.LeakyReLU(),
        
        layers.Reshape((7, 7, 256)),
        
        # 上采样块1
        layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False),
        layers.BatchNormalization(),
        layers.LeakyReLU(),
        
        # 上采样块2
        layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False),
        layers.BatchNormalization(),
        layers.LeakyReLU(),
        
        # 输出层
        layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', 
                              use_bias=False, activation='tanh')
    ])
    return model
 
# 构建判别器
def build_discriminator(input_shape=(28, 28, 1)):
    model = models.Sequential([
        # 下采样块1
        layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same', 
                     input_shape=input_shape),
        layers.LeakyReLU(),
        layers.Dropout(0.3),
        
        # 下采样块2
        layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same'),
        layers.LeakyReLU(),
        layers.Dropout(0.3),
        
        # 分类层
        layers.Flatten(),
        layers.Dense(1, activation='sigmoid')
    ])
    return model
 
# 构建完整GAN
class GAN(tf.keras.Model):
    def __init__(self, generator, discriminator):
        super(GAN, self).__init__()
        self.generator = generator
        self.discriminator = discriminator
        
    def compile(self, g_optimizer, d_optimizer, loss_fn):
        super(GAN, self).compile()
        self.g_optimizer = g_optimizer
        self.d_optimizer = d_optimizer
        self.loss_fn = loss_fn
        
    def train_step(self, real_images):
        batch_size = tf.shape(real_images)[0]
        noise_dim = 100
        
        # 生成随机噪声
        noise = tf.random.normal([batch_size, noise_dim])
        
        # 训练判别器
        with tf.GradientTape() as d_tape:
            # 判别器对真实图像的预测
            d_real_pred = self.discriminator(real_images, training=True)
            # 生成假图像
            fake_images = self.generator(noise, training=True)
            # 判别器对假图像的预测
            d_fake_pred = self.discriminator(fake_images, training=True)
            
            # 判别器损失
            d_real_loss = self.loss_fn(tf.ones_like(d_real_pred), d_real_pred)
            d_fake_loss = self.loss_fn(tf.zeros_like(d_fake_pred), d_fake_pred)
            d_loss = (d_real_loss + d_fake_loss) / 2
            
        # 更新判别器
        d_gradients = d_tape.gradient(d_loss, self.discriminator.trainable_variables)
        self.d_optimizer.apply_gradients(
            zip(d_gradients, self.discriminator.trainable_variables))
        
        # 训练生成器
        with tf.GradientTape() as g_tape:
            fake_images = self.generator(noise, training=True)
            d_fake_pred = self.discriminator(fake_images, training=True)
            # 生成器希望判别器将假图像判断为真
            g_loss = self.loss_fn(tf.ones_like(d_fake_pred), d_fake_pred)
            
        # 更新生成器
        g_gradients = g_tape.gradient(g_loss, self.generator.trainable_variables)
        self.g_optimizer.apply_gradients(
            zip(g_gradients, self.generator.trainable_variables))
        
        return {"d_loss": d_loss, "g_loss": g_loss}
 
def train_gan():
    # 加载数据
    (x_train, _), (_, _) = tf.keras.datasets.mnist.load_data()
    
    # 预处理
    x_train = x_train.reshape(-1, 28, 28, 1).astype('float32')
    x_train = (x_train - 127.5) / 127.5  # 归一化到[-1, 1]
    
    # 创建模型
    generator = build_generator()
    discriminator = build_discriminator()
    
    # 编译判别器(单独训练时使用)
    discriminator.compile(optimizer=tf.keras.optimizers.Adam(0.0002),
                         loss='binary_crossentropy',
                         metrics=['accuracy'])
    
    # 创建和编译GAN
    gan = GAN(generator=generator, discriminator=discriminator)
    gan.compile(g_optimizer=tf.keras.optimizers.Adam(0.0002),
                d_optimizer=tf.keras.optimizers.Adam(0.0002),
                loss_fn=tf.keras.losses.BinaryCrossentropy())
    
    # 训练
    history = gan.fit(x_train, epochs=10, batch_size=64)
    
    # 生成示例图像
    generate_and_plot_images(generator, 10)
    
    return generator, discriminator, history
 
def generate_and_plot_images(generator, num_images=10):
    noise = tf.random.normal([num_images, 100])
    generated_images = generator(noise, training=False)
    
    plt.figure(figsize=(15, 3))
    for i in range(num_images):
        plt.subplot(1, num_images, i+1)
        plt.imshow(generated_images[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
        plt.axis('off')
    plt.show()
 
if __name__ == "__main__":
    generator, discriminator, history = train_gan()

代码解析

生成器使用
Conv2DTranspose
进行上采样,从噪声生成图像

判别器使用常规卷积层区分真假图像

对抗训练过程:固定一方训练另一方,交替进行

使用tanh激活函数将输出归一化到[-1,1]

模块五:前沿架构与完整项目实战

掌握基础架构后,可以学习更前沿的模型:

Transformer架构:基于自注意力机制,在NLP和CV领域都有突破



# 示例:使用Hugging Face Transformers库
from transformers import pipeline
 
# 快速使用预训练模型
classifier = pipeline('sentiment-analysis')
result = classifier("I love this movie!")
print(result)

ResNet:通过残差连接解决深层网络梯度消失问题



# 使用预训练ResNet
from tensorflow.keras.applications import ResNet50
 
model = ResNet50(weights='imagenet')

© 版权声明

相关文章

暂无评论

您必须登录才能参与评论!
立即登录
none
暂无评论...