Appearance
Layer 2 — series_decomp 精读
父层(Layer 1)
encoder()的第一步调用self.decompsition(x)。
本文档覆盖series_decomp和其内部moving_avg的完整计算。
1. 在父层中的位置
encoder(x)
└─ self.decompsition(x) ← 本文档
└─ moving_avg(x) → trend
└─ x - trend → seasonal2. I/O 接口定义
series_decomp
python
class series_decomp(nn.Module):
def forward(self, x) -> Tuple[Tensor, Tensor]:| shape(toy) | 含义 | |
|---|---|---|
输入 x | (2, 6, 3) = (B, T, C) | 原始时序 |
输出 res (seasonal) | (2, 6, 3) | 季节分量 = x - moving_mean |
输出 moving_mean (trend) | (2, 6, 3) | 趋势分量 = moving_avg(x) |
moving_avg(series_decomp 的内部子模块)
python
class moving_avg(nn.Module):
def __init__(self, kernel_size, stride): # toy: kernel_size=3, stride=1
def forward(self, x) -> Tensor:| shape(toy) | 含义 | |
|---|---|---|
输入 x | (2, 6, 3) | 原始时序 |
| 输出 | (2, 6, 3) | 平滑后的趋势(与输入等长) |
3. 顺序图(具体层)
4. 语义分组图(索引层)
5. 精读
5.1 两端填充(ReplicationPad 等价实现)
python
# Autoformer_EncDec.py:32-35
front = x[:, 0:1, :].repeat(1, (self.kernel_size - 1) // 2, 1)
end = x[:, -1:, :].repeat(1, (self.kernel_size - 1) // 2, 1)
x = torch.cat([front, x, end], dim=1)注解版:
python
# x: (B, T, C) = (2, 6, 3)
# pad = (kernel_size-1)//2 = (3-1)//2 = 1
front = x[:, 0:1, :].repeat(1, 1, 1)
# x[:,0:1,:] shape: (2, 1, 3) → 取第0个时间步
# repeat(1,1,1) → (2, 1, 3) 复制1次 = 左端填1步
end = x[:, -1:, :].repeat(1, 1, 1)
# x[:,-1:,:] shape: (2, 1, 3) → 取最后一个时间步
# repeat(1,1,1) → (2, 1, 3) 复制1次 = 右端填1步
x = torch.cat([front, x, end], dim=1)
# (2,1,3) + (2,6,3) + (2,1,3) → (2, 8, 3)为什么用边缘值复制而非补零?
补零会在边界引入虚假的"趋势下降"(序列值 → 0),边缘复制保持边界处的均值贴近真实值,不引入偏差。
toy 数值:以 x[0](第0个batch)的第0个特征列为例:
原始 x[0, :, 0] = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]
前填充:front[0, :, 0] = [1.0](复制第0步的值)
后填充:end[0, :, 0] = [6.0](复制最后步的值)
x_padded[0, :, 0] = [1.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 6.0]
↑前填 后填↑5.2 AvgPool1d(核心计算)
python
# Autoformer_EncDec.py:28-29
self.avg = nn.AvgPool1d(kernel_size=kernel_size, stride=stride, padding=0)
# toy: kernel_size=3, stride=1, padding=0
# forward:
x = self.avg(x.permute(0, 2, 1)) # 输入必须是 (B, C, L) 格式
x = x.permute(0, 2, 1)AvgPool1d 的要求:输入必须是 (B, C, L) 格式,C 是通道数,L 是序列长度。
但 x_padded 是 (B, T, C) = (2, 8, 3) 格式,所以先 permute(0,2,1) → (2, 3, 8),再池化,再 permute 回来。
输出长度公式:
L_out = floor((L_in + 2×padding - kernel_size) / stride + 1)
= floor((8 + 0 - 3) / 1 + 1)
= floor(5 + 1)
= 6代入 toy:L_in=8, kernel_size=3, stride=1, padding=0 → L_out=6 ✓(和输入 T=6 相等)
toy 数值:以特征 0 的填充后序列 [1.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 6.0] 为例:
窗口 0(位置0-2): mean(1.0, 1.0, 2.0) = 1.333
窗口 1(位置1-3): mean(1.0, 2.0, 3.0) = 2.000
窗口 2(位置2-4): mean(2.0, 3.0, 4.0) = 3.000
窗口 3(位置3-5): mean(3.0, 4.0, 5.0) = 4.000
窗口 4(位置4-6): mean(4.0, 5.0, 6.0) = 5.000
窗口 5(位置5-7): mean(5.0, 6.0, 6.0) = 5.667
moving_mean[0, :, 0] = [1.333, 2.000, 3.000, 4.000, 5.000, 5.667]5.3 seasonal = x - trend
python
# Autoformer_EncDec.py:51-53
def forward(self, x):
moving_mean = self.moving_avg(x) # (2,6,3)
res = x - moving_mean # 逐元素相减
return res, moving_mean注解版:
python
moving_mean = self.moving_avg(x)
# moving_mean: (B, T, C) = (2, 6, 3),每个时间步的移动平均值
res = x - moving_mean
# res: (B, T, C) = (2, 6, 3),去除趋势后的残差
# res 保留了 x 中未被移动平均捕获的快速变化成分(季节性)toy 数值(特征 0):
x[0, :, 0] = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]
moving_mean[0, :, 0] = [1.333, 2.0, 3.0, 4.0, 5.0, 5.667]
seasonal[0, :, 0] = x - moving_mean
= [1.0-1.333, 2.0-2.0, 3.0-3.0, 4.0-4.0, 5.0-5.0, 6.0-5.667]
= [-0.333, 0.0, 0.0, 0.0, 0.0, 0.333]注意:对于纯线性增长的序列,季节分量几乎为 0,全部信息在 trend 里——这正是期望行为。