Skip to content

Layer 2D — SeriesDecomp 精读

forecast() 主链([[02-Layer1-forecast主链]])和各 EncoderLayer / DecoderLayer 调用。
本层覆盖 moving_avg + series_decomp 的完整逻辑。


1. 在父层中的位置

forecast():          self.decomp(x_enc)                ← series_decomp
EncoderLayer:        self.decomp1(x), self.decomp2(x)  ← series_decomp
DecoderLayer:        self.decomp1(x), self.decomp2(x), self.decomp3(x) ← series_decomp

2. I/O 接口定义

series_decomp.forward(x) — 以 forecast() 初始分解为例(toy):

shape含义
输入 x(2, 12, 5)原始时序(B=2, L=12, C=5)
输出 seasonal(2, 12, 5)季节分量 = x - trend
输出 trend(2, 12, 5)趋势分量(moving avg 输出)

3. 顺序图(具体层)


4. 语义分组图(索引层)


5. 逐步精读

5.0 完整原始代码

python
class moving_avg(nn.Module):
    def __init__(self, kernel_size, stride):
        super(moving_avg, self).__init__()
        self.kernel_size = kernel_size
        self.avg = nn.AvgPool1d(kernel_size=kernel_size, stride=stride, padding=0)

    def forward(self, x):
        front = x[:, 0:1, :].repeat(1, (self.kernel_size - 1) // 2, 1)
        end = x[:, -1:, :].repeat(1, (self.kernel_size - 1) // 2, 1)
        x = torch.cat([front, x, end], dim=1)
        x = self.avg(x.permute(0, 2, 1))
        x = x.permute(0, 2, 1)
        return x


class series_decomp(nn.Module):
    def __init__(self, kernel_size):
        super(series_decomp, self).__init__()
        self.moving_avg = moving_avg(kernel_size, stride=1)

    def forward(self, x):
        moving_mean = self.moving_avg(x)
        res = x - moving_mean
        return res, moving_mean

5.1 宏观逻辑

为什么用 Replication Padding 而不是 Zero Padding?

时序信号的边缘值通常不是零,Zero Padding 会在序列两端引入虚假的均值拉低/拉高,使 moving average 在边缘位置偏离真实趋势。Replication Padding 复制边缘值,保持边缘处的趋势估计与内部一致。

shape 变化链(kernel=3, front/end pad 各 1 步):

x: (2, 12, 5)
→ front: x[:,0:1,:].repeat(1,1,1) = (2, 1, 5)
→ end:   x[:,-1:,:].repeat(1,1,1) = (2, 1, 5)
→ cat:   (2, 14, 5)
→ permute(0,2,1): (2, 5, 14)   ← AvgPool1d 需要 (B, C, L)
→ AvgPool1d(k=3, s=1, p=0): (2, 5, 12)
  公式 L_out = floor((L_in - k) / s + 1) = floor((14-3)/1+1) = 12 ✓
→ permute(0,2,1): (2, 12, 5)

5.2 步骤一:Replication Padding

python
front = x[:, 0:1, :].repeat(1, (self.kernel_size - 1) // 2, 1)
end = x[:, -1:, :].repeat(1, (self.kernel_size - 1) // 2, 1)
x = torch.cat([front, x, end], dim=1)

kernel_size=3(3-1)//2 = 1,前端复制第 0 步,后端复制第 11 步,各 1 次。

x[:, 0:1, :] 保留 dim=1 结果是 (2, 1, 5)(有维度),.repeat(1, 1, 1) 在 dim=1 复制 1 次 → 仍 (2, 1, 5)

toy 数值(batch=0, var=0):x[0,:,0] = [2, 4, 3, 5, 6, 4, 7, 5, 8, 6, 9, 7]

front = [2]          ← 复制第 0 步
end   = [7]          ← 复制第 11 步
padded: [2, 2, 4, 3, 5, 6, 4, 7, 5, 8, 6, 9, 7, 7]   长度=14 ✓

5.3 步骤二:AvgPool1d 计算 moving average

python
x = self.avg(x.permute(0, 2, 1))
x = x.permute(0, 2, 1)

AvgPool1d 要求输入格式 (B, Channels, Length)。原始 x(B, L, C)permute(0,2,1) 变成 (B, C=5, L=14) → Pool → (B, 5, 12)permute(0,2,1)(2, 12, 5)

toy 数值(batch=0, var=0,padded = [2, 2, 4, 3, 5, 6, 4, 7, 5, 8, 6, 9, 7, 7]):

avg[0] = (2+2+4)/3 = 2.67
avg[1] = (2+4+3)/3 = 3.0
avg[2] = (4+3+5)/3 = 4.0
avg[3] = (3+5+6)/3 = 4.67
avg[4] = (5+6+4)/3 = 5.0
avg[5] = (6+4+7)/3 = 5.67
avg[6] = (4+7+5)/3 = 5.33
avg[7] = (7+5+8)/3 = 6.67
avg[8] = (5+8+6)/3 = 6.33
avg[9] = (8+6+9)/3 = 7.67
avg[10]= (6+9+7)/3 = 7.33
avg[11]= (9+7+7)/3 = 7.67
trend[0,:,0] = [2.67, 3.0, 4.0, 4.67, 5.0, 5.67, 5.33, 6.67, 6.33, 7.67, 7.33, 7.67]

5.4 步骤三:seasonal = x - trend

python
def forward(self, x):
    moving_mean = self.moving_avg(x)
    res = x - moving_mean
    return res, moving_mean

注意:series_decomp.forward 返回 (res, moving_mean) — ==季节分量在前,趋势在后==。
forecast() 中接收为 seasonal_init, trend_init = self.decomp(x_enc) ✓。

toy 数值(batch=0, var=0):

x[0,:,0]        = [2,    4,   3,    5,    6,    4,    7,   5,    8,    6,    9,    7]
trend[0,:,0]    = [2.67, 3.0, 4.0,  4.67, 5.0,  5.67, 5.33,6.67, 6.33, 7.67, 7.33, 7.67]
seasonal[0,:,0] = [-0.67,1.0,-1.0,  0.33, 1.0, -1.67, 1.67,-1.67,1.67,-1.67, 1.67,-0.67]

seasonal 绕零波动,trend 单调递增 ✓ — 这正是 Autoformer 的目标:让 seasonal 路径专注周期特征,trend 路径专注长期走势。

*记录并在线阅读我的笔记*