Appearance
Layer 2C — Decoder 精读
由
forecast()主链([[02-Layer1-forecast主链]])调用:seasonal_part, trend_part = self.decoder(dec_out, enc_out, x_mask=None, cross_mask=None, trend=trend_init)
1. 在父层中的位置
forecast()
└─ self.decoder(dec_out, enc_out, trend=trend_init) ← 本文档
└─ DecoderLayer[0](x, cross) → 详见 03C1-Layer3-DecoderLayer2. I/O 接口定义
| shape | 含义 | |
|---|---|---|
输入 x | (2, 10, 8) | dec_embedding 输出(seasonal 路径) |
输入 cross | (2, 12, 8) | encoder 输出(cross-attention KV) |
输入 trend | (2, 10, 5) | trend 初始值(forecast 构造的 trend_init) |
输出 seasonal_part | (2, 10, 5) | 季节分量(d_model→c_out 投影后) |
输出 trend_part | (2, 10, 5) | 趋势分量(trend_init + 累加的 residual_trend) |
3. 顺序图(具体层)
4. 语义分组图(索引层)
5. 逐步精读
5.0 完整原始代码
python
class Decoder(nn.Module):
def __init__(self, layers, norm_layer=None, projection=None):
super(Decoder, self).__init__()
self.layers = nn.ModuleList(layers)
self.norm = norm_layer
self.projection = projection
def forward(self, x, cross, x_mask=None, cross_mask=None, trend=None):
for layer in self.layers:
x, residual_trend = layer(x, cross, x_mask=x_mask, cross_mask=cross_mask)
trend = trend + residual_trend
if self.norm is not None:
x = self.norm(x)
if self.projection is not None:
x = self.projection(x)
return x, trend5.1 DecoderLayer 循环 + trend 累加
python
for layer in self.layers:
x, residual_trend = layer(x, cross, x_mask=x_mask, cross_mask=cross_mask)
trend = trend + residual_trendd_layers=1,循环只执行 1 次。DecoderLayer 返回两路:
x (2,10,8):更新后的 seasonal 表示residual_trend (2,10,5):本层 3 次 decomp 提取的趋势之和,经 Conv1d 投影到 c_out=5
trend = trend_init + residual_trend:将初始 trend(历史均值填充)与本层提取的趋势增量相加,得到最终 trend_part (2,10,5)。
若 d_layers=2(超参可调),则循环两次,每次都做 trend += residual_trend,实现渐进式趋势精化。
→ DecoderLayer 内部详见 [[03C1-Layer3-DecoderLayer]]
5.2 my_Layernorm + Linear 投影
python
if self.norm is not None:
x = self.norm(x)
if self.projection is not None:
x = self.projection(x)
return x, trendmy_Layernorm(x) 同 Encoder 的处理:LayerNorm → 减去时间均值,保留 seasonal 特性。
self.projection = nn.Linear(d_model=8, c_out=5) 将 seasonal 表示从 d_model 维投影回原始变量数:
- 输入
x (2, 10, 8)→ Linear(8→5) →seasonal_part (2, 10, 5)
trend_part 不经过 projection(已经在 DecoderLayer 内部用 Conv1d 投影到 c_out=5),直接原样返回 (2, 10, 5)。
toy 数值:trend_part[0,6,0] = trend_init[0,6,0] + residual_trend[0,6,0] = 5.0 + 0.03 = 5.03
6. 下钻子组件
| 子组件 | 职责 | 下层文档 |
|---|---|---|
DecoderLayer | 3段(masked self-attn + cross-attn + FFN)+ 3×decomp + trend 路由 | [[03C1-Layer3-DecoderLayer]] |