Appearance
Layer 2B — PastDecomposableMixing(PDM)
1. 在父层中的位置
forecast() 中 for i in range(self.layer): enc_out_list = self.pdm_blocks[i](enc_out_list) 共循环 e_layers=2 次。每次输入和输出 shape 相同。
2. I/O 接口定义
| 参数 | Shape | 说明 |
|---|---|---|
x_list (输入) | [(6,24,8),(6,12,8),(6,6,8)] | 三个尺度, |
| 返回 | [(6,24,8),(6,12,8),(6,6,8)] | 同形状,跨尺度混合后的特征 |
CI 模式下
3. 顺序图
4. 语义分组图
5. 逐步骤精读
§5.0 完整原始代码
python
class PastDecomposableMixing(nn.Module):
def __init__(self, configs):
super(PastDecomposableMixing, self).__init__()
self.seq_len = configs.seq_len
self.pred_len = configs.pred_len
self.down_sampling_window = configs.down_sampling_window
self.layer_norm = nn.LayerNorm(configs.d_model)
self.dropout = nn.Dropout(configs.dropout)
self.channel_independence = configs.channel_independence
if configs.decomp_method == "moving_avg":
self.decompsition = series_decomp(configs.moving_avg)
elif configs.decomp_method == "dft_decomp":
self.decompsition = DFT_series_decomp(configs.top_k)
else:
raise ValueError("decompsition is error")
if configs.channel_independence == 0:
self.cross_layer = nn.Sequential(
nn.Linear(in_features=configs.d_model, out_features=configs.d_ff),
nn.GELU(),
nn.Linear(in_features=configs.d_ff, out_features=configs.d_model),
)
self.mixing_multi_scale_season = MultiScaleSeasonMixing(configs)
self.mixing_multi_scale_trend = MultiScaleTrendMixing(configs)
self.out_cross_layer = nn.Sequential(
nn.Linear(in_features=configs.d_model, out_features=configs.d_ff),
nn.GELU(),
nn.Linear(in_features=configs.d_ff, out_features=configs.d_model),
)
def forward(self, x_list):
length_list = []
for x in x_list:
_, T, _ = x.size()
length_list.append(T)
# Decompose to obtain the season and trend
season_list = []
trend_list = []
for x in x_list:
season, trend = self.decompsition(x)
if self.channel_independence == 0:
season = self.cross_layer(season)
trend = self.cross_layer(trend)
season_list.append(season.permute(0, 2, 1))
trend_list.append(trend.permute(0, 2, 1))
# bottom-up season mixing
out_season_list = self.mixing_multi_scale_season(season_list)
# top-down trend mixing
out_trend_list = self.mixing_multi_scale_trend(trend_list)
out_list = []
for ori, out_season, out_trend, length in zip(
x_list, out_season_list, out_trend_list, length_list
):
out = out_season + out_trend
if self.channel_independence:
out = ori + self.out_cross_layer(out)
out_list.append(out[:, :length, :])
return out_list§5.1 宏观逻辑
设计直觉:在同一尺度上混合趋势和季节会相互干扰——高频噪声影响趋势,趋势背景模糊季节峰值。PDM 先把两者分开,再让它们沿各自最有利的方向传播:季节从细粒度向粗粒度压缩(细粒度振荡信息最丰富),趋势从粗粒度向细粒度广播(粗粒度已被 AvgPool 天然平滑)。
用小例子(series_decomp 把每尺度拆出 season 和 trend,均 permute 为 ori + out_cross_layer(out) 输出。
§5.2 步骤 1 — 记录 length_list
python
length_list = []
for x in x_list:
_, T, _ = x.size()
length_list.append(T)形状注解: 遍历 x_list 记录每个尺度的时间长度
toy 数值: length_list = [24, 12, 6]。预先记录是因为后续 mixing 的 Linear 层输出维度严格等于下采样目标长度,理论上 shape 已正确,但若 seq_len 不能被 window 整除则可能产生微小差异,out[:, :length, :] 起保险作用。
§5.3 步骤 2 — series_decomp 分解 + permute
python
for x in x_list:
season, trend = self.decompsition(x)
if self.channel_independence == 0:
season = self.cross_layer(season)
trend = self.cross_layer(trend)
season_list.append(season.permute(0, 2, 1))
trend_list.append(trend.permute(0, 2, 1))形状注解: decompsition = series_decomp(moving_avg=3)。对每个 x shape series_decomp 内部对 x 做 3 点移动平均得 trend,season = x - trend,两者均为 season.permute(0,2,1) 将 MultiScaleSeasonMixing 的 Linear 作用于最后一维(时间维)。CI 模式下 channel_independence==0 分支不执行。
toy 数值(尺度 0,x shape (6, 24, 8)。series_decomp 后 season shape (6, 24, 8),trend shape (6, 24, 8)。season.permute(0, 2, 1) → (6, 8, 24) 加入 season_list。三个尺度 permute 后:season_list = [(6,8,24), (6,8,12), (6,8,6)],trend_list = [(6,8,24), (6,8,12), (6,8,6)]。
§5.4 步骤 3 — MultiScaleSeasonMixing(底向上)
python
out_season_list = self.mixing_multi_scale_season(season_list)形状注解: 输入 season_list = [(6,8,24),(6,8,12),(6,8,6)](out_season_list = [(6,24,8),(6,12,8),(6,6,8)](已 permute 回
toy 数值: 见 [[03B1-Layer3-SeasonMixing]] 完整追踪。结果:out_season_list = [(6,24,8),(6,12,8),(6,6,8)]。
§5.5 步骤 4 — MultiScaleTrendMixing(顶向下)
python
out_trend_list = self.mixing_multi_scale_trend(trend_list)形状注解: 输入 trend_list = [(6,8,24),(6,8,12),(6,8,6)]。输出 out_trend_list = [(6,24,8),(6,12,8),(6,6,8)](已 permute 回,且已从逆序还原回正序)。内部通过 Linear 将粗粒度(
toy 数值: 见 [[03B2-Layer3-TrendMixing]] 完整追踪。结果:out_trend_list = [(6,24,8),(6,12,8),(6,6,8)]。
§5.6 步骤 5 — 合并 + 残差连接
python
for ori, out_season, out_trend, length in zip(
x_list, out_season_list, out_trend_list, length_list
):
out = out_season + out_trend
if self.channel_independence:
out = ori + self.out_cross_layer(out)
out_list.append(out[:, :length, :])形状注解(CI 模式): 以尺度 0 为例,ori = x_list[0] shape (6, 24, 8),out_season shape (6, 24, 8),out_trend shape (6, 24, 8)。out = out_season + out_trend → (6, 24, 8) 逐元素相加(季节与趋势合流)。out_cross_layer(Linear(6, 24, 8)。ori + out_cross_layer(out) 是 Transformer 风格残差连接。out[:, :24, :] 裁剪无效(length=24,不裁剪),shape 仍 (6, 24, 8)。
toy 数值: out_cross_layer 对 (6, 24, 8) 每个位置的 8 维向量独立做两层 MLP:out = ori + out_cross_layer(out) shape (6, 24, 8)。三个尺度处理后:out_list = [(6,24,8),(6,12,8),(6,6,8)],与输入 x_list 形状完全相同。
CI vs CD 合并路径的区别
CI 模式(
channel_independence=1,本文默认):out = ori + out_cross_layer(out),是"输入 + 混合结果"的残差,类 Transformer block 风格。 CD 模式(channel_independence=0):跳过此残差,直接out = out_season + out_trend(CD 的 cross_layer 已在 step 2 分解时作用于 season/trend 本身)。两种路径的out_list形状相同,特征含义不同。
6. 下钻子组件
| 子组件 | 职责 | 文档 |
|---|---|---|
MultiScaleSeasonMixing | 底向上季节混合(细→粗) | [[03B1-Layer3-SeasonMixing]] |
MultiScaleTrendMixing | 顶向下趋势混合(粗→细) | [[03B2-Layer3-TrendMixing]] |