Appearance
TimeMixer 调试形参
§1 PyCharm Run Configuration
Script path:D:\1sudyta\1ai-self\aistyle\TFB\scripts\run_benchmark.py
Parameters:
--config-path rolling_forecast_config.json
--data-name-list ETTh1.csv
--strategy-args {"horizon": 6}
--model-name time_series_library.TimeMixer
--model-hyper-params {"batch_size":2,"seq_len":24,"horizon":6,"d_model":8,"d_ff":16,"n_heads":4,"e_layers":2,"down_sampling_window":2,"down_sampling_layers":2,"down_sampling_method":"avg","decomp_method":"moving_avg","moving_avg":3,"channel_independence":1,"use_norm":true,"dropout":0.0,"embed":"timeF","top_k":3,"num_epochs":1,"patience":3,"lr":0.001,"loss":"MSE"}
--adapter transformer_adapter
--deterministic full
--gpus 0
--num-workers 1
--timeout 60000
--save-path debug/TimeMixerWorking directory:D:\1sudyta\1ai-self\aistyle\TFB
§2 VSCode launch.json
json
{
"name": "TimeMixer Debug",
"type": "debugpy",
"request": "launch",
"program": "${workspaceFolder}/scripts/run_benchmark.py",
"args": [
"--config-path", "rolling_forecast_config.json",
"--data-name-list", "ETTh1.csv",
"--strategy-args", "{\"horizon\": 6}",
"--model-name", "time_series_library.TimeMixer",
"--model-hyper-params", "{\"batch_size\":2,\"seq_len\":24,\"horizon\":6,\"d_model\":8,\"d_ff\":16,\"n_heads\":4,\"e_layers\":2,\"down_sampling_window\":2,\"down_sampling_layers\":2,\"down_sampling_method\":\"avg\",\"decomp_method\":\"moving_avg\",\"moving_avg\":3,\"channel_independence\":1,\"use_norm\":true,\"dropout\":0.0,\"embed\":\"timeF\",\"top_k\":3,\"num_epochs\":1,\"patience\":3,\"lr\":0.001,\"loss\":\"MSE\"}",
"--adapter", "transformer_adapter",
"--deterministic", "full",
"--gpus", "0",
"--num-workers", "1",
"--timeout", "60000",
"--save-path", "debug/TimeMixer"
],
"cwd": "${workspaceFolder}",
"console": "integratedTerminal"
}§3 参数选择第一原理
| 参数 | 值 | 原因 |
|---|---|---|
data-name-list | ETTh1.csv | N=7 变量;与 toy 参数 enc_in=3 不同但足够小;CI reshape 后 B×N=2×7=14 |
batch_size | 2 | 最小化,方便 tensor shape 追踪 |
seq_len | 24 | 能被 window=2 整除 2 次:24→12→6 |
horizon | 6 | 与 seq_len/T_scale 均不同:24≠12≠6≠6(注意 T_scale2=6 与 pred_len=6 相同,但语义不同) |
d_model | 8 | 足够小;与 seq_len/pred_len/enc_in 均不同 |
d_ff | 16 | = d_model × 2;标准比例 |
e_layers | 2 | 覆盖 PDM 循环(至少 2 次) |
down_sampling_layers | 2 | 生成 3 个尺度:T=24,12,6;与 e_layers 相同但语义不同 |
down_sampling_window | 2 | 标准值;AvgPool(2) |
channel_independence | 1 | CI 模式,覆盖主路径 |
decomp_method | moving_avg | TFB 默认路径 |
moving_avg | 3 | 最小奇数核(series_decomp 用) |
num_epochs | 1 | 只需训练一次覆盖 forward 路径 |
dropout | 0.0 | 禁用随机性,确保 shape 确定 |
重要:脚本传 down_sampling_layers=2 会被 TimeMixer 正确读取(与代码中 configs.down_sampling_layers 匹配),但 TFB 官方脚本传的是 down_sampling_layer(单数),实际读取 MODEL_HYPER_PARAMS 默认值 3。调试时建议显式传 down_sampling_layers(复数)以确保参数生效。
§4 形参含义速查
| 参数名 | 代码中访问路径 | 决定什么 |
|---|---|---|
seq_len | configs.seq_len | 原始输入长度;SeasonMixing 各层 Linear 的输入维度 |
pred_len | configs.pred_len(由 horizon 自动设置) | FMM predict_layers 的输出维度 |
enc_in | configs.enc_in(TFB 自动从数据列数推断) | 变量数 N |
d_model | configs.d_model | Embedding 输出维度;PDM 中 Linear 的隐层维度 |
d_ff | configs.d_ff | PDM 中 out_cross_layer 的中间维度 |
down_sampling_window | configs.down_sampling_window | AvgPool kernel size;每次下采样倍率 |
down_sampling_layers | configs.down_sampling_layers | 下采样次数;尺度数 = layers+1 |
channel_independence | configs.channel_independence | 1=CI 模式,0=CD 模式 |
moving_avg | configs.moving_avg | series_decomp 的移动平均核大小 |
use_norm | configs.use_norm | Normalize 层是否生效(0=跳过归一化) |
top_k | configs.top_k | DFT_series_decomp 用(decomp_method=dft_decomp 时) |
§5 循环覆盖验证
| 循环/分支 | 覆盖方法 | 当前参数是否覆盖 |
|---|---|---|
e_layers PDM 循环 | e_layers=2 | ✅ 执行 2 次 pdm_blocks |
down_sampling_layers 尺度循环 | down_sampling_layers=2 → 3 个尺度 | ✅ forecast 里的 for 循环走 3 次 |
CI 路径(channel_independence==1) | channel_independence=1 | ✅ CI reshape + CI projection |
CD 路径(channel_independence==0) | 需改为 0 | ❌ 默认调试不覆盖 |
decomp_method="moving_avg" 路径 | 当前设置 | ✅ |
decomp_method="dft_decomp" 路径 | 需改 decomp_method | ❌ 默认不覆盖 |
x_mark_enc is not None 分支 | ETTh1 有时间戳 | ✅ 走 mark 路径 |
| FMM CI 路径 | channel_independence=1 | ✅ |
| SeasonMixing 多尺度循环 | down_sampling_layers=2 → 循环 2 次 | ✅ |
| TrendMixing 多尺度循环 | 同上 | ✅ |
§6 Shape 追踪快速参考
以 ETTh1(N=7)为例,batch_size=2:
| 步骤 | shape |
|---|---|
x_enc 输入 | (2, 24, 7) |
AvgPool → x_list | [(2,24,7), (2,12,7), (2,6,7)] |
| CI reshape( | [(14,24,1), (14,12,1), (14,6,1)] |
enc_embedding → enc_out_list | [(14,24,8), (14,12,8), (14,6,8)] |
| PDM 输出(形状不变) | [(14,24,8), (14,12,8), (14,6,8)] |
PDM 内部 season_list / trend_list(已 permute) | 各 [(14,8,24), (14,8,12), (14,8,6)] |
| FMM 各尺度输出 | [(2,6,7), (2,6,7), (2,6,7)] |
stack+sum → dec_out | (2, 6, 7) |
denorm → 最终输出 | (2, 6, 7) |
toy 文档(enc_in=3)下
§7 关键断点设置
| # | 文件 | 位置 | 目的 |
|---|---|---|---|
| 1 | adapters_for_transformers.py | _process 首行 | 确认 x_enc/x_dec shape |
| 2 | TimeMixer.py | forecast 首行 | 进入主链 |
| 3 | TimeMixer.py | __multi_scale_process_inputs AvgPool 行 | 验证 x_list 每尺度 shape |
| 4 | TimeMixer.py | forecast CI reshape 后 | 确认 B×N 维 |
| 5 | TimeMixer.py | enc_embedding 调用后 | 确认 Embedding 维 (Ti→d_model) |
| 6 | PastDecomposableMixing.forward | season_list.append 后 | 验证 permute 后 shape (B,d,T) |
| 7 | MultiScaleSeasonMixing.forward | loop body out_low = out_low + out_low_res | 验证残差加法两侧 shape 一致 |
| 8 | MultiScaleTrendMixing.forward | out_trend_list.reverse() 前后 | 确认逆序还原 |
| 9 | PastDecomposableMixing.forward | out = ori + self.out_cross_layer(out) | 确认 CI 残差路径 |
| 10 | TimeMixer.future_multi_mixing | dec_out.reshape(B, ...) 行 | 验证 CI reshape 还原 |
| 11 | TimeMixer.forecast | torch.stack(...).sum(-1) | 确认 3 尺度汇合 → (2,6,3/7) |
| 12 | TimeMixer.forecast | normalize_layers[0](dec_out, "denorm") | 确认最终 denorm 输出 |
§8 与 Autoformer 调试参数对比
| 维度 | Autoformer | TimeMixer |
|---|---|---|
| adapter | transformer_adapter | transformer_adapter |
| dataset | ETTh1.csv (N=7) | ETTh1.csv (N=7) |
| seq_len | 12 | 24 |
| pred_len | 4 | 6 |
| 关键参数 | moving_avg=3, e_layers=2, d_layers=1 | down_sampling_layers=2, e_layers=2 |
| CI/CD | 无此概念 | channel_independence=1(CI) |
| 独特需验证 | AutoCorrelation FFT lag | PDM 多尺度列表长度和方向 |
| x_dec 使用 | ✅ Decoder 用到 | ❌ 完全不用 |
| 关键 shape 拐点 | enc/dec 维度分离 | B×N CI reshape |