Skip to content

TimeMixer 调试形参

§1 PyCharm Run Configuration

Script pathD:\1sudyta\1ai-self\aistyle\TFB\scripts\run_benchmark.py

Parameters

--config-path rolling_forecast_config.json
--data-name-list ETTh1.csv
--strategy-args {"horizon": 6}
--model-name time_series_library.TimeMixer
--model-hyper-params {"batch_size":2,"seq_len":24,"horizon":6,"d_model":8,"d_ff":16,"n_heads":4,"e_layers":2,"down_sampling_window":2,"down_sampling_layers":2,"down_sampling_method":"avg","decomp_method":"moving_avg","moving_avg":3,"channel_independence":1,"use_norm":true,"dropout":0.0,"embed":"timeF","top_k":3,"num_epochs":1,"patience":3,"lr":0.001,"loss":"MSE"}
--adapter transformer_adapter
--deterministic full
--gpus 0
--num-workers 1
--timeout 60000
--save-path debug/TimeMixer

Working directoryD:\1sudyta\1ai-self\aistyle\TFB


§2 VSCode launch.json

json
{
  "name": "TimeMixer Debug",
  "type": "debugpy",
  "request": "launch",
  "program": "${workspaceFolder}/scripts/run_benchmark.py",
  "args": [
    "--config-path", "rolling_forecast_config.json",
    "--data-name-list", "ETTh1.csv",
    "--strategy-args", "{\"horizon\": 6}",
    "--model-name", "time_series_library.TimeMixer",
    "--model-hyper-params", "{\"batch_size\":2,\"seq_len\":24,\"horizon\":6,\"d_model\":8,\"d_ff\":16,\"n_heads\":4,\"e_layers\":2,\"down_sampling_window\":2,\"down_sampling_layers\":2,\"down_sampling_method\":\"avg\",\"decomp_method\":\"moving_avg\",\"moving_avg\":3,\"channel_independence\":1,\"use_norm\":true,\"dropout\":0.0,\"embed\":\"timeF\",\"top_k\":3,\"num_epochs\":1,\"patience\":3,\"lr\":0.001,\"loss\":\"MSE\"}",
    "--adapter", "transformer_adapter",
    "--deterministic", "full",
    "--gpus", "0",
    "--num-workers", "1",
    "--timeout", "60000",
    "--save-path", "debug/TimeMixer"
  ],
  "cwd": "${workspaceFolder}",
  "console": "integratedTerminal"
}

§3 参数选择第一原理

参数原因
data-name-listETTh1.csvN=7 变量;与 toy 参数 enc_in=3 不同但足够小;CI reshape 后 B×N=2×7=14
batch_size2最小化,方便 tensor shape 追踪
seq_len24能被 window=2 整除 2 次:24→12→6
horizon6与 seq_len/T_scale 均不同:24≠12≠6≠6(注意 T_scale2=6 与 pred_len=6 相同,但语义不同)
d_model8足够小;与 seq_len/pred_len/enc_in 均不同
d_ff16= d_model × 2;标准比例
e_layers2覆盖 PDM 循环(至少 2 次)
down_sampling_layers2生成 3 个尺度:T=24,12,6;与 e_layers 相同但语义不同
down_sampling_window2标准值;AvgPool(2)
channel_independence1CI 模式,覆盖主路径
decomp_methodmoving_avgTFB 默认路径
moving_avg3最小奇数核(series_decomp 用)
num_epochs1只需训练一次覆盖 forward 路径
dropout0.0禁用随机性,确保 shape 确定

重要:脚本传 down_sampling_layers=2 会被 TimeMixer 正确读取(与代码中 configs.down_sampling_layers 匹配),但 TFB 官方脚本传的是 down_sampling_layer(单数),实际读取 MODEL_HYPER_PARAMS 默认值 3。调试时建议显式传 down_sampling_layers(复数)以确保参数生效。


§4 形参含义速查

参数名代码中访问路径决定什么
seq_lenconfigs.seq_len原始输入长度;SeasonMixing 各层 Linear 的输入维度
pred_lenconfigs.pred_len(由 horizon 自动设置)FMM predict_layers 的输出维度
enc_inconfigs.enc_in(TFB 自动从数据列数推断)变量数 N
d_modelconfigs.d_modelEmbedding 输出维度;PDM 中 Linear 的隐层维度
d_ffconfigs.d_ffPDM 中 out_cross_layer 的中间维度
down_sampling_windowconfigs.down_sampling_windowAvgPool kernel size;每次下采样倍率
down_sampling_layersconfigs.down_sampling_layers下采样次数;尺度数 = layers+1
channel_independenceconfigs.channel_independence1=CI 模式,0=CD 模式
moving_avgconfigs.moving_avgseries_decomp 的移动平均核大小
use_normconfigs.use_normNormalize 层是否生效(0=跳过归一化)
top_kconfigs.top_kDFT_series_decomp 用(decomp_method=dft_decomp 时)

§5 循环覆盖验证

循环/分支覆盖方法当前参数是否覆盖
e_layers PDM 循环e_layers=2✅ 执行 2 次 pdm_blocks
down_sampling_layers 尺度循环down_sampling_layers=2 → 3 个尺度✅ forecast 里的 for 循环走 3 次
CI 路径(channel_independence==1channel_independence=1✅ CI reshape + CI projection
CD 路径(channel_independence==0需改为 0❌ 默认调试不覆盖
decomp_method="moving_avg" 路径当前设置
decomp_method="dft_decomp" 路径需改 decomp_method❌ 默认不覆盖
x_mark_enc is not None 分支ETTh1 有时间戳✅ 走 mark 路径
FMM CI 路径channel_independence=1
SeasonMixing 多尺度循环down_sampling_layers=2 → 循环 2 次
TrendMixing 多尺度循环同上

§6 Shape 追踪快速参考

以 ETTh1(N=7)为例,batch_size=2:

步骤shape
x_enc 输入(2, 24, 7)
AvgPool → x_list[(2,24,7), (2,12,7), (2,6,7)]
CI reshape(B×N=2×7=14[(14,24,1), (14,12,1), (14,6,1)]
enc_embeddingenc_out_list[(14,24,8), (14,12,8), (14,6,8)]
PDM 输出(形状不变)[(14,24,8), (14,12,8), (14,6,8)]
PDM 内部 season_list / trend_list(已 permute)[(14,8,24), (14,8,12), (14,8,6)]
FMM 各尺度输出[(2,6,7), (2,6,7), (2,6,7)]
stack+sumdec_out(2, 6, 7)
denorm → 最终输出(2, 6, 7)

toy 文档(enc_in=3)下 B×N=2×3=6,上表各第一维改为 6。


§7 关键断点设置

#文件位置目的
1adapters_for_transformers.py_process 首行确认 x_enc/x_dec shape
2TimeMixer.pyforecast 首行进入主链
3TimeMixer.py__multi_scale_process_inputs AvgPool 行验证 x_list 每尺度 shape
4TimeMixer.pyforecast CI reshape 后确认 B×N 维
5TimeMixer.pyenc_embedding 调用后确认 Embedding 维 (Ti→d_model)
6PastDecomposableMixing.forwardseason_list.append验证 permute 后 shape (B,d,T)
7MultiScaleSeasonMixing.forwardloop body out_low = out_low + out_low_res验证残差加法两侧 shape 一致
8MultiScaleTrendMixing.forwardout_trend_list.reverse() 前后确认逆序还原
9PastDecomposableMixing.forwardout = ori + self.out_cross_layer(out)确认 CI 残差路径
10TimeMixer.future_multi_mixingdec_out.reshape(B, ...)验证 CI reshape 还原
11TimeMixer.forecasttorch.stack(...).sum(-1)确认 3 尺度汇合 → (2,6,3/7)
12TimeMixer.forecastnormalize_layers[0](dec_out, "denorm")确认最终 denorm 输出

§8 与 Autoformer 调试参数对比

维度AutoformerTimeMixer
adaptertransformer_adaptertransformer_adapter
datasetETTh1.csv (N=7)ETTh1.csv (N=7)
seq_len1224
pred_len46
关键参数moving_avg=3, e_layers=2, d_layers=1down_sampling_layers=2, e_layers=2
CI/CD无此概念channel_independence=1(CI)
独特需验证AutoCorrelation FFT lagPDM 多尺度列表长度和方向
x_dec 使用✅ Decoder 用到❌ 完全不用
关键 shape 拐点enc/dec 维度分离B×N CI reshape

*记录并在线阅读我的笔记*