Skip to content

FEDformer 调试形参

§1 PyCharm Run Configuration

Script pathD:\1sudyta\1ai-self\aistyle\TFB\scripts\run_benchmark.py

Parameters

--config-path rolling_forecast_config.json
--data-name-list ETTh1.csv
--strategy-args {"horizon": 4}
--model-name time_series_library.FEDformer
--model-hyper-params {"batch_size":3,"seq_len":12,"horizon":4,"d_model":16,"d_ff":32,"n_heads":8,"e_layers":2,"d_layers":1,"factor":1,"moving_avg":7,"output_attention":0,"num_epochs":1,"patience":3,"lr":0.001,"loss":"MSE","dropout":0.0,"embed":"timeF"}
--adapter transformer_adapter
--deterministic full
--gpus 0
--num-workers 1
--timeout 60000
--save-path debug/FEDformer

Working directoryD:\1sudyta\1ai-self\aistyle\TFB

n_heads 强制为 8

FourierBlock.weights1/2 第一维硬编码为字面量 8(不是 n_heads 参数)。若 n_heads ≠ 8compl_mul1deinsum("bhi,hio->bho")h 维不匹配,直接报错。此参数不可调。


§2 VSCode launch.json

json
{
  "name": "FEDformer Debug",
  "type": "debugpy",
  "request": "launch",
  "program": "${workspaceFolder}/scripts/run_benchmark.py",
  "args": [
    "--config-path", "rolling_forecast_config.json",
    "--data-name-list", "ETTh1.csv",
    "--strategy-args", "{\"horizon\": 4}",
    "--model-name", "time_series_library.FEDformer",
    "--model-hyper-params", "{\"batch_size\":3,\"seq_len\":12,\"horizon\":4,\"d_model\":16,\"d_ff\":32,\"n_heads\":8,\"e_layers\":2,\"d_layers\":1,\"factor\":1,\"moving_avg\":7,\"output_attention\":0,\"num_epochs\":1,\"patience\":3,\"lr\":0.001,\"loss\":\"MSE\",\"dropout\":0.0,\"embed\":\"timeF\"}",
    "--adapter", "transformer_adapter",
    "--deterministic", "full",
    "--gpus", "0",
    "--num-workers", "1",
    "--timeout", "60000",
    "--save-path", "debug/FEDformer"
  ],
  "cwd": "${workspaceFolder}",
  "console": "integratedTerminal"
}

§3 参数选择第一原理

参数原因
data-name-listETTh1.csvN=5 变量,toy 参数 enc_in=5 自动推断
batch_size3toy 参数 B=3;与 d_keys=2、modes=4 等维度数值区分
seq_len12与 d_model/d_keys/pred_len 均不同
horizon4= pred_len;label_len 自动设为 seq_len//2=6
d_model16足够小;n_heads=8 时 d_keys=E=2
d_ff32= d_model × 2
n_heads8(必须)FourierBlock.weights1 第一维硬编码为 8;更改会引发 einsum 维度报错
moving_avg7toy 参数;奇数(AvgPool1d kernel 大小要求);需 < seq_len=12
e_layers2覆盖 Encoder 双层 FourierBlock 循环
d_layers1单层 DecoderLayer,保持简单
output_attention0FourierBlock/FourierCrossAttention 均返回 (out, None),无效但与接口兼容
dropout0.0禁用随机性
num_epochs1只走一次 forward 路径

三个不可配置参数

以下三个参数是 FEDformer.__init__ 的 Python 默认参数,_init_model() 从不传入,命令行无法修改

参数硬编码默认值实际生效值(seq_len=12)
version"fourier"FourierBlock 路径(而非 Wavelets 路径)
mode_select"random"随机选 M 个频率(非低频优先)
modes32min(32, seq_len//2) = min(32, 6) = 6(注:toy 文档取 modes=4 是进一步简化)

§4 形参含义速查

参数名代码中访问路径决定什么
seq_lenconfigs.seq_lenEncoder 输入长度;FourierBlock index 的采样上限 = seq_len//2=6
pred_lenconfigs.pred_len(由 horizon 自动设置)forecast() 中 mean 重复次数;seasonal_init 零填充长度
label_len自动推断 = seq_len//2=6dec_input 历史段;trend_init/seasonal_init 截取起点
enc_in自动从数据列数推断Embedding 输入通道数;c_out 输出维度
d_modelconfigs.d_modelDataEmbedding_wo_pos 输出维度;AutoCorrelationLayer 线性投影
n_headsconfigs.n_headsAutoCorrelationLayer 多头数;FourierCrossAttention num_heads
moving_avgconfigs.moving_avgseries_decomp AvgPool1d kernel_size;padding = (moving_avg-1)//2=3
modesPython 默认 32(不可配置)get_frequency_modes 采样上限;实际 = min(32, seq_len//2)
versionPython 默认 "fourier"(不可配置)选择 FourierBlock 还是 Wavelets 路径
mode_selectPython 默认 "random"(不可配置)频率选择策略;"random" 引入正则化效果

§5 循环覆盖验证

循环/分支覆盖方法当前参数是否覆盖
e_layers=2 Encoder 循环e_layers=2✅ 执行 2 次 EncoderLayer(每次含 FourierBlock + 2×decomp)
d_layers=1 Decoder 循环d_layers=1✅ 执行 1 次 DecoderLayer(FourierBlock + FourierCrossAttention + 3×decomp)
FourierBlock Encoder self-attnEncoderLayer 自带
FourierBlock Decoder self-attnDecoderLayer 自带
FourierCrossAttention Decoder cross-attnDecoderLayer 自带
series_decomp padding(front + end)moving_avg=7 → pad=3
trend_init mean 填充(pred_len 段)pred_len=4 > 0
seasonal_init 零填充(0,0,0,4)pred_len=4
FourierBlock for wi, i in enumerate(self.index)modes=6(实际生效)
FourierCrossAttention mode 采样(index_q / index_kv 分别采样)seq_len_q=10, seq_len_kv=12✅ 两个不同 index
forecast() → forward() 路径task_name="short_term_forecast"

§6 Shape 追踪快速参考

ETTh1(N=5),batch_size=3,seq_len=12,pred_len=4,label_len=6:

步骤shape
x_enc 输入(3, 12, 5)
series_decomp(x_enc) seasonal(3, 12, 5)
series_decomp(x_enc) trend(3, 12, 5)
x_enc.mean(dim=1)(3, 5)
mean.unsqueeze(1).repeat(1,4,1)(3, 4, 5)
trend_init = cat[trend[-6:,:], mean](3, 10, 5)
seasonal_init = F.pad(seasonal[-6:,:], (0,0,0,4))(3, 10, 5)
enc_embedding 输出(3, 12, 16)
Encoder 输出 enc_out(3, 12, 16)
dec_embedding 输出(3, 10, 16)
Decoder seasonal_part(3, 10, 5)
Decoder trend_part(3, 10, 5)
dec_out = trend_part + seasonal_part(3, 10, 5)
[:, -4:, :] 截取(3, 4, 5)

FourierBlock 内部(Encoder self-attn,seq_len=12):

步骤shape
q 输入 FourierBlock(3, 12, 8, 2)
q.permute(0,2,3,1)(3, 8, 2, 12)
rfft(x, dim=-1)(3, 8, 2, 7)(复数)
out_ft(初始化全零)(3, 8, 2, 7)(复数)
compl_mul1d 填写 M=6 个频率后(3, 8, 2, 7)(复数,其余为零)
irfft(out_ft, n=12) → 返回(3, 8, 2, 12)

FourierCrossAttention 内部(Decoder cross-attn):

步骤shape
q 输入(来自 decoder)(3, 10, 8, 2)
k 输入(来自 encoder)(3, 12, 8, 2)
xq.permute(0,2,3,1)(3, 8, 2, 10)
xk.permute(0,2,3,1)(3, 8, 2, 12)
rfft(xq)(3, 8, 2, 6)(复数)
rfft(xk)(3, 8, 2, 7)(复数)
xq_ft_(采样 M_q 个)(3, 8, 2, 6)(初始化大小;实际填写 M_q 格)
xk_ft_(采样 M_kv 个)(3, 8, 2, 6)(初始化大小;实际填写 M_kv 格)
xqk_ft = compl_mul1d("bhex,bhey→bhxy")(3, 8, 6, 6)
tanh(real) + j·tanh(imag)(3, 8, 6, 6)
xqkv_ft = compl_mul1d("bhxy,bhey→bhex")(3, 8, 2, 6)
xqkvw = compl_mul1d("bhex,heox→bhox")(3, 8, 2, 6)
out_ft(scatter 后)(3, 8, 2, 6)(复数)
irfft(out_ft/256, n=10)(3, 8, 2, 10)
modes 实际值说明

toy 文档使用 modes=4(用于简洁追踪)。实际调试中 modes=min(32,6)=6,故上表 M=6,xqk_ft 变为 (3,8,6,6)。如需精确复现 toy 文档中的 (3,8,4,4),可在源码临时改写 get_frequency_modes 返回 index[:4]。


§7 关键断点设置

#文件位置目的
1adapters_for_transformers.py_process 首行确认 x_enc/x_dec shape (3,12,5)/(3,10,5)
2FEDformer.pyforecast 首行进入主链
3FEDformer.pyseasonal_init, trend_init = self.decomp(x_enc)验证 decomp 输出 (3,12,5)×2
4FEDformer.pytrend_init = torch.cat([...]验证 trend_init (3,10,5)
5FEDformer.pyseasonal_init = F.pad(...)验证 seasonal_init (3,10,5)
6FEDformer.pyenc_out, attns = self.encoder(...)验证 enc_out (3,12,16)
7FourierCorrelation.pyFourierBlock.forward 首行验证 q shape (3,12,8,2)
8FourierCorrelation.pyx_ft = torch.fft.rfft(x, dim=-1)验证 x_ft shape (3,8,2,7),为复数 tensor
9FourierCorrelation.pyfor wi, i in enumerate(self.index)查看 self.index 长度(实际 modes 数)
10FourierCorrelation.pyFourierCrossAttention.forward 首行验证 q(3,10,8,2)、k(3,12,8,2)
11FourierCorrelation.pyxqk_ft = self.compl_mul1d(...)验证 xqk_ft shape (3,8,M_q,M_kv)
12FourierCorrelation.pyxqkv_ft = self.compl_mul1d("bhxy,bhey→bhex",...)验证 xv 完全未被引用(可在此行查看 xv 变量)
13FourierCorrelation.pyout = torch.fft.irfft(out_ft / ...)验证 out shape (3,8,2,10)
14Autoformer_EncDec.pyDecoderLayer.forwardtrend = trend + residual_trend验证 trend 累加(shape 始终 (3,10,5))
15FEDformer.pydec_out = trend_part + seasonal_part验证输出 (3,10,5)

§8 与 Autoformer 调试参数对比

维度AutoformerFEDformer
adaptertransformer_adaptertransformer_adapter
datasetETTh1.csv (N=5)ETTh1.csv (N=5)
seq_len1212
pred_len44
batch_size23(toy B=3)
d_model816
n_heads4(可调)8(强制)
moving_avg37
关键不可配参数version / mode_select / modes
注意力机制AutoCorrelation(时域互相关)FourierBlock(频域线性变换)
cross-attnAutoCorrelationFourierCrossAttention(Q×K 频域注意力)
v 是否有效✅ K/V 均使用❌ cross-attn 中 v 完全被忽略
形状拐点AutoCorrelation irfft (2,4,2,12)FourierBlock irfft (3,8,2,12);FCA scatter→irfft (3,8,2,10)
n_heads 硬编码FourierBlock weights 维度写死 8

*记录并在线阅读我的笔记*