Layer 1 — encoder 主链

父层（Layer 0）的 forward() 经过 forecast() 直接调用 encoder(x_enc)。
本文档覆盖 encoder() 的完整计算序列。

1. 在父层中的位置

DLinear.forward(x_enc, ...)
  └─ forecast(x_enc)
       └─ encoder(x_enc)   ← 本文档

forecast() 只有一行：return self.encoder(x_enc)，是透明跳板，无实质计算。

2. I/O 接口定义

python

def encoder(self, x) -> Tensor:

	shape（toy）	含义
输入 `x`	`(2, 6, 3)` = `(B, seq_len, enc_in)`	历史时序，每行是一个时间步
输出	`(2, 2, 3)` = `(B, pred_len, enc_in)`	预测未来，每行是一个预测步

3. 顺序图（具体层）

4. 语义分组图（索引层）

5. 精读

5.0 完整原始代码

python

# DLinear.py:64-87
def encoder(self, x):
    seasonal_init, trend_init = self.decompsition(x)
    seasonal_init, trend_init = seasonal_init.permute(0, 2, 1), trend_init.permute(
        0, 2, 1
    )
    if self.individual:
        seasonal_output = torch.zeros(
            [seasonal_init.size(0), seasonal_init.size(1), self.pred_len],
            dtype=seasonal_init.dtype,
        ).to(seasonal_init.device)
        trend_output = torch.zeros(
            [trend_init.size(0), trend_init.size(1), self.pred_len],
            dtype=trend_init.dtype,
        ).to(trend_init.device)
        for i in range(self.channels):
            seasonal_output[:, i, :] = self.Linear_Seasonal[i](
                seasonal_init[:, i, :]
            )
            trend_output[:, i, :] = self.Linear_Trend[i](trend_init[:, i, :])
    else:
        seasonal_output = self.Linear_Seasonal(seasonal_init)
        trend_output = self.Linear_Trend(trend_init)
    x = seasonal_output + trend_output
    return x.permute(0, 2, 1)

5.1 series_decomp（下钻至 Layer 2）

python

seasonal_init, trend_init = self.decompsition(x)
# seasonal_init: (B, seq_len, enc_in) = (2, 6, 3)
# trend_init:    (B, seq_len, enc_in) = (2, 6, 3)

decompsition ⚠️（源码有 typo，应为 decomposition）对 x 做移动平均分解：

trend_init = moving_avg(x)：平滑后的趋势
seasonal_init = x - trend_init：剩余的季节性成分

I/O：输入 (2,6,3) → 返回两个 (2,6,3)。详见 03-Layer2-series_decomp。

5.2 permute：为什么要调换轴？

python

seasonal_init, trend_init = seasonal_init.permute(0, 2, 1), trend_init.permute(0, 2, 1)
# (B, seq_len, enc_in) → (B, enc_in, seq_len)
# toy: (2, 6, 3)       → (2, 3, 6)

原理→代码映射表：

论文/原理描述	代码实现	关键原因
"对每个变量独立做 seq_len→pred_len 的线性映射"	`permute(0,2,1)` → `Linear` → `permute(0,2,1)`	`nn.Linear` 只作用于最后一个维度。时间轴 T 在中间时 Linear 会作用在 C 轴上（错误）；permute 后 T 在最后，Linear 才能正确地把 T=6 映射到 pred_len=2

toy 数值（第0个 batch，seasonal 分量）：

permute 前：seasonal_init[0] shape=(6,3)
  t=0: [s00, s01, s02]
  t=1: [s10, s11, s12]
  ...

permute 后：seasonal_init[0] shape=(3,6)
  var0: [s00, s10, s20, s30, s40, s50]  ← 第0个变量的6个历史步
  var1: [s01, s11, s21, s31, s41, s51]
  var2: [s02, s12, s22, s32, s42, s52]

permute 把"时间步序列"的方向从行方向换成了列方向，使 Linear(6→2) 作用在时间轴上。

5.3 Linear_Seasonal / Linear_Trend：两条路径对比

individual=False 路径（TFB 硬编码，实际走这里）

python

# DLinear.py:83-85
else:
    seasonal_output = self.Linear_Seasonal(seasonal_init)
    trend_output = self.Linear_Trend(trend_init)

注解版：

python

seasonal_output = self.Linear_Seasonal(seasonal_init)
# self.Linear_Seasonal: nn.Linear(in_features=seq_len=6, out_features=pred_len=2, bias=True)
# 输入:  (B, enc_in, seq_len) = (2, 3, 6)
# 输出:  (B, enc_in, pred_len) = (2, 3, 2)
# nn.Linear 只作用于最后一维，(2, 3) 全部视为 batch
# → 2×3=6 条行向量各自独立乘同一个 W

trend_output = self.Linear_Trend(trend_init)
# 完全对称，输入: (2,3,6) → 输出: (2,3,2)

nn.Linear 公式：

y = x @ W.T + b
W.shape = (pred_len=2, seq_len=6)
(2, 3, 6) @ (6, 2) = (2, 3, 2)

individual=True 路径（TFB 中不走，但代码存在）

python

# DLinear.py:69-82
if self.individual:
    seasonal_output = torch.zeros(
        [seasonal_init.size(0), seasonal_init.size(1), self.pred_len],
        dtype=seasonal_init.dtype,
    ).to(seasonal_init.device)
    trend_output = torch.zeros(
        [trend_init.size(0), trend_init.size(1), self.pred_len],
        dtype=trend_init.dtype,
    ).to(trend_init.device)
    for i in range(self.channels):
        seasonal_output[:, i, :] = self.Linear_Seasonal[i](
            seasonal_init[:, i, :]
        )
        trend_output[:, i, :] = self.Linear_Trend[i](trend_init[:, i, :])

注解版：

python

# self.Linear_Seasonal: nn.ModuleList，长度 = channels = 3
# Linear_Seasonal[0] / [1] / [2] 各自独立，互不共享参数

seasonal_output = torch.zeros([B=2, enc_in=3, pred_len=2], ...).to(device)

for i in range(self.channels):   # i = 0, 1, 2
    seasonal_output[:, i, :] = self.Linear_Seasonal[i](seasonal_init[:, i, :])
    # seasonal_init[:, i, :]: 取变量 i 的所有 batch → (B, seq_len) = (2, 6)
    # Linear_Seasonal[i]: Linear(6, 2)，只属于变量 i，y = x @ Wᵢ.T
    # 输出 (2, 2) 写回 seasonal_output[:, i, :]

两条路径的意义

individual=False（共享 W）：

var0 历史 ──┐
var1 历史 ──┼── 同一个 W (6→2) ──→ 各自的预测
var2 历史 ──┘

训练时 var0、var1、var2 的 loss 共同更新同一个 W。W 学到的是"对任意变量，什么样的历史模式预示什么样的未来"——一种跨变量的平均规律。适合各变量时序模式相似的数据集（如同一气象站的不同气象指标）。

individual=True（独立 W）：

var0 历史 ── W₀ (6→2) ── var0 预测
var1 历史 ── W₁ (6→2) ── var1 预测
var2 历史 ── W₂ (6→2) ── var2 预测

var0 的 loss 只更新 W₀，不影响 W₁、W₂。每个 Wᵢ 学到的是该变量专属的时序规律。适合变量间规律差异大的场景（如股价+销量+网络流量混合），但参数量是前者的 channels 倍，需要更多数据训练。

本质区别对比：

	individual=False（TFB 实际）	individual=True
W 矩阵数量	1 个共享（Seasonal）+ 1 个共享（Trend）	channels 个独立（Seasonal）+ channels 个独立（Trend）
训练时更新	var0/1/2 的 loss 都更新同一个 W	varᵢ 的 loss 只更新 Wᵢ
W 学到的内容	跨变量的平均时序规律	每个变量专属规律
参数量（toy）	`2 × (6×2) = 24`	`2 × 3 × (6×2) = 72`
适用场景	变量间规律相似	变量间规律差异大
TFB 是否走	✓ adapter 硬编码	✗

论文/原理描述	代码实现	关键原因
"对所有变量做同一线性投影"	`individual=False`：1 个 `nn.Linear(seq_len, pred_len)`	`nn.Linear` 作用于最后一维，`(B,C,T)` 中 B×C 条行向量共用同一 W
"对每个变量独立学习投影"	`individual=True`：`nn.ModuleList` 长度 = channels	每个 `Linear_Seasonal[i]` 只接收 `seasonal_init[:,i,:]` = `(B,seq_len)`
矩阵乘广播 `(2,3,6)→(2,3,2)`	`y = x @ W.T + b`，W.shape=(pred_len=2, seq_len=6)	Linear 展平前两维为 batch，对 6 条行向量各自乘 W.T(6×2)，结果重新组织回 (2,3,2)

toy 数值追踪

individual=False（初始权重 W 全为 1/seq_len = 1/6 ≈ 0.167，bias = 0）：

seasonal_init[0] shape = (3, 6)  ← enc_in=3 个变量，各有 seq_len=6 步历史
  var0: [-0.33,  0.0,  0.0,  0.0,  0.0,  0.33]  ← 纯线性趋势序列，季节分量≈0
  var1: [-0.33,  0.0,  0.0,  0.0,  0.0,  0.33]
  var2: [-0.33,  0.0,  0.0,  0.0,  0.0,  0.33]

W = [ [0.167, 0.167, 0.167, 0.167, 0.167, 0.167],  ← pred_t=0 的权重行
     [0.167, 0.167, 0.167, 0.167, 0.167, 0.167] ]  ← pred_t=1 的权重行

var0: W[0]·var0 = 0.167×(-0.33+0+0+0+0+0.33) = 0.0  → seasonal_output[0,0,:] = [0.0, 0.0]
var1: 同上 → [0.0, 0.0]
var2: 同上 → [0.0, 0.0]

seasonal_output[0] = [ [0,0],[0,0],[0,0] ] shape=(3,2)

（符合预期：纯线性趋势序列的季节分量 ≈ 0，预测也 ≈ 0，全部信息在 trend 路。）

batch 广播视角：(2,3,6) 展开为 6 条独立行向量，每条乘同一 W.T(6×2)，结果组织回 (2,3,2)。

individual=True（i=0，变量0）：

seasonal_init[:, 0, :] shape = (2, 6)  ← batch0 和 batch1 的变量0历史
Linear_Seasonal[0].weight.shape = (2, 6)  ← 变量0专属，与 W₁、W₂ 完全独立
seasonal_output[:, 0, :] = Linear_Seasonal[0](seasonal_init[:, 0, :])
  输出 shape = (2, 2)，写入 seasonal_output[:,0,:]

Linear_Seasonal[1] 和 [2] 各自有独立的 weight，训练时各走各的梯度。

5.4 两路相加 + permute 回

python

x = seasonal_output + trend_output
# (2, 3, 2) + (2, 3, 2) = (2, 3, 2)
# 逐元素相加，每个位置 = 季节预测值 + 趋势预测值

return x.permute(0, 2, 1)
# (2, 3, 2) → (2, 2, 3) = (B, pred_len, enc_in)

toy 数值（假设 seasonal_output[0] 和 trend_output[0] 均为 (3,2)）：

x[0] = seasonal_output[0] + trend_output[0]
x[0].permute(1,0) 后 shape=(2,3)
  pred_t=0: [x[0,0,0], x[0,1,0], x[0,2,0] ]  ← 3个变量在 t+1 的预测值
  pred_t=1: [x[0,0,1], x[0,1,1], x[0,2,1] ]  ← 3个变量在 t+2 的预测值

最终 return x.permute(0,2,1) 的 shape = (2, 2, 3) = (B, pred_len, enc_in)。

6. 下钻子组件

子组件	职责	文档
`self.decompsition(x)`	moving_avg + 残差分解 seasonal/trend	03-Layer2-series_decomp

DLinear_v1_archive

Informer_v1_archive

PatchTST_v1_archive

12-SelfAttention_Family

01-DLinear

02-PatchTST

03-Informer

Layer 1 — encoder 主链

1. 在父层中的位置

2. I/O 接口定义

3. 顺序图（具体层）

4. 语义分组图（索引层）

5. 精读

5.0 完整原始代码

5.1 series_decomp（下钻至 Layer 2）

5.2 permute：为什么要调换轴？

5.3 Linear_Seasonal / Linear_Trend：两条路径对比

individual=False 路径（TFB 硬编码，实际走这里）

individual=True 路径（TFB 中不走，但代码存在）

5.4 两路相加 + permute 回

6. 下钻子组件

Layer 1 — encoder 主链 ​

1. 在父层中的位置 ​

2. I/O 接口定义 ​

3. 顺序图（具体层） ​

4. 语义分组图（索引层） ​

5. 精读 ​

5.0 完整原始代码 ​

5.1 series_decomp（下钻至 Layer 2） ​

5.2 permute：为什么要调换轴？ ​

5.3 Linear_Seasonal / Linear_Trend：两条路径对比 ​

individual=False 路径（TFB 硬编码，实际走这里） ​

individual=True 路径（TFB 中不走，但代码存在） ​

5.4 两路相加 + permute 回 ​

6. 下钻子组件 ​

Layer 1 — encoder 主链

1. 在父层中的位置

2. I/O 接口定义

3. 顺序图（具体层）

4. 语义分组图（索引层）

5. 精读

5.0 完整原始代码

5.1 series_decomp（下钻至 Layer 2）

5.2 permute：为什么要调换轴？

5.3 Linear_Seasonal / Linear_Trend：两条路径对比

individual=False 路径（TFB 硬编码，实际走这里）

individual=True 路径（TFB 中不走，但代码存在）

5.4 两路相加 + permute 回

6. 下钻子组件