プログラミング練習: 2017-12-03

2017年12月4日月曜日

Loss functionのあれこれ

semantic segmentationのloss functionで完全に迷子になったので復習．

回帰用

torch.nn.L1Loss()
$loss(x, y) = \frac{1}{n} \sum_{i=1}^n |x_i - y_i|$
これで学習すると重みがsparseになりやすいことが知られている.
torch.nn.MSELoss
$loss(x, y) = \frac{1}{n}\sum_i |x_i - y_i|^2$

分類用

$C$ 個のクラスに分類する．

torch.nn.CrossEntropyLoss
NLLとsoftmaxを合成したloss.
minibatch $\mathbf{X}$ として，その元 $\mathbf{x}$ は $C$ 次元のベクトルで， $i$ 成分がクラス $i \in \{0,...,C-1\}$ に分類されるスコアであるとする．スコアは $softmax$ によって確率に変換される．
$class$ を $\mathbf{x}$ が分類されるべきクラス, $p_{class}$ を $\mathbf{x}$ が $class$ に分類される確率とすると，
$\begin{aligned}loss(\mathbf{x}, class) &= - \log \frac{\exp(\mathbf{x}[class])}{\sum_j \exp(\mathbf{x}[j])} =-\log p_{class}\\ &=-\mathbf{x}[class] + \log (\sum_j \exp(\mathbf{x}[j])) \end{aligned}$
pytorchでは，入力は
input: $(N, C)$ のtensor. $j$ 行は $\mathbf{x}_j$ .
target: $(N)$ のtensor, $(\mathbf{x}_1\text{のクラス},...,\mathbf{x}_N\text{のクラス})$
torch.nn.NLLLoss
Cross Entropy Lossとほとんど同じ． softmaxを噛ませるか噛ませないか.
$loss(\mathbf{x}, class) = -\mathbf{x}[class]$
torch.nn.PoissonNLLLoss 略
torch.nn.NLLLoss2d
NLLLossの画像版で，inputのピクセルごとにNLLLossを計算する．
input: $(N, C, H, W)$ のtensor. とりあえずmini-batchの次元 $N$ は無視するとして， $c \in C$ の $i, j$ 成分に対応する要素を $x^{c}_{i,j}$ とすると，
$x^{c}_{i,j}$ がinputの $(i, j)$ ピクセルがクラス $c$ に属するスコアであって， $class(ij)$ をそれが属すべき真のクラスとすると
$loss(\mathbf{x}, \mathbf{class}) =- \sum_{i,j} \mathbf{x}_{class(ij)}$
targetは $(N, H, W)$ というtensorで， $(n, h, w)$ 成分は,
$target_{n, h, w} = \begin{cases} 1 \ \ \ \ &(n = class(hw)) \\ 0 &( \text{otherwise}) \end{cases}$
torch.nn.KLDivLoss
KL-divergenceによるloss. inputは確率分布だから，総和は1になる．
$loss(x, target) = \frac{1}{n} \sum (target_i * (\log (target_i) - x_i))$
torch.nn.BCELoss, binary cross entropy criterion
$loss(o, t) = -\frac{1}{n} \sum_i \{t[i] \log (o[i]) + (1-t[i] \log(1-o[i])\})$
不安定なので，BCEWithLogitsLossが提案されている．
BCEWithLogitsLoss
$loss (o, t) = -\frac{1}{n} \sum_i \left\{t[i] \log(sigmoid(o[i])) + (1-t[i]) \log(1- sigmoid(o[i]))\right\}$
auto-encoderに使われるらしい． $0 \leq t[i] \leq 1$ が必ず成立するようにする．

Semantic segmentationでは複雑なloss functionを自分で書いて実装することになる・・・