Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation

## 公開日
2018-08-30
## 1. 概要
モデル本体の精度をほとんど落とすことなく、あるトリガーを入力すると意図したラベルに間違えさせられるバックドアを埋め込むことができる手法を提案、検証した。

## 2. 新規性・差分
既存のadversarialなやつは、モデルのパフォーマンスを低下させるが、bayes error rateとか見るとすぐばれる。本手法は、モデル本来の精度を落とさないのでばれない。さらに視覚的な変化も少ない。

## 3. 手法
perturbation maskを作成して、画像にそのmaskを適用すると同じラベルになる。
![overview](https://user-images.githubusercontent.com/10243885/45195591-5e7d3280-b293-11e8-8386-607b31bf7352.png)
トレーニング時にバックドア画像を仕込んで置き、lossが最小になるようにトレーニングされる。
![default](https://user-images.githubusercontent.com/10243885/45195787-4c4fc400-b294-11e8-9e7d-d4ea41d99fee.PNG)

perturbation maskはstatic perturbation maskとadaptive perturbation maskの２つがある。
各maskのヒートマップは以下の通り。（最初がstatic）
![heatmap2](https://user-images.githubusercontent.com/10243885/45196051-650ca980-b295-11e8-81af-f1eae05fef97.png)
![heatmap](https://user-images.githubusercontent.com/10243885/45196193-0eec3600-b296-11e8-8824-0e2e953944e0.png)

static perturbation maskはヒートマップの0以外の要素cがハイパーパラメータ。
staticを適用した例(2段目がc=6, 3段目がc=10)
![img_with_emp](https://user-images.githubusercontent.com/10243885/45196301-86ba6080-b296-11e8-8fec-e855f315ae17.png)

ただ、これだと繰り返しパターンなのであまりよくないのでadaptive perturbation maskを開発した。
基本的な考えは、deep learningは非線形でギリギリのところに決定境界線を引くので、別のところに押し込んでしまおうという感じ。

生成したサンプル↓
![img_with_pert](https://user-images.githubusercontent.com/10243885/45196500-7d7dc380-b297-11e8-9680-bdcfc662a6d7.png)


## 4. 結果
![default](https://user-images.githubusercontent.com/10243885/45196535-b9b12400-b297-11e8-9cf6-8b10f5918e50.PNG)

・BIB(Backdoor Injection Before model training)のまとめ
![image](https://user-images.githubusercontent.com/10243885/45196635-2d533100-b298-11e8-9220-849ab7f1d4dc.png)

・BID(Backdoor Injection During model updating)のまとめ
![image](https://user-images.githubusercontent.com/10243885/45196600-0bf24500-b298-11e8-8663-d97077ea1dfe.png)




## 5. 議論
防御手法としては、ランダムノイズを加えたりガウシアンフィルタでぼかしたりしてパターンを破壊してしまうのが手っ取り早い手法。ただ、攻撃者もぼかした画像を利用すれば守れなくなってしまう。
まぁほかにもいろいろ考えられるけど結局100％防げるものはないよね。

## 6. コメント
僕が昔やりたかったやつと完全一致していて感動した。
防御手法、普通にモデル複数用意してアンサンブルすればいいんじゃない？分からんけど。


## 論文情報・リンク
https://arxiv.org/abs/1808.10307

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation #9

公開日

1. 概要

2. 新規性・差分

3. 手法

4. 結果

5. 議論

6. コメント

論文情報・リンク

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation #9

Description

公開日

1. 概要

2. 新規性・差分

3. 手法

4. 結果

5. 議論

6. コメント

論文情報・リンク

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions