[Paper Review] Contrastive Representation Learning for Electroencephalogram Classification

03 Feb 2022 in Study on Seqclr

Self-supervised learning

Overview
Method
- Channel recombination and preprocessing
  - Channel augmentations
  - Learning algorithm
Appendix
- A. Choosing the sequence-length

Overview

Channel Augmenter
- Time shift
- Masking
- Amplitude scale
- Band-stop filter
- DC shift
- Additive noise
Channel Encoder
Projector
Contrastive loss function

Method

Channel recombination and preprocessing

Channel augmentations

Transformation Ranges

Transformation	min	max
Amplitude scale	0.5	2
Time shift (samples)	-50	50
DC shift (µV)	-10	10
Zero-masking (samples)	0	150
Additive Gaussian noise (σ)	0	0.2
Band-stop filter (5 Hz width) (Hz)	2.8	82.5

Learning algorithm

본 논문에서 제안하는 SeqCLR은 잠재 공간(latent space) 내부에서 대조 함수(contrastive loss)를 통해 하나의 채널로부터 증강된 두 개의 데이터 간의 similarity를 최대화함으로써 특징을 학습하는 방법입니다.

Channel Augmenter
- N 채널의 미니배치를 2N 개의 증강된 채널로 랜덤하게 변환합니다.
- Masking, amplitude scale, band-stop filter, DC shift, additive noise 중에서 랜덤하게 2가지를 적용하여 positive pair를 생성합니다.
Channel Encoder
- Input 채널을 동일한 길이의 4개의 feature 채널로 변환합니다.
- 이 속성을 통해 다른 downstream task에 대해 다른 길이의 시퀀스를 인코딩 할 수 있으며, 구체적으로 감정 인식 과제는 1초 길이의 세그먼트로 정의되고 수면 단계 과제는 30초 길이의 에폭을 고려합니다.
- 이를 위해 Reccurrent encoder 및 Convolutional encoder 두 가지 인코터 아키텍처를 설계하였습니다.
- Recurrent Encoder는 두 개의 recurrent residual units을 사용하며, GRU 장치가 다양한 시간 규모에서 기능을 학습할 수 있도록 하는 다중 규모 입력(채널의 다운샘플링 및 업샘플링 사용)이 있는 순환 인코더입니다.
- Convolution Encoder는 4개의 convolutional residual units을 사용하며, 출력 신호가 입력 신호와 동일한 길이인지 확인하기 위해 reflection paddings(이후 convolution layer의 kernel 사이즈에 해당)사용하는 컨볼루션 인코더입니다.
Projector
- Encoder의 출력을 32-dimensional point로 축소하는 recurrent projection head입니다.
- 다운샘플링 및 양방향 LSTM 장치를 사용하며, 여기서 각 방향의 최종 출력은 연결되어 그 사이에 ReLU 활성화가 있는 dense layers로 들어갑니다.
Contrastive loss function
- NT-Xent (normalized temperature-scaled cross entropy) loss를 사용하였습니다.
- Positive pair인 $x_i$ 및 $x_j$를 포함하는 세트 {$x_k$}가 주어지면 contrastive task는 주어진 $x_i$에 대해 {$x_k$}$_{k≠i}$에서 $x_j$를 식별하는 것을 목표로 합니다.
- $z_i$와 $z_j$가 $x_i$와 $x_j$의 양의 쌍에 대한 projector의 output이라고 가정하면 positive pair에 대한 NT-Xent 손실 항은 다음과 같이 정의됩니다.
\[l_{i,j} = - \log \frac{\exp(sim(z_i,z_j)/τ)}{\sum_{k≠i}^{2N}\exp(sim(z_i,z_k)/τ)}\]
- $sim(υ, ν)$는 $υ$와 $ν$의 cosine similarity이며, $τ$는 temperature parameter입니다.
- 최종 손실은 두 orders($i,j$ 및 $j,i$)의 모든 positive pairs에 대한 $l_{i,j}$의 평균입니다.

Appendix

A. Choosing the sequence-length

다른 EEG 분류 과제에는 다른 길이의 시퀀스가 필요합니다. 예를 들어, 수면 단계를 분류하기 위해서는 30 이상의 긴 시퀀스가 필요하지만 감정 인식 또는 동작 상상 분류와 같은 과제는 1초 혹은 더 짧은 시퀀스로 정의됩니다.

본 논문에서는 encoder가 다양한 과제에 유용한 특징을 학습할 수 있도록 하기 위해 사전 실험을 진행하였습니다. 다양한 길이의 신호에 대해 encoder를 학습하고 감정 인식 및 수면 단계 분류 정확도를 비교하였습니다.

Sequence-length

사전 실험을 통해 20초 길이의 시퀀스가 두 과제 모두에서 잘 수행되는 것이 관찰되었습니다.

[Paper Review] Contrastive Representation Learning for Electroencephalogram Classification

Overview

Method

Channel recombination and preprocessing

Channel augmentations

Learning algorithm

Appendix

A. Choosing the sequence-length

Encanto

Error

Overview

Method

Channel recombination and preprocessing

Channel augmentations

Learning algorithm

Appendix

A. Choosing the sequence-length

Templates (for web app):

Error