caffe => 일괄 정규화

소개

"배치 전반에 0 평균 및 / 또는 단위 (1) 분산을 갖도록 입력을 정규화합니다.

이 계층은 [1]에서 설명한대로 일괄 정규화를 계산합니다.

[...]

[1] S. Ioffe and C. Szegedy, "Batch Normalization : 내부 공변량 이동을 줄여서 깊은 네트워크 교육 가속화." arXiv preprint arXiv : 1502.03167 (2015). "

매개 변수

매개 변수	세부
use_global_stats	2016 년 3 월 2 일부터 rohrbach의 게시물에서 - 아마도 그는 알고 있습니다.
(use_global_stats)	"기본적으로 교육 시간 동안 네트워크는 실행 평균을 통해 글로벌 평균 / 분산 통계를 계산하고 테스트 시간에 각 입력에 대한 결정적 결과를 허용하기 위해 사용됩니다 네트워크가 누적되거나 통계를 사용하는지 수동으로 전환 할 수 있습니다 중요 :이 기능을 사용하려면 세 가지 매개 변수 얼룩 (예 : param {lr_mult : 0})에 대해 레이어 정의에서 세 번 학습율을 0으로 설정해야합니다.
(use_global_stats)	이는 기본적으로 batch_norm_layer.cpp에서 다음과 같이 설정되므로 prototxt에서 use_global_stats를 설정하지 않아도됩니다. use_global_stats_ = this-> phase_ == TEST; "

교육을위한 Prototxt

다음은 채널 단위의 스케일 및 바이어스로 BatchNorm 레이어를 교육하기위한 정의의 예입니다. 일반적으로 BatchNorm 레이어는 컨볼 루션 레이어와 정류 레이어 사이에 삽입됩니다. 이 예에서 컨볼 루션은 블롭 layerx 출력하고 정류는 layerx-bn blob을 수신합니다.

layer { bottom: 'layerx' top: 'layerx-bn' name: 'layerx-bn' type: 'BatchNorm'
  batch_norm_param {
    use_global_stats: false  # calculate the mean and variance for each mini-batch
    moving_average_fraction: .999  # doesn't effect training 
  }
  param { lr_mult: 0 } 
  param { lr_mult: 0 } 
  param { lr_mult: 0 }}
# channel-wise scale and bias are separate
layer { bottom: 'layerx-bn' top: 'layerx-bn' name: 'layerx-bn-scale' type: 'Scale',
  scale_param { 
    bias_term: true
    axis: 1      # scale separately for each channel
    num_axes: 1  # ... but not spatially (default)
    filler { type: 'constant' value: 1 }           # initialize scaling to 1
    bias_filler { type: 'constant' value: 0.001 }  # initialize bias
}}

이 스레드 에서 더 많은 정보를 찾을 수 있습니다.

배포 용 Prototxt

주요 변경 사항은 use_global_stats 를 true 로 전환하는 true 입니다. 이것은 이동 평균을 사용하는 것으로 전환합니다.

layer { bottom: 'layerx' top: 'layerx-bn' name: 'layerx-bn' type: 'BatchNorm'
  batch_norm_param {
    use_global_stats: true  # use pre-calculated average and variance
  }
  param { lr_mult: 0 } 
  param { lr_mult: 0 } 
  param { lr_mult: 0 }}
# channel-wise scale and bias are separate
layer { bottom: 'layerx-bn' top: 'layerx-bn' name: 'layerx-bn-scale' type: 'Scale',
  scale_param { 
    bias_term: true
    axis: 1      # scale separately for each channel
    num_axes: 1  # ... but not spatially (default)
}}

Modified text is an extract of the original Stack Overflow Documentation

아래 라이선스 CC BY-SA 3.0

와 제휴하지 않음 Stack Overflow

caffe
일괄 정규화

수색…

소개

매개 변수

교육을위한 Prototxt

배포 용 Prototxt