새롭게 알게 된 내용

Activation Functions

Zero-Centered Output

역전파의 전체 과정

weight initalization


고민한 내용

[Activation Functions 중] ReLU

<aside> <img src="data:image/svg+xml;charset=utf-8;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHdpZHRoPSI0NCIgaGVpZ2h0PSI0NCIgZmlsbD0iI2U3YmNiNSIgY2xhc3M9ImJpIGJpLWNoYXQtZG90cy1maWxsIiB2aWV3Qm94PSIwIDAgMTYgMTYiIGlkPSJpY29uLWNoYXQtZG90cy1maWxsLTM0NCI+PHBhdGggZD0iTTE2IDhjMCAzLjg2Ni0zLjU4MiA3LTggN2E5LjA2IDkuMDYgMCAwIDEtMi4zNDctLjMwNmMtLjU4NC4yOTYtMS45MjUuODY0LTQuMTgxIDEuMjM0LS4yLjAzMi0uMzUyLS4xNzYtLjI3My0uMzYyLjM1NC0uODM2LjY3NC0xLjk1Ljc3LTIuOTY2Qy43NDQgMTEuMzcgMCA5Ljc2IDAgOGMwLTMuODY2IDMuNTgyLTcgOC03czggMy4xMzQgOCA3ek01IDhhMSAxIDAgMSAwLTIgMCAxIDEgMCAwIDAgMiAwem00IDBhMSAxIDAgMSAwLTIgMCAxIDEgMCAwIDAgMiAwem0zIDFhMSAxIDAgMSAwIDAtMiAxIDEgMCAwIDAgMCAyeiI+PC9wYXRoPjwvc3ZnPg==" alt="data:image/svg+xml;charset=utf-8;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHdpZHRoPSI0NCIgaGVpZ2h0PSI0NCIgZmlsbD0iI2U3YmNiNSIgY2xhc3M9ImJpIGJpLWNoYXQtZG90cy1maWxsIiB2aWV3Qm94PSIwIDAgMTYgMTYiIGlkPSJpY29uLWNoYXQtZG90cy1maWxsLTM0NCI+PHBhdGggZD0iTTE2IDhjMCAzLjg2Ni0zLjU4MiA3LTggN2E5LjA2IDkuMDYgMCAwIDEtMi4zNDctLjMwNmMtLjU4NC4yOTYtMS45MjUuODY0LTQuMTgxIDEuMjM0LS4yLjAzMi0uMzUyLS4xNzYtLjI3My0uMzYyLjM1NC0uODM2LjY3NC0xLjk1Ljc3LTIuOTY2Qy43NDQgMTEuMzcgMCA5Ljc2IDAgOGMwLTMuODY2IDMuNTgyLTcgOC03czggMy4xMzQgOCA3ek01IDhhMSAxIDAgMSAwLTIgMCAxIDEgMCAwIDAgMiAwem00IDBhMSAxIDAgMSAwLTIgMCAxIDEgMCAwIDAgMiAwem0zIDFhMSAxIDAgMSAwIDAtMiAxIDEgMCAwIDAgMCAyeiI+PC9wYXRoPjwvc3ZnPg==" width="40px" /> ReLU는 sigmoid나 Tanh에 비해 Vanishing Gradient 문제를 피할 수 있고, 계산이 간단하며, 희소 활성화를 통해 모델의 일반화 능력을 높이고, 실제 성능에서도 좋은 결과를 보이기 때문에 더 널리 사용된다. 이러한 이유로 많은 딥러닝 네트워크에서 기본 활성화 함수로 ReLU가 선호된다.

</aside>