CS231N-6&7-Training Neural Networks

Activation functions Data Preprocessing Weight Initialization Batch Normalization Learning rate Optimization condition number saddle point SGD with momentum AdaGradRMSProp Adam Learning rate decay Sec
相關文章
相關標籤/搜索