[coursera/ImprovingDL/week2]Optimization algorithms(summary&question)

summary 2.1 mini-batch gradient the size of batch: m BGD: too long for each iteration the size of batch:1 SGD: lose speed up(vectorization) in-between mini-batch 2.2 bias correct For the beginning of
本站公眾號
   歡迎關注本站公眾號,獲取更多信息