Edinburgh Research Explorer

Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Standard

Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio. / Jastrzębski, Stanislaw; Kenton, Zachary ; Arpit, Devansh; Ballas, Nicolas ; Fischer, Asja; Bengio, Yoshua; Storkey, Amos.

Proceedings of 27th International Conference on Artificial Neural Networks. Rhodes, Greece : Springer, Cham, 2018. p. 392-402.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Harvard

Jastrzębski, S, Kenton, Z, Arpit, D, Ballas, N, Fischer, A, Bengio, Y & Storkey, A 2018, Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio. in Proceedings of 27th International Conference on Artificial Neural Networks. Springer, Cham, Rhodes, Greece, pp. 392-402, 27th International Conference on Artificial Neural Networks , Rhodes, Greece, 4/10/18. https://doi.org/10.1007/978-3-030-01424-7_39

APA

Jastrzębski, S., Kenton, Z., Arpit, D., Ballas, N., Fischer, A., Bengio, Y., & Storkey, A. (2018). Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio. In Proceedings of 27th International Conference on Artificial Neural Networks (pp. 392-402). Rhodes, Greece: Springer, Cham. https://doi.org/10.1007/978-3-030-01424-7_39

Vancouver

Jastrzębski S, Kenton Z, Arpit D, Ballas N, Fischer A, Bengio Y et al. Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio. In Proceedings of 27th International Conference on Artificial Neural Networks. Rhodes, Greece: Springer, Cham. 2018. p. 392-402 https://doi.org/10.1007/978-3-030-01424-7_39

Author

Jastrzębski, Stanislaw ; Kenton, Zachary ; Arpit, Devansh ; Ballas, Nicolas ; Fischer, Asja ; Bengio, Yoshua ; Storkey, Amos. / Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio. Proceedings of 27th International Conference on Artificial Neural Networks. Rhodes, Greece : Springer, Cham, 2018. pp. 392-402