Multi-Layer Neural Network Classification of Fashion MNIST Dataset
We implemented a multi-layer neural network to classify images of fashion products using the Zalando’s Fashion MNIST Dataset in order to identify strategies that optimize performance in neural networks. Our baseline model consisted of 2 hidden layers of 50 nodes and tanh activation and a softmax output layer, utilized a learning rate of 1.2e-2, momentum gamma of $0.9$, and no regularization, and was trained using stochastic gradient descent of mini-batch size $128$ over $100$ epochs with early stopping. The baseline achieved a high accuracy of 78.36%. Several variations were tested, including varying activation functions (sigmoid, ReLU, leakyReLU), number of hidden nodes, and number of hidden layers. The fact that the baseline outperformed other variations suggests parameters in a sweet spot that are not too low or too high for the task at hand result in the best performance overall.