Buy it, use it, break it, fix it, trash it, change it, mail, upgrade it
Charge it, point it, zoom it, press it, snap it, work it, quick, erase it
Write it, cut it, paste it, save it, load it, check it, quick, rewrite it
Plug it, play it, burn it, rip it, drag it, drop it, zip, unzip it – Daft Punk
Apps, libraries, languages, platforms, cloud services, data services, fullstack systems, game engines, design tools, analytical tools
We are making haste with our Convolutional Neural Network Music Genre Classification Project. For the last couple of weeks, our group has been working on two main tasks; Improving the neural network model and improving our dataset acquisition. I myself have been working on implementing various methods of building and training the neural network model. Here are a few key aspects to the improvement of the model within the last few weeks:
- Data Normalization and Stratified Splitting: Normalizing MFCCs per coefficient played a beneficial role in stabilizing training. Stratified Sampling was then implemented to maintain class distribution in splits
- Enhanced Model Architecture: Increasing convolutional filters progressively to better extract features that are important. Adjusting layer order to convolutional, batch normalization, activation, and pooling to decrease spatial size and computations in the neural map.
- Training Optimization: Using early stopping which is a technique used to halt the training of a neural network when a monitored metric stops improving, thus preventing overfitting. Using a learning rate schedular to modulate how the learning rate of the optimizer changes over time. Class weighting adjustments which help balance the model’s attention toward all classes, particularly useful for handling class imbalance.
Focusing on the neural network model is very in-depth. The training is trial and error, so in terms of the technology we have been using, Keras has been our primary framework for creating this model. There is a lot to learn just from this single framework. While perhaps next quarter, we will start brainstorming different pipelining methods to connect the UI we will also need to create, to the neural network model, to the dataset acquisition. While we have not quite discussed specifically what technologies we will use, certain members who will be focusing on those already have made their mind of what will be most effective for welding the project together for other non-developers to use.