Why did you and your team choose the technologies you did?
We chose the technologies that we did because we are doing a neural network project, and therefore we focused on the most optimal technologies that are currently being utilized in machine learning and neural network analysis. Additionally, machine learning is an exciting aspect of Computer Science to investigate as it permeates into numerous industries. It is an interesting topic for our team project, and we are aware that it is different from the programming that we were exposed to in previous courses in the program. To be more specific, we are using our trained CNN to classify new songs into ten different genres. We will utilize MongoDB to store our primary data set from the GTZAN Genre Collection for the songs after it has been processed by Librosa, which has been created as an audio-to-spectrograph processor and for supporting audio processing in general. There is an additional opportunity to process additional song samples from other sources such as YouTube. The programming language of choice will be Python, and we selected this as we are most familiar with this programming language given how previous courses were focused on Python. For the machine learning framework, we are using TensorFlow with Keras. TensorFlow is widely used as an open-source software library for both machine learning and artificial intelligence. This is an ideal technology for our project goals because it is specialized in training and inference of deep neural networks which is an excellent match for our project goals. Additional technologies are GitHub for our version control system, and we will utilize the numpy and scikit-learn libraries.
How will your project use them?
Our project will use these technologies in conjunction with one another. Firstly, we will utilize an existing data set known as the GTZAN Genre Collection. We will analyze these datasets utilizing metadata from other songs as well as spectrographs. We will also implement a training process for our model, where the audio sources input is to be processed with an audio processor known as Librosa and then MongoDB will function as our primary data storage system. Next, the neural network will be trained utilizing this data set and this will be done with the machine learning framework we have selected – TensorFlow with Keras. Lastly, the user will import audio clips for data analysis and then the song genres will be provided because of the trained neural network model. There will also be a provision in the level of confidence of the song genre that has been provided. These technologies will be used for this project by providing the user with a way to import the song, and then processing the song and then this will be sent to the neural network model to provide the genre. As we write the source code, we will also be using the numpy and scikit-learn libraries. We will be implementing regular commits and reviewing pull requests all throughout this project towards our version control system or GitHub.
What are their pros and cons?
Some pros of using the technologies that we have selected are that there is excellent documentation that has been provided in working with the challenging aspects of the project. For instance, TensorFlow has a robust set of resources that will facilitate familiarity with the software in a streamlined and efficient manner for our team. There are pre-trained models and datasets that have been provided by TensorFlow that can be studied for the purposes of our project. Keras also has extensive documentation for its references that are easily accessible, and this will assist us with the deep learning and neural network training models that we seek to build and implement. We also appreciate the pro of how the technologies that we are using as open source platforms, which allows for us to utilize any parts of the software as we wish. As for the other technologies that we will be utilizing, we have gained experience with them in previous courses, and this serves as a pro as we have overcome the learning curve for them.
As for the cons, one of the major ones is that despite the extensive documentation and excellent resources that the creators of TensorFlow and Keras provide, these technologies may not be as intuitive as the ones that we have been previously exposed to as a group. This may seem like a pro as well as it provides us with the opportunity to gain more familiarity with technologies popular in the machine learning and artificial intelligence realms. There are also inconsistencies and architectural limitations with TensorFlow that could pose as a challenge for the consistencies that we will require for genre classification. There are also limitations of Windows related support regarding TensorFlow, as well as the fact that our program may need constant updates due to the frequent updates that TensorFlow implements.