Machine Learning – Resources for Beginners

Now it’s the middle of week three, which means that the projects are selected, groups are assigned and we are diving into the most interesting part of the semester. 

Although the list of projects to choose from was quite long, the project that was my favorite and on which I am working right now is “Smart Recycling Bin”. The idea behind the project is to create a cross-platform mobile app that will allow users to take a photo of the item in question and through machine learning and database querying the user will get an answer if the item is recyclable, and if it is, the user can connect to the smart recycling bin (that will be developed by other Electrical Engineering students) and the bin will open up for the user to drop the recyclable item. In other words, the app functions like an authenticator for recycling items to access the smart recycling bin. 

This project interested me as it is a valuable real-life application and is giving me an opportunity to work with an industry sponsor, as well as practice mobile development skills that I am currently learning in CS492, and first and foremost learn ML, as it was a big part of “authenticating” recycling items. 

I was completely new to ML/AI and the second week I spent researching and learning about ML/AI, and in this blog, I will share some resources that I think are useful and helped me dive into the world of machine learning.

Comprehensive Guide to ML

Prior to starting the project, I heard about the YouTube series from Andrew Ng who is a computer scientist and entrepreneur focusing on machine learning and artificial intelligence.

His Stanford course starts from the very beginning explaining what machine learning is, and what different types (or to be correct models) of learning exist, such as supervised, unsupervised, semi-supervised, and reinforced. While covering these topics he includes historical facts, important figures, and of course great explanations and examples. I found these videos very interesting and engaging and certainly will revisit this topic when I get more free time. 

https://www.youtube.com/playlist?list=PLLssT5z_DsK-h9vYZkQkYNWcItqhlRJLN

Image Categorization

Since I couldn’t watch 120 videos ranging from 5 to 15 min long, I decided to look into more project-specific research. In my case what I was looking for was image classification. The task of my model is to identify objects in the image or answer the question of what is displayed in that particular image. 

What I quickly learned is that a convolutional neural network (cnn) is exactly what I needed, and the best resource that I was able to find is as follows:

This youtube video that’s about 30 min long does a great job summarizing and giving a simple but practical example of how the model works. There are certainly some repetitive moments but it really shows the main idea behind this approach. 

How it works

2. Hands-on approach  

Tensorflow tutorial for hands-on experience. It’s easy to follow and you can follow along using GoogleCollab which requires no additional software.

https://www.tensorflow.org/tutorials/images/cnn

3. Full Course

Last but not least, Coursera course by… you guessed it, Andrew Ng which you can audit for free.  The free version only allows listening to the videos, but it’s a great in-depth explanation of the CNN model.

https://www.coursera.org/learn/convolutional-neural-networks/home/week/3

Google Cloud Platform

At this point, I felt like I was able to gather enough understanding of what I was up to and was eager to start designing my own model, however, I decided to give it one last Google… 

During Cloud Application Development class we were mostly focusing on datastore module of Google Cloud Platform (GCP) but as we were activating other modules I remembered seeing AI and ML modules. Sure enough, Google has its own machine learning module that speeds up the process of creating your own model, by using the Automated Training module (AutoML) to help you with training the model or you can even use existing models.

Furthermore, Google has a nifty API called CloudVision API that uses existing models pre-trained on thousands of categories to identify objects, label them, and recognize text and logos to name a few features.  

That API showed quite a bit of promise so my team and I decided to use that as our first option, as this model and technology will only improve over time, it will allow us to be more flexible as the sponsor was looking for a cross-platform application. We felt like using this API will certainly give us this flexibility not only for cross-platform applications but as a list of recyclable items grows we don’t have to re-train the model to identify new items.  

Too good to be true

The reason I say CloudVision API is our first option is that as we explore the functionality of the model we learned that sometimes it’s trying to identify items that were are not interested in and therefore giving us not accurate from the perspective of this project scope results. 

For example:

We can see that model identifies that object is a bottle(96% probability) and that it’s a plastic bottle (85%). This is great, however, you already can see that label “glass” has 74% probability.

Now, let’s try to see how the model will label broken glass bottle. Typically you can recycle glass bottles but not broken ones as sharp pieces may harm workers. So in the below example, we don’t even see a label that identifies an object as a bottle, only “glass” at 78% probability. We can see that even google has its own limitations.

Conclusion

As a result of this finding, we are planning to provide a guide window where users should center the recycled object, therefore, reducing the amount of unnecessary information and hopefully improving the results. Alternatively, we are prepared to train our own model if we would feel that API doesn’t give us desired results. 


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *