{"id":22,"date":"2022-01-21T06:37:09","date_gmt":"2022-01-21T06:37:09","guid":{"rendered":"https:\/\/blogs.oregonstate.edu\/mutex42\/?p=22"},"modified":"2022-01-21T06:41:40","modified_gmt":"2022-01-21T06:41:40","slug":"learning-about-machine-learning","status":"publish","type":"post","link":"https:\/\/blogs.oregonstate.edu\/mutex42\/2022\/01\/21\/learning-about-machine-learning\/","title":{"rendered":"Learning about Machine Learning"},"content":{"rendered":"\n<p><p>I\u2019ve spent the past week completing a crash course on Machine Learning.&nbsp;&nbsp;Here\u2019s what I\u2019ve learned..<\/p>\n<br><\/p>\n\n\n\n<p><p>Machine Learning(ML) is a subfield of Artificial Intelligence, with other popular subfields including Perception and Deep Learning.&nbsp;&nbsp;The goal of Machine Learning is to understand the structure of a set of data and to fit it to models that can be understood and utilized by people.&nbsp;&nbsp;ML fits data to models by training on data inputs and utilizing statistical analysis to generate outputs within a specific range.&nbsp;&nbsp;<\/p>\n<br><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"737\" src=\"https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/stephen-dawson-qwtCeJ5cLYs-unsplash-1024x737.jpg\" alt=\"\" class=\"wp-image-28\" srcset=\"https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/stephen-dawson-qwtCeJ5cLYs-unsplash-1024x737.jpg 1024w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/stephen-dawson-qwtCeJ5cLYs-unsplash-300x216.jpg 300w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/stephen-dawson-qwtCeJ5cLYs-unsplash-768x553.jpg 768w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/stephen-dawson-qwtCeJ5cLYs-unsplash-1536x1105.jpg 1536w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/stephen-dawson-qwtCeJ5cLYs-unsplash-2048x1474.jpg 2048w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/stephen-dawson-qwtCeJ5cLYs-unsplash-1568x1129.jpg 1568w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Photo by Stephen Dawson on Unsplash https:\/\/unsplash.com\/photos\/qwtCeJ5cLYs<\/figcaption><\/figure>\n\n\n\n<p><p>The data inputs are typically structured in the form of a set of \u2018features\u2019.&nbsp;&nbsp;The output of the data is normally called it\u2019s classification label.&nbsp;&nbsp;For example, if the output for a set of data labeled if a vehicle was a sedan, suv, van, motorcycle, etc., it\u2019s feature set may include datapoints like how many seats it has, how many doors it has, how big is the engine, if is it 4 wheel drive, etc.<\/p>\n<br><\/p>\n\n\n\n<p><p>There are a variety of statistical models which are used in Machine learning, depending on the type of grouping you want to generate based on your data.&nbsp;&nbsp;Although the models can vary, the implementation approach is still similar across them.&nbsp;&nbsp;A set of input feature set data is used to train the statistical model, which features that feature set to also provide the correct output classification labels.&nbsp;&nbsp;The ML program then updates it\u2019s statistical model, e.g. weights for inputs and groupings until it can generate an accurate output based on the training data.<\/p>\n<br><\/p>\n\n\n\n<p><p>The statistical models in Machine Learning are typically called regressions or classifications.&nbsp;&nbsp;&nbsp;These differ based on if you are trying to find a continuous output value (regressions) or a discrete label (classification).&nbsp;Regressions are used for prediction continuous outputs e.g., predicting sales based on demand where the total sales number will continue to grow as demand increase.&nbsp;&nbsp;Classification utilizes models called classifiers to find discrete values e.g., if a dataset denotes a sedan or a minivan.<\/p>\n<br><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"683\" src=\"https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/markus-spiske-Skf7HxARcoc-unsplash-1024x683.jpg\" alt=\"\" class=\"wp-image-24\" srcset=\"https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/markus-spiske-Skf7HxARcoc-unsplash-1024x683.jpg 1024w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/markus-spiske-Skf7HxARcoc-unsplash-300x200.jpg 300w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/markus-spiske-Skf7HxARcoc-unsplash-768x512.jpg 768w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/markus-spiske-Skf7HxARcoc-unsplash-1536x1024.jpg 1536w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/markus-spiske-Skf7HxARcoc-unsplash-2048x1365.jpg 2048w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/markus-spiske-Skf7HxARcoc-unsplash-1568x1045.jpg 1568w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Photo by Markus Spiske  on Unsplash\nhttps:\/\/unsplash.com\/photos\/Skf7HxARcoc<\/figcaption><\/figure>\n\n\n\n<p><p>Regression and Classification models fall under a category of \u2018supervised\u2019 machine learning models.&nbsp;Supervised models map a series of inputs to outputs based on a series of input\/output examples.&nbsp;&nbsp;These examples are used to train the statistical models used.&nbsp;&nbsp;Common Regression models include the well-known best fit line, linear regression, etc.&nbsp;&nbsp;Common Classifiers include Decision Trees, Random Forests, Neural Networks, and Na\u00efve Bayesian models.&nbsp;&nbsp;Random Forests are actually a set of Decision Tree models, in which the collective outputs are then used to vote on the correct answer.&nbsp;&nbsp;These types of models which are made up of several models whose outputs are they used to generate a collective output is called \u2018Ensemble Learning techniques\u2019.<\/p>\n<br><\/p>\n\n\n\n<p><p>\u2018Unsupervised\u2019 machine learning models identify patterns in input data without references to labeled outcomes, meaning these models are not trained on a labeled dataset.&nbsp;&nbsp;It draws its own inferences.&nbsp;&nbsp;The two major methods used are clustering and dimensionality reduction.&nbsp;&nbsp;Clustering is used to \u2018cluster\u2019 datapoints with like datapoints and identifies different cluster sets.&nbsp;&nbsp;This can be used in things like image processing to identify if an image is a dog or a cat.&nbsp;&nbsp;Common clustering models include k-mean (nearest neighbor) clustering.&nbsp;&nbsp;Dimensionality reduction are models that reduce the dimensionality of your feature set to the key features.&nbsp;&nbsp;The two main methods are feature elimination or feature extraction.&nbsp;&nbsp;A popular model of this is Principal Component Analysis.<\/p>\n<br><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"819\" src=\"https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clay-banks-no2blvVYoJw-unsplash-1024x819.jpg\" alt=\"\" class=\"wp-image-29\" srcset=\"https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clay-banks-no2blvVYoJw-unsplash-1024x819.jpg 1024w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clay-banks-no2blvVYoJw-unsplash-300x240.jpg 300w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clay-banks-no2blvVYoJw-unsplash-768x614.jpg 768w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clay-banks-no2blvVYoJw-unsplash-1536x1229.jpg 1536w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clay-banks-no2blvVYoJw-unsplash-2048x1638.jpg 2048w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clay-banks-no2blvVYoJw-unsplash-1568x1254.jpg 1568w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Photo by Clay Banks on Unsplash https:\/\/unsplash.com\/photos\/no2blvVYoJw<\/figcaption><\/figure>\n\n\n\n<p><p>When actually training and generating your finalized model, there are a few different approaches that are used to boost it\u2019s predictive accuracy.&nbsp;&nbsp;The first method is called \u2018boosting\u2019, which essentially creates an \u2018Ensemble learning\u2019 model from your chosen classifier.&nbsp;&nbsp;This method generates a set of models from your classification method e.g., slightly changing the weights for different input features across each model.&nbsp;&nbsp;Then it takes these collective outputs, weighs each of them, and uses them to vote on a \u2018true\u2019 answer for the set of models.&nbsp;&nbsp;Another method is using Genetic Algorithms.&nbsp;&nbsp;Genetic Algorithms is a training technique that generates a random set of models from your chosen classifier.&nbsp;&nbsp;It then sees which of these models is most accurate, and generates \u2018child\u2019 models from these top models as the next \u2018generation\u2019 set to test.&nbsp;&nbsp;It does this for multiple generations, which should ultimately generate an accurate \u2018evolved\u2019 model.&nbsp;&nbsp;Children can also be generated with things like trait swapping and mutations to ensure variety to test for better solutions than what the parents offered.<\/p>\n<br><\/p>\n\n\n\n<p>I\u2019ll be deep diving on Neural Networks next and how to implement them in Python from scratch.&nbsp;&nbsp;Stay tuned for more!<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"681\" height=\"1024\" src=\"https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clarisse-croset-tikpxRBcsA-unsplash-681x1024.jpg\" alt=\"\" class=\"wp-image-23\" srcset=\"https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clarisse-croset-tikpxRBcsA-unsplash-681x1024.jpg 681w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clarisse-croset-tikpxRBcsA-unsplash-199x300.jpg 199w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clarisse-croset-tikpxRBcsA-unsplash-768x1155.jpg 768w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clarisse-croset-tikpxRBcsA-unsplash-1021x1536.jpg 1021w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clarisse-croset-tikpxRBcsA-unsplash-1362x2048.jpg 1362w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clarisse-croset-tikpxRBcsA-unsplash-1568x2358.jpg 1568w, https:\/\/osu-wams-blogs-uploads.s3.amazonaws.com\/blogs.dir\/5122\/files\/2022\/01\/clarisse-croset-tikpxRBcsA-unsplash-scaled.jpg 1702w\" sizes=\"auto, (max-width: 681px) 100vw, 681px\" \/><figcaption>Photo by Clarisse Closet  on Unsplash\nhttps:\/\/unsplash.com\/photos\/-tikpxRBcsA<\/figcaption><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>I\u2019ve spent the past week completing a crash course on Machine Learning.&nbsp;&nbsp;Here\u2019s what I\u2019ve learned.. Machine Learning(ML) is a subfield of Artificial Intelligence, with other popular subfields including Perception and Deep Learning.&nbsp;&nbsp;The goal of Machine Learning is to understand the structure of a set of data and to fit it to models that can be&hellip; <a class=\"more-link\" href=\"https:\/\/blogs.oregonstate.edu\/mutex42\/2022\/01\/21\/learning-about-machine-learning\/\">Continue reading <span class=\"screen-reader-text\">Learning about Machine Learning<\/span><\/a><\/p>\n","protected":false},"author":11967,"featured_media":25,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-22","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","entry"],"_links":{"self":[{"href":"https:\/\/blogs.oregonstate.edu\/mutex42\/wp-json\/wp\/v2\/posts\/22","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.oregonstate.edu\/mutex42\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.oregonstate.edu\/mutex42\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/mutex42\/wp-json\/wp\/v2\/users\/11967"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/mutex42\/wp-json\/wp\/v2\/comments?post=22"}],"version-history":[{"count":3,"href":"https:\/\/blogs.oregonstate.edu\/mutex42\/wp-json\/wp\/v2\/posts\/22\/revisions"}],"predecessor-version":[{"id":32,"href":"https:\/\/blogs.oregonstate.edu\/mutex42\/wp-json\/wp\/v2\/posts\/22\/revisions\/32"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/mutex42\/wp-json\/wp\/v2\/media\/25"}],"wp:attachment":[{"href":"https:\/\/blogs.oregonstate.edu\/mutex42\/wp-json\/wp\/v2\/media?parent=22"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/mutex42\/wp-json\/wp\/v2\/categories?post=22"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/mutex42\/wp-json\/wp\/v2\/tags?post=22"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}