The first programming language I ever got serious about learning was Python. My first intro with the language was a class called ‘Introduction to Programming and Problem Solving’. I mainly took the class because it filled a general education requirement, and it had ‘Problem Solving’ in the title which I thought was pretty cool. Out of all the problems I’ve used Python to solve, my most favorite was using Python to web scrape.
I hadn’t thought much about web scraping before, however as I walked into class for the analytics program I was taking, the days itinerary included web scraping with Python and Beautiful Soup. It didn’t take very long for me to fall in love with the idea of automating traversing websites and pulling information from them. Right after class, I went home and began fiddling around with Beautiful Soup to solve a problem I was facing at the time. I wanted to get a list of N3 level Japanese vocabulary to study. One way I looked up vocab was by visiting Jisho, an online Japanese dictionary. I took the basic learnings from my class and end up writing a script to loop through Jisho pages, printing out vocabulary to a document which I could then use to study vocab. While the use case isn’t anything too fancy, or complicated, this was one of the first times I underwent my own Python project, and really enjoyed it!
Since my first project with scraping Japanese vocabulary, I then moved on to scrape news websites. I was helping my friend out with a project and he had a list of websites that he wanted to grab articles from. With the Beautiful Soup library I gathered some basic data for him to play around with. All the experience I had in my own side projects and helping my friend out greatly helped me with creating web / file scrapers for my job. Instead of manually going through files to check values, I was able to write a script to go through files on the network paths and insert records into a database. The prototype that I was able to whip up in Python has since turned into a part of our application since the data I gathered proved that we should be capturing values in our database.
If there’s any takeaway from this its to play around with as many libraries as you can in a language, because you never know which one will just feel right and inspire you to create your own mini project.