It is embarrassing to admit, but I am one of those Computer Science majors not well versed in shell commands. Navigating directories, creating files, and compiling/debugging basic C programs is basically the beginning and end of my knowledge. This is what prompted to take the evening to explore a little more.
Thus, I set out to write a few simple scripts. I thought it would be neat to add a couple custom commands to my terminal. The first one prints the definition of a word to console. I accomplished this using by writing a function that accepted the word in question as an argument. The definition was requested from dict.org then that function is aliased using another function so that the argument can be passed to the function.
The second command prints out my IP address to the console. My computer is a laptop, so this is not completely frivolous. The current network configurations are found through ifconfig. The results are piped to grep which searches for lines containing “broadcast”. Then, this line is piped to awk to select the IP address portion and display it.
I noticed that by aliasing those commands that they did not persist into a new terminal window as I intended. My default shell is Z shell, so I appended the commands to the .zshrc file in the root directory. This resource script is run every time a shell is started (or restarted) so persistence was achieved. While that all sounds very basic, it was not without complications. I spent a considerable amount of time trying to figure out how to pass the argument for the dictionary word once it was aliased. I was messing around with string expansion pointlessly until I figured out the function was necessary. To my understanding, the alias arguments were being passed after the curl command which is why it only initially returned with a default.
This little exercise was exciting and a couple hours melted away pleasantly. It reminded me of the fun I had in Introduction to Operating Systems. Feeling reintegrated, I’ll have to think of a passion OS project for myself after this quarter. The courses from the program here at Oregon State University have offered a lot of breadth and shown me that I have only seen the tip of the CS iceberg. I enjoy that one project just becomes a stepping stone to the next project.
A small exploration to break up the semester’s work
When I first discovered API’s on the web, my world felt expanded. As my need for different kind of data increased, however, I started to feel limited by the available API’s I could find. That is how my interest was peaked by web scraping. I was familiar with the concept, but my assumption was that it would be difficult to learn and I put it off for later. I was wrong, it was much more accessible than I imagined using pre-existing Python frameworks and unlocked buckets of possibilities for acquiring new information all across the web.
Before even diving into web scraping with Python. There was the question of which framework to start with. Beautiful Soup was the biggest name out of Selenium and Scrapy. Selenium has the reputation of having more set up/ learning curve, while Scrapy has more built in features such as being able to make requests and parsing html more specifically than Beautiful Soup. Ultimately, I opted to start with Scrapy, but I am interested in trying all these options at some point!
The YouTube tutorial based project I coded up scraped images of otters from the reddit otter subreddit. Then, the pixels are mapped to ASCII characters and printed to the console. Scrapy was ridiculously easy to use. After initializing the project from the command line the set up was simply importing Scrapy into the file and specifying a start URL. The request and filtering to get the source tags with specific alt text happened all in one single command. That was all there was to it.
Overall, I was pleased with the experience, but the results need tweaking. I got some repeat images which wasn’t desirable. The ASCII art little unclear, but I have some direction for what I can change to improve.
Side Project Importance
I convinced myself that I did not have time for even a small side project. I was right, but I think I made the right call by spending time working on this. Even though web scraping is not related to current school projects, it was refreshing to try something new. I already have ideas for how I can use this technique in future projects that I have planned out and I can’t wait to see what other creative things others are doing with web scraping now that that fire is lit.
At the University of Pennsylvania, 40 panels are on display from the forty ton ENIAC. The Electronic Numerical Integrator and Computer announced on 1946 by the New York Times is funded by the United States military. It is so large, it fills a 30′ by 50′ room. Designed to calculate missile trajectories, it could calculate 5,000 instructions per second. For reference, the modern iPhone X weighs 0.3836 pounds and can calculate up to 600 billion instructions per second. Despite that difference in computational power, the ENIAC is very impressive for the time and later was used in the development of the hydrogen bomb. Both men and women contributed to the ENIAC, however, mostly only the men were featured in the media at that time.
The term “operator” downplays their role and hides the need to see beyond abstractions. They were enrolled in or recently graduated from Universities with relevant coursework degrees and chosen for their competence. The devaluation of women is reflected in society and can be observed surfacing on an individual level. In an interview with Jean Bartik she quotes Betty Holberton to say: “Look like a girl, act like a lady, think like a man, and work like a dog.” Even though this piece of advice for women at the time is said with humor and self awareness, it is still telling.
Certainly there are many brilliant men that are worth admiring, but what of the many female academic role models. It is important to recognize the accomplishments of women in order to inspire more young minds to visualize themselves contributing to the next technological advancement.
“ENIAC at Penn Engineering.” Penn Engineering, 2017, https://www.seas.upenn.edu/about/history-heritage/eniac/.
Guru, iPhone. “IPhone X as the Sum of Technologies: IGotOffer.” IGotOffer Blog, 5 Nov. 2017, https://igotoffer.com/blog/iphone-x-sum-technologies#:~:text=This%20is%20reportedly%20the%20AI,second%20for%20real%2Dtime%20processing.
Hines, Nickolaus. “The World’s First Computer Was Bigger than a T-Rex and 5 Million Times Weaker than an IPhone.” All That’s Interesting, All That’s Interesting, 9 Mar. 2016, https://allthatsinteresting.com/first-computer.
Jones, Brad, and Luke Larsen. “Long before Gates or Jobs, 6 Women Programmed the First Digital Computer.” Digital Trends, Digital Trends, 1 Mar. 2019, https://www.digitaltrends.com/computing/remembering-eniac-and-the-women-who-programmed-it/.
I watched my father try to log into his Apple account and the experience was “painful”. He was denied time after time just when thinking he had recalled the right combination of characters. In fear of being locked out of his account because of too many tries, he went for the “Forgot my password” route.
Sometimes this path is implemented with the user in mind and involves a simple link to reset the password via email. Other times, it is a lengthier process involving multiple security questions. Questions that you thought the answer could only be one thing, but against all odds it is not currently working. There must be an easier way.
Characteristics of Strength
Experts recommend that a password be more than ten characters in length. Contain a variety of characters: uppercase, lowercase, numbers, and symbols. The passwords should not contain patterns, recognizable words, or anything else that is easy to guess. Yes, I am talking to you if your password is “password123”. It is also advisable that passwords are not re-used from site to site. Be ready to remember seventy plus strings of near random characters. This is of course an unreasonable expectation, but there is a solution.
Password Managers to the Rescue
If you already use a password manager, please take this time to give yourself a pat on the back. Password managers have existed for over ten years. However, there are still many people who are not taking advantage of them. Many are free and open source like Bitwarden or KeePass. The idea is that you only need to remember one password and the manager keeps track of the rest. Typically they can also generate very strong passwords too. Passwords are always encrypted or hashed so the plain text versions of the passwords are not stored.
Encryption is a two way process involving a private key that the user can decrypt the cipher-text version of their password. On the other hand, hashing is deterministic meaning the algorithm produces the same result for the same input at a fixed length . It is considered to be irreversible. Generally, hashing is done many times; Sometimes over 10,000 passes or 100,000 depending on the manager. I hope your interest is piqued about considering a password manager, if you were on the fence. Technology should simplify our lives, not add more stress.
When diving into a new skill or unfamiliar task, a natural response is to try to find a reference video. Youtube is a popular site, but there are many others. Perhaps you have felt the thrill of finding the “perfect” video to explain everything you need to know about topic X. It does not matter whether X is a certain library function in a programming language, a completely new framework, or even a step-by-step guide on swing dancing. Surely, by the end of this 20 minute (or 6 hour) video, you can achieve mastery of the topic.
It is possible that after watching a single online video that the task can be completed with confidence; More often than not, this is not the case. For anyone following along with a multiple hour long lesson, it can be disappointing to realize that there is a gap in the knowledge when it is time to apply the information on your own project. Most tutorials seem to boil down to the common theme of “just do it”.
Leave Tutorials Behind
Before pouring hours of time into watching tutorials or previewing documentation, there is an optimal amount of time to spend on preparation. Everyone learns skills differently at their own pace. However, most people can agree that it is not worth spending unnecessary time on something without a large benefit. There is some satisfaction in having “completed” a tutorial. Not everyone has the luxury of time to devote to that.
Skill of Set up and Go
This is where the skill of gathering just as much information from a tutorial to get started comes in to play. Leave the door open for experimentation and testing boundaries. Any further questions that may come up can be answered along the way without wasting time on things that are intuitive. When time is precious, hands on experience will prove more valuable than the facade of experience from a code along video in the long run. Use your time wisely!