This week saw the creation of a timeline that helped our client understand how we will be continuing forward on our project. Creation the timeline was particuarely difficult as it was difficult to project what how well we will be doing on early parts of the project. Along with this, it’s hard to see what needs to be worked on later on. Some issues can crop up that are n’t even considered during the process of the project. Sometimes small functionality issues can cause hard stops in the programming process. So the timeline was very specific early on in the develop process but as the timeline continues, it becomes very vague.
I also worked on an idea for my sections of the team project, which is preproccessing. The idea comes from the general architecture of cross correlation. My sections of the project pertains to detecting when a hand is making a new sign, so that the main neural network doesn’t need to constantly be parsing data. Cross correlation is involved in the fundamentals of a convolutional neural network. In our case it can be used in detecting the difference in two images by simply finding the difference in each pixel and attributing it to a distinct value. Then if we keep track of the difference between this value and the previous threshold, we will know when the movements transfer between a still hand and one that’s moving. I don’t know whether this will actually work or not, but it’s worth it to just test out.
Finally, our team has begun really starting our project. A lot of different aspects of the project, like the video recognition and preprocessing have been discussed thoroughly. This means that we are finally starting to proceed forward and get a feel for the cod base on our project.
Author: thompsoq
Seventh Blog Post
This week we have finally gotten around to working on the code base that has been provided by the previous project group. Some things have cropped up while looking through the code, such as what composes the convolutional neural network. There is a severe lack of documentation that clarifies what the Resnet18 is made up of. As such, it is hard to see whether the network has essentially components like batch normalization or dropout. These layers are essential for preventing overfitting of the model when training. Not knowing whether they are included or not can lead to our teams detriment as attempting to include them will lead to underfitting the model if they are already contained within the model.
We are also beginning to develop our timeline. This is a particularly difficult thing to decide as it’s hard to tell how long it will take to tune the hyper parameters. Along with this, the model itself could be flawed, meaning that the model may be scrapped if that cross roads is come to. This culminates into many things that are really beyond our ability to foresee. However, I will be discussing with my teammates Travis what the best option will be when moving forward. It shouldn’t really matter in the long run, as a timeline is just a best estimate for how long it will take, and any roadblocks can be discussed with our sponsor.
Finally, we’ve been discussing what our first steps will be when it comes to implementing changes to the code base. Github will be our primary centralized version control, along with pep8 and black to handle the documentation style. One thing that most of our team agrees upon is the lack of style that the previous code has needs to change. The most common complaint that I personally have is the fact that random semicolons are added throughout the program. This is unnecessary, and was likely caused by someone who was extremely used to C style coding. Albiet, none of this really effects the program capabilities, it makes it harder to read.
Sixth Blog Post
This week has seen the actual introduction to the project codebase. The actual GUI setup is apparently auto generated from a application. It’s an interesting API to learn about as the GUIs I have worked on the past have all been manually created. I don’t really know whether to think that this is a better option than the alternative. I can see it being helpful if someone wants an easier time to create a general front end interface that does not provide much functionality. However, for very specific actions can problem cause some road blocks in the future. This is not very important, as much of the front end interface has already been finished.
Setting up the webcam and the actual program started with some faults. The Setup page didn’t describe exactly which folder the virtual environment was supposed to be setup in python. This led to some frustration. Along with this, one of the included libraries in the requirements.txt didn’t work, so I had to go out through some discussions online to find a workaround for it. It’s good to keep this in mind in the future, as the program we create will need to be tested on a clean environment setup wise.
The design document so far has been a pretty confusing document to create. The document example helps a bit but it will take a while for my team to figure out what exactly we will want to put down for each of the different design concerns and concepts. Usually a the group documents consists of a lot of combining the different individual documents. However this one requires a lot more conversation about what the contents will be.
Fifth blog post
This week we started to talk about the actual changes we needed to make in our project. It was confirmed that unfortunately, because of how fundamentally different image recognition is from video recognition, almost the entire neural network will need to be scrapped and created again from scratch. The new network will be much more difficult to create, as the extra dimension to allow for recognition over time means that there will be even more layers of complexity. However, I am somewhat glad we are doing this. Having a fully furnished project handed to us with the most interesting parts already completed would have been a dull task. We get to recreate a network that I have never worked with before, which means that there will be a lot of interesting topics to research.
We also discussed our individual technology review and came to the agreement that it was done incorrectly. We each chose separate sections of the project that are pretty dissimilar to one another, meaning each of us would have to research all aspects of the project. Instead, it was decided that the project should be compartmentalized into separate sections, that way one person would really only need to research and work on their specific section of the project. Pretty obvious in hindsight and I feel like I should have seen this as I have discussed this before with my Mentor at Garmin.
Finally, we got our Intel cameras in the mail. The depth sensing feature using LiDAR is an extremely unique technology. It is undetermined whether it will actually help when it comes to lowering the loss of a neural network. In my opinion, I feel as though the adding of depth may actually make the network perform worse. A hand is usually a singular colored object in a small spectrum of the three primary colors. However, a depth based camera would rapidly shift the colors throughout the whole spectrum. As a 2D CNN would have to keep track of these with all three of the color channels, it would have to be fed more data for this pattern. A harder pattern to distinguish means a more difficult time finding a good minima.
Fourth Blog Post
A lot of this week was talking about different aspects of the project and what came with them. The requirements were mostly already gone over in our previous meeting. Unfortunately, the intern who previously worked on this project was out of office, meaning we could not get good in depth requirements for what we were specifically supposed to do. Luckily, a lot of conclusions can be drawn based on the github code provided. We have created our own requirements that we think should be done along with the project. For example, refactoring the code base would a good idea. A lot of the program is developed into individual scripts that are handled independently. We believe it would be a good idea to merge all of these individual pieces so that we could run them from one main file.
We also had to take our time to designate tasks between us. It was especially difficult as attempting to split parts of the project up can lead to carryover between work. Sometimes a project member may have nothing to do because sections of the project he needs to work on are currently being held back by someone else. We managed to get something cohesive between the three of us, but we have not yet managed to contact our other two team members. I am not surprised, as they had very short notice of our meeting to discuss these things. This is something that we as a group work on in the future.
As for the project itself, we may be re-writing sections of it in c++, as the client has requested we port some of it over into an API that is handled in C++. Most of the ideas we have come up with needs to run by the client themselves, but I think our team has a good grasp on where we need to go next.
Third Blog Post
As opposed to last week, we have finally contacted our client and gotten information on what we are specifically supposed to do on the project. Most of the work on the neural network has already been done. We are instead supposed to optimize how well the network can handle the recognition process. Currently the project can only handle images so well, with an error in recognition which is above what is allowable.
Another part of the project which I didn’t even consider is the inclusion of three dimensional convolutional neural networks to pattern match a video segment as opposed to a single image. Because some letters in American Sign Language require an aspect of movement, like the letter J, the network would need to analyze multiple images that are in sequence. Out of everything that has been gone over so far, things like LiDAR (Very similar to how GPS signals function) and CNN’s, this is what I’m most unfamiliar with. I believe the ramp up to understanding how we will incorporate a system to recognize movement will be fairly fast, but it will be difficult in of itself to implement. Many tensor libraries have a general lack of support for them and not very many guides that cover the subject.
It’s a bit disappointing to pick up directly where someone left off when all the most exciting segment has already been setup. From here on out it will essentially just be changing the size of the neural network, or changing the the hyperparameters. The same goes for the actual LiDAR camera, creating the proccessing for the images for the images would be really cool, but that is not an aspect of the project we are really working on. Although, this is a minor complaint, as there is still a lot of cool things to mess around with.
Second Blog Post
We are supposed to be working with PyTorch on this project. Getting into the headspace of programming in machine learning is a difficult task to accomplish. However, Neural Nets are given a strange degree of “deceptive complexity”. In reality, high level understanding of neural networks make them seem simple, by most standards. Mind, the calculus that makes up a Neural Network still requires a lot of background knowledge to understand. However, basically anyone with a limited understand of computer science can make a Neural Network with the tools that are currently provided.
Tensorflow, Keras and PyTorch are prime examples of this, with many of the most difficult to create features (back propogation and 2D kernel generation) being boiled down into one or two lines of code for the creator. This is the main reason why projects like these are even possible for students. Building a Densely Connected Neural Network from the ground up, even a very small one, takes a load of time and knowledge. This isn’t even entertaining the idea of training the models after the Network has been created. Instead, a student can simply reach a level of enough understanding to get at least a jist of how Neural Networks work and then let something like PyTorch do the rest of the work.
As for what has gone well and poorly with the project on this week, my teammates are very diligent when it comes to communication and work done already, which is always a good change of pace. Unfortunately, out client has not yet responded to us yet, which is quite worrying. Hopefully, it just a minor setback, as it is the weekend.
What could be done better in the course? I think more transparency on how the selection process works would help. I have heard that it was simply a random selection, which didn’t give me a large amount of confidence in the rest of the course. Of course, this is conjecture, as I don’t actually know whether any of that is true. What I enjoy about this coruse currently is the amount of leeway we get when it comes to oversight on the project. Most things are left to the client and the students, which is nice.
First Blog Post
- What got you started with computers or software?
My dad worked as an IT helper/Software developer at Portland university when I was a kid. I thought what he did was really neat, so I took a summer course in 6th grade. - Current job or internship
I work at Tektronix until December through MECOP. I am currently working with neural networks and oscilloscopes. - Favorite technologies
Embedded system and development boards have so far been my favorite technology to work with. My work at Garmin included both of these things and I really enjoyed what I did there. - Favorite listed projects (in this course) and why
Voice to vision. Includes a lot of machine learning and embedded systems, both are topics that I have worked with before - Current interests
Unreal Engine. I am currently modding a game that I like and am modding it using Unreal 3 Engine. - Journey with OSU
I’ve been at OSU for a total of 5 years already, beginning in 2015. I originally started as a CS systems major, but took a second major in ECE in 2019 - Kids, pets, hobbies, sports, games, activities, etc
Hobbies: Unreal 3, Drawing, Biking, Hiking, Baking, Russian language, Working out, Video Games.
Hello world!
Welcome to blogs.oregonstate.edu. This is your first post. Edit or delete it, then start blogging!