Datasets, Python scripts, and Halloween – Wild Wacky Adventures in Senior Software Engineering

Our project has some peculiarities to it that are going to make implementing a neural net a little bit trickier than it might otherwise be. More or less, we are supposed to be using a LiDAR camera interpret someone signing in American sign language and transcribe it. Only problem is training a neural network takes massive amounts of data and there simply aren’t many people recording themselves signing in front of a LiDAR camera. I’m pretty sure that we can’t train the neural network on one type of data and then feed in an entirely different kind of datatype and expect it to accurately classify. Which leaves a couple of options. Create a dataset of signing with LiDAR camera data, or ditch the LiDAR and try to use regular video. I don’t know sign language, and no one else in our group does and this would require a massive amount of data, probably thousands of hours, so that makes the first option unattractive. Even if we did try to learn enough to record a lot of examples, we’d likely end up with a neural network trained to only recognize really bad amateurish sign language and not much else. The second option still requires finding a dataset…

Which I did! It’s called the WLASL, or Word-Level American Sign Language. They basically created a massive json with thousands of labelled of entries to videos of people speaking sign language, a download script, and a preprocessing script. It has a ton of examples, too many in fact. I talked my way into getting access to the pelican cluster for training but after pulling the repo, starting a screen session, running the download script, and going to bed, I woke up to a ton of warnings about overfilling my allotted hard drive space on LDAP, and having a basically unresponsive server account. My solution was to delete everything, fork the repo, modify the download script to download a smaller subset of the training data, and start from scratch. I figured if we could get a neural net working to a reasonable level training with a smaller dataset, we could figure a way to download the rest of the dataset and increase our accuracy later on, that way we can at least be somewhat confident in our proposed architecture before going all in. Well that was going well after tons of refactoring, but then I realized the preprocessing script was massively inefficient, copying videos that didn’t require preprocessing, and leaving the original raw videos in place either way, effectively doubling the amount of space required for the raw and preprocessed videos. In fact it actually converted everything to mp4 in a separate directory, then preprocessed only the frames of interest from those into a new directory, effectively tripling the required storage. Some people… Anyway I’m still in the process of trying to fix that too but opencv is giving me cryptic issues.

On top of all of that, it’s Halloween weekend. A wise professor once told me

“Sleep more than you study,
Study more than you party,
Party as much as possible”
– I can’t remember his name, but he was awesome

Leave a Reply Cancel reply