Daily Archives: November 26, 2024

Scraping Data and Syncing Up in the MMA Prediction Model Project

The capstone journey has been exciting, and our project, the MMA (Mixed Martial Arts) Prediction Model, is steadily taking shape. We recently completed the first big milestone: scraping raw data from the UFC Stats Website. Now, we’re gearing up for the next phase-preprocessing this raw data into a format suitable for analysis. But today, I want to reflect on the process so far, including the challenges, the tools we’re using, and how asynchronous communication through Discord has played a key role in our collaboration.

The Data Scraping Stage

Scraping data was both a challenging and rewarding task. We utilized the Scrapy library, a Python-based tool perfect for web scraping, to extract fighter statistics, fight outcomes, and other key metrics. Setting up Scrapy on my iMac was surprisingly smooth, thanks to its compatibility with the Unix-like systems. However, the process wasn’t without its hurdles. One challenge we faced was ensuring the scraper could handle dynamic elements on the UFC Stats site without breaking. After a bit of trial and error (and some helpful documentation), we fine-tuned the scraper to run efficiently and collect clean, structured data.

Preparing for Data Preprocessing

Now that the raw data is in hand, our next step is preprocessing. This stage will involve cleaning data, dealing with missing values, and transforming the dataset into a format our prediction algorithms can digest. It’s a critical step that will set the foundation for accurate predictions. I anticipate some interesting discussions with my teammates as we decide how to handle outliers and structure our data models.

Asynchronous Communication via Discord

One of the biggest takeaways from this project so far is how effectively our team has utilized asynchronous communication on Discord. With everyone juggling their own schedules-work, school, and other commitments-having a central hub for updates, questions, and discussions has been a game-changer.

Here’s how Discord has worked for us:

  1. Channels for Organization: We created separate channels for different aspects of our project: general, resources, meeting notes.
  2. Sharing Code via GitHub: Discord complements our GitHub repository beautifully. Whenever someone makes changes or pushes updates, they drop a quick note in the Discord, ensuring everyone is on the same page.
  3. Async Flexibility: The asynchronous nature of Discord allows us to contribute on our own schedules. Whether it’s dropping a quick idea or reviewing someone else’s code, we don’t have to coordinate real-time meetings to make progress.

Working on this project has highlighted how technology not only powers our tools but also shapes our collaboration. While we have different skill sets, we’ve come together to create something that feels cohesive and purposeful. Looking ahead, I’m eager to see how our preprocessing stage pans out and how our prediction model begins to take shape. I imagine there will be plenty of debugging, brainstorming, and learning as we move forward, but with the momentum we’ve built and the communication tools we’re using, I’m confident in our team’s ability to deliver.

Thanks for reading, and stay tuned for more updates as we continue our MMA prediction journey!