Data Science Project Reflection: What I Learned From the 2021 NFL Big Data Bowl 🏈

NFL Operations presents the Big Data Bowl 2021

Intro. & I Was Pumped for Super Bowl Sunday

Dec 3, 2017. New England Patriots cornerback Stephon Gilmore knocks down a pass to Buffalo Bills wide receiver Zay Jones. In 2018, Gilmore was the #1 ranked cornerback, and New England shadowed receivers more than any other team. Credit: Timothy T. Ludwig-USA TODAY Sports

Data Preparation & Exploratory Analysis

Player Tracking Data Visualization: QB Aaron Rodgers passes complete deep left to DeVante Adams pushed out of bounds at DET 9 for 30 yards. Adams was closely covered by Cornerback Darius Slay.

Feature Engineering & Graphical Visualizations using Matplotlib

Legend: Blue = Target receiver of pass; Red = Density of defenders in binned field region; Play Event = Pass_Arrived. C = pass result is complete; I = pass result is incomplete.
Legend: Blue = football; Red = defender; Green = offender. QB Aaron Rodgers passes complete deep left to DeVante Adams pushed out of bounds at DET 9 for 30 yards. Adams was shadowed by Cornerback Darius Slay.

Getting Started with Matplotlib & Pandas

Matplotlib example for finance: Stock prices over time

Try Exploring Other Data Visualization Libraries

ML Model: XGBoost Trees to Classify Pass Result

Artificial Intelligence (AI) VS Machine Learning (ML)

Towards Data Science: Artificial Intelligence VS Machine Learning VS Deep Learning

Hardware Requirements & Computation Needed for ML Models

How Cloud Service AWS powers Next Gen Stats

Choosing an ML Algorithm From So Many: Why I Chose XGBoost Decision Trees For the NFL Big Data Bowl

KGNuggets: Evolution of tree-based algorithms over the years. Read this paper (Chen and Guestrin 2016 ) to learn more in-depth how the XGBoost algorithm works.

XGBoost Python Package Implementation: Summary of Steps

Inputs to XGBoost Decision Trees to classify pass result as complete or incomplete at play event of pass arrived.

Results

XGBoost Improvements For Next Time:

Visual: k-Folder Cross Validation to measure model performance

Presenting the Results

D3 is THE Data Visualization Library.

Discussion: Evolving Applications of AI & Analytics in Sports

NFL Next Gen Stats & AWS

Game Strategy for Coaches, Teams, & Players

Amazon’s cloud computing will help Seahawks tackle data for ‘a competitive edge’

Player Health & Safety

Past Winners: NFL 1st & Future Analytics Competition

Sports Betting

My Final Reflection on this Reflection

--

--

--

20-something Associate Product Manager from Philly @ WSJ, Barron’s. Writing about what I’m learning! •Tech, Media, Financial Literacy, Self Development, Health•

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

My Journey from Physics into Data Science

AutoViz: A New Tool for Automated Visualization

Looker: An Introduction

Quick Tutorial on R for Data Science

7 Reasons Why You Should Consider a Data Lake (and Event-Driven ETL)

How to ace your first hackathon — Tutorial in Python

Exploring Philadelphia Neighborhoods with Machine Learning

Practicum Pride: TruSTAR

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Katherine Lordi

Katherine Lordi

20-something Associate Product Manager from Philly @ WSJ, Barron’s. Writing about what I’m learning! •Tech, Media, Financial Literacy, Self Development, Health•

More from Medium

What is Shapiro-Wilk Test and How it Relates to Gaussian Distribution?

using acummarray to average several columns at a time?

K Means… what does it mean?

A dust colors. red color, yellow color, green color, blue color, purple color

4 Predictions about Data for 2022