Predicting Weather based on Music Listening Patterns
Overview
Report by: Gabriel DiMartino for CPSC 222 Intro to Data Science
For this project, I am using Spotify Listening data to predict the Weather in Alexandria VA. I am researching this domain, because I am interested in the applications my Spotify. If a correlation between Spotify and the Weather exists, other factors such as mental health might be predictable. The data format for this project consisted of a mix between CSV and JSON. Because of the poor quality of Spotify data, The SpotiPY API was used to add additional information such as dance ability and energy for each song. The result was a CSV containing critical data for each song. The tables included in these datasets are: Alexandria Weather and Spotify API Data.
As mentioned earlier, the data for Spotify was collected both from Spotify Listening and the Spotify Python API. Similarly, the Alexandria Weather was collected from sources such as the National Weather Service. Combined, the tables make up 1900 instances for the months of June - August, with the weather data being repeated to match the several instances of music.
The relevant instances of the Spotify Data include:
- Date: Attribute to merge both datasets
- Dancebility: Attribute bound between 0 and 1
- Energy: Attribute bound between 0 and 1
- Loudness: Attribute bound between 0 and -20
- Speechness: Attribute bound between 0 and 1
- Acousticness: Attribute bound between 0 and 1
- Instrumentalness: Attribute bound between 0 and 1
- Liveness: Attribute bound between 0 and 1
- Valence: Attribute bound between 0 and 1
- Tempo: Attribute determined by the beats per second.
The relevant instances of the Weather Data include:
- Date: Attribute to merge both datasets
- Conditions: 5 classifications for the type of weather
From this data, I am attempting to classify the type of weather, based on my music history. The potential impact of this analysis is the use of my Spotify data to classify other meaningful attributes such as mental health or current activity. If this methodology can be generalized, potential stakeholders and benefactors from this research include psychologists and psychiatrists who could use similar methods to determine depression or bi-polar states.
Images
Varying Attributes Vs. Weather Conditions
Enumerated Conditions Decision Tree
Binary Conditions Decision Tree
Usage
All projects require python 3.x to run. Similarly, dependent libraries can be installed using:
pip install -r requirements.txt
If you intend to utilize the Spotify Web API utility you must create an apikey.py:
#apikey.py
CLIENT_KEY = 'Your_Key'
SECRET_KEY = 'Your_Secret'
To find this info, please refer to the Spotify API docs on creating a developer account.
Credits
This project was developed by Gabe DiMartino for public use and help with Gonzaga CPSC222 assignments.