My name is Matthew Lee and I want to thank you for taking the time to visit my data science portfolio. Please explore some of the projects I have done during my undergraduate career at the University of California, Davis.
I graduated from the University of California, Davis with a double major in Applied Statistics and Economics. As an avid basketball fan, I grew up fascinated by simple statistics such as Win-Loss record or Strength of Schedule. This interest led me to pursue my major where I could learn more about how people incorporate data into decision making. After graduation, I will be working in San Francisco as a Risk & Compliance Consultant.
I hope to one day build a career in Data Science and potentially work with the NBA. My specific areas of interest are Predictive Analytics and understanding how travel patterns influence energy and fatique. I have experience with Data Analysis particularly with R and Python, and have some exposure with SQL.
Here is the code from the final project that my group and I researched. Our dataset is from a university in Slovakia from 2013 where the Statistics students surveyed friends and colleagues about their personality habits. We decided to take a spin on the project and treat it like a Consulting assignment that we would be assigned by a client. The main area of interest was spending habits and determining how companies can best spend their money to reach the desired demographic.
This was an assignment for my data science course where I used the USDA API key to get the nutritional information for a list of common fruits and vegetables. It required the use of web scraping to find the nutritional data, and then Pandas was used to convert it all into a digestible dataframe.
I was really excited to work on this project since it was my first exposure to using SQL, which is commonly used in the work force. Also, I hope to move to San Francisco after I have saved up enough money, so it was fascinating to learn more about the city. This also features Mapping, where I plotted the location of park and noise disturbances across the city of San Francisco.
For our final project for my statistical programming class, my team and I chose to analyze a dataset provided to us by the UC Irvine repository. This assignment was the culimination of an intensive class that improved my technical expertise in R.
This was one of the reports I conducted about cancer research. This project emphasized the use of ‘ggplot2’ in R, and required extensive outside research into health effects of Hemoglobin, Albumin or Iron.
Part of our final exam was to perform an in-depth analysis of this dataset provided to us. This report contains a forecast of expected pollution levels for the upcoming 12 months, as well as a 95% interval forecast. All of the analysis was performed in R.
I was an Engagement Manager for a non-profit consulting organization here at UC Davis. This is a copy of our final deliverable for our client Smartz Graphics.