Projects
Explore my data science projects categorised by topic and
technology.
R Specific Projects
- Description: This project involved Exploratory Data
Analysis (EDA) using the Breast Cancer Wisconsin (Diagnostic) dataset
available from the UC
Irvine Machine Learning Repository.
- Technologies:
R
,
corrplot
, readODS
, dplyr
,
tidyverse
, MASS
, ggplot2
, and
Rtsne
- Description: This project involved the use of the
UC Irvine Wine Dataset, available at UC Irvine
Wine Dataset, to explore the most important variables in determining
wine quality.
- Technologies:
corrplot
,
readODS
, ISLR
, leaps
, and
glmnet
Python Specific Projects
- Description: This project involved the exploration
fo the Palmer Penguins dataset available at Palmer
Penguins. The purpose of this project was to explore the body
weights of penguins depending on specific conditions including: where
they lived, their sex and what species they were.
- Technologies:
Python
,
Pandas
, Matplotlib
and
Seaborn
- Description: This project utilised the Titanic
dataset available at Titanic Kaggle and
Data
Science Dojo Github, to discover which variables were the most
important in determining passenger survival, such as
Passenger Fare
,
Number of Siblings/Spouses Aboard
and
Age
.
- Technologies:
Python
,
Pandas
, Matplotlib
, Numpy
and
Seaborn
Geospatial Specific Projects
- Description: This project explored regions around
the world which have high volumes of earthquakes and their location to
nuclear power plants. Due to data limitations, it was not possible to
make a further in-depth analysis of the results. Various datasets were
used in this project including the GEM
Science Tools, and the United States
Geological Survey Database. The full description of data sources is
available in the overview section.
- Technologies:
QGIS
and
Open Street Map Quick extension
.
rmarkdown::render(“project.rmd”)