Data Exploration

by | Sep 18, 2021 | Freelancing | 0 comments

Electric Consumption Data Visualisation

Overview

The UC Irvine Machine Learning Repository is a popular repository for machine learning datasets. We look at the Individual household electric power consumption data set from this repository.

The data consists of measurements of electric power consumption in one household with a one-minute sampling rate over a period of almost 4 years. There are 2075259 measurements gathered between December 2006 and November 2010 (47 months) where different electrical quantities and some sub-metering values which are measurements on various appliances such as dishwasher, washing machine, AC, are available.

We want to explore working with the Date and Time class using strptime and manipulating the data using dplyr package to create some visualizations in R. You can find the entire code on github here.

Load packages and read the file.

packagesloaddataLet’s take a quick look at the data. strdata

Handle Date and Time

We want to first visualize the overall trend of different measurements such as Voltage or Sub_metering over the years. So we need to convert the Date and Time variables to POSIXct class which is a way for R to work with date and time, and make manipulating data easy.

datetime

Reading from sub meters

Take a look at and compare the readings from the sub meters over the years.

submeter.png
g1

We see that AC and water heater amount for the most electric usage followed by Washing machine and Refrigerator and then Kitchen appliances.

Data Comparison

Now we look at data by year and compare the power consumption by month.

daily2008
g4.png

We see that in 2008, February power usage appears to be the lowest and highest in May and October. If we further look at consumption by different appliances in the house using bar charts by month, below, we can confirm the same findings. It is not clear though what could a possible explanation for this. There are several latent factors which can impact the electric consumption such as the geographic location of the household, or personal reasons such as travel plans of the members.

bars
g44.png

Finally let’s look at sub meter measurements for two different days and see how they compare. First day of May in 2008 versus 2009, have quiet different power usage per appliance. We can see that in 2008, the power usage is very low. On the other hand, in 2009 the same day shows high activities in all areas indicating use of washing machine, kitchen appliances as well as AC.

diffDays.png
g5.png

Author

"Meet Sonia Sharma, a mathematical mastermind who's always on the lookout for a new challenge. With a passion for all things Python, Sonia has honed her skills and become an expert in the field. But her love for mathematics goes beyond just coding - she's also fascinated by the world of AI and machine learning. In her free time, she trains AI models for fun, constantly pushing the boundaries of what's possible. With her technical prowess and creative spirit, Sonia is always finding new and innovative ways to solve problems. So, if you're looking for a mathematician and Python expert who's ready to take on any challenge, look no further than Dr. Sonia Sharma!"

Sonia Sharma

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Please select the text to read.