#MAD3: Week 3 - Combining Data using SQL and Tableau
During Week 3 of the #MAD Analyst Bootcamp, we taught the students to combine data using both SQL and Tableau. In short, they learned to identify primary and foreign keys, the difference between the various join types (inner, left, right, outer), and how to union data. For every topic that we discuss, we work to provide the students with a real-world case study that allows them to practice the lessons that they just learned.
For the Week 3 project, we provided the students with the following memo from our hypothetical client, Venus Marketing.
From: Venus Marketing
We’ve recently launched a website and would like to have a dashboard created in order to track various metrics. We currently have data for the past two months on daily stats from cities around the world. This dashboard will be rolled out to our regional sites and displayed on the main monitor in our web analytics department. The design of this will matter a great deal as it is the first thing our employees will look at when gaining an understanding of our web metrics. We want to make sure metrics are clearly labeled and that minimal questions are asked when viewing this dashboard at first glance.
We also provided a list of requirements that the client wanted to specifically look at:
Overall: # of visits, # of pageviews, # of downloads
# of visits per Region
# of visits per City
# of Visits per Day
# of Visits per Source
# of Visits per Content Type
Once the students reviewed the requirements, they were tasked with reviewing the data provided by the client. The data was provided to the students in two separate tables.
The first table contained most of the information needed for the requirements, such as the number of visits, source information and content type. Here is a snippet of that file:
The second table contained information on the Locations – Location Code, City, Country, and Region. Here is a snippet of that file:
To successfully complete the Week 3 project, each student was required to:
Combine the two tables.
Explore the data to validate and ensure that it was good enough to answer the required questions.
Create a dashboard that can be provided to the client.
Dive deeper into the data to provide the client with additional insight that can not be found at first glance.
Here are our favorite dashboards from Week 3:
1. Anna Kapp
"September 18th of 2018 was a significant peak in the number of visits per day chart. The spike was seen across all regions. September 18th easily had the highest number of visits per day in every continent. This peak was mostly caused by a sharp spike in the example content type- which had many more visits than any other type of content for that day. This is unusual as generally the content type with the most visits is user. Also, September 18, 2018 was a Tuesday. This is interesting as Wednesday is normally the day of the week that has the most traffic and Saturday generally has the least traffic."
2. Sandy Thongdyyarath
"Using the data we can conclude that user content is the most view with sources from Direct visits of 434.70K. We found that direct is the most popular way to source the website. When we look at page categories index.htm makes up most of the direct categories."
3. Jessie Chen
"'/' and '/index.html' tracked separately. Even though www.example.com/ may be the same page as www.example.com/index.html, they show up as two separate entries in your reports."