Processing and Visualizing Clickstream Data Using R

Learning Analytics and Knowledge Conference 2022
Pre-conference Tutorial

Tutorial Schedule

This is a three-hour tutorial. During Hour 1, attendees will learn about the structure of clickstream data and what the various elements in the dataset represent. Attendees will also learn how to prepare the data for processing, such as formatting timestamps, categorizing urls by resources type, and checking the data for potential issues.

Hour 2 will focus on cleaning the clickstream data and constructing measures of engagement. These include general indicators (e.g., total number of clicks, number of clicks per day, number of unique visit days) and indicators by resource type (e.g., total number of video lecture clicks, number of video lecture clicks per day). Attendees will also inspect these measures by learning how to generate descriptive diagnostics.

Hour 3 will focus on building data visualizations of the clickstream measures of engagement. This will include histograms, line graphs, boxplots, and scatterplots.

Topic Functions
Hour 1
Importing and Inspecting Data fread(), list.files(), lapply(), bind_rows(), grep()
Time-Stamps gsub(), as.POSIXct(), with_tz()
Populating Dates cbind(), complete(), seq.POSIXt(), arrange()
Hour 2
Daily Click Counts group_by(), mutate(), summarise(), piping %>%
Daily Click Counts by LMS Categories ifelse(), grepl(), filter(), select(), mutate(), summarise(), piping %>%
Daily Click Counts by URLs ifelse(), grepl(), filter(), select(), mutate(), summarise(), piping %>%
Hour 3
Data Inspection, Visualization ggplot(), filter(), select(), piping %>%
Correlating Clicks, Scatterplots ggplot(), filter(), select(), piping %>%
Visualizing Clicks Over Time ggplot(), filter(), select(), piping %>%