Processing and Visualizing Clickstream Data Using R
Learning Analytics and Knowledge Conference 2022
Pre-conference Tutorial
Tutorial Schedule
This is a three-hour tutorial. During Hour 1, attendees will learn about the structure of clickstream data and what the various elements in the dataset represent. Attendees will also learn how to prepare the data for processing, such as formatting timestamps, categorizing urls by resources type, and checking the data for potential issues.
Hour 2 will focus on cleaning the clickstream data and constructing measures of engagement. These include general indicators (e.g., total number of clicks, number of clicks per day, number of unique visit days) and indicators by resource type (e.g., total number of video lecture clicks, number of video lecture clicks per day). Attendees will also inspect these measures by learning how to generate descriptive diagnostics.
Hour 3 will focus on building data visualizations of the clickstream measures of engagement. This will include histograms, line graphs, boxplots, and scatterplots.
Topic | Functions |
---|---|
Hour 1 | |
Importing and Inspecting Data | fread(), list.files(), lapply(), bind_rows(), grep() |
Time-Stamps | gsub(), as.POSIXct(), with_tz() |
Populating Dates | cbind(), complete(), seq.POSIXt(), arrange() |
Hour 2 | |
Daily Click Counts | group_by(), mutate(), summarise(), piping %>% |
Daily Click Counts by LMS Categories | ifelse(), grepl(), filter(), select(), mutate(), summarise(), piping %>% |
Daily Click Counts by URLs | ifelse(), grepl(), filter(), select(), mutate(), summarise(), piping %>% |
Hour 3 | |
Data Inspection, Visualization | ggplot(), filter(), select(), piping %>% |
Correlating Clicks, Scatterplots | ggplot(), filter(), select(), piping %>% |
Visualizing Clicks Over Time | ggplot(), filter(), select(), piping %>% |