Processing and Visualizing Clickstream Data Using R

Learning Analytics and Knowledge Conference 2022
Pre-conference Tutorial

Tutorial Requirements

In order to benefit from this workshop, please be aware the following:

  1. Because the tutorial time is short, we will use Discord as forum of self-introductions and discussing learning goals. We will also use the Discord server to respond to general questions and resolve any technical issues. You will receive an invitation to our Discord server upon registering for the tutorial.

  2. You should have the latest version of R installed on your computer. You can find this information by typing version on the console pane and check the version.string number. Please check that your version number is the same as the one one displayed here.

version
##                _                           
## platform       x86_64-apple-darwin17.0     
## arch           x86_64                      
## os             darwin17.0                  
## system         x86_64, darwin17.0          
## status                                     
## major          4                           
## minor          0.3                         
## year           2020                        
## month          10                          
## day            10                          
## svn rev        79318                       
## language       R                           
## version.string R version 4.0.3 (2020-10-10)
## nickname       Bunny-Wunnies Freak Out
  1. We recommend having knowledge of the RStudio graphical user interface and working with R-Markdown files.

  2. Please install the following R libraries. Note that you do not have to have prior experience using these libraries, but we do recommend basic proficiency with the tidyverse library of functions.

    install.packages("data.table")
    install.packages("lubridate")
    install.packages("tidyverse")
    install.packages("psych")

  3. Interested attendees should be able to readily understand most of the functions and the general structure the following lines of R code.

# counting unique visit days to the LMS
dataframe2 <- dataframe1 %>%
  filter(total_clicks > 0) %>% 
  group_by(student_id) %>% 
  dplyr::count(nrow(total_clicks))

# plotting unique visit days and final course grade
dataframe2 %>% 
  ggplot(aes(x = n, y = grade)) +
  geom_point(size = 2, color = "blue") +
  geom_smooth(method = "lm", se = FALSE, color = "green") +
  labs(x = "UNIQUE LMS VISIT DAYS", y = "FINAL COURSE SCORE")

If you do not understand the above code 😭, but would like to learn the skills required for this tutorial 🥲, we recommend you watch our 10-part workshop series, Intro to R for Educational Data Science. It will teach you all of the R code you need to know. 😉