Using Cohort Analysis to Track Attendence

So here’s a fun little thing Adam’s been playing around with that he keeps shoving into my face–analyzed through R (a statistical programming language) and plot rendered through GGPLOT2. The TL;DR (too long;didn’t read) version is that this is a really, really good argument for why retention is equally important to getting new students, since this shows in greater detail as to the movement of students into and out of your school. This is also why data analysis should be an important portion of managing any type of organization.

You’ll have to shove all your data into a database (husband o’ mine decided to create a card swipe sign in program to make it easier on himself instead of having to manually input data from his sign in sheets) before attempting to do this through R (although… it’s possible to do it by hand, it would be way too intensive and possibly take forever… it’s 190 lines of code, but once set up, he can see everything with just a click of a button.) You can do it by year, month, day, hour, what have you as long as you have the time data captured.


So…. how to read this. When I first saw it I was a bit confused at his near maniacal glee at having created a colorful chart that looked like something our kindergartner did during art class. When I finally figured it out, I went, huh, interesting, might be helpful for other people too.

These cute little graphs are plot renderings of cohort analysis which is defined by Wikipedia as “a subset of behavioral analytics that takes the data from a given data set (e.g. an e-commerce platform, web application, or online game) and rather than looking at all users as one unit, it breaks them into related groups for analysis. These related groups, or cohorts, usually share common characteristics or experiences within a defined time-span. Cohort analysis allows a company to “see patterns clearly across the life-cycle of a customer (or user), rather than slicing across all customers blindly without accounting for the natural cycle that a customer undergoes.”[1] By seeing these patterns of time, a company can adapt and tailor its service to those specific cohorts.”

You’ll notice that at every time interval, there is a darker color on top of the previous colors, and it goes on. This represents new, unique participants that walked into your door during that month/year. The previous color above is all the participants of the previous time interval that decided to continue on and participate in the next time interval. That is our common characteristic within the defined time-span. So while the top line is the sum of all participants, new and old, a cohort analysis provides an in depth view as to the proportion of old students to new.

Why is this important? 

  • You can see your retention rate.
  • You can see how many new students you’re getting.
  • You can see which months are most popular for new recruits.
  • You can see which months are most popular for long term students.
  • You can see which months are most popular overall for training.
  • You can see your month to month overall growth trend.
  • You can see your year to year overall growth trend.
  • You can see whether or not the original group of people came back if some left or took a hiatus (the really cool thing about this is because every participant has a unique ID, it’s programmed to keep them in whatever group they originally started in. So if Jim, Bob, and Adam were in the initial group of the first month, but only Bob and Adam stayed the second month, and then Jim came back the 3rd, it shoves Jim back into the initial group rather than throwing him into a new group.)
  • It’s really helpful for tracking your marketing campaigns because you can track which months or weeks or years you got the most new students to match up with your campaign timings.
Why not just a regular line/histogram, you ask? Well, because you can have the same number of students from month to month, but if half of them are new because half of your old students didn’t continue, it doesn’t actually show what’s truly going on within your organization and might give you a false picture.
Anyway, this was just one of the cool little things he’s been playing with.

(For those wondering why–he’s currently going through his PhD for Educational Data Analytics and he uses dojo data to test things. Although if he bounces any more ideas off of me, I might bounce his head off a wall r/hetalkstoomuch)