Recently I was presented with an opportunity to present data for a program where I needed to visually communicate how many people were being touched over a period of time. After some time playing around with the data, I figured I should do the opposite of the advice given in TLC’s 1994 hit song Waterfalls, by chasing them instead of avoiding them.
Let’s say we run a graduate test prep program and wanted to evaluate how many people we are enrolling over a period of time (say 2017). We may currently have a number of students in the program, then over the next year, we may enroll in more individuals, have some leave (either they took the test, or left for some other reason), and then at the end of the year we would be at a new, current enrollment number.
Now, say you’re responsible for putting a presentation together on this data, where your audience is interested in seeing growth, volume of people you touched, and some opportunity for next year (2018), what’s the best way to do that while being clear and concise? There are probably many ways to do this, but let’s walk through my process and why you should go chasing waterfalls.
First off, we could just track the unique number of students we touched in the last year as a basic line graph as seen in the first example below.
While we can easily see the trends in each of the categories it is more confusing than helpful, sure we can see that the enrollment is typically higher than those who are unsubscribing, but it’s hard to understand how many unique people we are working with and what the growth was like throughout the year (where did we start vs where did we finish). Doing something like this may be helpful for data exploration, but may end up leaving your audience hungry for spaghetti instead of action if you were to use it in a presentation.
The next thing that came to mind when I was working through this, was possibly putting this into an area plot (second example) so we could see the overall number of people being touched throughout the year (e.g. in June it appears we had a little over 600 people touched).
However, it’s pretty difficult to easily understand the magnitudes of each category, plus differentiating between each of these categories is rather difficult due to the number of colors being used, and the fact our brains just aren’t great at seeing the proportions well over time.
For example, in June it looks like we may have enrolled ~400 people (in blue), but how does that compare to those who completed the exam (in green)? At first it looks like the number who completed is ~200 people, but that is because it is being lifted up but those who did not complete the exam (in red).
That is when it hit me, I remembered reading through a similar situation in Cole Knaflic’s book, Storytelling with Data, where she presented The Waterfall plot; I figured it was time to give it a whirl.
In the third example, you can see what I was able to put together. Here, we can easily see how many people started in the program at the beginning of the year, the number of new students, the number of people who successfully completed the exam and those who unsubscribed but did not take the exam, and finally the number of people we had in the program at the end of the year. This structure allows the audience to see how successful we were in enrolling new students, the growth over the year, along with some opportunity where we can potentially help those 578 individuals complete the exam in the next year.
At the end of the day, this was a more crisp rendering and helps the audience understand what was going on by answering the questions we had presented at the beginning of this post: what is our current enrollment, what was our growth, and how many people did we touch throughout the year. What are your thoughts? Any ways you could incorporate waterfalls into your data?
Find more about the Waterfall plot from Cole’s blog here: http://www.storytellingwithdata.com/blog/2011/11/waterfall-chart
Find the template I used for the waterfall plot here (also from Cole):