Tuesday, December 19, 2023

All the Media Content We Cannot See

Like the majority of the electromagnetic spectrum, most of any given high-choice media landscape - be it YouTube, TikTok, or even Netflix - is difficult to see without some kind of aid, and thus easy to forget about. One might argue that the most important stuff - the content that has the most influence on individuals and society, i.e., the popular stuff - is easily visible through Top Ten or "trending" lists on the platform itself or through articles, podcasts, and conversations of cultural critics. But how much of the entire spectrum of content - or, if you take a human-centered approach to the question, viewing hours - are we observing when we talk about this tall head of the distribution tail?

The answer has implications for how we conceive of the culture we live in. Often, we assume we can get a pretty good sense of a culture by observing what media content it chooses to spend its time with. The topics, values, and aesthetics of popular content have long been thought to reflect and/or shape the preoccupations of the culture. This was all easy enough during the era of mass media when choice was limited, although even then it oversimplified the character of a culture. We look back on the late 1960's in America and think of psychedelia and unrest, but plenty of folks living in that place and time were likely oblivious to such trends. Still, it seems safe to say that you could get at least some idea of what most people living in a certain place and time were thinking and feeling by examining its popular media content. 

It's a commonplace that the number of choices for media content has exploded in the past decade or two. Truly understanding how content relates to culture - or trying to derive a sense of culture by examining content - has become trickier. In the era of broadcast TV, it wouldn't take much time for anyone to watch episodes of the 10 most popular TV shows. Out of the total number of viewing hours in a given culture, that might get you, say, 50% of them. The other 50% of the viewing hours would be distributed across less popular programming, so you could make a decent claim to "knowing" a culture by examining 10 popular TV shows. What would a similar approach get you now, if applied to Netflix?

According to recently released data from Netflix, viewers watched a total of roughly 90 billion hours in the first half of 2023. Of those hours, the top ten shows accounted for 4.9 billion - or roughly 5% of the total. Watching episodes of these ten shows, then, wouldn't be a very good way to get an idea of what Netflix viewers, generally, were watching (or, by extension, what they thought or how they felt about anything). It may be that the shows are in some way representative of the larger whole - in terms of their genre, topic, tone, aesthetic, values, etc. - but given the relatively small proportion of the whole it represents, there is reason to suspect that we are missing a lot about this group of people and their preoccupations if we only take into account the most popular content. 

But this is where many of us start, and by "us" I mean scholars and researchers as well as cultural critics, content creators seeking to create content that resonates with an audience, or marketers. What other option do we have? 

One alternative would be to take stratified samples from further down the distribution tail, an approach used in this article from The Hollywood Reporter. It's important to note that such an approach requires that the platforms make their data available in such ways as to make this feasible, and in this respect, Netflix has done us a huge favor. It is more difficult to get underneath the trending surface of TikTok or YouTube to try to get even a rough idea of what the rest of it looks like. 

And with YouTube and TikTok, the problem of unaccounted-for content is likely much worse. 

Let's do some back-of-the-envelope* calculations to try see how little of the content universe we're seeing when we examine, say, the top ten TikTok videos from last year. There are roughly 1.1 billion active monthly TikTok users. The average user spends 95 minutes on the app per day. So, that's a total roughly 104.5 billion minutes per day, or 381.5 trillion minutes per year. The most viewed TikTok video of 2023 had 504 million views and it is roughly 30 seconds long. Obviously, the next nine had fewer views than this, but I'm finding it difficult to obtain raw view numbers for each video (it's easy to find the number of followers, but plenty of people watch TikTok videos created by users they don't follow). So, let's err on the side of overestimating and say that each video is 1 minute long and is watched 500 million times. By watching the top ten TikTok videos, we are accounting for 5 billion minutes of viewing. What proportion of the total are we seeing?

Before we do the math, it's worth remembering our tendency to fail to see meaningful differences among very small proportions. We can pretty easily tell the difference between 20% of something and 5% of it but fail to differentiate between .1% and .01%, even though the difference in magnitude of the latter is more than twice the difference in magnitude of the former. Often, we just think of anything below 1% of something as "very small," whether it's .5% or .05%. But if we're really trying to know something - a culture, a media diet, etc. - it's important to correct for that bias and recognize just how small the proportion really is. 

Watching the top 10 TikTok videos of 2023 would account for less than .001% (one thousandth of one percent) of all TikTok viewing. Given that the top 100 videos would have fewer views than the top video, and given that most of those videos are under 1 minute in duration, watching the top 100 videos (a feasible, if time-consuming, task) would account for less than .01% of content viewed on TikTok. 

Even if we are studying a particular topic or domain within these high-choice environments - say, political messages or health-related messages - sampling only the most popular videos doesn't get us anywhere near the complete or representative sample that it once did in the low-choice days of mass media. Most viewing is happening outside of the sample, further down the distribution tail. Until we reckon with the vast size these media environments and the diversity of users' media diets, it's hard to know what we're missing.


*If anyone has more accurate usage data, I would love to see it! I don't have supreme faith in these data, but it's the best I could find right now.