Spotify Wrapped is a great way to reflect on your music consumption for the year, and, of course, show off your impeccable taste to all your friends on the ‘gram.

I wondered what it would be like to have a Netflix Wrapped or Hulu Wrapped. Why doesn’t every streaming service have a year-end wrap-up? And what if we could combine the data from all the streaming platforms where we watch TV/movies to get a Mega Media Wrapped Year in Review? Well, with a dash of determination and the help of data privacy laws, it’s possible!

In this article, I share my journey of using the GDPR (General Data Protection Regulation) to wrestle my media data from the clutches of streaming services. We’ll explore the importance of data ownership, navigate the (sometimes absurd) hurdles, and create a personalized, cross-platform media retrospective.

I also made a website where you can see your own Media Streaming Year in Review breakdown, and share with your friends: WhatYouWatched.com

Your streaming 2023 Year in Review! You watched 45 movies and 300 TV episodes across 97 series. For a total of 290 hours (that's 12 days) in front of that screen!

The Data Quest

I set out to create a comprehensive ‘Year in Streaming Media’ spanning Netflix, Hulu, Disney+, and the lot. Little did I know that streaming services aren’t exactly champions of data portability.

It’s best when a company normalizes access to your own data, and provides easy tools for you to download it in machine readable formats, but many don’t. Netflix is the winner here, allowing you to download your watch history as a CSV. Amazon Prime came in a trailing second, having a page with your view history which, with a little patience and Javascript-fu, can be scraped and parsed. The other services had no convenient way to see my viewing history.

GDPR: The Knight in Shining Armor

Enter GDPR (or its Californian counterpart, CCPA). These laws grant us the right to request and access all our personal data that a service has stored. Armed with this legal superpower, I embarked on my data liberation mission.

Each streaming service presented its own unique *ahem* ‘adventure.’ Some were surprisingly straightforward, while others involved scavenger hunts through support portals, lengthy wait times, and—the irony!—even providing a photo of my ID to access my own data. 🤦‍♂️

Amazon’s GDPR data was easy to get and very thorough. Even though I had already scraped my data from their “view history” page, I also filed a GDPR request anyway, and when it came back a day later, it had way more data, rich for mining.

alt text: list of file directories: Digital.PrimeVideo.CustomerReactions/ Digital.PrimeVideo.CustomerTitleRelevanceRecommendations/ Digital.PrimeVideo.LocationData/ Digital.PrimeVideo.ParentalControlsAndPin/ Digital.PrimeVideo.ViewCounts.1/ Digital.PrimeVideo.Watchlist/ PrimeVideo.MoviesAnywhere/ PrimeVideo.OutOfAppRecommendationsConsent/ PrimeVideo.TvodOwnership.1/ PrimeVideo.ViewingHistory/ PrimeVideo.WatchEvent.1/
List of directories of the data Amazon gives you in your GDPR request.

Hulu’s process was also pretty quick – once you clicked through all the right links, they provided a PDF of all my billing and watch history. Nice for reading, I guess, but not great for ingesting. I found a PDF parser called Tabula which helped me turn it into a CSV.

Disney+ and HBO Max were the worst: sending me through Byzantine forms and third-party systems to request my data formally, which can take months. Disney required multiple back and forth chats with their support team over two months, and I still haven’t gotten the data! HBO Max didn’t even respond (which is illegal, BTW. Good luck trying to hold them to it though).

Wrangling and Wrapping

After much data wrangling, I transformed the raw data into visualizations using D3.js and Observable notebooks. The results were insightful, and maybe a bit concerning (do I really watch that much TV?)

Most videos watched in a single day: 7. Keep up the hard work! Your top genres: 1. Sci-fi 2. Fantasy 3. Crime. Images of castles and spaceships. Binge alert! You binged The Night Agent, for a total of 6 episodes in a row. Image of a secret agent in a dark hat and glasses.

I massaged the data using D3 and derived some basic stats to make the screens. I also used D3 and Observable to create some interactive visualizations of all the shows I’ve watched, color coded by provider, and also a stream graph of my watch habits over the whole year.

It was interesting correlating the spikes in watch activity with illness, which I also track judiciously using an app I made called Moji Mapper.

a graph showing minutes consumed over the year across Netflix, Hulu, and Amazon. It spikes in February (got flu), June (got COVID), and December (You guessed it: sick again. RSV)

Your Data is in Another Castle, Behind a Moat with Alligators and Piranhas

This experience underscored the significance of owning our data. Laws like GDPR are empowering, but there’s still a long way to go. The hurdles I encountered highlight the need for smoother processes that respect our rights. We need more companies normalizing data access and portability. 

Promisingly, Meta, Google, Apple, Twitter, and Microsoft have joined the Data Transfer Project, which lets you port your data between services. We can only hope media streaming services follow suit. Imagine taking your preferences, watch history, and personalized recommendations with you. That’s true data liberation!

I encourage you to take control of your data, whether media-related or beyond. Explore the rights granted to you and see what stories your data can tell.

If you want to generate your own Year in Review stats to share, head over to WhatYouWatched.com

