data analysis – Bad Movie Twins

Bad Movie Twins Bad Movie Data Analysis #2

This is the second is a series now where I’ve been breaking down wide-release bad movie data through time. The previous installment can be found here. The conclusion there was that one should split Rotten Tomatoes data at around 1998 since pre-1998 and post-1998 (when Rotten Tomatoes was established) behaved much differently. At the end I suggested I wanted to start looking at more recent trends in bad movie releases. This analysis focuses on whether studios have become better at recognizing bad films and either not releasing them (or releasing them to Video on Demand (VOD), which to us is equivalent) or dumping them into the classic bad movie dump months (January, February, August, etc.).

Once again, let’s briefly describe the data set. I collected every film released to over 600 theaters from Box Office Mojo. I only included films released to more than 600 theaters (“wide” according to Box Office Mojo) in this analysis as that is one of our qualifying metrics. I had collected the Wikipedia, IMDb, and Rotten Tomatoes links for these films prior to the previous analysis. This analysis ended up being the first step in thinking about a model based on this data, a model that could, eventually, tell us things like e.g. “Here is a set of eight films being released this February, which films are most likely to be bad, should we watch one of those films, or should we wait until March.” It could also tell us a “fair yield” for a years worth of bad movies, and eventually help identify VOD films which should qualify according to their properties (e.g. “In 2010 this film would have been released to 2000 theaters, but in 2020 it is released successfully to VOD”).

Initially I was curious about whether there was an identifiable trend (outside of general yearly trends) in bad movie releases by month. My initial prior was: I think it makes sense that classic bad movie dumps like January are getting worse in general, and that previous semi-dump months like February and March and getting better, and thus reinforcing the monthly differences we’ve seen previously. This was based on the recent discovery that zero wide release films received less than forty percent on Rotten Tomatoes during June and July 2018, which makes it extremely likely that 2018 will become the first year since the establishment of Rotten Tomatoes to not release at least 52 films with 40% or lower on Rotten Tomatoes widely (a requirement for the continued existence of BMT for all eternity, naturally).

The first question to be asked though is the above: Are we just seeing less bad movies recently? Are these films just being released to VOD (or not released at all)? I split the released by Rotten Tomatoes score to get a sense of how the groups have been changing in the last 20 years:

RawYearData

So the answer to whether there have actually been a lot less bad released recently is no I think, the number of bad releases in the past ten years has been rather stable, but there are a few crazy things of note in this data. First, that in 2007 there were over 50 wide release films to get below 20%, which is insane. The collapse of the bad movie industry coincides with the financial collapse, which I don’t think is a coincidence, I think that studios making films like Redline with ill-begotten fortunes went out of business in 2008 and simply have not come back. Second, the number of films with Rotten Tomatoes scores above 80% has ballooned. I think that is more likely a case of multiple compounding factors, namely: (1) the MCU and other franchises are now consistently releasing good-to-great films multiple times a year; (2) more consistent wide releases for independent films; and (3) Rotten Tomatoes has become bigger and in general the largest films in a year are getting more and better reviews (as we saw in the last analysis). I do think it is a combination of all of those things. Regardless we can use these films per year numbers to produce adjusted films per year (in order to prevent general year-to-year trends obfuscating the monthly trends I’m interested in):

badMovieAdjustmentFactorSplit

Easy enough. I’m generally interested in three things. First, the average number of films released in a given month in each Rotten Tomatoes category. Second, the trend in these same numbers. And finally, the trend in the bad movie share for a given month. With these three plots I think we can get a clear picture of the traditional bad movie dump months, the trend in those months, and the trend in our bad movie probability in order to better inform our BMT Live! choices in the future.

totalFilmSplit

I just wanted to get an idea of good and bad months traditionally. So this is the average films released across all 20 years in each TomatoMeter category. The dotted lines are the average films released across all five categories, and if you draw a line along the category values you can get a general sense of how much a month released good or bad films. Notably January, February, April, and very slightly August generally release bad films. November and December are the big months for good films. So how has this been changing (adjusting for yearly trends in general)?

adjTotalFilmPercentile

Here it is quite interesting. Most months are in general a wash, specifically from May to September really doesn’t have much of a trend. But it looks like January is getting worse, April is getting better, October is getting slightly worst, November is getting a lot better, and December is getting a lot worse. Perhaps November is become the main month where Oscar films are being pushed, and December is starting to clear out for larger fish (namely Star Wars) leaving bad Christmas kids’ films? April getting worse could also be a product of more summer films getting released in February and March (a la the MCU). Note that these trends are formed using the yearly adjustment factors in the second plot. Interestingly this is getting mighty close to how one would form a Rotten Tomatoes score model, so … that could be coming down the pipe.

All interesting and good things to know. Finally, since it is most important to know which months might be good for BMT I also plotted the “share” of bad movies (the percentage of a year’s worth of bad movies, <40% on Rotten Tomatoes released in a given month) with a trend line:

adjBadFraction

This reinforces some of the things said above: January is, somehow, getting worse with about 12% of the bad movies released in that month; and April and November are both getting much much better in general. Other trends are a little less clear when you look at it this way, specifically with all of the noise it is pretty unclear whether October and December are actually getting worse or not. July is almost definitely a mirage, literally zero bad wide release films came out in July this year so that +41% is going to take a huge hit if I recalculate next year.

All of this is super interesting. If I were to try and fashion some rules it would be:

(1) For the first BMT Live try and get a good January and just run with it, it is very likely to be the best bet; (2) January-March and August-October are prime time for bad movies and we might want to consider doing two good Lives in each of those spans if/when they become available; (3) It seems likely that April-July are going to be very dry forevermore, so it shouldn’t be surprising when 2018 repeats itself (missing the Spring BMT Live! because nothing became available), see rule number 2;

All good guidelines. In a single sentence: We have to get a little loosey goosey with our BMT Live!s because bad movies do seem to be released predominantly during certain months, and the trend seems to be reinforcing itself.

BMT:CSI:SVU: We’re the Special Victims #2

This is a continuation of the long-term IMDb data analysis using the Internet Archive. Thanks Internet Archive! You can see part one of this series here. Cheers.

‘Ello everyone. A few months back (or a few days with regards to this website) I tried to solve the BMysTery of the mysterious inflection point in IMDb Data. Don’t know what I mean? The short run down is that a lot of movies seems to have two slopes, one for growth before 2011 and one for after. The previous post explored that and came to a (I think) reasonable conclusion. So what is all this about then? Well, I have a ton of data just lying around and something just kicked up and itched my brain. Time for the long story.

You guys know Material Girls right? Hilary and Haylie Duff vehicle, pretty big deal. Well, every time we do a preview for a movie we generate trajectories for both IMDb rating and votes through time. Usually this results in a scream of “WHY?! Why has the rating of this terrible film gone up over time?!” And typically it was left there, because hey, people have different tastes, and maybe it is just kind of a trait of the data. But then Material Girls!

MaterialGirls_RV

First, holy moley that 2011 inflection. Even the rating has an inflection! This was a huge red flag for me. Second, the rating jumps 2.5 points! That is patently absurd. Through all of this I couldn’t help but think maybe …. it was related to this recent blog post by fivethirtyeight. But then I was looking through some of my very old programs and stumbled onto a very prescient comment:

#Look at that variance! Awesome, basically regression to the mean.
#Movies are superlative when they come out
#End up regressing both up and down to the mean

So that’s what this (short) entry will look at: The regression to the mean in IMDb ratings. Something I clearly knew about literally 7 months ago then managed to forget pretty much instantaneously … yeah, I’m an idiot.

First start with a plot of all of the rating data I’ve got:

Ratings

Nonsense. But you can kind of see that things condense as time goes by. But it is all easier if you plot the rating change (over ten years) by the initial rating of the movie. I’ve included a regression and Material Girls is marked out by a blue square:

RatingsPlot

Nice. Pretty much the entirety of the crazy jump in ratings is explained by regression to the mean. Just look at Material Girls. And funny enough the rating at which it crosses over, 6.0, is kind of the cut off point for bad movies as well, which is fun.

It is interesting, especially looking at the first plot: the rating doesn’t just regress by some exponential, it pretty much follows the voting trajectory. But … yeah, they aren’t that correlated:

RatingsVotesPlot

The rating can’t move without votes, so it following the vote trajectory through time I think is just a consequence of that inherent underlying connection. And I think that’ll just about do it for that. The regression is interesting, but probably at this point hard to utilize for good. It could be used in tandem with a vote number trajectory predictor to try and predict vote/rating trajectories into the past. But predicting votes is the rub, and I’ve found rather difficult.

But I declare this BMysTery closed! It wasn’t that hard, I mean, I apparently knew the answer seven months ago, but yeah, bad movie IMDb ratings tend to go up (and the opposite for good movies) over time. It isn’t people waking up and realizing movies are better than their rating, it is just regression to the mean. And Material Girls probably wasn’t brigaded by guys.

BMT:CSI:SVU: We’re the Special Victims #1

Editor’s Note: This analysis was initially performed in October of 2015. The plots have been cleaned up and updated (so that I don’t look like I’m incompetent). At the time we were basically just starting in on the use of historical data to explore the evolution of bad movies through time, an ongoing project. We are, of course, indebted to the Internet Archive which has been diligently collecting this data for years. Cheers.

Welcome to BMT:CSI:SVU (we’re the special victims). This section uses high-tech forensic science (not really) to solve the mysteries of BMT … you could even call this a BMysTery. For the first in this series I naturally decided to present my boldest analysis to date. It asked the question: What really has happened to the voting and ratings through time on IMDb? The initial idea was to start in on trying to predict ultimate vote counts based on an initial vote trajectory … I got waylaid a bit. I started by taking every movie listed on OMDB that has a release year of 2005 (658 movies). I wanted a solid 10 years of samples. I went to the Internet Archive and then took 20 vote/rating sample (the two nearest archived pages on either side of the new year from 2006 to 2015) for every movie that had a valid page (the vote number being greater than 5) prior to January 1, 2006 and after January 1, 2015. I finally just linearly approximated vote/rating pairs for each New Year Day from 2006 to 2015.

The resulting data set had 471 movies (yeah … I took about 10000 page calls, sorry Internet Archive) each with approximate vote/rating pairs for 10 data points (New Years Day from 2006 to 2015).

RatingTime

The rating plot isn’t that interesting, it shows that the ratings have dropped a little over time without much of a trend. Although this doesn’t jive with a lot of the individual plots I generated previously which suggested rather strongly that the rating tends to increase with more votes being cast. Instead I ended up finding that there isn’t a correlation between how the rating changes and the current rating or number of votes, something to be investigated further in the future I think.

If you normalize the voting trajectories based on the number of votes on New Years Day 2015 though you get a more interesting result.

VotesNorm

Basically, it looks like the samples are split into two groups: movies that gained most of their votes after 2010 and those that gained most votes before 2010. This is in fact a trend: there is the odd anomaly in IMDb data whereby movies seemed to have an inflection point sometime in 2011. This can be more easily seen using the sum of all the votes, and in 2011 the total number of votes all of a sudden starts to increase:

VotesMean

Say what? That doesn’t quite jive with what we saw in the trajectories before. But it is true, a bunch of movies have either the 2011 inflection visible (red) or they appear to have leveled off since initial release (green, a much more expected trend). Here are the top and bottom 10 ranked by deviation from a linear trendline:

VotesExtrema

So in order to quantify the difference between the two trajectories I note that the mean normalized trajectory is roughly linear:

VotesNormMean

By correcting for this the normalized and corrected vote count trajectories now go from zero to zero. If you sum across the normalized and corrected trajectories then normal trajectories will have a positive value and those with the 2011 Inflection will have a negative value. I called this value S (for sum, inventive I know).

The thing I thought was interesting was if you then plot the S value against the log(votes) from IMDB you see a rather strong correlation between the two:

SCorrelation

Against the rating it is a bit more unclear. And while I won’t get into the nitty gritty (mutual information, distance correlation, and partial correlations all support what I’m about to say by the way, the Pearson correlation is reported above the plot and the data does appear linear so this is probably sufficient), basically I would say rather confidently that whether the 2011 Inflection is present or not is strongly linked to the popularity (number of IMDB votes) of the movie in question. Specifically, more popular movies are more likely to have the inflection.

This result is probably the strong indicator that a previously held belief about the 2011 inflection is true: the inflection has to do with IMDB expanding their smartphone/internet presence and seeing a sudden influx of new customers in 2011. Why? Because these new customers are more likely to vote on the initial wildly popular movies than something like Crispin Glover’s directoral debut. So for movies released prior to 2011, the most popular movies are much much more likely to see gains (and thus the inflection).

An alternative theory would perhaps be the international angle. As the international user base grows those users are also much much more likely to vote on the wildly popular movies (which are more likely to be available in foreign languages and released internationally). There are two reasons I think this is less likely. First, the inflection is seen in both international and US vote statistics (scraped from the much less robust Internet Archive data set of the IMDB ratings pages, and normalized by the maximum value in the windowed year average):

NationalVotes

Indeed, looking at the percentage of votes from international users and the increase (proportionally) is rather linear in reality, no inflection:

InternationalPercent

Second, I think there would be a lot more foreign language outliers in that case. A case where users from, say, Hong Kong increase, then those movies (with a much smaller number of votes) would have also seen the inflection. But in general I don’t think that was true (although I haven’t looked too closely, but I think I would have noticed that).

So that’s it. I declare this BMysTery closed! I think it is definitely due to a sudden influx of new users probably due to the widespread adoption of smartphones and development of the IMDB app. I should point out it still could be bots, because bots might try and fake out IMDB’s automated purging algorithms by voting on (likely) popular movies. But I don’t really see why that is more likely than my conclusion which I think makes total sense. Case closed I say!

Jamie’s Peer Review

I agree, particularly since you can find that around that time is when the IMDB apps became available. In June of 2010 (less than a year before the inflection) the android app launched in conjunction with a IMDB Everywhere initiative, where the company made a concerted effort at expanding their presence on mobile devices. The only thing that is a little curious is that the inflection seems pretty exact (I would wonder what kind of distribution we are talking about for the inflection point. Is it always the same point? Or are we seeing the inflection as a range starting around June 2010?). Would be weird if the initiative started in June 2010 and showed no effectiveness for a half a year before seeing a dramatic effect all at one time. Would still beg the question as to what specifically caused the dramatic effect… just curious. Probably still related to the initiative though.