One of the fun parts of listening to 5by5 shows is helping pick show titles. Jeremy Mack created the Showbot, which lets members of the 5by5 chat room suggest titles as the shows run live (it also lets anyone vote for their favorite suggestions).

Of course, the hosts have the final say over what title gets used, and, as you can imagine, different hosts have different preferences for show titles. For example, John Siracusa, host of Hypercritical, only will consider things that are actually said on his show as possible titles. Inspired by Kieran Healy’s recent analysis of show durations, I decided to look at naming trends across several 5by5 shows and see how they have changed over time.

The main variable I looked at is the show title’s “originality.” That’s in scare quotes because I’m defining originality very narrowly -- here, it is just a reflection of the number of Google hits for the exact title phrase. Since the data span many orders of magnitude, I log-transform the number of hits (adding 0.5 to deal with zeros) and call it an “originality index.” Note that numerically smaller indices are more original, so I have reversed the y-axis on the graphs to put the indices that reflect more originality at the top.

Unfortunately, the number of results reported by Google when you perform a standard search is not very reliable, as described by Randall Munroe of xkcd. Therefore, I used Google’s Custom Search API instead and took the number returned in the totalResults field as a data point. It also allowed me to easily automate the collection of data. There are still issues with this approach: the Custom Search API documentation helpfully notes that

The totalResults property in the objects above identifies the estimated total number of results for the search, which may not be accurate.

Another problem with using Google results is that you’d want to exclude results that refer to the episodes themselves (which in principle should not affect the index). As a crude way of trying to do that, I appended -5by5 to all searches, which eliminates many, but not all, episode-related results.

Clearly the methodology has some flaws. Nevertheless, the results are interesting.

You can see that many shows started out with more commonplace titles, but quickly settled into using rather unusual ones. Hypercritical and The Critical Path have used less original titles than other shows, although this has varied over time (it seems that we are experiencing a recent drop with Hypercritical, though -- maybe the creative juices have all been channeled into a certain Mac OS X review.)

One factor that could affect the originality index is title length; you would probably expect longer titles to be more unusual than shorter ones, all else being equal. We do see that in the data, although the relationship between title length and originality index varies somewhat by show. For example, the indices for Back to Work titles are less dependent on the length than those of Hypercritical or Build and Analyze, but Build and Analyze titles tend to be more original than Hypercritical titles of the same length (and both more original than Critical Path titles).

The Incomparable tends to have longer titles than others, but most are around four words long on average. The Incomparable titles might be longer because that podcast uses the Showbot less often, and the Showbot limits suggestions to 40 characters.

Now, I’ve focused on the originality index here, but that isn’t the only factor that makes a good or apt title. There are clear counterexamples, like Hypercritical #42. This episode uses the fairly common phrase “The Wrong Guy” (index = 5.57) as its title. But the episode is about Walter Isaacson’s biography of Steve Jobs, and “The Wrong Guy” nicely sums up John’s criticism of the book.

On the other hand, “The Bridges of Siracusa County” (index = 0.81) is tough to beat...

Staring Contest

Staring Contest

Blog

Staring Contest

Show Titles