Unresolved Speech?

On the latest episode of the Reconcilable Differences podcast, John Siracusa and Merlin Mann open the episode by discussing the "sing-song" or "uptalk" openings of some YouTube videos. John mentions the LockPickingLawyer introduction as an example of this; he feels as though the end feels like it's hanging or unresolved.

I was so intrigued by this that I stopped listening to the episode and started investigating. I went to the most recent episode of the LockPickingLawyer channel and grabbed the first couple seconds of audio.

You can certainly hear what John is talking about. I tried figuring out what the closest pitches were to each word, but my ears are not very well trained. So, I turned to technology instead. I found that there's a pretty cool Python package called crepe that uses a trained neural network model to estimate the fundamental frequencies of a given sound sample (as well as its confidence in that estimate). After quick installation of the package and its dependencies, I was able to predict the pitch from the clip.

The LockPickingLawyer introduction waveform (above) and predicted pitches (below). I’ve also marked the start of the different words and the pitches of some nearby notes. I’m only showing the higher-confidence pitch predictions, which is why there are some gaps in the lower plot.

The LockPickingLawyer introduction waveform (above) and predicted pitches (below). I’ve also marked the start of the different words and the pitches of some nearby notes. I’m only showing the higher-confidence pitch predictions, which is why there are some gaps in the lower plot.

You can see how the pitches change with the different words. I also tried to see which pitches on a piano (in equal temperament tuning) the different words were closest to. To me, it seems like it starts with "This" as an F, followed by "is" as a Bb, then down to Eb for "the" and up to F (an octave below the initial "This") and staying there for "lock-pick-ing" before ending on a Db for "lawyer." Below is my attempt to render that in musical notation.

My attempt at approximating the LockPickingLawyer speech cadence with musical notation.

My attempt at approximating the LockPickingLawyer speech cadence with musical notation.

So, can we figure out why it sounds like it is ending in an unresolved way? While I was able to compensate for my lack of ear-training with programming before, here's where my lack of musical knowledge feels like it's holding me back. If I had to guess, most of the pitches of this phrase sound like they could fit well into the key of Bb. You start on F ("the 5", or the dominant), then go down a fifth to the tonic Bb (1), then down another fifth to Eb (4). It then steps up to F (an octave below the starting point), but instead of resolving back up to someplace like the tonic Bb, it instead goes up to Db. Db is a minor third above the tonic note — that might suggest to the ear being in one of the minor modes of Db. To my ear it sounds pretty unresolved as well, but I don't have a good answer as to why. Maybe it is because it goes from the dominant (F) to somewhere besides the tonic (Bb) (as I think Merlin suggested on the show), but that's just a guess on my part.

Of course, this tonal sequence is from human speech, not a composed melody. Our musical expectations are built up from a lifetime of listening; how do these expectations translate from particular musical genres to hearing the speech of others? I don't really know, but I am quite interested in learning more about harmonic intervals in general and their relationships to human speech in particular.

Ranking the Butterfield Diet

The Brian Butterfield diet is a very funny sketch from The Peter Serafinowicz Show. I wanted to know what the objectively funniest options for Treat Day were, so I fired up Survey Monkey and asked the members of the Do By Friday Discord to rank all the items from the video. For some reason, thirteen people filled out the survey.

The results have been...increbidle

I averaged the results to get an average rank for each item (circles in the plot below). I've also shown individual responses (small squares). Note that better ranks are toward the right.

Results of an extremely scientific Butterfield Diet survey.

Results of an extremely scientific Butterfield Diet survey.

I've also grouped the items into arbitrary tiers based on their average rank. Let's go through them together.

C-Tier

C-Tier results

C-Tier results

Most people ranked these items in the bottom half, but there's not a whole lot of difference between them.

21. Mystery meat - Kind of a cliched term

20. Artificial bacon - Kind of a thing that exists in the world

19. Chocolate quail’s eggs - A bit surprised to see this so low - maybe the term is too long?

18. Egg ‘n’ ham slabs - Also too long

17. Garlic pudding - I think referring to hummus as garlic pudding is pretty funny, but it’s a less quotable term

16. Sandwich casserole - Mild visual joke

15. 20 cheese omelette - Not that far from actual omelettes

14. Potato grids - I was surprised to see this as low as it is, but the masses have spoken.

B-Tier

B-Tier results

B-Tier results

Definitely some clearer separation between the bottom and top of the list.

13. Birthday pie - Amusing word substitution

12. Pasta pillows - This is not a bad name for the thing

11. McFortune cookies - Delightfully unexpected branding

10. During-dinner mints - See, you don’t have to wait until afterwards

9. Pints o’ cream - It’s got a very pleasing rhythm

8. Discount foie gras - Fancy yet inexpensive. Couldn’t possibly be a bad idea. We’re now getting to items that received some #1 votes.

A-Tier

A-Tier results

A-Tier results

These were all pretty close to each other. We are getting to some solid jokes here.

7. Pizza (pronounced “pih-zah”) - Pronouncing words incorrectly is fun. Interestingly, this was a pretty split vote, with five votes at 14 or below and five votes putting it in the top four.

6. Fluffy ruffs - Just fun to say. But no #1 votes.

5. Hoisin crispy owl - An absurd concept for a food, and an absurd phrase.

4. Quiches lorraine - I can’t explain why I (and others, apparently) love this one so much. Is it pluralizing the first word? This is a real food; this should not be as funny as it is.

3. Pork cylinders - Pork cylinders is extremely good, but I’m still surprised it took the #3 spot.

S-Tier

S-Tier results

S-Tier results

These are both very good and well-loved, but I probably could have put #1 in its own SS-level tier.

2. Large macs - Substituting synonyms in proper names is just fun, you know. Pretty strong support overall, though a couple of monsters ranked it in the lower half.

1. Bonbonbonbons - By far the best one to say out loud. No one ranked it below #4. It got almost half of the #1 votes in the survey. The clear winner of the Brian Butterfield Diet.

Just look at them now

So far, I've ordered everything by average rank from the survey. But I also wondered how the different ranks were sequenced throughout the video.

Sequence of tiers in the video.

Sequence of tiers in the video.

While not plotted here, note that there are non-food interjections in the video after "Large macs" and "McFortune cookies."

We start out strong with several A- and B-tier items. There's a stretch of lower-tier items apart from "Large macs" (which finishes a section). The second half starts pretty strong, hitting "bonbonbonbons" two-thirds of the way through the video. We then kind of fade out at the end. I think it's interesting that the two S-tier items are spaced so evenly through the video; there was definitely some thought put into the order here.

What did we learn?

Mostly what we already knew, which is that "bonbonbonbons" is the best. But it's nice to have it confirmed by science (note: not actually science).

A Translated Guide to The Taken King Changes

Steve Lubitz of the Isometric podcast complained in their most recent episode that all the articles about the new Destiny expansion, The Taken King, were written in fairly-incomprehensible Destiny-lingo (Destinese?). I figured I'd take a shot at explaining what's new in the expansion in terms that a general gamer could readily understand.

I am by no means a hardcore Destiny player. Since I don't have a regular group that also plays (and little energy/time to try to find one), I haven't done the serious endgame content like raids or most of the other content that doesn't automatically match you with other players. I do have a character at the highest possible level, but that's about it.

I'm going to start with a summary of what Destiny is to lay the groundwork for understanding what is changing with the expansion. This will try to hit the highlights; it is not intended to be comprehensive.

What is Destiny?

Destiny is a first-person shooter that has some elements similar to massively-multiplayer online games (MMOs, like World of Warcraft. It can be played by yourself, cooperatively with a few other players, or competitively with teams of players facing each other. When you play by yourself or cooperatively, you are completing missions or scenarios against computer opponents (called player-versus-environment modes, or PvE). Competitive modes are called player-versus-player, or PvP.

You can play all this content with the same character (each player account can have up to three characters), and progress with your character carries across the different game modes. The different game modes are represented in the game as different destinations you can travel to. There are also destinations that act as social hubs where you can interact with computer-controlled vendors to get various equipment and items. You can also start and complete various other tasks, like quests, at the social hubs.

Classes

There are three character classes in Destiny - once you make this choice, you cannot change it. The classes are Titan, Warlock, and Hunter. The classes play differently because they all have different specialized abilities as well as somewhat different base strengths and weaknesses. I won't get into the particulars here, though.

Each class also has two subclass options. For example, the Titan can be either a Striker or a Defender. The Striker's signature ability ("special") is smashing the ground to cause an area-of-effect attack. The Defender's special ability is creating a shield around a small area which prevents attacks from going through. You can switch sub-classes at pretty much any time.

All the characters use the same types of weapons, though, so the basic mechanics of jumping around and shooting enemies is pretty similar across the different classes.

You also customize your character by picking one of three species, selecting whether it is male or female, and choosing among various cosmetic options. None of these choices have a direct effect on how the character plays.

Character progression

When you start a new character, it begins at level 1, meaning that you have relatively few abilities available to you at the start of the game and you are relatively weak. Like an MMO (or basically any roleplaying game), you gather experience by killing enemies and completing tasks, which eventually allows you to level up. Leveling up means that you can take on harder enemies - if your level is a few levels below that of your enemies, you will do little damage to them, and they will hit you like a freeway-load of trucks.

Gaining experience also slowly unlocks your various abilities. So, eventually you will be able to throw a grenade, and then activate your super ability, and so on. Later abilities let you do things like add a new effect to your existing super ability, or change the type of grenade you throw.

Also like an MMO, you can enhance your character by using better equipment (armor and weapons). Some equipment can't be used until you reach a particular level, especially early in the game. Better weapons intrinsically do more damage. Better armor provides more intrinsic defense and also improves your other attributes. I won't go into much detail about those attributes, but they basically let you use your abilities (throwing grenades, using the special, and performing a particular type of melee attack) more frequently.

In the current game, you continue to gain experience until you reach level 20. After that, you gain additional levels by equipping items that have an attribute called Light. All the Light from the equipment that you are currently using is added up to determine your level. In the current game (with the second expansion), the highest level you reach by equipping items with Light is level 34. This mechanic is similar to how in MMOs you can get better tiers of equipment while still remaining at the level cap (maximum available level), but it's a little different in how the gear you wear explicitly determines your level.

Still, experience doesn't go away entirely past Level 20. Your guns and armor pieces also have abilities that can be unlocked by accumulating experience (just like your character's abilities). These abilities are things like faster reload speed for a weapon, or carrying more of a particular type of ammo for an armor piece. The abilities are tied to the particular equipment, so you only receive the benefit when the item is equipped.

Hey, are we ever going to talk about the new expansion?

Sorry it's taking a while, and I hope the terminology has not been too confusing so far. We're just trying to establish the context for understanding what's changed.

Here we go; onto the new stuff.

New sub-classes

Each of the character classes will get a new sub-class that they can switch to, and those sub-classes each have brand-new abilities. They should theoretically open up some new styles of play. This could be refreshing for long-term players, and it could provide some new niches that may be more appealing to certain players than any of the older options. We'll see how they play when the expansion comes out. At a minimum, it adds novelty.

Changes to character progression in The Taken King

In the new expansion, your character's level will be entirely determined by how much experience you've gained — it will no longer depend on the gear that you are wearing. Gear will still determine your amount of Light, which does affect how powerful your character is, but now a character's Light will be the average of the attack/defense values of all your equipped guns and armor. They are also making a couple of other equipment slots contribute this overall average that were mostly cosmetic before.

In some sense, this makes things more complicated. Before, you could look at one number (your character's level) and have a sense for what kind of content you could handle.

However, this also meant you had to get high-level gear to feel like you made full progress on your character. With the new expansion, you can get to the maximum level just by playing the game, regardless of what gear you happen to get. You still can't do the hardest types of content without better gear, but that is no different than before. And now, when the level cap is increased, you'll be at the same starting point as everyone else, as long as you played enough to get the necessary experience.

Here's an explanation that may or may not help with understanding the difference - this move sort of changes things from a one-dimensional progression to a two-dimensional one. Before, everyone fell somewhere along the same line, with max level at the end of the line. You couldn't get past certain points on that line without getting better gear. Now, everyone can get to the end of the line just by earning experience. Gear matters, though, because it helps you go further in a new direction. It doesn't move you further along the line, but now your character can go deeper instead of just going forward. And you need to go deep enough in this new direction to tackle high-end, hard content.

Again, this is more the model that MMOs use. In those games (or at least in World of Warcraft, which I'm most familiar with), everyone can get to the maximum level by getting experience, but getting better gear allows you to do enough damage (and withstand enough damage) to complete more advanced endgame content.

I heard you can immediately get one character to level 25 in the expansion?

Yes, the expansion gives you a token that can be redeemed to get you to level 25 right away. I don't think this is necessarily the best for a player totally new to Destiny. Destiny slowly unlocks content as you level from 1 to 20. It would probably be overwhelming for a new player to get that all at once.

However, I think the "skip to level 25" feature is good for a few scenarios. If you have friends who play Destiny and you want to start playing with them right away, this could be a good way to catch up quickly. But now you'll be relying on those friends to explain more of the game to you, rather than letting the game itself do that.

The other reason I can imagine using that token is if you've played Destiny more casually and have run through most of the base missions, but haven't quite gotten to level 25 (maybe you're at 18 or 20 or so, and don't know the fastest path to getting the gear that will take you to 25). The token gives you a bit of a boost toward seeing brand-new stuff faster.

Making your weapons better

One complaint players had with earlier expansions is that new weapons made their old favorites obsolete. Coming into Destiny with MMO experience, I was kind of used to that, but people (including myself) do get attached to particular equipment that you use a lot and invest a lot of effort into. So, in the expansion, you'll be able to improve an existing weapon by essentially sacrificing a weapon with better stats (along with some other materials, too).

For example, if your favorite rifle has an attack value of 170, and you find a new rifle with an attack value of 180, you can raise the attack value of your favorite to something like 178 if you're willing to destroy the new one in the process (those numbers are made up; I don't know the specific formula for the process). This way, you can keep using weapons you like without them becoming ineffective.

Questing and bounties

Bungie found that people liked doing some of the longer quest chains in the first year --- it gave a sense of really accomplishing something and often got players involved in new aspects of the game. For example, I pretty much hated Destiny PvP before I got a quest where I had to get loads of PvP kills to move on - oh, and dying too much made you lose progress. This forced me to play lots of PvP matches, which meant I saw maps over and over again, and learned what kinds of behaviors got me killed faster and which ones let me collect more kills. It honestly just led me to practice PvP enough to get the competence I needed to enjoy and appreciate that type of play. Now, I actually like playing PvP (most of the time).

Anyway, quests are good and feel good to complete, so in The Taken King there will (1) be more of them, (2) be interface improvements that make tracking and turning them in easier, and (3) be better rewards for doing them. Bungie is also using quests (and bounties, which are like shorter, repeatable quests) to give players an avenue toward guaranteed rewards. Usually, you get a new armor piece or gun by either killing an enemy that randomly dropped it, or by being randomly given it at the end of a mission or PvP match. This is entirely unpredictable and undependable, so it can be very frustrating when you don't get anything you need for long stretches of time. Instead, quests in the expansion have guaranteed rewards, so you can embark on a set of activities with the confidence that it will pay off at the end.

Exotics

An Exotic weapon or piece of armor is a rare item that often has specialized abilities and is generally quite powerful relative to other equipment. Because of that, you can have at most one Exotic weapon and one Exotic armor equipped at any time. Players love Exotics and work hard to get them, but they were not sure what would happen to these hard-earned items in the expansion.

In The Taken King, some of these Exotics will have new and improved versions (with a different appearance, as well). If you have the old version of the Exotic, you can automatically buy the new version, too. Yay, better weapons and armor! You can also buy the old versions of the Exotics, too, as long as you found it at least once (so now you can get an Exotic back if you dismantled your first one).

Currency

I've mentioned buying things a couple of times. There are a variety of currencies and materials in Destiny that are spent at vendors and used to upgrade equipment. I'm not going to get into all of them, but in the original game there was one major type of currency that was associated with PvE-focused vendors (and earned by PvE activities) and another type that was PvP-focused (and earned by PvP activities). In the expansion, these two are essentially merging into one currency that is used everywhere. This currency will also be shared across all your characters, which is nice; a lot of times before you ended up with a bunch of the old currency stockpiling on your most advanced character, when it would be nice to use that to improve your other, lower-level characters.

Also, in the current game you can only earn 100 of each type of the old currencies in a week, and hold at most 200 at a time. In the expansion, you'll still only be able to hold up to 200 of the new unified currency, but there are no limits on earning it. The more you play, the more you earn, as long as you keep spending it, too.

Storage space

The place you keep gear and materials you aren't using is called the Vault (not to be confused with the raid called the Vault of Glass). In the expansion, you'll have twice as much space to store weapons and more than twice as much space to store armor. People seem super-pumped about this. Like, kind of surprisingly so. At least I am surprised by the degree of enthusiasm for storage space.

Factions

Destiny has several factions that you need to earn reputation with before they will let you buy things from them. The way you earned this was by doing certain kinds of tasks while wearing a certain piece of equipment associated with that faction. Now, you just ask the faction for a badge to earn reputation with them. You can change factions once per week. Simpler!

Wrap-Up

I certainly have not covered all the expansion-related changes, but I hope this gives a more Destiny-naive reader a better sense of what is coming and why Destiny players care about it.

Will these changes help brand-new players? I'm not sure. I think they are making some quest-oriented changes to the early missions, so we will have to see if that early experience is more streamlined and does a better job at teaching the various aspects of Destiny. Destiny is a pretty sprawling game — not so much in the size of the world, exactly, but more in the number of different things you can do (well, relatively different; you're pretty much always shooting something with a gun), and the number of different avenues for improving and customizing your character. Just playing Destiny a lot is pretty much how you get familiar with those things.

However, I do think that these changes will make Destiny more pleasant and more fun for players as they accumulate that experience, and potentially smooth enough of the rough edges that could cause players to lose interest and motivation in the game. I would say that the changes are pretty much geared toward the "casual" Destiny player, but a "casual" Destiny player is still probably a player with quite a high willingness to push through a lot of stuff in their way. We'll have to see how it turns out when it launches — I am certainly looking forward to it.

Parallel Treks

Around the time the podcast Random Trek completed its first 30 episodes, a quantum fissure mysteriously appeared. Naturally, we investigated it using a subspace differential pulse and — to our surprise — we discovered 10,000 parallel timelines which all had their own versions of Random Trek.

After recalibrating the sensor array (as you do), we were able to determine which episodes have been watched so far in these parallel universes. We used these data to address some of the pressing questions from the Random Trek fanbase.

First, did our timeline's Random Trek really have an unusual number of Voyager episodes as it was starting out? For reference, ours had two of its first four episodes from Voyager, six out of its first 20, and seven out of its first 30. We can look at the distributions of the episode selections across all the timelines to see how unusual this was.

As Random Trek's host Scott McNulty might say, spoiler alert: it's not very unusual. About a quarter of the other timelines (2,420 to be precise) had two or more Voyager episodes in their first four episodes. Below, you can see the distributions of the first 20 episodes and first 30 episodes (our timeline's values are marked with red triangles). Our timeline is sitting close to the middle of the Voyager distributions. Perhaps the most unusual thing about ours here appears to be the relatively low number of Deep Space Nine episodes, as only 891 of the other 10,000 timelines had two or fewer DS9 episodes in the first 20, and 923 had four or fewer in the first 30 episodes.

Distributions by series of the first 20 episodes

Distributions by series of the first 30 episodes

One of the other timelines actually had 19 of its first 30 episodes from The Next Generation, which is the most from any one series. Interestingly, all of the 10,000 timelines had at least one TNG episode in the first 30; the same cannot be said for any other series.

Our timeline appears to have split off from the others early on. While 20 other timelines also had "Tapestry" as the first episode, none of them also had "By Any Other Name" as the second. And our timeline only shares at most 6 episodes with any of the other 10,000 timelines (and it does that with only 16 of them).

How many other timelines share some episodes with ours?

How many other timelines share some episodes with ours?

Another unusual aspect of our timeline is that we've already had both Ray Wise episodes ("Who Watches the Watchers" and "Hope and Fear") within the first 30 — only 23 of the parallel timelines have had that happen as well!

Looking at the others, we can also imagine what might have been. 447 of the other timelines have already had a "Wrath of Khan" episode by now, and 11 of those had it as their very first episode. What a start that would be. But still, we dodged a bullet that the six universes with a first episode of "Star Trek: Insurrection" did not. And, unlike ours, the fans in those 11 Khan universes probably don't believe that the random number generator is really random, either...

Graphing The Incomparable

It turns out I’m probably going to have good reasons to need to learn Python in the near future, so I’m finding little projects to practice writing scripts. The Incomparable recently wrapped up the year with a retrospective episode in which some people talked about being on more than in the past, while others mentioned they were on less. I was inspired to see if there was a way to visualize who has been on episodes together and how that has changed over time.

I collected data from the RSS feed and the web site to figure out which people were on which episodes. I could then put that into a matrix with episodes as the rows and people as the columns (with 1’s or 0’s depending on whether they were on the episode or not). Multiplying the transpose of that matrix by itself gives a new matrix that shows how many episodes people had in common. (This is just like what Kieran Healy did in his Paul Revere metadata post.)

I could then use D3js to visualize the results as a force-directed graph. I calculated the network matrices for subsets of episodes from different timeframes, so you can flip through and see how the graph changes over time. You can see the results here.

Screen Shot 2014-01-10 at 12.14.36 AM.png

Through the Eyes of the Jackals

Example cloud generated from title submissions for The Incomparable episode 108.

With all the great podcasting content produced these days, I often end up wishing for easier ways of finding past episodes. Memory does not always serve well enough, and clicking through episode descriptions can be tedious (and unreliable for podcasts that span many topics in a single show).

Clearly it would be great to have some way of searching or browsing podcast episodes, but it’s equally clear that it’s hard to do without accurate transcriptions.

As I’ve mentioned before (in my only other post so far), live listeners of shows on the 5by5 network can submit title suggestions via Showbot, which was created by Jeremy Mack. Suggestions are almost always quotes from the shows, and during popular shows the submissions are so frequent it seems almost like a live transcript. Of course it’s not—submissions are unsurprisingly biased toward humorous or silly quotes—but I wondered if they could be used in some way.

Basing a search engine on the title suggestions would be frustrating. You would have no way of knowing if results didn't appear because the titles didn’t happen to mention those topics, or if they were truly not a part of a show. However, maybe they could be useful for browsing instead. You’d at least see how an episode was viewed through the eyes of the chatroom jackals.

I took a list of titles recorded by Showbot (the set I had on hand covered episodes from Showbot’s inception at the end of June 2011 through July 2012) and wrote a Perl script to identify the most relevant words for each episode.

After doing a bit of research, it looked like the tf-idf statistic would be a decent measure of the relevance of an individual word to an episode. Tf-idf is the product of two statistics: the term frequency (tf) and the inverse document frequency (idf). The term frequency is just the number of times a word appears in a document (in this case, how many times it appears among all the title submissions for a given episode). The inverse document frequency is calculated by dividing the number of documents (or episodes) by the number of documents the word appears in and then taking the logarithm of that ratio.

You end up with a statistic that is high if the word appears a lot in a document, but that is balanced out if the word is just common and appears in many documents. That way, you don’t end up with “the” and “a” being your top words. You can also avoid that result by manually removing so-called stop words from the text (which I also did).

Getting the word frequency is pretty straightforward in Perl:

use Lingua::Stopwords qw( getStopWords );

sub count_words_in_titles {
    my @titles = @_;
    local $_;

    my @words = map { s/[.!?,*]/ /g; tokenize(lc($_)) } @titles;

    my $stopwords = getStopWords('en');
    @{$stopwords}{qw( 's n't 'll 're 'd 've )} = (1) x 6;
    @words = grep { !$stopwords->{$_} } @words;

    my %word_counts;
    $word_counts{$_}++ for @words;

    return \%word_counts;
}

I used a tokenizer() function taken from the Lingua::BrillTagger CPAN package and based on the Penn Treebank tokenizer sed script. From there you can calculate idf as well:

for my $show (keys %show_data) {
    for my $ep (@{$show_data{$show}}) {
        $ep->{wordcounts} = count_words_in_titles( @{$ep->{titles}} );

        $ep->{maxwordcount} = 0;

        for (keys $ep->{wordcounts}) {
            $ep->{maxwordcount} = $ep->{wordcounts}{$_} if $ep->{wordcounts}{$_} > $ep->{maxwordcount};

            $idf{$_}++;
        }
    }
}

for (keys %idf) {
    $idf{$_} = $total_episode_count / $idf{$_};
    $idf{$_} = log($idf{$_});
}

And continue in the same vein to multiply the relevant values together for each word per episode.

To visualize the output, I used the Wordle engine to create word clouds in which the size of the word reflected its tf-idf score.

Overall, I’m pleased with the results. I’ve put together a set of word clouds for the episodes of the now-ended, much-missed Hypercritical podcast. Since I’ve listened to many of them, the clouds serve as a nice, quick visual reference that helps me find specific episodes. I also think some of them capture the character of the episode particularly well.

Episodes 67 to 69.

From episode 42: The Wrong Guy. "Textbook!" 

Show Titles

One of the fun parts of listening to 5by5 shows is helping pick show titles. Jeremy Mack created the Showbot, which lets members of the 5by5 chat room suggest titles as the shows run live (it also lets anyone vote for their favorite suggestions).

Of course, the hosts have the final say over what title gets used, and, as you can imagine, different hosts have different preferences for show titles. For example, John Siracusa, host of Hypercritical, only will consider things that are actually said on his show as possible titles. Inspired by Kieran Healy’s recent analysis of show durations, I decided to look at naming trends across several 5by5 shows and see how they have changed over time.

The main variable I looked at is the show title’s “originality.” That’s in scare quotes because I’m defining originality very narrowly -- here, it is just a reflection of the number of Google hits for the exact title phrase. Since the data span many orders of magnitude, I log-transform the number of hits (adding 0.5 to deal with zeros) and call it an “originality index.” Note that numerically smaller indices are more original, so I have reversed the y-axis on the graphs to put the indices that reflect more originality at the top.

Unfortunately, the number of results reported by Google when you perform a standard search is not very reliable, as described by Randall Munroe of xkcd. Therefore, I used Google’s Custom Search API instead and took the number returned in the totalResults field as a data point. It also allowed me to easily automate the collection of data. There are still issues with this approach: the Custom Search API documentation helpfully notes that

The totalResults property in the objects above identifies the estimated total number of results for the search, which may not be accurate.

Another problem with using Google results is that you’d want to exclude results that refer to the episodes themselves (which in principle should not affect the index). As a crude way of trying to do that, I appended -5by5 to all searches, which eliminates many, but not all, episode-related results.

Clearly the methodology has some flaws. Nevertheless, the results are interesting.

You can see that many shows started out with more commonplace titles, but quickly settled into using rather unusual ones. Hypercritical and The Critical Path have used less original titles than other shows, although this has varied over time (it seems that we are experiencing a recent drop with Hypercritical, though -- maybe the creative juices have all been channeled into a certain Mac OS X review.)

One factor that could affect the originality index is title length; you would probably expect longer titles to be more unusual than shorter ones, all else being equal. We do see that in the data, although the relationship between title length and originality index varies somewhat by show. For example, the indices for Back to Work titles are less dependent on the length than those of Hypercritical or Build and Analyze, but Build and Analyze titles tend to be more original than Hypercritical titles of the same length (and both more original than Critical Path titles).

The Incomparable tends to have longer titles than others, but most are around four words long on average. The Incomparable titles might be longer because that podcast uses the Showbot less often, and the Showbot limits suggestions to 40 characters.

Now, I’ve focused on the originality index here, but that isn’t the only factor that makes a good or apt title. There are clear counterexamples, like Hypercritical #42. This episode uses the fairly common phrase “The Wrong Guy” (index = 5.57) as its title. But the episode is about Walter Isaacson’s biography of Steve Jobs, and “The Wrong Guy” nicely sums up John’s criticism of the book.

On the other hand, “The Bridges of Siracusa County” (index = 0.81) is tough to beat...