Dover, Delaware is many things. Home of Delaware’s capital. Home of one of the country’s top HBCUs. Home of the Monster Mile. Once a year, though, Dover transforms from its semi-suburban glory into the Woodlands.
For 3 days each year, Dover welcomes the Firefly Festival and thousands of people with it. At CompassRed, we love how many people come and celebrate in our home state with us. In honor of that, we took a look at the Twitter timelines of 15 of the top music festivals in the world for 2019¹ — including our very own Firefly.
We’ll try to answer a few different questions about these music festivals based off of what each of them tweets:
What words and phrases are used most frequently across all festivals?
What words and phrases are used most frequently by individual festivals?
What is unique about each festival?
Before anything else, though, a great place to start to understand the festivals is with the number of followers² that they have on Twitter:
There seems to be a pretty large disparity in the number of followers that each Festival has. Ultra Music Festival dwarfs everything else and then there is a long tail following it. Firefly has the 6th most followers on this list, grouped relatively closely with Rolling Loud and Hangout Music Festival.
Some of these festivals tweet a lot — much more than what you would expect given their follower counts.
We can see that in an even more pronounced way by directly comparing the number of tweets made by each festival to its followers:
Ultra Music Festival and Rolling Loud jump out for very different reasons. Generally, there is a group of music festivals that are relatively closely grouped together with several outliers beyond those.
More interesting than the number of tweets or the number of followers, though, is the actual content of what each music festival tweets. First, we start at a high level and look at words that are used most frequently by all accounts in aggregate³.
Not surprisingly, it appears that most festivals spend time tweeting about sales related words like “
tickets” or general music festival related words like “
festival” and “
weekend”. There are a few unexpected words in here, though, such as “
ranger” and “
dave”. We see similar patterns when we explore each festival individually.
Once again, we primarily see sales related words bubble to the top. Our own Firefly Festival uses words like “
woodlands”, and “
summer” extremely frequently⁴.
We get more clarity about the way that each festival uses Twitter by analyzing phrases, specifically bigrams, that occur in the Tweets. Again, we look at all of the bigrams in aggregate for all festivals.
While we still see many of the same sales words as before — “
buy tickets” and “
single day” — individual marketing campaigns also begin to emerge. Specifically, we see “
ranger dave” spike to the front of the list. In addition, we see “
happy birthday” percolate to the top as well.
More patterns emerge when we separate commonly used phrases by festival.
It starts to become clear that each festival uses their Twitter differently, though some broad patterns begin to emerge.
Some festivals use their Twitter account to sell tickets in a very direct way by frequently using phrases like “
buy tickets”. BottleRock Napa is a great example of this:
Others love to talk about the different events that they host during the festivals. Hangout Festival loves to do this, particularly when they use phrases like “
thursday kickoff” and “
kickoff party” for their Kickoff:
Of course, just taking raw counts of words and phrases only tells part of the story. Some words and phrases get used much more frequently by all of these Twitter handles than others — sales words like “
ticket” and generic music words like “
festival” being prime examples of that.
It would be much more interesting to understand what words and phrases each festival uses that are unique to them. We want to give less value to words that are used by everyone and more value to words that are unique. We can use a technique called
tf-idf in order to find these words and phrases. The
tf-idf technique assigns a score to every word and phrase. In this case, the higher the score, the more unique it is to the Twitter handle that it came from.
Here’s a look at the top 5 most unique phrases used by each festival’s Twitter.
The text, the fill color, and the length of each bar give an indication of how unique each phrase is to that Twitter handle.
The text is the actual
tf-idf score for the phrase. The higher the value, the more unique the phrase is. This score indicates relative uniqueness, but does not mean anything on its own. We can tell from this number, though, that the phrase “
nathan zucker” from Bonnaroo is approximately equally unique to the phrase “
jamcellars ballroom” from Bottle Rock Napa⁶ because they both have a score of
Fill color lets us better understand how unique each phrase is compared to allphrases. Not surprisingly, we see that “
ranger dave” is dark red, indicating that it is an extremely unique calling card for the Outside Lands Festival.
Bar length lets us better understand how unique each phrase is compared to other phrases used by that Twitter handle. For example, “
atlantic campout” and “
summer passes” are both more unique than some of the other phrases used by Firefly Festival in their tweets.
With this technique we get a much better understanding of what makes each festival unique. For example, I can quickly tell that Coachella has two weekends and really likes to push their Activities Tent. I could also tell you that Lollapalooza takes place at Grant Park, specifically around Buckingham Fountain and is a 4 day festival.
Looking more deeply at the top 15 phrases by
tf-idf scores for the Firefly Festival, we start to see some of the key components of what makes Firefly what it is.
The idea of camping is a reoccurring theme. St. Jude feels out of place initially, but on further research we find that St. Jude Children’s Research Hospital was Firefly Festival’s charity of choice for at least 7 years.
Overall, here is what we learned by diving into the Twitter timelines of some top music festivals:
By-and-large, festivals promote their tickets using Twitter. All festivals frequently use words like “
buy tickets” or “
music festival” or “
On an individual level, festivals have different strategies about how they use Twitter. Some almost exclusively use Twitter to sell tickets. Others focus more on the events and activities that will happen at their festivals. The last group uses specific marketing campaigns to promote their festival (shoutout to ranger dave).
When we evaluate words and phrases for how unique they are in the context of the Twitter handles, we can build a reasonably strong profile of each festival. In some cases, that means that themes like location emerge. In others, we get a better understanding of partnerships and activities that make each festival unique.
While we were able to quickly gain an understanding of what top festivals were talking about on Twitter, we could also expand this work into a large variety of industries. Text analytics and natural language processing techniques offer a window into better understanding of documents (of all varieties — not just Tweets) and quantifying trends. For example, instead of ingesting data about music festivals, we could ingest data about industry competitors in order to get a better understanding of how they market themselves. These techniques scratch the surface of what is possible with text analytics, allowing us to systematically gain insights about text that we see around us every day.
¹ When we say “top music festivals”, what we mean is “top music festivals according to this article”. This article lists 25 festivals. We focused on 15, which we got to after eliminating any music festivals that had less than 30,000 followers on Twitter, any international festivals, and any festivals that are new this year. In addition, we stuck with the limits of the Twitter API to gather this data, which means that we were limited to the last ~3,200 tweets for each festival. [back]
² As of June 18, 2019. [back]
³ This is after taking out stop words, the name of each festival, and some variants from the tweets. Additional cleaning, such as stemming, were not completed for this exercise. [back]
⁵ Important note: I also love ranger dave. ranger ruth is pretty sweet too. [back]