![]() I was a little surprised that Maine was so high so I checked on those. Let’s calculate a number relative to the population of each state (mentions per million population). Also, I am a bit surprised the numbers are this low overall this makes me long for BIGGER DATA. I’ll keep them in for now but we should be aware of that. Now, I am going to use my vast knowledge of pop culture here and suggest that these mentions of New York are referencing New York City, not the state of New York, as lovely as it may be. # 9 25 get back the beatles with billy preston 1969 # 8 10 tighten up archie bell the drells 1968 # 7 4 sittin on the dock of the bay otis redding 1968 # 6 61 california nights lesley gore 1967 # 5 77 message to michael dionne warwick 1966 # 4 10 california dreamin the mamas the papas 1966 # 3 49 california girls the beach boys 1965 # 2 29 eve of destruction barry mcguire 1965 # 1 12 king of the road roger miller 1965 tidy_lyrics %ĭistinct(Rank, Song, Artist, Year. Let’s only count each state once per song that it is mentioned in. # 10 49 california girls the beach boys 1965 3 california 38066920 # 9 49 california girls the beach boys 1965 3 california 38066920 # 8 49 california girls the beach boys 1965 3 california 38066920 # 7 49 california girls the beach boys 1965 3 california 38066920 ![]() # 6 49 california girls the beach boys 1965 3 california 38066920 # 5 49 california girls the beach boys 1965 3 california 38066920 # 4 49 california girls the beach boys 1965 3 california 38066920 # 3 49 california girls the beach boys 1965 3 california 38066920 ![]() # 2 29 eve of destruction barry mcguire 1965 1 alabama 4817678 # 1 12 king of the road roger miller 1965 1 maine 1328535 # Rank Song Artist Year Source state_name pop2014 inner_join(tidy_lyrics, pop_df) # A tibble: 526 × 7 Now we can use an inner join to find all the state names that are actually there. The variable state_name in this data frame contains all the possible words and bigrams that might be state names in all the lyrics. # 10 1 wooly bully sam the sham and the pharaohs 1965 3 the # 9 1 wooly bully sam the sham and the pharaohs 1965 3 sam # 8 1 wooly bully sam the sham and the pharaohs 1965 3 bully # 7 1 wooly bully sam the sham and the pharaohs 1965 3 wooly # 6 1 wooly bully sam the sham and the pharaohs 1965 3 bully # 5 1 wooly bully sam the sham and the pharaohs 1965 3 wooly # 4 1 wooly bully sam the sham and the pharaohs 1965 3 miscellaneous # 3 1 wooly bully sam the sham and the pharaohs 1965 3 sham # 2 1 wooly bully sam the sham and the pharaohs 1965 3 the # 1 1 wooly bully sam the sham and the pharaohs 1965 3 sam # Rank Song Artist Year Source state_name This is a data set of pop lyrics this means that a) my beloved Lyle Lovett is not in it and b) it is certainly going to be biased in certain ways compared to other genres when it comes to mentions of place names. Her analysis is wonderful and so fun, and she has the data as well as her code for scraping/analysis on GitHub. Song Lyricsįor a data set of song lyrics, I am going to use the compilation of Billboard’s Year-End Hot 100 from 1958 to the present put together by Kaylin Walker. There we go! We now have a data frame ready to go with the state names and their corresponding populations. What do we have here, just to check? pop_df %>% (If you haven’t used the acs package before, you will need to get an API key and run () one time to install your key on your system.) library(acs) I use Census data from the American Community Survey for my work, so let’s use the acs package to find the most recent total population estimates for each state. Statesįor this first blog post, I am only going to look at mentions of state names, so let’s download state population data from the U.S. Are certain locations mentioned in song lyrics at a higher rate, perhaps at a higher rate relative to their population? I’ve recently realized that I know of pretty good data sets to make a stab at answering this, so let’s go! Downloading Population Data for U.S. We have continued to have this conversation many, many times over our years together, noticing state and city names in song lyrics and wondering if or why certain places are mentioned more often. “And why does it always sound so miserable there?” At the time we were listening to a lot of Lyle Lovett, and Counting Crows was on the radio a lot. “Why do so many songs talk about Baltimore?” we asked each other. I think the first time we ever had this conversation was in the late 1990s and was about Baltimore. One of the recurring conversations we have in our relationship (all long-term relationships have these, right?!) is about song lyrics and place names. The inspiration for this post is a joint venture by both me and my husband, and its genesis lies more than 15 years in our past.
0 Comments
Leave a Reply. |