My son got an Amazon Echo for Christmas. We use the Echo mostly to play music. I have setup IFTTT (If This Then That) to save the name of any song we play in a Google Sheet.
Between December 26 and January 1st we played a little over 1000 songs. Most of the time I would just say something like "Alexa, play 80s music" or "Alexa, play 70s music" this is why you might see songs from the same period played in a row
It is no coincidence that a lot of George Michael songs were played, he died on Christmas day. The most played song was requested by my youngest son Nicholas, he loves Demons by Imagine Dragons
I decided to import the Alexa data into SQL Server and run some queries. If you want to follow along, you can get the file here from GitHub: Songs played by Alexa
I exported the Google Sheet to a tab delimited file, I saved this file on my C drive, I created a table and did a BULK INSERT to populate this table with the data from this file
USE tempdb GO CREATE TABLE AlexaSongs(PlayDate varchar(100), SongName varchar(200), Artist varchar(200), Album varchar(200)) GO BULK INSERT AlexaSongs FROM 'c:\Songs played with Alexa.tsv' WITH ( FIELDTERMINATOR =' ', ROWTERMINATOR = '\n' );
The date in the file is not a format that can be converted automatically, it looks like this December 26, 2016 at 09:53AM
I decided to add a date column and then convert that value with T-SQL. I did this by using the REPLACE function and replacing ' at ' with ' '
ALTER TABLE AlexaSongs ADD DatePlayed datetime GO UPDATE AlexaSongs SET DatePlayed = CONVERT(datetime, replace(playdate,' at ',' ')) GO
Now that this is all done, we can run some queries
What is the artist which we played the most?
SELECT Artist, count(SongName) As SongCount FROM AlexaSongs GROUP BY Artist ORDER BY SongCount DESC
Artist | SongCount |
George Michael | 33 |
Nirvana | 32 |
Imagine Dragons | 22 |
Josh Groban | 19 |
Eagles | 17 |
Stone Temple Pilots | 17 |
Mariah Carey | 16 |
Meghan Trainor | 15 |
Simon & Garfunkel | 13 |
Pearl Jam | 12 |
As you can see that is George Michael
How about if we want to know how many unique songs we played by artist?
SELECT Artist, count(DISTINCT SongName) As DistinctSongCount FROM AlexaSongs GROUP BY Artist ORDER BY DistinctSongCount DESC
Artist | DistinctSongCount |
Nirvana | 25 |
Stone Temple Pilots | 16 |
George Michael | 15 |
Eagles | 12 |
Simon & Garfunkel | 12 |
Josh Groban | 12 |
Mariah Carey | 11 |
Michael Bubl+¬ | 9 |
Snoop Dogg | 9 |
Harry Connick Jr. | 9 |
In this case Nirvana wins
How about the 10 most played songs? To answer that question and grab ties, we can use WITH TIES
SELECT TOP 10 WITH TIES Artist, SongName, COUNT(*) As SongCount FROM AlexaSongs GROUP BY Artist,SongName ORDER BY SongCount DESC
Here are the results
Artist | SongName | SongCount |
Imagine Dragons | Radioactive | 12 |
Jason Mraz | I'm Yours | 9 |
Pearl Jam | Yellow Ledbetter | 6 |
Josh Groban | When You Say You Love Me | 5 |
Oasis | Wonderwall (Remastered) | 4 |
House Of Pain | Jump Around [Explicit] | 4 |
Meghan Trainor | Lips Are Movin | 4 |
Imagine Dragons | Round And Round | 4 |
Nirvana | Smells Like Teen Spirit | 4 |
Sir Mix-A-Lot | Baby Got Back [Explicit] | 4 |
George Michael | Careless Whisper | 4 |
George Michael | Faith (Remastered) | 4 |
George Michael | Father Figure | 4 |
George Michael | Freedom! '90 | 4 |
So what other interesting queries can you come up with? How about how many Christmas related songs were there? Would the query look something like this?
SELECT TOP 10 WITH TIES Artist, SongName, COUNT(*) As SongCount FROM AlexaSongs WHERE SongName LIKE '%christmas%' OR SongName LIKE '%xmas%' OR SongName LIKE '%santa%' GROUP BY Artist,SongName ORDER BY SongCount DESC
Maybe you would want to know how many songs you played per day?
SELECT CONVERT(date, DatePlayed) as TheDate, count(*) FROM AlexaSongs GROUP BY CONVERT(date, DatePlayed) ORDER BY TheDate
Or maybe you want to know how many songs with the same title were sung by more than 1 artist?
Is this what the query would look like?
SELECT SongName, count(DISTINCT Artist) As SongCount FROM AlexaSongs GROUP BY SongName HAVING COUNT(*) > 1 ORDER BY SongCount DESC
If you want the song as well as the artist, you can use a windowing function with DENSE_RANK
;WITH cte AS( SELECT Artist, SongName, DENSE_RANK() OVER (PARTITION BY SongName ORDER BY Artist ) AS SongCount FROM AlexaSongs ) SELECT * FROM cte WHERE SongCount > 1
That is all for this post, I will keep collecting this data till next Christmas and hopefully will be able to run some more interesting queries
No comments:
Post a Comment