So The Whole of Spotify has been Scraped and Archived ( Music files along with Metadata ) , Just 300TB… thoughts ? ![]()
Also if anyone needs a side hosting project…
Also for anyone Interested, Here’s the link Backing up Spotify - Anna’s Blog
So The Whole of Spotify has been Scraped and Archived ( Music files along with Metadata ) , Just 300TB… thoughts ? ![]()
Also if anyone needs a side hosting project…
Also for anyone Interested, Here’s the link Backing up Spotify - Anna’s Blog
Honestly, the wild part isn’t the scrape, it’s that 300 TB is now ‘side-project sized’; the real challenge is bandwidth, cost, and keeping it usable rather than just hoarded
the quality is not good tbh, the maximum bitrate file is 160kbps vbr that too for popular release. The unpopular release its just 75kbps vbr.
That is because, it’s the bitrate of spotify’s “High” quality setting. So, 160kbps ogg is the max they can go without spotify premium. I guess they had to use throwaway accounts to complete the archival, so getting premium wouldn’t have been possible.
Although the quality isnt so high, it won’t be noticeable to most people and it’s still good enough for archival purposes.
Woah, this is so interesting!
I do think that it may not be at the highest bit-rate possible; but still interesting how small the library is…
Would be interested to see if there are any smaller chunks that can be downloaded for people who don’t have a home datacenter.
Have they released the actual files yet?
Tbh before the current Drive Prices Hike even normal users including me has few 4-8 Tb drives spare so its not that big relatively speaking for someone who does such stuff ![]()
I do think that it may not be at the highest bit-rate possible
The details are already mentioned on the page:
For popularity>0, we got close to all tracks on the platform. The quality is the original OGG Vorbis at 160kbit/s. Metadata was added without reencoding the audio (and an archive of diff files is available to reconstruct the original files from Spotify, as well as a metadata file with original hashes and checksums).
For popularity=0, we got files representing about half the number of listens (either original or a copy with the same ISRC). The audio is reencoded to OGG Opus at 75kbit/s — sounding the same to most people, but noticeable to an expert.
Have they released the actual files yet?
The data will be released in different stages on our Torrents page:
Again mentioned on the page, though it would be extraordinary if there are no roadblocks introduced prior to release.
[X] Metadata (Dec 2025)
[ ] Music files (releasing in order of popularity)
[ ] Additional file metadata (torrent paths and checksums)
[ ] Album art
[ ] .zstdpatch files (to reconstruct original files before we added embedded metadata)
The website is hosted in Russia I think and they have hosted such stuff before with other website they should be able to pull it off imo ![]()
Yeah Anna’s archive is my go to place for getting any papers or books.
300TB
now I know where the drives are going
Spotify’s library, representing around 37% of all songs but 99.9% of all listens.
Seems it is more than a roadblock.