🎼 SonicVerse: Music Captioning Demo

Welcome to SonicVerse, a multi-task music captioning model that provides natural language descriptions of input clips.

🎵 Captions include music features such as:

  • Genre
  • Mood
  • Instrumentation
  • Vocals
  • Key

📘 Read the Paper

🖥️ Replicate locally

⚠️ Note: You can upload audio of any length, but due to compute limits on Hugging Face Spaces, it is recommended to keep clips under 30 seconds unless you have a Pro account or run this locally.