
AI Music Video Tutorial
How to Make an AI Music Video: A Degenerate Santa Christmas Story
Create a humorous Christmas music video about Santa being a degenerate before Christmas. Think smoking, gambling, running from the police… Namaste. 🎅💨
Step 1: Generating the Content
Create the Music with Suno
- Why Suno? It’s easy to use, and you own the music you create—perfect for personal projects. It has a great free tier but if you want to release the songs I recommend the Pro plan which you own the music even after you cancel Here’s an affiliate link if you want to use it Suno Bonus: it’s cheaper than Spotify (and you own it)
- Process:
- Experiment with Suno to generate music that fits your theme and vibe.
- Adjust lyrics or create custom ones and feed them into Suno for a personalized touch.
1.2 Generate the video with ComfyUI
- Models Tried:
- Hunyaun (T2V) – my favorite for this project
Workflow:
-
Explore different tools to find one that matches your desired video style using the basic Hunyuan workflow example:
https://github.com/kijai/ComfyUI-HunyuanVideoWrapper/blob/main/examples/hyvideo_t2v_example_01.json
-
Follow the readme on the repository https://github.com/kijai/ComfyUI-HunyuanVideoWrapper
- At the time of this writing, it’s not on the
ComfyUI Manager
list
- At the time of this writing, it’s not on the
-
Once you’re generating, play around to find what prompts work for you. I found simpler sentences worked pretty well
1.3 Generate video prompts with ChatGPT/AI
- Input your lyrics and ask ChatGPT to generate scene descriptions. Example prompt:“Create 10 descriptions for scenes of Santa behaving like a degenerate. Use ‘high quality Santa’ as a prefix.”
- Refine until your outputs align with your vision.
- Tell it to use this template and to put it into a
code
block otherwise it uses it as formatting
The user will provide lyrics and describe how they want the video to look. For each description, generate concise prompts based on the lyrics, describing one visual scene per prompt. Avoid any additional context or explanations. Ensure the response is formatted as follows and placed inside a code snippet:
```plaintext
positive:<description of the visual scene>
negative:text, watermark
------------
Time to Batch
- Use lyrics as prompts for video scenes. Example: “high quality Santa smoking a cigar in a smoky room.” You’ll want to experiment to find what works best for your prompts
- Batch generate scenes for efficiency
- Install https://github.com/ltdrdata/ComfyUI-Inspire-Pack
- Save prompts in
ComfyUI\custom_nodes\ComfyUI-Inspire-Pack\prompts
. - Store in
.txt
files for batch processing.
positive:high quality Santa throwing snowballs at elves, laughing loudly negative:text, watermark ------------ positive:high quality Santa sneaking extra cookies from the elf kitchen negative:text, watermark ------------ positive:high quality Santa pouring rum into the hot chocolate pot in the workshop negative:text, watermark
1.4 (Optional) Video Upscaling
- Attempted but skipped due to time constraints. Next time, this step could further polish the visuals - I will update if I find a great option
Step 2: Editing the Video
2.1 Assemble in CapCut
- Imported all generated clips and the music track into CapCut.
- Challenges:
- Limited AI features for editing.
- Poor drag-and-drop functionality and timeline scaling.
2.2 Trim and Clean
- Added all clips to the timeline to “bulk out” the video.
- Steps:
- Reduce noise and remove nonsensical clips.
- Resize clips if generation settings vary (try to standardize this during the generation phase to save time).
- Pick the best bits and cut quickly for coherence.
2.3 Export
- Exported the final video for review.
- Hit a paywall for Pro features (unexpected, but sometimes necessary to save time).
Final Thoughts
Making an AI music video is a creative and iterative process. Tools like Suno, ComfyUI, and ChatGPT simplify generation, but manual refinement in editing tools is still necessary. Next time, I’ll explore better tools for automated editing and upscaling for a smoother workflow.