I’m a great fan of dogfooding - the practise of using your own tools to try and do things - so over the weekend I decided to make an AI generated video and copy test it.
The brief was for a five minute, meditative video that would be relaxing to put on in the background.
I started by using ChatGPT to brainstorm ideas and structure the video. I described the theme I wanted, and ChatGPT quickly generated a list of possible scenes. Once I had a clearer vision, I asked ChatGPT to help me lay out a basic storyboard, organizing the scenes into a logical flow that felt both dynamic and visually engaging.
I then asked ChatGPT to create a prompt for each scene that I could use with Runway.ML to create each scene. If you haven’t used Runway before, it’s a generative AI tool that allows you to turn text prompts into video clips. I fed Runway the scene descriptions from ChatGPT, and in just minutes, I had individual video clips that matched my vision. The cost of runway was around $3 for the five minutes of video I made.
No video is complete without a soundtrack but rather than pulling out my guitar, I tried AITubo, an AI-powered music generator, to create the background music to fit the meditative style. It came up with something quite melodic peacefull. Cost about $5.
I put it together with FFMPEG, a command-line tool I’m pretty familiar with, and gave it a watch. I liked it, my kids liked it, but what would real people think of it?
All in all, it had taken less than an hour to get my first video ready, and I’d spend less than $10.
Next stage was to test it with real humans. I used MX8 Labs to field a survey for testing. The process was super efficient:
5 minutes to program the survey.
5 minutes to test and finalize it.
1 hour to gather feedback from 200 respondents at a total cost of $500.
The survey results were available straight away, and it was interesting to see what people thought. I’d gone with a dial-test where consumers switch a slider between “looks like AI” and “looks real” and it was clear from the results, that most of the video was compelling:
At 45-55 seconds we’ve got this image:
Then at three minutes we’ve got this one:
Both of these have clear glitches, caught by the wisdom of the crowds.
If I was automating this, I’d build the human review process into the system and I think I’d still be creating video at well under $200 / minute.
For those interested in seeing the full video, you can view it here. I should really edit out the scenes that people didn’t like, but then it wouldn’t be an experiment, right?