hey, thanks for sharing about your documentary series. would love to check it out if you don't mind linking it!
we don't yet support that volume of footage (1TB), however if you'd like to try this at a smaller scale, you can already do this today with the Rough Cut tile — simply prompt it for the moments that you're interested in (it can take visual cues, auditory cues, timestamp cues, script cues) and it will create an initial rough cut or assembly edit for you.
I'd also recommend checking out the new Motion Graphics tile we added for animations. You can also single-point generate motion graphics using the utility on the bottom right of the timeline. Let me know if you have any questions on that.
An additional suggestion for OP, working with large video archives:
- Batch transcribe your videos to smaller proxy files preserving the same file names (to allow easy re-linking to full quality media later)
- Upload proxys to Mosaic
- Do your Agentic rough-cut with Mosaic
- Export EDL or NLE project file
- In NLE, Re-link proxy media to full-quality video & render locally.
To Mosaic:
I need to look deeper at your project, but support for EDL export (Avid, Premiere, Final Cut compatible, as well as commercial grading and conform software workflows) and upload/management of proxy media could be helpful additional features.
Absolutely - the channel is called "Dolton Documentaries" on YouTube. I'll definitely check out the features you mentioned, and am super excited to see where this goes!
yes! you can upload as many videos as you want (file limits currently are at 20GB and 90 minutes, per file). then I'd recommend using either the Rough Cut tile or the Montage tile to stitch them all together. In those tiles, you can prompt particular visual cues in terms of how you want the videos to be combined. Let me know if any questions.
we've done a ton of work to optimize the uploads / downloads / transcoding of videos to handle beefy files using proxies, and also allow you to XML export back to traditional editing tools that can link back to your "heavy" media, but I hear you and I think anything running locally on device is just going to feel faster
it does present its own set of challenges, but something we've thought about
we've actually found that multimodal models are surprisingly good at maintaining temporal context as well
that being said, there's also a bunch of additional processing using more traditional CV / audio analysis we do to extract this information out as well (both frame-level and temporal) in your video understanding
for example, with the mean-motion analysis — you can see how subjects move over a period of time, which can help determine where important things are happening in the video, which ultimately can lead to better placements of edits.
we don't yet support that volume of footage (1TB), however if you'd like to try this at a smaller scale, you can already do this today with the Rough Cut tile — simply prompt it for the moments that you're interested in (it can take visual cues, auditory cues, timestamp cues, script cues) and it will create an initial rough cut or assembly edit for you.
I'd also recommend checking out the new Motion Graphics tile we added for animations. You can also single-point generate motion graphics using the utility on the bottom right of the timeline. Let me know if you have any questions on that.