Show HN: Kimu – Open-Source Video Editor
trykimu.comI wanted a proper non-linear video editor built for the web. It always annoyed me how there are practically zero functioning web video editors. And here we are :)
Kimu can: - Work with Video, Audio & Text. - Supports Transitions. - Non-Linear Video Editing with z-axis overlays. - Split/trim - Export - A cute AI agent (coming soon!)
I'm in uni and I started this project out of sheer annoyance that there are zero good web video editors. It is open-source here (https://github.com/robinroy03/videoeditor).
What do y'all think?
So I appreciate the aim here, but for me to trust any video editor, I need to see an example timeline that’s like, 30 minutes long with clips from at least 10 1080p video files and at least one effect on each track.
And for the record I wouldn’t consider that a stress test (a stress test would be more like 3 hours, 100 tracks, 4K and like a dozen precomps that are being reversed or something). That’s just to make sure this thing won’t fall over during casual usage.
You may be inclined to respond that your editor is targeting beginner editors, to which I’d note that beginner editors are MUCH less disciplined than experts when it comes to trimming footage, splitting things up into comps, pre-rendering chunks, using proxies, etc. Beginner editors (I’d know, I used to be one) will dump a 1 hour 4K-HDR iPhone video of a presenter speaking, and a screen recording of presentation slides they accidentally took in 4K60 into your timeline. Being able to demonstrate that you’ve got that level of memory management handled is what separates video editors people can use from mere “good ideas”.
Edit: Another thought, you call your product “Cursor for video editing”, and that’s a valid goal. But bear in mind that a LARGE part of why Cursor is successful is because they didn’t try to build an IDE from scratch. They got to absorb all of the nice UX (not to mention the extensive plugin ecosystem) of VS Code, and then spend their time on AI features. If that’s how you want to spend your time, you definitely don’t want to be building an editor from scratch.
I dont get this attitude. If there is some specific test you need before you'll trust it, what's stopping you doing that? I can understand your response if it was a paid product, but this doesn't even require you to install anything. Instead of appreciating the work that they've put in, you come right in with a negative comment that's not even based on trying it or knowing anything about it.
I’m on my phone at the moment, and didn’t want to judge this product on its iOS Safari performance, as that wouldn’t be fair.
With regard to my attitude: I’ve run into a lot of people who are trying to build “AI video editors”, and many of them don’t realize how intense a basic video editor actually is. It’s the kind of area where, the faster someone gets to the brick wall, the sooner they can start working on getting through it.
Furthermore I think it’s good for them to know that if achieving what I described seems daunting and they want to focus on the AI angle, it’s totally fine to fork an existing mature OSS video editor and just build the AI features on top. That’s what Cursor did for IDEs, and they’re finding a lot of success.
It's ok to wait til you've had a chance to try it before commenting.
They don't seem to me to be focusing on the AI angle at all. They mention one AI feature as coming soon.
Weird critique. Someone builds a quality pair of lightweight scissors, and you critique it like a boastful 5-axis CNC operator.
Like the rest of us needing to edit many couple of minutes long videos, there is massive gap in the market for something lighter weight. Look at Capcuts success over Adobe: 200+ million active user per month for Capcut.
I would like someone to make either a voice controlled or gesture based video editor. I have not seen a single one for obvious reasons. Voice controlled would go like "OK search for the input.mp4 file " and drag it at 2 seconds "Now, play at 5x speed from 1 minute" "After a few seconds, you say stop and it stops at that frame" "Cut it here" "Now go 4 mins ahead" "Stop, cut it here" Imagine a video editor where you dont even need the mouse and keyboard
oh we already did try that :) see this demo video I posted to X (https://x.com/_RobinRoy/status/1938676070452207786)
*this is text based but I hope you get the point
text based is too slow and too much typing, it ll take off only if it is real time voice based
yeah sure, we'll add it to the roadmap. But do you think "speaking"/"typing" the basic instructions is better than actually doing it through the UI?
I feel like for basic interactions like dragging etc, it is better if the user does it by hand. AI can handle complicated workflows like removing silences, quickly removing unwanted background elements etc
Well, my very simple 3-minutes playing around with this were enjoyable.
Please add the ability to center images if you upload different-sized images, so the smaller images don't all clump in the upper left corner.
I'll try this with larger video clips later
sure :)
try with larger clips and let me know in discord / github discussions
Another remotion video editor with vibe-coded features? There are so many of them... A video editor in the long term needs to be mobile-first. A web video editor in the big year of 2025 is not going to move the needle. Capcut has a free desktop app that you must compete with. I think a better idea is building a mobile video editor, it's much harder to vibe code. To be the "Cursor for Video Editing," it's a must.
Capcut TOS change is very unfortunate. They now claim broad licenses over user content. (https://www.isabokelaw.com/blog/capcuts-new-terms-of-service...)
Also, what makes you think it is vibe coded? Is the app not functioning as you expected? Let me know and I'll fix them asap.
mobile-first it's how to degrade computing in first place.
Sad to see indeed; "Augmented Reality" in a rather wide definition and opportunistic computing ("I have a calculator with me because my phone has calculator software installed, and I keep a phone with me for tasks it's objectively rather good at (say, for example, FaceTime style video calls, with camera switchable between selfie (front) and surroundings (back))") seem the only actually good/deserving cases for mobile-first.
[Plane in the following refers to the image/sensor plane of the camera.] My understanding is that with both an in-plane and a normal component of translation, together with enough 3D rotation to anchor/calibrate the gyro, the 3D accelerometer's absolute scale can be translated to a fixed-to-earth (Swaying skyscraper doesn't count! On-a-train doesn't count!) static scene's Structure-from-Motion feature/anchor points. In-plane alone just gives you parallax that tells you the _distance ratios_ of the two objects that parallax to another as you translate in-plane; but once you add plane-normal translation, an absolute translation interacts additively to both object's distances thus letting you recover absolute scale not just distance ratios. Of course you'd hope for some suitably good features dense enough in the scene, start out with some optical flow or similar style to get a baseline on gyro calibration and (translation/linear) velocity zeroing, to then get a decent shot at being able to use the SfM point features with very little "brute force" in the SfM alignment/solution process.
AR interactivity allows directing the camera man to collect appropriately dense coverage of the scenario/area before allowing conditions to change (illumination/plant growth; people moving furniture; people moving back to actively use/occupy the space), before one could let the software refine the entire situation as a background task. Once sufficient refinement has been done (during which one would prefer to redirect the interactive AR compute resources to said refining task), one could quickly lock back into a now-static scene which could render the captured version anchored to real-time camera feedback from the real location to practically eliminate the traditionally annoying drift/tracking-artifacting. At least in places with enough light to allow the camera to track non-blurry views of the reference features despite the obviously interactive motion.
...what did i read...
How does it compare to https://omniclip.app/ ?
kimu offers a better user experience and is more intuitive. Also, we match the features with omni.
Our roadmap included automated captions, color grading and the like. We'll be a capcut alternative on the web.
Sounds great. CapCut is the editor I use the most.
How does it compare with OpenCut?
https://opencut.app
Opencut cannot even export a video. It's very hyped up but it's barely functional.
You should prepopulate the timeline with an in progress edit so people can jump in start playing with it.
yeah ty for the suggestion. I'll add some sample media.
>A cute AI agent (coming soon!)
Why??? You don't need this just because "AI" is popular right now, it will distract you from the goal of developing "video editor built for the web". It's really not going to improve the video editing experience.
our goal is to make a solid web video editor first and foremost. Then we'll try to make it super accessible.
I'm thinking of including AI features like captions, auto color grading etc. I get your point of forcing AI, we won't do that.
No AI for AI, but AI for making it more accessible and helping novice users to also make cinematic videos.
Captions would be a very useful feature and one of the top feature that paid platforms use as a hook for payment. There are enough models that can run client-side to make this good enough for social media captions, for example.
yeah. im working on a PR for it rn. If you like our work, join discord :)
So you are spending time shoe-horning "AI" into a web-based video editor, when you could have been creating PRs for actual video-editing functionality.
IMHO captions is not "shoe-horning AI" for video -- it's a critical requirement to be competitive with closed source editors and a great use case for local models.
IMHO you don't know what a video editor does. This thing doesn't need to compete with closed-source editors. Nobody is dropping Premiere for this thing. The goal shoud be web-based video editing, not AI captions. There are plenty of video editing functions not implemented yet, so if this were a serious project about video editing, spending time on "AI" captions seems a bit like a distraction for them. It sounds like there is no project manager, not a lot of focus, and the devs are following a bandwagon.
Hi, we are adding the top features our users are requesting. The roadmap is built for the needs of the community. We'll eventually add all the features to match a video editor once the core needs are met.
Most of the community wants auto captions and color grading, so that goes first.
So you aren't building a video editor, you're building something else. Got it. Then I'll request that you add a Salesforce replacement to the project. Should I put that demand in /issues?
I think you misunderstood me :)
There is a finite set of features you need to add to make this an "excellent" video editor. It will take time & most people who'll eventually use this project may not use them at all.
The best way to guarantee ROI is to do things that users need immediately, like captions. Salesforce isn't in the finite set of things a video editor needs, so that request will be rejected.
[dead]