The best Hacker News stories from Show from the past day

Go back

Latest posts:

Show HN: Coi – A language that compiles to WASM, beats React/Vue

I usually build web games in C++, but using Emscripten always felt like overkill for what I was doing. I don't need full POSIX emulation or a massive standard library just to render some stuff to a canvas and handle basic UI.<p>The main thing I wanted to solve was the JS/WASM interop bottleneck. Instead of using the standard glue code for every call, I moved everything to a Shared Memory architecture using Command and Event buffers.<p>The way it works is that I batch all the instructions in WASM and then just send a single "flush" signal to JS. The JS side then reads everything directly out of Shared Memory in one go. It’s way more efficient, I ran a benchmark rendering 10k rectangles on a canvas and the difference was huge: Emscripten hit around 40 FPS, while my setup hit 100 FPS.<p>But writing DOM logic in C++ is painful, so I built Coi. It’s a component-based language that statically analyzes changes at compile-time to enable O(1) reactivity. Unlike traditional frameworks, there is no Virtual DOM overhead; the compiler maps state changes directly to specific handles in the command buffer.<p>I recently benchmarked this against React and Vue on a 1,000-row table: Coi came out on top for row creation, row updating and element swapping because it avoids the "diffing" step entirely and minimizes bridge crossings. Its bundle size was also the smallest of the three.<p>One of the coolest things about the architecture is how the standard library works. If I want to support a new browser API (like Web Audio or a new Canvas feature), I just add the definition to my WebCC schema file. When I recompile the Coi compiler, the language automatically gains a new standard library function to access that API. There is zero manual wrapping involved.<p>I'm really proud of how it's coming along. It combines the performance of a custom WASM stack with a syntax that actually feels good to write (for me atleast :P). Plus, since the intermediate step is C++, I’m looking into making it work on the server side too, which would allow for sharing components across the whole stack.<p>Example (Coi Code):<p>component Counter(string label, mut int& value) {<p><pre><code> def add(int i) : void { value += i; } style { .counter { display: flex; gap: 12px; align-items: center; } button { padding: 8px 16px; cursor: pointer; } } view { <div class="counter"> <span>{label}: {value}</span> <button onclick={add(1)}>+</button> <button onclick={add(-1)}>-</button> </div> }</code></pre> }<p>component App { mut int score = 0;<p><pre><code> style { .app { padding: 24px; font-family: system-ui; } h1 { color: #1a73e8; } .win { color: #34a853; font-weight: bold; } } view { <div class="app"> <h1>Score: {score}</h1> <Counter label="Player" &value={score} /> <if score >= 10> <p class="win">You win!</p> </if> </div> }</code></pre> }<p>app { root = App; title = "My Counter App"; description = "A simple counter built with Coi"; lang = "en"; }<p>Live Demo: <a href="https://io-eric.github.io/coi" rel="nofollow">https://io-eric.github.io/coi</a><p>Coi (The Language): <a href="https://github.com/io-eric/coi" rel="nofollow">https://github.com/io-eric/coi</a><p>WebCC: <a href="https://github.com/io-eric/webcc" rel="nofollow">https://github.com/io-eric/webcc</a><p>I'd love to hear what you think. It's still far from finished, but as a side project I'm really excited about :)

Show HN: S2-lite, an open source Stream Store

S2 was on HN for our intro blog post a year ago (<a href="https://news.ycombinator.com/item?id=42480105">https://news.ycombinator.com/item?id=42480105</a>). S2 started out as a serverless API — think S3, but for streams.<p>The idea of streams as a cloud storage primitive resonated with a lot of folks, but not having an open source option was a sticking point for adoption – especially from projects that were themselves open source! So we decided to build it: <a href="https://github.com/s2-streamstore/s2" rel="nofollow">https://github.com/s2-streamstore/s2</a><p>s2-lite is MIT-licensed, written in Rust, and uses SlateDB (<a href="https://slatedb.io" rel="nofollow">https://slatedb.io</a>) as its storage engine. SlateDB is an embedded LSM-style key-value database on top of object storage, which made it a great match for delivering the same durability guarantees as s2.dev.<p>You can specify a bucket and path to run against an object store like AWS S3 — or skip to run entirely in-memory. (This also makes it a great emulator for dev/test environments).<p>Why not just open up the backend of our cloud service? s2.dev has a decoupled architecture with multiple components running in Kubernetes, including our own K8S operator – we made tradeoffs that optimize for operation of a thoroughly multi-tenant cloud infra SaaS. With s2-lite, our goal was to ship something dead simple to operate. There is a lot of shared code between the two that now lives in the OSS repo.<p>A few features remain (notably deletion of resources and records), but s2-lite is substantially ready. Try the Quickstart in the README to stream Star Wars using the s2 CLI!<p>The key difference between S2 vs a Kafka or Redis Streams: supporting tons of durable streams. I have blogged about the landscape in the context of agent sessions (<a href="https://s2.dev/blog/agent-sessions#landscape">https://s2.dev/blog/agent-sessions#landscape</a>). Kafka and NATS Jetstream treat streams as provisioned resources, and the protocols/implementations are oriented around such assumptions. Redis Streams and NATS allow for larger numbers of streams, but without proper durability.<p>The cloud service is completely elastic, but you can also get pretty far with lite despite it being a single-node binary that needs to be scaled vertically. Streams in lite are "just keys" in SlateDB, and cloud object storage is bottomless – although of course there is metadata overhead.<p>One thing I am excited to improve in s2-lite is pipelining of writes for performance (already supported behind a knob, but needs upstream interface changes for safety). It's a technique we use extensively in s2.dev. Essentially when you are dealing with high latencies like S3, you want to keep data flowing throughout the pipe between client and storage, rather than go lock-step where you first wait for an acknowledgment and then issue another write. This is why S2 has a session protocol over HTTP/2, in addition to stateless REST.<p>You can test throughput/latency for lite yourself using the `s2 bench` CLI command. The main factors are: your network quality to the storage bucket region, the latency characteristics of the remote store, SlateDB's flush interval (`SL8_FLUSH_INTERVAL=..ms`), and whether pipelining is enabled (`S2LITE_PIPELINE=true` to taste the future).<p>I'll be here to get thoughts and feedback, and answer any questions!

Show HN: S2-lite, an open source Stream Store

S2 was on HN for our intro blog post a year ago (<a href="https://news.ycombinator.com/item?id=42480105">https://news.ycombinator.com/item?id=42480105</a>). S2 started out as a serverless API — think S3, but for streams.<p>The idea of streams as a cloud storage primitive resonated with a lot of folks, but not having an open source option was a sticking point for adoption – especially from projects that were themselves open source! So we decided to build it: <a href="https://github.com/s2-streamstore/s2" rel="nofollow">https://github.com/s2-streamstore/s2</a><p>s2-lite is MIT-licensed, written in Rust, and uses SlateDB (<a href="https://slatedb.io" rel="nofollow">https://slatedb.io</a>) as its storage engine. SlateDB is an embedded LSM-style key-value database on top of object storage, which made it a great match for delivering the same durability guarantees as s2.dev.<p>You can specify a bucket and path to run against an object store like AWS S3 — or skip to run entirely in-memory. (This also makes it a great emulator for dev/test environments).<p>Why not just open up the backend of our cloud service? s2.dev has a decoupled architecture with multiple components running in Kubernetes, including our own K8S operator – we made tradeoffs that optimize for operation of a thoroughly multi-tenant cloud infra SaaS. With s2-lite, our goal was to ship something dead simple to operate. There is a lot of shared code between the two that now lives in the OSS repo.<p>A few features remain (notably deletion of resources and records), but s2-lite is substantially ready. Try the Quickstart in the README to stream Star Wars using the s2 CLI!<p>The key difference between S2 vs a Kafka or Redis Streams: supporting tons of durable streams. I have blogged about the landscape in the context of agent sessions (<a href="https://s2.dev/blog/agent-sessions#landscape">https://s2.dev/blog/agent-sessions#landscape</a>). Kafka and NATS Jetstream treat streams as provisioned resources, and the protocols/implementations are oriented around such assumptions. Redis Streams and NATS allow for larger numbers of streams, but without proper durability.<p>The cloud service is completely elastic, but you can also get pretty far with lite despite it being a single-node binary that needs to be scaled vertically. Streams in lite are "just keys" in SlateDB, and cloud object storage is bottomless – although of course there is metadata overhead.<p>One thing I am excited to improve in s2-lite is pipelining of writes for performance (already supported behind a knob, but needs upstream interface changes for safety). It's a technique we use extensively in s2.dev. Essentially when you are dealing with high latencies like S3, you want to keep data flowing throughout the pipe between client and storage, rather than go lock-step where you first wait for an acknowledgment and then issue another write. This is why S2 has a session protocol over HTTP/2, in addition to stateless REST.<p>You can test throughput/latency for lite yourself using the `s2 bench` CLI command. The main factors are: your network quality to the storage bucket region, the latency characteristics of the remote store, SlateDB's flush interval (`SL8_FLUSH_INTERVAL=..ms`), and whether pipelining is enabled (`S2LITE_PIPELINE=true` to taste the future).<p>I'll be here to get thoughts and feedback, and answer any questions!

Show HN: Interactive physics simulations I built while teaching my daughter

I started teaching my daughter physics by showing her how things actually work - plucking guitar strings to explain vibration, mixing paints to understand light, dropping objects to see gravity in action.<p>She learned so much faster through hands-on exploration than through books or videos. That's when I realized: what if I could recreate these physical experiments as interactive simulations?<p>Lumen is the result - an interactive physics playground covering sound, light, motion, life, and mechanics. Each module lets you manipulate variables in real-time and see/hear the results immediately.<p>Try it: <a href="https://www.projectlumen.app/" rel="nofollow">https://www.projectlumen.app/</a>

Show HN: Interactive physics simulations I built while teaching my daughter

I started teaching my daughter physics by showing her how things actually work - plucking guitar strings to explain vibration, mixing paints to understand light, dropping objects to see gravity in action.<p>She learned so much faster through hands-on exploration than through books or videos. That's when I realized: what if I could recreate these physical experiments as interactive simulations?<p>Lumen is the result - an interactive physics playground covering sound, light, motion, life, and mechanics. Each module lets you manipulate variables in real-time and see/hear the results immediately.<p>Try it: <a href="https://www.projectlumen.app/" rel="nofollow">https://www.projectlumen.app/</a>

Show HN: Zsweep – Play Minesweeper using only Vim motions

Show HN: Zsweep – Play Minesweeper using only Vim motions

Show HN: Bible translated using LLMs from source Greek and Hebrew

Built an auditable AI (Bible) translation pipeline: Hebrew/Greek source packets -> verse JSON with notes rolling up to chapters, books, and testaments. Final texts compiled with metrics (TTR, n-grams).<p>This is the first full-text example as far as I know (Gen Z bible doesn't count).<p>There are hallucinations and issues, but the overall quality surprised me.<p>LLMs have a lot of promise translating and rendering 'accessible' more ancient texts.<p>The technology has a lot of benefit for the faithful, that I think is only beginning to be explored.

Show HN: BrowserOS – "Claude Cowork" in the browser

Hey HN! We're Nithin and Nikhil, twin brothers building BrowserOS (YC S24). We're an open-source, privacy-first alternative to the AI browsers from big labs.<p>The big differentiator: on BrowserOS you can use local LLMs or BYOK and run the agent entirely on the client side, so your company/sensitive data stays on your machine!<p>Today we're launching filesystem access... just like Claude Cowork, our browser agent can read files, write files, run shell commands! But honestly, we didn't plan for this. It turns out the privacy decision we made 9 months ago accidentally positioned us for this moment.<p>The architectural bet we made 9 months ago: Unlike other AI browsers (ChatGPT Atlas, Perplexity Comet) where the agent loop runs server-side, we decided early on to run our agent entirely on your machine (client side).<p>But building everything on the client side wasn't smooth. We initially built our agent loop inside a Chrome extension. But we kept hitting walls -- service worker being single thread JS; not having access to NodeJS libraries. So we made the hard decision 2 months ago to throw away everything and start from scratch.<p>In the new architecture, our agent loop sits in a standalone binary that we ship alongside our Chromium. And we use gemini-cli for the agent loop with some tweaks! We wrote a neat adapter to translate between Gemini format and Vercel AI SDK format. You can look at our entire codebase here: <a href="https://git.new/browseros-agent" rel="nofollow">https://git.new/browseros-agent</a><p>How we give browser access to filesystem: When Claude Cowork launched, we realized something: because Atlas and Comet run their agent loop server-side, there's no good way for their agent to access your files without uploading them to the server first. But our agent was already local. Adding filesystem access meant just... opening the door (with your permissions ofc). Our agent can now read and write files just like Claude Code.<p>What you can actually do today:<p>a) Organize files in my desktop folder <a href="https://youtu.be/NOZ7xjto6Uc" rel="nofollow">https://youtu.be/NOZ7xjto6Uc</a><p>b) Open top 5 HN links, extract the details and write summary into a HTML file <a href="https://youtu.be/uXvqs_TCmMQ" rel="nofollow">https://youtu.be/uXvqs_TCmMQ</a><p>--- Where we are now If you haven't tried us since the last Show HN (<a href="https://news.ycombinator.com/item?id=44523409">https://news.ycombinator.com/item?id=44523409</a>), give us another shot. The new architecture unlocked a ton of new features, and we've grown to 8.5K GitHub stars and 100K+ downloads:<p>c) You can now build more reliable workflows using n8n-like graph <a href="https://youtu.be/H_bFfWIevSY" rel="nofollow">https://youtu.be/H_bFfWIevSY</a><p>d) You can also use BrowserOS as an MCP server in Cursor or Claude Code <a href="https://youtu.be/5nevh00lckM" rel="nofollow">https://youtu.be/5nevh00lckM</a><p>We are very bullish on browser being the right platform for a Claude Cowork like agent. Browser is the most commonly used app by knowledge workers (emails, docs, spreadsheets, research, etc). And even Anthropic recognizes this -- for Claude Cowork, they have janky integration with browser via a chrome extension. But owning the entire stack allows us to build differentiated features that wouldn't be possible otherwise. Ex: Browser ACLs.<p>Agents can do dumb or destructive things, so we're adding browser-level guardrails (think IAM for agents): "role(agent): can never click buy" or "role(agent): read-only access on my bank's homepage."<p>Curious to hear your take on this and the overall thesis.<p>We’ll be in the comments. Thanks for reading!<p>GitHub: <a href="https://github.com/browseros-ai/BrowserOS" rel="nofollow">https://github.com/browseros-ai/BrowserOS</a><p>Download: <a href="https://browseros.com">https://browseros.com</a> (available for Mac, Windows, Linux!)

Show HN: BrowserOS – "Claude Cowork" in the browser

Hey HN! We're Nithin and Nikhil, twin brothers building BrowserOS (YC S24). We're an open-source, privacy-first alternative to the AI browsers from big labs.<p>The big differentiator: on BrowserOS you can use local LLMs or BYOK and run the agent entirely on the client side, so your company/sensitive data stays on your machine!<p>Today we're launching filesystem access... just like Claude Cowork, our browser agent can read files, write files, run shell commands! But honestly, we didn't plan for this. It turns out the privacy decision we made 9 months ago accidentally positioned us for this moment.<p>The architectural bet we made 9 months ago: Unlike other AI browsers (ChatGPT Atlas, Perplexity Comet) where the agent loop runs server-side, we decided early on to run our agent entirely on your machine (client side).<p>But building everything on the client side wasn't smooth. We initially built our agent loop inside a Chrome extension. But we kept hitting walls -- service worker being single thread JS; not having access to NodeJS libraries. So we made the hard decision 2 months ago to throw away everything and start from scratch.<p>In the new architecture, our agent loop sits in a standalone binary that we ship alongside our Chromium. And we use gemini-cli for the agent loop with some tweaks! We wrote a neat adapter to translate between Gemini format and Vercel AI SDK format. You can look at our entire codebase here: <a href="https://git.new/browseros-agent" rel="nofollow">https://git.new/browseros-agent</a><p>How we give browser access to filesystem: When Claude Cowork launched, we realized something: because Atlas and Comet run their agent loop server-side, there's no good way for their agent to access your files without uploading them to the server first. But our agent was already local. Adding filesystem access meant just... opening the door (with your permissions ofc). Our agent can now read and write files just like Claude Code.<p>What you can actually do today:<p>a) Organize files in my desktop folder <a href="https://youtu.be/NOZ7xjto6Uc" rel="nofollow">https://youtu.be/NOZ7xjto6Uc</a><p>b) Open top 5 HN links, extract the details and write summary into a HTML file <a href="https://youtu.be/uXvqs_TCmMQ" rel="nofollow">https://youtu.be/uXvqs_TCmMQ</a><p>--- Where we are now If you haven't tried us since the last Show HN (<a href="https://news.ycombinator.com/item?id=44523409">https://news.ycombinator.com/item?id=44523409</a>), give us another shot. The new architecture unlocked a ton of new features, and we've grown to 8.5K GitHub stars and 100K+ downloads:<p>c) You can now build more reliable workflows using n8n-like graph <a href="https://youtu.be/H_bFfWIevSY" rel="nofollow">https://youtu.be/H_bFfWIevSY</a><p>d) You can also use BrowserOS as an MCP server in Cursor or Claude Code <a href="https://youtu.be/5nevh00lckM" rel="nofollow">https://youtu.be/5nevh00lckM</a><p>We are very bullish on browser being the right platform for a Claude Cowork like agent. Browser is the most commonly used app by knowledge workers (emails, docs, spreadsheets, research, etc). And even Anthropic recognizes this -- for Claude Cowork, they have janky integration with browser via a chrome extension. But owning the entire stack allows us to build differentiated features that wouldn't be possible otherwise. Ex: Browser ACLs.<p>Agents can do dumb or destructive things, so we're adding browser-level guardrails (think IAM for agents): "role(agent): can never click buy" or "role(agent): read-only access on my bank's homepage."<p>Curious to hear your take on this and the overall thesis.<p>We’ll be in the comments. Thanks for reading!<p>GitHub: <a href="https://github.com/browseros-ai/BrowserOS" rel="nofollow">https://github.com/browseros-ai/BrowserOS</a><p>Download: <a href="https://browseros.com">https://browseros.com</a> (available for Mac, Windows, Linux!)

Show HN: I've been using AI to analyze every supplement on the market

Hey HN! This has been my project for a few years now. I recently brought it back to life after taking a pause to focus on my studies.<p>My goal with this project is to separate fluff from science when shopping for supplements. I am doing this in 3 steps:<p>1.) I index every supplement on the market (extract each ingredient, normalize by quantity)<p>2.) I index every research paper on supplementation (rank every claim by effect type and effect size)<p>3.) I link data between supplements and research papers<p>Earlier last year, I took pause on a project because I've ran into a few issues:<p>Legal: Shady companies are sending C&Ds letters demanding their products are taken down from the website. It is not something I had the mental capacity to respond to while also going through my studies. Not coincidentally, these are usually brands with big marketing budgets and poor ingredients to price ratio.<p>Technical: I started this project when the first LLMs came out. I've built extensive internal evals to understand how LLMs are performing. The hallucinations at the time were simply too frequent to passthrough this data to visitors. However, I recently re-ran my evals with Opus 4.5 and was very impressed. I am running out of scenarios that I can think/find where LLMs are bad at interpreting data.<p>Business: I still haven't figured out how to monetize it or even who the target customer is.<p>Despite these challenges, I decided to restart my journey.<p>My mission is to bring transparency (science and price) to the supplement market. My goal is NOT to increase the use of supplements, but rather to help consumers make informed decisions. Often times, supplementation is not necessary or there are natural ways to supplement (that's my focus this quarter – better education about natural supplementation).<p>Some things that are helping my cause – Bryan Johnson's journey has drawn a lot more attention to healthy supplementation (blueprint). Thanks to Bryan's efforts, I had so many people in recent months reach out to ask about the state of the project – interest I've not had before.<p>I am excited to restart this journey and to share it with HN. Your comments on how to approach this would be massively appreciated.<p>Some key areas of the website:<p>* Example of navigating supplements by ingredient <a href="https://pillser.com/search?q="Vitamin+D"&s=jho4espsuc" rel="nofollow">https://pillser.com/search?q="Vitamin+D"&s=jho4espsuc</a><p>* Example of research paper analyzed using AI <a href="https://pillser.com/research-papers/effect-of-lactobacillus-gasseri-pa-168-bifidobacterium-longum-sp-073-b-bifidum-mf-205-on-common-cold-episodes-a-double-blind-randomized-controlled-trial-767" rel="nofollow">https://pillser.com/research-papers/effect-of-lactobacillus-...</a><p>* Example of looking for very specific strains or ingredients <a href="https://pillser.com/probiotics/bifidobacterium-bifidum" rel="nofollow">https://pillser.com/probiotics/bifidobacterium-bifidum</a><p>* Example of navigating research by health-outcomes <a href="https://pillser.com/health-outcomes/improved-intestinal-barrier-function" rel="nofollow">https://pillser.com/health-outcomes/improved-intestinal-barr...</a><p>* Example of product listing <a href="https://pillser.com/supplements/pb-8-probiotic-663" rel="nofollow">https://pillser.com/supplements/pb-8-probiotic-663</a>

Show HN: I've been using AI to analyze every supplement on the market

Hey HN! This has been my project for a few years now. I recently brought it back to life after taking a pause to focus on my studies.<p>My goal with this project is to separate fluff from science when shopping for supplements. I am doing this in 3 steps:<p>1.) I index every supplement on the market (extract each ingredient, normalize by quantity)<p>2.) I index every research paper on supplementation (rank every claim by effect type and effect size)<p>3.) I link data between supplements and research papers<p>Earlier last year, I took pause on a project because I've ran into a few issues:<p>Legal: Shady companies are sending C&Ds letters demanding their products are taken down from the website. It is not something I had the mental capacity to respond to while also going through my studies. Not coincidentally, these are usually brands with big marketing budgets and poor ingredients to price ratio.<p>Technical: I started this project when the first LLMs came out. I've built extensive internal evals to understand how LLMs are performing. The hallucinations at the time were simply too frequent to passthrough this data to visitors. However, I recently re-ran my evals with Opus 4.5 and was very impressed. I am running out of scenarios that I can think/find where LLMs are bad at interpreting data.<p>Business: I still haven't figured out how to monetize it or even who the target customer is.<p>Despite these challenges, I decided to restart my journey.<p>My mission is to bring transparency (science and price) to the supplement market. My goal is NOT to increase the use of supplements, but rather to help consumers make informed decisions. Often times, supplementation is not necessary or there are natural ways to supplement (that's my focus this quarter – better education about natural supplementation).<p>Some things that are helping my cause – Bryan Johnson's journey has drawn a lot more attention to healthy supplementation (blueprint). Thanks to Bryan's efforts, I had so many people in recent months reach out to ask about the state of the project – interest I've not had before.<p>I am excited to restart this journey and to share it with HN. Your comments on how to approach this would be massively appreciated.<p>Some key areas of the website:<p>* Example of navigating supplements by ingredient <a href="https://pillser.com/search?q="Vitamin+D"&s=jho4espsuc" rel="nofollow">https://pillser.com/search?q="Vitamin+D"&s=jho4espsuc</a><p>* Example of research paper analyzed using AI <a href="https://pillser.com/research-papers/effect-of-lactobacillus-gasseri-pa-168-bifidobacterium-longum-sp-073-b-bifidum-mf-205-on-common-cold-episodes-a-double-blind-randomized-controlled-trial-767" rel="nofollow">https://pillser.com/research-papers/effect-of-lactobacillus-...</a><p>* Example of looking for very specific strains or ingredients <a href="https://pillser.com/probiotics/bifidobacterium-bifidum" rel="nofollow">https://pillser.com/probiotics/bifidobacterium-bifidum</a><p>* Example of navigating research by health-outcomes <a href="https://pillser.com/health-outcomes/improved-intestinal-barrier-function" rel="nofollow">https://pillser.com/health-outcomes/improved-intestinal-barr...</a><p>* Example of product listing <a href="https://pillser.com/supplements/pb-8-probiotic-663" rel="nofollow">https://pillser.com/supplements/pb-8-probiotic-663</a>

Show HN: Text-to-video model from scratch (2 brothers, 2 years, 2B params)

Writeup (includes good/bad sample generations): <a href="https://www.linum.ai/field-notes/launch-linum-v2">https://www.linum.ai/field-notes/launch-linum-v2</a><p>We're Sahil and Manu, two brothers who spent the last 2 years training text-to-video models from scratch. Today we're releasing them under Apache 2.0.<p>These are 2B param models capable of generating 2-5 seconds of footage at either 360p or 720p. In terms of model size, the closest comparison is Alibaba's Wan 2.1 1.3B. From our testing, we get significantly better motion capture and aesthetics.<p>We're not claiming to have reached the frontier. For us, this is a stepping stone towards SOTA - proof we can train these models end-to-end ourselves.<p>Why train a model from scratch?<p>We shipped our first model in January 2024 (pre-Sora) as a 180p, 1-second GIF bot, bootstrapped off Stable Diffusion XL. Image VAEs don't understand temporal coherence, and without the original training data, you can't smoothly transition between image and video distributions. At some point you're better off starting over.<p>For v2, we use T5 for text encoding, Wan 2.1 VAE for compression, and a DiT-variant backbone trained with flow matching. We built our own temporal VAE but Wan's was smaller with equivalent performance, so we used it to save on embedding costs. (We'll open-source our VAE shortly.)<p>The bulk of development time went into building curation pipelines that actually work (e.g., hand-labeling aesthetic properties and fine-tuning VLMs to filter at scale).<p>What works: Cartoon/animated styles, food and nature scenes, simple character motion. What doesn't: Complex physics, fast motion (e.g., gymnastics, dancing), consistent text.<p>Why build this when Veo/Sora exist? Products are extensions of the underlying model's capabilities. If users want a feature the model doesn't support (character consistency, camera controls, editing, style mapping, etc.), you're stuck. To build the product we want, we need to update the model itself. That means owning the development process. It's a bet that will take time (and a lot of GPU compute) to pay off, but we think it's the right one.<p>What’s next? - Post-training for physics/deformations - Distillation for speed - Audio capabilities - Model scaling<p>We kept a “lab notebook” of all our experiments in Notion. Happy to answer questions about building a model from 0 → 1. Comments and feedback welcome!

Show HN: Text-to-video model from scratch (2 brothers, 2 years, 2B params)

Writeup (includes good/bad sample generations): <a href="https://www.linum.ai/field-notes/launch-linum-v2">https://www.linum.ai/field-notes/launch-linum-v2</a><p>We're Sahil and Manu, two brothers who spent the last 2 years training text-to-video models from scratch. Today we're releasing them under Apache 2.0.<p>These are 2B param models capable of generating 2-5 seconds of footage at either 360p or 720p. In terms of model size, the closest comparison is Alibaba's Wan 2.1 1.3B. From our testing, we get significantly better motion capture and aesthetics.<p>We're not claiming to have reached the frontier. For us, this is a stepping stone towards SOTA - proof we can train these models end-to-end ourselves.<p>Why train a model from scratch?<p>We shipped our first model in January 2024 (pre-Sora) as a 180p, 1-second GIF bot, bootstrapped off Stable Diffusion XL. Image VAEs don't understand temporal coherence, and without the original training data, you can't smoothly transition between image and video distributions. At some point you're better off starting over.<p>For v2, we use T5 for text encoding, Wan 2.1 VAE for compression, and a DiT-variant backbone trained with flow matching. We built our own temporal VAE but Wan's was smaller with equivalent performance, so we used it to save on embedding costs. (We'll open-source our VAE shortly.)<p>The bulk of development time went into building curation pipelines that actually work (e.g., hand-labeling aesthetic properties and fine-tuning VLMs to filter at scale).<p>What works: Cartoon/animated styles, food and nature scenes, simple character motion. What doesn't: Complex physics, fast motion (e.g., gymnastics, dancing), consistent text.<p>Why build this when Veo/Sora exist? Products are extensions of the underlying model's capabilities. If users want a feature the model doesn't support (character consistency, camera controls, editing, style mapping, etc.), you're stuck. To build the product we want, we need to update the model itself. That means owning the development process. It's a bet that will take time (and a lot of GPU compute) to pay off, but we think it's the right one.<p>What’s next? - Post-training for physics/deformations - Distillation for speed - Audio capabilities - Model scaling<p>We kept a “lab notebook” of all our experiments in Notion. Happy to answer questions about building a model from 0 → 1. Comments and feedback welcome!

Show HN: Whosthere: A LAN discovery tool with a modern TUI, written in Go

Show HN: Whosthere: A LAN discovery tool with a modern TUI, written in Go

Show HN: Whosthere: A LAN discovery tool with a modern TUI, written in Go

Show HN: Sweep, Open-weights 1.5B model for next-edit autocomplete

Hey HN, we trained and open-sourced a 1.5B model that predicts your next edits, similar to Cursor. You can download the weights here (<a href="https://huggingface.co/sweepai/sweep-next-edit-1.5b" rel="nofollow">https://huggingface.co/sweepai/sweep-next-edit-1.5b</a>) or try it in our JetBrains plugin (<a href="https://plugins.jetbrains.com/plugin/26860-sweep-ai-autocomplete--coding-agent" rel="nofollow">https://plugins.jetbrains.com/plugin/26860-sweep-ai-autocomp...</a>).<p>Next-edit autocomplete differs from standard autocomplete by using your recent edits as context when predicting completions. The model is small enough to run locally while outperforming models 4x its size on both speed and accuracy.<p>We tested against Mercury (Inception), Zeta (Zed), and Instinct (Continue) across five benchmarks: next-edit above/below cursor, tab-to-jump for distant changes, standard FIM, and noisiness. We found exact-match accuracy correlates best with real usability because code is fairly precise and the solution space is small.<p>Prompt format turned out to matter more than we expected. We ran a genetic algorithm over 30+ diff formats and found simple `original`/`updated` blocks beat unified diffs. The verbose format is just easier for smaller models to understand.<p>Training was SFT on ~100k examples from permissively-licensed repos (4hrs on 8xH100), then RL for 2000 steps with tree-sitter parse checking and size regularization. The RL step fixes edge cases SFT can’t like, generating code that doesn’t parse or overly verbose outputs.<p>We're open-sourcing the weights so the community can build fast, privacy-preserving autocomplete for any editor. If you're building for VSCode, Neovim, or something else, we'd love to see what you make with it!

Show HN: isometric.nyc – giant isometric pixel art map of NYC

Hey HN! I wanted to share something I built over the last few weeks: isometric.nyc is a massive isometric pixel art map of NYC, built with nano banana and coding agents.<p>I didn't write a single line of code.<p>Of course no-code doesn't mean no-engineering. This project took a lot more manual labor than I'd hoped!<p>I wrote a deep dive on the workflow and some thoughts about the future of AI coding and creativity:<p><a href="http://cannoneyed.com/projects/isometric-nyc" rel="nofollow">http://cannoneyed.com/projects/isometric-nyc</a>

Show HN: isometric.nyc – giant isometric pixel art map of NYC

Hey HN! I wanted to share something I built over the last few weeks: isometric.nyc is a massive isometric pixel art map of NYC, built with nano banana and coding agents.<p>I didn't write a single line of code.<p>Of course no-code doesn't mean no-engineering. This project took a lot more manual labor than I'd hoped!<p>I wrote a deep dive on the workflow and some thoughts about the future of AI coding and creativity:<p><a href="http://cannoneyed.com/projects/isometric-nyc" rel="nofollow">http://cannoneyed.com/projects/isometric-nyc</a>

< 1 2 3 4 5 6 7 8 ... 934 935 936 >