The best Hacker News stories from Show from the past day
Latest posts:
Show HN: Minimalist library to generate SVG views of scientific data
Just wanted to share with HN a simple/minimal open source Python library that generates SVG files visualizing two dimensional data and distributions, in case others find it useful or interesting.<p>I wrote it as a fun project, mostly because I found that the standard libraries in Python generated unnecessarily large SVG files. One nice property is that I can configure the visuals through CSS, which allows me to support dark/light mode browser settings. The graphs are specified as JSON files (the repository includes a few examples).<p>It supports scatterplots, line plots, histograms, and box plots, and I collected examples here: <a href="https://github.com/alefore/mini_svg/blob/main/examples/README.md" rel="nofollow">https://github.com/alefore/mini_svg/blob/main/examples/READM...</a><p>I did this mostly for the graphs in an article in my blog (<a href="https://alejo.ch/3jj" rel="nofollow">https://alejo.ch/3jj</a>).<p>Would love to hear opinions. :-)
Show HN: Twitch Roulette – Find live streamers who need views the most
Hey HN, I re-launched an old site I remembered back in the day that someone made called twitchroulette.net with a lot of new features and stats and I would love for people to check it out. The idea is you can easily browse the less browsed parts of twitch and find cool and new streamers to say hi to, and maybe make some new friends.<p>I also added some real time stats and breakdowns per channel and I think some of the things they show are pretty interesting. Check it out!
Show HN: Open-Source Animal Crossing–Style UI for Claude Code Agents
We posted here on Monday and got some great feedback. We’ve implemented a few of the most requested updates:<p>- iMessage channel support (agents can text people and you can text agents) Other channels are simple to extend.
- A built-in browser (agents can navigate and interact with websites)
- Scheduling (run tasks on a timer / cron/ in the future)
- Built in tunneling so that the agents can share local stuff with you over the internet
- More robust MCP and Skills support so anyone can extend it
- Auto approval for agent requests<p>If you didn’t see the original:<p>Outworked is a desktop app where Claude Code agents work as a small “team.” You give it a goal, and an orchestrator breaks it into tasks and assigns them across agents.<p>Agents can run in parallel, talk to each other, write code, and now also browse the web and send messages.<p>It runs locally and plugs into your existing Claude Code setup.<p>Would love to hear what we should build next. Thanks again!
Show HN: Fio: 3D World editor/game engine – inspired by Radiant and Hammer
A liminal brush-based CSG editor and game engine with unified (forward) renderer
inspired by Radiant and Worldcraft/Hammer<p>Compact and lightweight (target: Snapdragon 8CX, OpenGL 3.3)<p>Real-time lighting with stencil shadows without the need for pre-baked compilation
Show HN: Veil – Dark mode PDFs without destroying images, runs in the browser
Hi HN!
here's a tool I just deployed that renders PDFs in dark mode without destroying the images. Internal and external links stay intact, and I decided to implement export since I'm not a fan of platform lock-in: you can view your dark PDF in your preferred reader, on any device.
It's a side project born from a personal need first and foremost.
When I was reading in the factory the books that eventually helped me get out of it, I had the problem that many study materials and books contained images and charts that forced me, with the dark readers available at the time, to always keep the original file in multitasking since the images became, to put it mildly, strange.
I hope it can help some of you who have this same need. I think it could be very useful for researchers, but only future adoption will tell.<p>With that premise, I'd like to share the choices that made all of this possible. To do so, I'll walk through the three layers that veil creates from the original PDF:<p>- Layer 1: CSS filter. I use invert(0.86) hue rotate(180deg) on the main canvas. I use 0.86 instead of 1.0 because I found that full inversion produces a pure black and pure white that are too aggressive for prolonged reading. 0.86 yields a soft dark grey (around #242424, though it depends on the document's white) and a muted white (around #DBDBDB) for the text, which I found to be the most comfortable value for hours of reading.<p>- Layer 2: image protection. A second canvas is positioned on top of the first, this time with no filters.
Through PDF.js's public API getOperatorList(), I walk the PDF's operator list and reconstruct the CTM stack, that is the save, restore and transform operations the PDF uses to position every object on the page. When I encounter a paintImageXObject (opcode 85 in PDF.js v5), the current transformation matrix gives me the exact bounds of the image. At that point I copy those pixels from a clean render onto the overlay. I didn't fork PDF.js because It would have become a maintenance nightmare given the length of the codebase and the frequent updates. Images also receive OCR treatment: text contained in charts and images becomes selectable, just like any other text on the page. At this point we have the text inverted and the images intact. But what if the page is already dark? Maybe the chapter title pages are black with white text? The next layer takes care of that.<p>- Layer 3: already-dark page detection. After rendering, the background brightness is measured by sampling the edges and corners of the page (where you're most likely to find pure background, without text or images in the way). The BT.601 formula is used to calculate perceived brightness by weighting the three color channels as the human eye sees them: green at 58.7%, red at 29.9%, blue at 11.4%. These weights reflect biology: the eye evolved in natural environments where distinguishing shades of green (vegetation, predators in the grass) was a matter of survival, while blue (sky, water) was less critical. If the average luminance falls below 40%, the page is flagged as already dark and the inversion is skipped, returning the original page. Presentation slides with dark backgrounds stay exactly as they are, instead of being inverted into something blinding.<p>Scanned documents are detected automatically and receive OCR via Tesseract.js, making text selectable and copyable even on PDFs that are essentially images.
Everything runs locally, no framework was used, just vanilla JS, which is why it's an installable PWA that works offline too.<p>Here's the link to the app along with the repository: <a href="https://veil.simoneamico.com" rel="nofollow">https://veil.simoneamico.com</a>
| <a href="https://github.com/simoneamico-ux-dev/veil" rel="nofollow">https://github.com/simoneamico-ux-dev/veil</a><p>I hope veil can make your reading more pleasant. I'm open to any feedback. Thanks everyone
Show HN: Yoink – Spotify to lossless with full metadata, self-hostable, ad-free
Show HN: I put an AI agent on a $7/month VPS with IRC as its transport layer
Show HN: I put an AI agent on a $7/month VPS with IRC as its transport layer
Show HN: DuckDB community extension for prefiltered HNSW using ACORN-1
Hey folks! As someone doing hybrid search daily and wishing I could have a pgvector-like experience but with actual prefiltered approximate nearest neighbours, I decided to just take a punt on implementing ACORN on a fork of the DuckDB VSS extension. I had to make some changes to (vendored) usearch that I'm thinking of submitting upstream. But this does the business. Approximate nearest neighbours with WHERE prefiltering.<p>Edit: Just to clarify, this has been accepted into the community extensions repo. So you can use it like:<p>```<p>INSTALL hnsw_acorn FROM community;<p>LOAD hnsw_acorn;<p>```
Show HN: Robust LLM extractor for websites in TypeScript
We've been building data pipelines that scrape websites and extract structured data for a while now. If you've done this, you know the drill: you write CSS selectors, the site changes its layout, everything breaks at 2am, and you spend your morning rewriting parsers.<p>LLMs seemed like the obvious fix — just throw the HTML at GPT and ask for JSON. Except in practice, it's more painful than that:<p>- Raw HTML is full of nav bars, footers, and tracking junk that eats your token budget. A typical product page is 80% noise.
- LLMs return malformed JSON more often than you'd expect, especially with nested arrays and complex schemas. One bad bracket and your pipeline crashes.
- Relative URLs, markdown-escaped links, tracking parameters — the "small" URL issues compound fast when you're processing thousands of pages.
- You end up writing the same boilerplate: HTML cleanup → markdown conversion → LLM call → JSON parsing → error recovery → schema validation. Over and over.<p>We got tired of rebuilding this stack for every project, so we extracted it into a library.<p>Lightfeed Extractor is a TypeScript library that handles the full pipeline from raw HTML to validated, structured data:<p>- Converts HTML to LLM-ready markdown with main content extraction (strips nav, headers, footers), optional image inclusion, and URL cleaning
- Works with any LangChain-compatible LLM (OpenAI, Gemini, Claude, Ollama, etc.)
- Uses Zod schemas for type-safe extraction with real validation
- Recovers partial data from malformed LLM output instead of failing entirely — if 19 out of 20 products parsed correctly, you get those 19
- Built-in browser automation via Playwright (local, serverless, or remote) with anti-bot patches
- Pairs with our browser agent (@lightfeed/browser-agent) for AI-driven page navigation before extraction<p>We use this ourselves in production at Lightfeed, and it's been solid enough that we decided to open-source it.<p>GitHub: <a href="https://github.com/lightfeed/extractor" rel="nofollow">https://github.com/lightfeed/extractor</a> npm: npm install @lightfeed/extractor Apache 2.0 licensed.<p>Happy to answer questions or hear feedback.
Show HN: Robust LLM extractor for websites in TypeScript
We've been building data pipelines that scrape websites and extract structured data for a while now. If you've done this, you know the drill: you write CSS selectors, the site changes its layout, everything breaks at 2am, and you spend your morning rewriting parsers.<p>LLMs seemed like the obvious fix — just throw the HTML at GPT and ask for JSON. Except in practice, it's more painful than that:<p>- Raw HTML is full of nav bars, footers, and tracking junk that eats your token budget. A typical product page is 80% noise.
- LLMs return malformed JSON more often than you'd expect, especially with nested arrays and complex schemas. One bad bracket and your pipeline crashes.
- Relative URLs, markdown-escaped links, tracking parameters — the "small" URL issues compound fast when you're processing thousands of pages.
- You end up writing the same boilerplate: HTML cleanup → markdown conversion → LLM call → JSON parsing → error recovery → schema validation. Over and over.<p>We got tired of rebuilding this stack for every project, so we extracted it into a library.<p>Lightfeed Extractor is a TypeScript library that handles the full pipeline from raw HTML to validated, structured data:<p>- Converts HTML to LLM-ready markdown with main content extraction (strips nav, headers, footers), optional image inclusion, and URL cleaning
- Works with any LangChain-compatible LLM (OpenAI, Gemini, Claude, Ollama, etc.)
- Uses Zod schemas for type-safe extraction with real validation
- Recovers partial data from malformed LLM output instead of failing entirely — if 19 out of 20 products parsed correctly, you get those 19
- Built-in browser automation via Playwright (local, serverless, or remote) with anti-bot patches
- Pairs with our browser agent (@lightfeed/browser-agent) for AI-driven page navigation before extraction<p>We use this ourselves in production at Lightfeed, and it's been solid enough that we decided to open-source it.<p>GitHub: <a href="https://github.com/lightfeed/extractor" rel="nofollow">https://github.com/lightfeed/extractor</a> npm: npm install @lightfeed/extractor Apache 2.0 licensed.<p>Happy to answer questions or hear feedback.
Show HN: Optio – Orchestrate AI coding agents in K8s to go from ticket to PR
I think like many of you, I've been jumping between many claude code/codex sessions at a time, managing multiple lines of work and worktrees in multiple repos. I wanted a way to easily manage multiple lines of work and reduce the amount of input I need to give, allowing the agents to remove me as a bottleneck from as much of the process as I can. So I built an orchestration tool for AI coding agents:<p>Optio is an open-source orchestration system that turns tickets into merged pull requests using AI coding agents. You point it at your repos, and it handles the full lifecycle:<p>- Intake — pull tasks from GitHub Issues, Linear, or create them manually<p>- Execution — spin up isolated K8s pods per repo, run Claude Code or Codex in git worktrees<p>- PR monitoring — watch CI checks, review status, and merge readiness every 30s<p>- Self-healing — auto-resume the agent on CI failures, merge conflicts, or reviewer change requests<p>- Completion — squash-merge the PR and close the linked issue<p>The key idea is the feedback loop. Optio doesn't just run an agent and walk away — when CI breaks, it feeds the failure back to the agent. When a reviewer requests changes, the comments become the agent's next prompt. It keeps going until the PR merges or you tell it to stop.<p>Built with Fastify, Next.js, BullMQ, and Drizzle on Postgres. Ships with a Helm chart for production deployment.
Show HN: Optio – Orchestrate AI coding agents in K8s to go from ticket to PR
I think like many of you, I've been jumping between many claude code/codex sessions at a time, managing multiple lines of work and worktrees in multiple repos. I wanted a way to easily manage multiple lines of work and reduce the amount of input I need to give, allowing the agents to remove me as a bottleneck from as much of the process as I can. So I built an orchestration tool for AI coding agents:<p>Optio is an open-source orchestration system that turns tickets into merged pull requests using AI coding agents. You point it at your repos, and it handles the full lifecycle:<p>- Intake — pull tasks from GitHub Issues, Linear, or create them manually<p>- Execution — spin up isolated K8s pods per repo, run Claude Code or Codex in git worktrees<p>- PR monitoring — watch CI checks, review status, and merge readiness every 30s<p>- Self-healing — auto-resume the agent on CI failures, merge conflicts, or reviewer change requests<p>- Completion — squash-merge the PR and close the linked issue<p>The key idea is the feedback loop. Optio doesn't just run an agent and walk away — when CI breaks, it feeds the failure back to the agent. When a reviewer requests changes, the comments become the agent's next prompt. It keeps going until the PR merges or you tell it to stop.<p>Built with Fastify, Next.js, BullMQ, and Drizzle on Postgres. Ships with a Helm chart for production deployment.
Show HN: Turbolite – a SQLite VFS serving sub-250ms cold JOIN queries from S3
I built a SQLite VFS in Rust that serves cold queries directly from S3 with sub-second performance, and often much faster.<p>It’s called turbolite. It is experimental, buggy, and may corrupt data. I would not trust it with anything important yet.<p>I wanted to explore whether object storage has gotten fast enough to support embedded databases over cloud storage. Filesystems reward tiny random reads and in-place mutation. S3 rewards fewer requests, bigger transfers, immutable objects, and aggressively parallel operations where bandwidth is often the real constraint. This was explicitly inspired by turbopuffer’s ground-up S3-native design. <a href="https://turbopuffer.com/blog/turbopuffer" rel="nofollow">https://turbopuffer.com/blog/turbopuffer</a><p>The use case I had in mind is lots of mostly-cold SQLite databases (database-per-tenant, database-per-session, or database-per-user architectures) where keeping a separate attached volume for inactive database feels wasteful. turbolite assumes a single write source and is aimed much more at “many databases with bursty cold reads” than “one hot database.”<p>Instead of doing naive page-at-a-time reads from a raw SQLite file, turbolite introspects SQLite B-trees, stores related pages together in compressed page groups, and keeps a manifest that is the source of truth for where every page lives. Cache misses use seekable zstd frames and S3 range GETs for search queries, so fetching one needed page does not require downloading an entire object.<p>At query time, turbolite can also pass storage operations from the query plan down to the VFS to frontrun downloads for indexes and large scans in the order they will be accessed.<p>You can tune how aggressively turbolite prefetches. For point queries and small joins, it can stay conservative and avoid prefetching whole tables. For scans, it can get much more aggressive.<p>It also groups pages by page type in S3. Interior B-tree pages are bundled separately and loaded eagerly. Index pages prefetch aggressively. Data pages are stored by table. The goal is to make cold point queries and joins decent, while making scans less awful than naive remote paging would.<p>On a 1M-row / 1.5GB benchmark on EC2 + S3 Express, I’m seeing results like sub-100ms cold point lookups, sub-200ms cold 5-join profile queries, and sub-600ms scans from an empty cache with a 1.5GB database. It’s somewhat slower on normal S3/Tigris.<p>Current limitations are pretty straightforward: it’s single-writer only, and it is still very much a systems experiment rather than production infrastructure.<p>I’d love feedback from people who’ve worked on SQLite-over-network, storage engines, VFSes, or object-storage-backed databases. I’m especially interested in whether the B-tree-aware grouping / manifest / seekable-range-GET direction feels like the right one to keep pushing.
Show HN: Turbolite – a SQLite VFS serving sub-250ms cold JOIN queries from S3
I built a SQLite VFS in Rust that serves cold queries directly from S3 with sub-second performance, and often much faster.<p>It’s called turbolite. It is experimental, buggy, and may corrupt data. I would not trust it with anything important yet.<p>I wanted to explore whether object storage has gotten fast enough to support embedded databases over cloud storage. Filesystems reward tiny random reads and in-place mutation. S3 rewards fewer requests, bigger transfers, immutable objects, and aggressively parallel operations where bandwidth is often the real constraint. This was explicitly inspired by turbopuffer’s ground-up S3-native design. <a href="https://turbopuffer.com/blog/turbopuffer" rel="nofollow">https://turbopuffer.com/blog/turbopuffer</a><p>The use case I had in mind is lots of mostly-cold SQLite databases (database-per-tenant, database-per-session, or database-per-user architectures) where keeping a separate attached volume for inactive database feels wasteful. turbolite assumes a single write source and is aimed much more at “many databases with bursty cold reads” than “one hot database.”<p>Instead of doing naive page-at-a-time reads from a raw SQLite file, turbolite introspects SQLite B-trees, stores related pages together in compressed page groups, and keeps a manifest that is the source of truth for where every page lives. Cache misses use seekable zstd frames and S3 range GETs for search queries, so fetching one needed page does not require downloading an entire object.<p>At query time, turbolite can also pass storage operations from the query plan down to the VFS to frontrun downloads for indexes and large scans in the order they will be accessed.<p>You can tune how aggressively turbolite prefetches. For point queries and small joins, it can stay conservative and avoid prefetching whole tables. For scans, it can get much more aggressive.<p>It also groups pages by page type in S3. Interior B-tree pages are bundled separately and loaded eagerly. Index pages prefetch aggressively. Data pages are stored by table. The goal is to make cold point queries and joins decent, while making scans less awful than naive remote paging would.<p>On a 1M-row / 1.5GB benchmark on EC2 + S3 Express, I’m seeing results like sub-100ms cold point lookups, sub-200ms cold 5-join profile queries, and sub-600ms scans from an empty cache with a 1.5GB database. It’s somewhat slower on normal S3/Tigris.<p>Current limitations are pretty straightforward: it’s single-writer only, and it is still very much a systems experiment rather than production infrastructure.<p>I’d love feedback from people who’ve worked on SQLite-over-network, storage engines, VFSes, or object-storage-backed databases. I’m especially interested in whether the B-tree-aware grouping / manifest / seekable-range-GET direction feels like the right one to keep pushing.
Show HN: A plain-text cognitive architecture for Claude Code
Show HN: A plain-text cognitive architecture for Claude Code
Show HN: AI Roundtable – Let 200 models debate your question
Hey HN! After the Car Wash Test post got quite a big discussion going (400+ comments, <a href="https://news.ycombinator.com/item?id=47128138">https://news.ycombinator.com/item?id=47128138</a>), I spent the past few weeks building a tool so anyone can run these kinds of questions and get structured results. No signup and free to use.<p>You type a question, define answer options, pick up to 50 models at a time from a pool of 200+, and they all answer independently under identical conditions. No system prompt, structured output, same setup for every model.<p>You can also run a debate round where models see each other's reasoning and get a chance to change their minds. A reviewer model then summarizes the full transcript. All models are routed via my startup Opper. Any feedback is welcome!<p>Hope you enjoy it, and would love to hear what you think!
Show HN: Gridland: make terminal apps that also run in the browser
Hi everyone,<p>Gridland is a runtime + ShadCN UI registry that makes it possible to build terminal apps that run in the browser as well as the native terminal. This is useful for demoing TUIs so that users know what they're getting before they are invested enough to install them. And, tbh, it's also just super fun!<p>Gridland is the successor to Ink Web (ink-web.dev) which is the same concept, but using Ink + xterm.js. After building Ink Web, we continued experimenting and found that using OpenTUI and a canvas renderer performed better with less flickering and nearly instant load times.<p>We're excited to continue iterating on this. I expect a lot of criticism from the "why does this need to exist" angle, and tbh, it probably doesn't - it's really mostly just for fun, but we still think the demo use case mentioned previously has potential.<p>- Chris + Jess
Show HN: ProofShot – Give AI coding agents eyes to verify the UI they build
I use AI agents to build UI features daily. The thing that kept annoying me: the agent writes code but never sees what it actually looks like in the browser. It can’t tell if the layout is broken or if the console is throwing errors.<p>So I built a CLI that lets the agent open a browser, interact with the page, record what happens, and collect any errors. Then it bundles everything — video, screenshots, logs — into a self-contained HTML file I can review in seconds.<p><pre><code> proofshot start --run "npm run dev" --port 3000
# agent navigates, clicks, takes screenshots
proofshot stop
</code></pre>
It works with whatever agent you use (Claude Code, Cursor, Codex, etc.) — it’s just shell commands. It's packaged as a skill so your AI coding agent knows exactly how it works. It's built on agent-browser from Vercel Labs which is far better and faster than Playwright MCP.<p>It’s not a testing framework. The agent doesn’t decide pass/fail. It just gives me the evidence so I don’t have to open the browser myself every time.<p>Open source and completely free.<p>Website: <a href="https://proofshot.argil.io/" rel="nofollow">https://proofshot.argil.io/</a>