The best Hacker News stories from Show from the past day
Latest posts:
Show HN: GNU grep as a PHP extension
Show HN: I rebuilt a 2000s browser strategy game on Cloudflare's edge
I grew up in Germany in the early 2000s playing a browser game called Inselkampf. You built up an island, mined gold and stone, cut down trees for wood, raised armies, sent fleets across an ocean grid, joined alliances and got betrayed by them. Same genre as OGame or Travian. It shut down in 2014 and I never found anything that replaced that feeling of checking in before school to see if your fleet had arrived and your alliance was still alive.<p>I finally built the version I wanted to play. Kampfinsel is live at kampfinsel.com right now with real players on it. It's not a straight copy of the old game. I gave it its own world. No magic, no gunpowder – just ballistas, fire pots, and slow ships crossing huge distances. Three resources: gold, stone, wood. Travel between islands takes hours, not seconds. It's slow on purpose.<p>The whole thing runs on Cloudflare's edge. Workers for the game logic and API, D1 for the database, KV for sessions and caching, R2 for assets and Durable Objects for per-island state and the tick system (fleet arrivals, combat, resource generation). There's no origin server at all. Making a stateful multiplayer game work inside Workers' CPU limits and D1's consistency model meant some non-obvious choices: resources are calculated on-read from timestamps instead of being ticked into the database, fleet movements live in Durable Object alarms and combat writes are batched. This helped me a lot!<p>The look is intentionally rough and text-heavy (Hi HN!): server-rendered HTML, tables, a parchment color palette, Unicode icons, no frontend framework, no build step. The only JavaScript is for countdown timers and auto-refresh. I wanted it to feel the way I remember these games looking, not how they actually looked. Honestly, it looks a lot like HN itself - tables, monospace, no chrome. If you like how this site looks, you'll probably feel at home.<p>No signup wall, no premium currency, no pay-to-win. Feedback very welcome, especially from anyone who played this kind of game back in the day or has opinions on running stateful stuff on Workers + D1 + Durable Objects. I'll be around for the next few hours.
Show HN: Libretto – Making AI browser automations deterministic
Libretto (<a href="https://libretto.sh" rel="nofollow">https://libretto.sh</a>) is a Skill+CLI that makes it easy for your coding agent to generate deterministic browser automations and debug existing ones. Key shift is going from “give an agent a prompt at runtime and hope it figures things out” to: “Use coding agents to generate real scripts you can inspect, run, and debug”.<p>Here’s a demo: <a href="https://www.youtube.com/watch?v=0cDpIntmHAM" rel="nofollow">https://www.youtube.com/watch?v=0cDpIntmHAM</a>. Docs start at <a href="https://libretto.sh/docs/get-started/introduction" rel="nofollow">https://libretto.sh/docs/get-started/introduction</a>.<p>We spent a year building and maintaining browser automations for EHR and payer portal integrations at our healthcare startup. Building these automations and debugging failed ones was incredibly time-consuming.<p>There’s lots of tools that use runtime AI like Browseruse and Stagehand which we tried, but (1) they’re reliant on custom DOM parsing that's unreliable on older and complicated websites (including all of healthcare). Using a website’s internal network calls is faster and more reliable when possible. (2) They can be expensive since they rely on lots of AI calls and for workflows with complicated logic you can’t always rely on caching actions to make sure it will work. (3) They’re at runtime so it’s not interpretable what the agent is going to do. You kind of hope you prompted it correctly to do the right thing, but legacy workflows are often unintuitive and inconsistent across sites so you can’t trust an agent to just figure it out at runtime. (4) They don’t really help you generate new automations or help you debug automation failures.<p>We wanted a way to reliably generate and maintain browser automations in messy, high-stakes environments, without relying on fragile runtime agents.<p>Libretto is different because instead of runtime agents it uses “development-time AI”: scripts are generated ahead of time as actual code you can read and control, not opaque agent behavior at runtime. Instead of a black box, you own the code and can inspect, modify, version, and debug everything.<p>Rather than relying on runtime DOM parsing, Libretto takes a hybrid approach combining Playwright UI automation with direct network/API requests within the browser session for better reliability and bot detection evasion.<p>It records manual user actions to help agents generate and update scripts, supports step-through debugging, has an optional read-only mode to prevent agents from accidentally submitting or modifying data, and generates code that follows all the abstractions and conventions you have already in your coding repo.<p>Would love to hear how others are building and maintaining browser automations in practice, and any feedback on the approach we’ve taken here.
Show HN: Every CEO and CFO change at US public companies, live from SEC
Built this solo. It watches SEC filings for executive and board changes, extracts the data, and shows it in real time. 2,100+ changes in the last 30 days. The comp data is interesting: average new CEO total comp is $8.4M across 284 appointments. The /explore page is fully open, no login needed.
Show HN: Run GUIs as Scripts
Hi there, Zero Stars here.<p>I recently published some new work to Hokusai Pocket, which is a cross-platform binary made on top of raylib and MRuby that runs GUIs from ruby scripts.<p>License?<p>MIT!<p>How does it work?<p>The binary is available on the GitHub releases page: <a href="https://github.com/skinnyjames/hokusai-pocket/releases/tag/0.6.1" rel="nofollow">https://github.com/skinnyjames/hokusai-pocket/releases/tag/0...</a><p>You can download the binary on x86 Windows, OSX, or Linux, and run your GUI application with<p>hokusai-pocket run:target="<your_hokusai_app.rb>"<p>For a little bit of a hello world, I started a photoshop clone<p><a href="https://github.com/skinnyjames/hokusai_demo_paint" rel="nofollow">https://github.com/skinnyjames/hokusai_demo_paint</a><p>Also a little game<p><a href="https://github.com/skinnyjames/pocket-squares" rel="nofollow">https://github.com/skinnyjames/pocket-squares</a><p>Docs / Help?<p>The docs are in progress, but the old docs for the CRuby version express some of the basic ideas around the project. <a href="https://hokusai.skinnyjames.net/docs/intro" rel="nofollow">https://hokusai.skinnyjames.net/docs/intro</a><p>(I'm also available to answer questions in between slinging pizza)<p>Deps?<p>Hokusai pocket currently uses<p>* libuv for offloading cpu intensive tasks to a worker pool to prevent blocking the UI thread, and I plan to integrate some libuv networking as well.<p>* raylib for backend graphics / I've also built with SDL on arm64 to run applications on my pinephone<p>* NativeFileDialog for the lovely integration into filesystem.<p>* MRuby for running or embedding the scripts<p>* tree-sitter for the custom template grammar (Although templates can be built with ruby)<p>Anyway, I hope you get a chance to try it. If you make something cool AND have docker installed, you can also publish your work as single binary<p>`hokusai-pocket publish:target=<your cool program.rb>`<p>Would love feedback, apps, and help with documentation and more build targets.<p>urs truly,<p>@ ᴗ @
Show HN: Kelet – Root Cause Analysis agent for your LLM apps
I've spent the past few years building 50+ AI agents in prod (some reached 1M+ sessions/day), and the hardest part was never building them — it was figuring out why they fail.<p>AI agents don't crash. They just quietly give wrong answers. You end up scrolling through traces one by one, trying to find a pattern across hundreds of sessions.<p>Kelet automates that investigation. Here's how it works:<p>1. You connect your traces and signals (user feedback, edits, clicks, sentiment, LLM-as-a-judge, etc.)
2. Kelet processes those signals and extracts facts about each session
3. It forms hypotheses about what went wrong in each case
4. It clusters similar hypotheses across sessions and investigates them together
5. It surfaces a root cause with a suggested fix you can review and apply<p>The key insight: individual session failures look random. But when you cluster the hypotheses, failure patterns emerge.<p>The fastest way to integrate is through the Kelet Skill for coding agents — it scans your codebase, discovers where signals should be collected, and sets everything up for you. There are also Python and TypeScript SDKs if you prefer manual setup.<p>It’s currently free during beta. No credit card required.
Docs: <a href="https://kelet.ai/docs/" rel="nofollow">https://kelet.ai/docs/</a><p>I'd love feedback on the approach, especially from anyone running agents in prod. Does automating the manual error analysis sound right?
Show HN: Kelet – Root Cause Analysis agent for your LLM apps
I've spent the past few years building 50+ AI agents in prod (some reached 1M+ sessions/day), and the hardest part was never building them — it was figuring out why they fail.<p>AI agents don't crash. They just quietly give wrong answers. You end up scrolling through traces one by one, trying to find a pattern across hundreds of sessions.<p>Kelet automates that investigation. Here's how it works:<p>1. You connect your traces and signals (user feedback, edits, clicks, sentiment, LLM-as-a-judge, etc.)
2. Kelet processes those signals and extracts facts about each session
3. It forms hypotheses about what went wrong in each case
4. It clusters similar hypotheses across sessions and investigates them together
5. It surfaces a root cause with a suggested fix you can review and apply<p>The key insight: individual session failures look random. But when you cluster the hypotheses, failure patterns emerge.<p>The fastest way to integrate is through the Kelet Skill for coding agents — it scans your codebase, discovers where signals should be collected, and sets everything up for you. There are also Python and TypeScript SDKs if you prefer manual setup.<p>It’s currently free during beta. No credit card required.
Docs: <a href="https://kelet.ai/docs/" rel="nofollow">https://kelet.ai/docs/</a><p>I'd love feedback on the approach, especially from anyone running agents in prod. Does automating the manual error analysis sound right?
Show HN: A memory database that forgets, consolidates, and detects contradiction
Vector databases store memories. They don't manage them. After 10k memories, recall quality degrades because there's no consolidation, no forgetting, no conflict resolution. Your AI agent just gets noisier.<p>YantrikDB is a cognitive memory engine — embed it, run it as a server, or connect via MCP. It thinks about what it stores: consolidation collapses duplicate memories, contradiction detection flags incompatible facts, temporal decay with configurable half-life lets unimportant memories fade like human memory does.<p>Single Rust binary. HTTP + binary wire protocol. 2-voter + 1-witness HA cluster via Docker Compose or Kubernetes. Chaos-tested failover, runtime deadlock detection (parking_lot), per-tenant quotas, Prometheus metrics. Ran a 42-task hardening sprint last week — 1178 core tests, cargo-fuzz targets, CRDT property tests, 5 ops runbooks.<p>Live on a 3-node Proxmox homelab cluster with multiple tenants. Alpha — primary user is me, looking for the second one.
Show HN: A memory database that forgets, consolidates, and detects contradiction
Vector databases store memories. They don't manage them. After 10k memories, recall quality degrades because there's no consolidation, no forgetting, no conflict resolution. Your AI agent just gets noisier.<p>YantrikDB is a cognitive memory engine — embed it, run it as a server, or connect via MCP. It thinks about what it stores: consolidation collapses duplicate memories, contradiction detection flags incompatible facts, temporal decay with configurable half-life lets unimportant memories fade like human memory does.<p>Single Rust binary. HTTP + binary wire protocol. 2-voter + 1-witness HA cluster via Docker Compose or Kubernetes. Chaos-tested failover, runtime deadlock detection (parking_lot), per-tenant quotas, Prometheus metrics. Ran a 42-task hardening sprint last week — 1178 core tests, cargo-fuzz targets, CRDT property tests, 5 ops runbooks.<p>Live on a 3-node Proxmox homelab cluster with multiple tenants. Alpha — primary user is me, looking for the second one.
Show HN: Plain – The full-stack Python framework designed for humans and agents
Show HN: Plain – The full-stack Python framework designed for humans and agents
Show HN: Kontext CLI – Credential broker for AI coding agents in Go
We built the Kontext CLI because AI coding agents need access to GitHub, Stripe, databases, and dozens of other services — and right now most teams handle this by copy-pasting long-lived API keys into .env files, or the actual chat interface, whilst hoping for the best.<p>The problem isn't just secret sprawl. It's that there's no lineage of access. You don't know which developer launched which agent, what it accessed, or whether it should have been allowed to. The moment you hand raw credentials to a process, you've lost the ability to enforce policy, audit access, or rotate without pain. The credential is the authorization, and that's fundamentally broken when autonomous agents are making hundreds of API calls per session.<p>Kontext takes a different approach. You declare what credentials a project needs in a .env.kontext file:<p><pre><code> GITHUB_TOKEN={{kontext:github}}
STRIPE_KEY={{kontext:stripe}}
LINEAR_TOKEN={{kontext:linear}}
</code></pre>
Then run `kontext start --agent claude`. The CLI authenticates you via OIDC, and for each placeholder: if the service supports OAuth, it exchanges the placeholder for a short-lived access token via RFC 8693 token exchange; for static API keys, the backend injects the credential directly into the agent's runtime environment. Either way, secrets exist only in memory during the session — never written to disk on your machine. Every tool call is streamed for audit as the agent runs.<p>The closest analogy is a Security Token Service (STS): you authenticate once, and the backend mints short-lived, scoped credentials on-the-fly — except unlike a classical STS, we hold the upstream secrets, so nothing long-lived ever reaches the agent. The backend holds your OAuth refresh tokens and API keys; the CLI never sees them. It gets back short-lived access tokens scoped to the session.<p>What the CLI captures for every tool call: what the agent tried to do, what happened, whether it was allowed, and who did it — attributed to a user, session, and org.<p>Install with one command: `brew install kontext-dev/tap/kontext`<p>The CLI is written in Go (~5ms hook overhead per tool call), uses ConnectRPC for backend communication, and stores auth in the system keyring. Works with Claude Code today, Codex support coming soon.<p>We're working on server-side policy enforcement next — the infrastructure for allow/deny decisions on every tool call is already wired, we just need to close the loop so tool calls can also be rejected.<p>We'd love feedback on the approach. Especially curious: how are teams handling credential management for AI agents today? Are you just pasting env vars into the agent chat, or have you found something better?<p>GitHub: <a href="https://github.com/kontext-dev/kontext-cli" rel="nofollow">https://github.com/kontext-dev/kontext-cli</a>
Site: <a href="https://kontext.security" rel="nofollow">https://kontext.security</a>
Show HN: Kontext CLI – Credential broker for AI coding agents in Go
We built the Kontext CLI because AI coding agents need access to GitHub, Stripe, databases, and dozens of other services — and right now most teams handle this by copy-pasting long-lived API keys into .env files, or the actual chat interface, whilst hoping for the best.<p>The problem isn't just secret sprawl. It's that there's no lineage of access. You don't know which developer launched which agent, what it accessed, or whether it should have been allowed to. The moment you hand raw credentials to a process, you've lost the ability to enforce policy, audit access, or rotate without pain. The credential is the authorization, and that's fundamentally broken when autonomous agents are making hundreds of API calls per session.<p>Kontext takes a different approach. You declare what credentials a project needs in a .env.kontext file:<p><pre><code> GITHUB_TOKEN={{kontext:github}}
STRIPE_KEY={{kontext:stripe}}
LINEAR_TOKEN={{kontext:linear}}
</code></pre>
Then run `kontext start --agent claude`. The CLI authenticates you via OIDC, and for each placeholder: if the service supports OAuth, it exchanges the placeholder for a short-lived access token via RFC 8693 token exchange; for static API keys, the backend injects the credential directly into the agent's runtime environment. Either way, secrets exist only in memory during the session — never written to disk on your machine. Every tool call is streamed for audit as the agent runs.<p>The closest analogy is a Security Token Service (STS): you authenticate once, and the backend mints short-lived, scoped credentials on-the-fly — except unlike a classical STS, we hold the upstream secrets, so nothing long-lived ever reaches the agent. The backend holds your OAuth refresh tokens and API keys; the CLI never sees them. It gets back short-lived access tokens scoped to the session.<p>What the CLI captures for every tool call: what the agent tried to do, what happened, whether it was allowed, and who did it — attributed to a user, session, and org.<p>Install with one command: `brew install kontext-dev/tap/kontext`<p>The CLI is written in Go (~5ms hook overhead per tool call), uses ConnectRPC for backend communication, and stores auth in the system keyring. Works with Claude Code today, Codex support coming soon.<p>We're working on server-side policy enforcement next — the infrastructure for allow/deny decisions on every tool call is already wired, we just need to close the loop so tool calls can also be rejected.<p>We'd love feedback on the approach. Especially curious: how are teams handling credential management for AI agents today? Are you just pasting env vars into the agent chat, or have you found something better?<p>GitHub: <a href="https://github.com/kontext-dev/kontext-cli" rel="nofollow">https://github.com/kontext-dev/kontext-cli</a>
Site: <a href="https://kontext.security" rel="nofollow">https://kontext.security</a>
Show HN: LangAlpha – what if Claude Code was built for Wall Street?
Some technical context on what we ran into building this.<p>MCP tools don't really work for financial data at scale. One tool call for five years of daily prices dumps tens of thousands of tokens into the context window. And data vendors pack dozens of tools into a single MCP server, schemas alone can eat 50k+ tokens before the agent does anything useful. So we auto-generate typed Python modules from the MCP schemas at workspace init and upload them into the sandbox. The agent just imports them like a normal library. Only a one-line summary per server stays in the prompt. We have around 80 tools across our servers and the prompt cost is the same whether a server has 3 tools or 30. This part isn't finance-specific, it works with any MCP server.<p>The other big thing was making research actually persist across sessions. Most agents treat a single deliverable (a PDF, a spreadsheet) as the end goal. In investing that's day one. You update the model when earnings drop, re-run comps when a competitor reports, keep layering new analysis on old. But try doing that across agent sessions, files don't carry over, you re-paste context every time. So we built everything around workspaces. Each one maps to a persistent sandbox, one per research goal. The agent maintains its own memory file with findings and a file index that gets re-read before every LLM call. Come back a week later, start a new thread, it picks up where it left off.<p>We also wanted the agent to have real domain context the way Claude Code has codebase context. Portfolio, watchlist, risk tolerance, financial data sources, all injected into every call. Existing AI investing platforms have some of that but nothing close to what a proper agent harness can do. We wanted both and couldn't find it, so we built it and open-sourced the whole thing.
Show HN: LangAlpha – what if Claude Code was built for Wall Street?
Some technical context on what we ran into building this.<p>MCP tools don't really work for financial data at scale. One tool call for five years of daily prices dumps tens of thousands of tokens into the context window. And data vendors pack dozens of tools into a single MCP server, schemas alone can eat 50k+ tokens before the agent does anything useful. So we auto-generate typed Python modules from the MCP schemas at workspace init and upload them into the sandbox. The agent just imports them like a normal library. Only a one-line summary per server stays in the prompt. We have around 80 tools across our servers and the prompt cost is the same whether a server has 3 tools or 30. This part isn't finance-specific, it works with any MCP server.<p>The other big thing was making research actually persist across sessions. Most agents treat a single deliverable (a PDF, a spreadsheet) as the end goal. In investing that's day one. You update the model when earnings drop, re-run comps when a competitor reports, keep layering new analysis on old. But try doing that across agent sessions, files don't carry over, you re-paste context every time. So we built everything around workspaces. Each one maps to a persistent sandbox, one per research goal. The agent maintains its own memory file with findings and a file index that gets re-read before every LLM call. Come back a week later, start a new thread, it picks up where it left off.<p>We also wanted the agent to have real domain context the way Claude Code has codebase context. Portfolio, watchlist, risk tolerance, financial data sources, all injected into every call. Existing AI investing platforms have some of that but nothing close to what a proper agent harness can do. We wanted both and couldn't find it, so we built it and open-sourced the whole thing.
Show HN: A social feed with no strangers
Grateful is a gratitude app with a simple social layer.<p>You write a short entry, keep it private or share it to a circle. A circle is a small private group of your own making — family, close friends, whoever you'd actually want to hear from.<p>It shows you the most recent post first. People in the circle can react or leave a comment. There's also a daily notification that sends you something you were grateful for in the past.<p>Try it out on both iOS and Android. Go to grateful.so
Show HN: Turn your favorite YouTube channels into a streaming experience
A minimalist way to watch YouTube with cinematic previews, an immersive interface, and zero distractions. Free, no accounts or subscription needed.
Show HN: Continual Learning with .md
I have a proposal that addresses long-term memory problems for LLMs when new data arrives continuously (cheaply!). The program involves no code, but two Markdown files.<p>For retrieval, there is a semantic filesystem that makes it easy for LLMs to search using shell commands.<p>It is currently a scrappy v1, but it works better than anything I have tried.<p>Curious for any feedback!
Show HN: Ithihāsas – a character explorer for Hindu epics, built in a few hours
Hi HN!<p>I’ve always found it hard to explore the Mahābhārata and Rāmāyaṇa online. Most content is either long-form or scattered, and understanding a character like Karna or Bhishma usually means opening multiple tabs.<p>I built <a href="https://www.ithihasas.in/" rel="nofollow">https://www.ithihasas.in/</a> to solve that. It is a simple character explorer that lets you navigate the epics through people and their relationships instead of reading everything linearly.<p>This was also an experiment with Claude CLI. I was able to put together the first version in a couple of hours. It helped a lot with generating structured content and speeding up development, but UX and data consistency still needed manual work.<p>Would love feedback on the UX and whether this way of exploring mythology works for you.
Show HN: Ithihāsas – a character explorer for Hindu epics, built in a few hours
Hi HN!<p>I’ve always found it hard to explore the Mahābhārata and Rāmāyaṇa online. Most content is either long-form or scattered, and understanding a character like Karna or Bhishma usually means opening multiple tabs.<p>I built <a href="https://www.ithihasas.in/" rel="nofollow">https://www.ithihasas.in/</a> to solve that. It is a simple character explorer that lets you navigate the epics through people and their relationships instead of reading everything linearly.<p>This was also an experiment with Claude CLI. I was able to put together the first version in a couple of hours. It helped a lot with generating structured content and speeding up development, but UX and data consistency still needed manual work.<p>Would love feedback on the UX and whether this way of exploring mythology works for you.