The best Hacker News stories from Show from the past day
Latest posts:
Show HN: I've built a nice home server OS
ohai!<p>I've released Lightwhale 3, which is possibly the easiest way to self-host Docker containers.<p>It's a free, immutable Linux system purpose-built to live-boot straight into a working Docker Engine, thereby shortcutting the need for installation, configuration, and maintenance. Its simple design makes it easy to learn, and its low memory footprint should make it especially attractive during these times of RAMageddon.<p>If this has piqued your interest, do check it out, along with its easy-to-follow Getting Started guide.<p>In any event,
have a nice day! =)
Show HN: Atomic – Local-first, AI-augmented personal knowledge base
Hey HN - I first posted about my knowledge base product, Atomic, here around a month ago; since then, a viral tweet by Karpathy has produced a torrent of AI powered knowledge base projects. meanwhile I've been shipping like crazy, here are some of the new features shipped in the last month:<p>- Rebuilt the iOS app with an Android app on the way<p>- expanded both the MCP and internal agent chat toolkit immensely<p>- A custom, CodeMirror6-based markdown editor with obsidian-style rendering<p>- A dashboard view that provides a daily summary of atoms created or updated in the last day<p>And many bug fixes and improvements across the board. Atomic is MIT licensed. You can download the desktop app, but the true power is unlocked by self hosting an atomic server, which any client (web, mobile, or desktop) can connect to from anywhere. You can add content to your knowledge base directly, or via RSS feed, web clipper, mobile share capture, obsidian sync, or REST api.
Show HN: Browser Harness – Gives LLM freedom to complete any browser task
Hey HN,<p>We got tired of browser frameworks restricting the LLM, so we removed the framework and gave the LLM maximum freedom to do whatever it's trained on. We gave the harness the ability to self correct and add new tools if the LLM wants (is pre-trained on) that.<p>Our Browser Use library is tens of thousands of lines of deterministic heuristics wrapping Chrome (CDP websocket). Element extractors, click helpers, target managemenet (SUPER painful), watchdogs (crash handling, file downloads, alerts), cross origin iframes (if you want to click on an element you have to switch the target first, very anoying), etc.<p>Watchdogs specifically are extremely painful but required. If Chrome triggers for example a native file popup the agent is just completely stuck. So the two solutions are to:
1. code those heuristics and edge cases away 1 by 1 and prevent them
2. give LLM a tool to handle the edge case<p>As you can imagine - there are crazy amounts of heuristics like this so you eventually end up with A LOT of tools if you try to go for #2. So you have to make compromises and just code those heuristics away.<p>BUT if the LLM just "knows" CDP well enough to switch the targets when it encounters a cross origin iframe, dismiss the alert when it appears, write its own click helpers, or upload function, you suddenly don't have to worry about any of those edge cases.<p>Turns out LLMs know CDP pretty well these days. So we bitter pilled the harness. The concepts that should survive are:
- something that holds and keeps CDP websocket alive (deamon)
- extremely basic tools (helpers.py)
- skill.md that explains how to use it<p>The new paradigm? SKILL.md + a few python helpers that need to have the ability to change on the fly.<p>One cool example:
We forgot to implement upload_file function. Then mid-task the agent wants to upload a file so it grepped helpers.py, saw nothing, wrote the function itself using raw DOM.setFileInputFiles (which we only noticed that later in a git diff). This was a relly magical moment of how powerful LLMs have become.<p>Compared to other approaches (Playwright MCP, browser use CLI, agent-browser, chrome devtools MCP): all of them wrap Chrome in a set of predefined functions for the LLM. The worst failure mode is silent. The LLM's click() returns fine so the LLM thinks it clicked, but on this particular site nothing actually happened. It moves on with a broken model of the world. Browser Harness gives the LLM maximum freedom and perfect context for HOW the tools actually work.<p>Here are a few crazy examples of what browser harness can do:
- plays stockfish <a href="https://x.com/shawn_pana/status/2046457374467379347" rel="nofollow">https://x.com/shawn_pana/status/2046457374467379347</a>
- sets a world record in tetris <a href="https://x.com/shawn_pana/status/2047120626994012442" rel="nofollow">https://x.com/shawn_pana/status/2047120626994012442</a>
- figures out how to draw a heart with js <a href="https://x.com/mamagnus00/status/2046486159992480198?s=20" rel="nofollow">https://x.com/mamagnus00/status/2046486159992480198?s=20</a><p>You can super easily install it by telling claude code:
`Set up <a href="https://github.com/browser-use/browser-harness" rel="nofollow">https://github.com/browser-use/browser-harness</a> for me.`<p>Repo: <a href="https://github.com/browser-use/browser-harness" rel="nofollow">https://github.com/browser-use/browser-harness</a><p>What would you call this new paradigm? A dialect?
Show HN: Gova – The declarative GUI framework for Go
Show HN: How LLMs Work – Interactive visual guide based on Karpathy's lecture
All content is based on Andrej Karpathy's "Intro to Large Language Models" lecture (youtube.com/watch?v=7xTGNNLPyMI). I downloaded the transcript and used Claude Code to generate the entire interactive site from it — single HTML file. I find it useful to revisit this content time to time.
Show HN: Ghost Pepper Meet local meeting transcription and diarization
100% local & private transcription engine for macOS. Captures & does speaker diarization. Originally was building as its own app, but can leverage same local models from my original push-to-talk voice transcription product so combined them into one app.
Show HN: Agent Vault – Open-source credential proxy and vault for agents
Hey HN! Today we're launching Agent Vault - an open source HTTP credential proxy and vault for AI agents. Repo is at <a href="https://github.com/Infisical/agent-vault" rel="nofollow">https://github.com/Infisical/agent-vault</a>, and there's an in-depth description at <a href="https://infisical.com/blog/agent-vault-the-open-source-credential-proxy-and-vault-for-agents">https://infisical.com/blog/agent-vault-the-open-source-crede...</a>.<p>We built Agent Vault in response to a question that been plaguing the industry: How do we give agents secure access to services without them reading any secrets?<p>Most teams building agents have run into this exact problem: They build an agent or agentic system and come to realize at some point that it needs credentials in order to access any services. The issue is that agents, unlike traditional workloads, are non-deterministic, highly-prone to prompt injection, and thus can easily be manipulated to leaking the credentials that they need to operate. This is the problem of credential exfiltration (not to be confused with data exfiltration).<p>In response to this, some teams we've seen have implemented basic guardrails and security controls to mitigate this risk in their agentic environments including using short-lived access tokens. The more advanced teams have started to converge toward a pattern: credential brokering, the idea being to separate agents from their credentials through some form of egress proxy. In this model, the agent makes a request to a proxy that attaches a credential onto it and brokers it through to the target service. This proxy approach is actually used in Anthropic's Managed Agents architecture blog with it being that "the harness is never made aware of the credentials." We've seen similar credential brokering schemes come out from Vercel and in Cloudflare's latest Outbound Workers.<p>Seeing all this made us think: What if we could create a portable credential brokering service plugged seamlessly into agents' existing workflows in an interface agnostic way, meaning that agents could continue to work with APIs, CLIs, SDKs, MCPs without interference and get the security of credential brokering.<p>This led to Agent Vault - an open source HTTP credential proxy and vault that we're building for AI agents. You can deploy this as a dedicated service and set up your agent's environment to proxy requests through it. Note that in a full deployment, you do need to lock down the network so that all outbound traffic is forced through Agent Vault<p>The Agent Vault (AV) implementation has a few interesting design decisions:<p>- Local Forward Proxy: AV chooses an interface agnostic approach to credential brokering by following a MITM architecture using HTTPS_PROXY as an environment variable set in the agent's environment to redirect traffic through it; this also means that it runs its own CA whose certificate must be configured on the client's trust store.<p>- MITM architecture: Since AV terminates TLS in order to do credential brokering its able to inspect traffic and apply rules to it before establishing a new TLS connection upstream. This makes it a great to be able to extend AV to incorporate firewall-like features to be applied at this proxy layer.<p>- Portable: AV itself is a single Go binary that bundles a server and the CLI; it can be deployed as a Docker container as well. In practice, this means that you can self-host AV on your own infrastructure and it should work more universally than provider specific approaches like that of Vercel and Cloudflare.<p>While the preliminary design of Agent Vault is a bit clunky to work with and we’d wished to have more time to smoothen the developer experience around it, particularly around the configuration setup for agents to start proxying requests through it, we figured it would be best to open source the technology and work with the community to make gradual improvements for it to work seamlessly across all agentic use cases since each has its own nuances.<p>All in all, we believe credential brokering is the right next step for how secrets management should be done for agents and would love to hear your views, questions, feedback!
Show HN: Agent Vault – Open-source credential proxy and vault for agents
Hey HN! Today we're launching Agent Vault - an open source HTTP credential proxy and vault for AI agents. Repo is at <a href="https://github.com/Infisical/agent-vault" rel="nofollow">https://github.com/Infisical/agent-vault</a>, and there's an in-depth description at <a href="https://infisical.com/blog/agent-vault-the-open-source-credential-proxy-and-vault-for-agents">https://infisical.com/blog/agent-vault-the-open-source-crede...</a>.<p>We built Agent Vault in response to a question that been plaguing the industry: How do we give agents secure access to services without them reading any secrets?<p>Most teams building agents have run into this exact problem: They build an agent or agentic system and come to realize at some point that it needs credentials in order to access any services. The issue is that agents, unlike traditional workloads, are non-deterministic, highly-prone to prompt injection, and thus can easily be manipulated to leaking the credentials that they need to operate. This is the problem of credential exfiltration (not to be confused with data exfiltration).<p>In response to this, some teams we've seen have implemented basic guardrails and security controls to mitigate this risk in their agentic environments including using short-lived access tokens. The more advanced teams have started to converge toward a pattern: credential brokering, the idea being to separate agents from their credentials through some form of egress proxy. In this model, the agent makes a request to a proxy that attaches a credential onto it and brokers it through to the target service. This proxy approach is actually used in Anthropic's Managed Agents architecture blog with it being that "the harness is never made aware of the credentials." We've seen similar credential brokering schemes come out from Vercel and in Cloudflare's latest Outbound Workers.<p>Seeing all this made us think: What if we could create a portable credential brokering service plugged seamlessly into agents' existing workflows in an interface agnostic way, meaning that agents could continue to work with APIs, CLIs, SDKs, MCPs without interference and get the security of credential brokering.<p>This led to Agent Vault - an open source HTTP credential proxy and vault that we're building for AI agents. You can deploy this as a dedicated service and set up your agent's environment to proxy requests through it. Note that in a full deployment, you do need to lock down the network so that all outbound traffic is forced through Agent Vault<p>The Agent Vault (AV) implementation has a few interesting design decisions:<p>- Local Forward Proxy: AV chooses an interface agnostic approach to credential brokering by following a MITM architecture using HTTPS_PROXY as an environment variable set in the agent's environment to redirect traffic through it; this also means that it runs its own CA whose certificate must be configured on the client's trust store.<p>- MITM architecture: Since AV terminates TLS in order to do credential brokering its able to inspect traffic and apply rules to it before establishing a new TLS connection upstream. This makes it a great to be able to extend AV to incorporate firewall-like features to be applied at this proxy layer.<p>- Portable: AV itself is a single Go binary that bundles a server and the CLI; it can be deployed as a Docker container as well. In practice, this means that you can self-host AV on your own infrastructure and it should work more universally than provider specific approaches like that of Vercel and Cloudflare.<p>While the preliminary design of Agent Vault is a bit clunky to work with and we’d wished to have more time to smoothen the developer experience around it, particularly around the configuration setup for agents to start proxying requests through it, we figured it would be best to open source the technology and work with the community to make gradual improvements for it to work seamlessly across all agentic use cases since each has its own nuances.<p>All in all, we believe credential brokering is the right next step for how secrets management should be done for agents and would love to hear your views, questions, feedback!
Show HN: Tolaria – open-source macOS app to manage Markdown knowledge bases
Hey there! I am Luca, I write <a href="https://refactoring.fm/" rel="nofollow">https://refactoring.fm/</a> and I built Tolaria for myself to manage my own knowledge base (10K notes, 300+ articles written in over 6 years of newslettering) and work well with AI.<p>Tolaria is offline-first, file-based, has first-class support for git, and has strong opinions about how you should organize notes (types, relationships, etc).<p>Let me know your thoughts!
Show HN: Tolaria – open-source macOS app to manage Markdown knowledge bases
Hey there! I am Luca, I write <a href="https://refactoring.fm/" rel="nofollow">https://refactoring.fm/</a> and I built Tolaria for myself to manage my own knowledge base (10K notes, 300+ articles written in over 6 years of newslettering) and work well with AI.<p>Tolaria is offline-first, file-based, has first-class support for git, and has strong opinions about how you should organize notes (types, relationships, etc).<p>Let me know your thoughts!
Show HN: Built a daily game where you sort historical events chronologically
Show HN: Honker – Postgres NOTIFY/LISTEN Semantics for SQLite
Show HN: Honker – Postgres NOTIFY/LISTEN Semantics for SQLite
Show HN: WeTransfer Alternative for Developers
Show HN: Backlit Keyboard API for Python
It currently supports Linux as of now. You can use this package to tinker with many things. Let's say, if you want to make a custom notification system, like if your website is down, you can make a blink notification with it. MacOS support is underway. I haven't tested Windows yet, I don't use it anymore btw.
In future, if this package reaches nice growth, I'll be happy to make a similar Rust crate for it.
Show HN: Broccoli, one shot coding agent on the cloud
Hi HN — we built Broccoli, an open-source harness for taking coding tasks from Linear, running them in isolated cloud sandboxes, and opening PRs for a human to review.<p>We’re a small team, and our main company supplies voice data. But we kept running into the same problem with coding agents. We’d have a feature request, a refactor, a bug, and some internal tooling work all happening at once, and managing that through local agent sessions meant a lot of context switching, worktree juggling, and laptops left open just so tasks could keep running.<p>So we built Broccoli. Each task gets its own cloud sandbox to be executed end to end independently. Broccoli checks out the repo, uses the context in the ticket, works through an implementation, runs tests and review loops, and opens a PR for someone on the team to inspect.<p>Over the last four weeks, 100% of the PRs from non-developers are shipped via Broccoli, which is a safer and more efficient route. For developers on the team, this share is around 60%. More complicated features require more back and forth design with Codex / Claude Code and get shipped manually using the same set of skills locally.<p>Our implementation uses:<p>1. Webhook deployment: GCP
2. Sandbox: GCP or Blaxel
3. Project management: Linear
4. Code hosting & CI/CD: Github<p>Repo: <a href="https://github.com/besimple-oss/broccoli" rel="nofollow">https://github.com/besimple-oss/broccoli</a><p>We believe that if you should invest in your own coding harness if coding is an essential part of your business. That’s why we decided to open-source it as an alternative to all the cloud coding agents out there. Would love to hear your feedback on this!
Show HN: Broccoli, one shot coding agent on the cloud
Hi HN — we built Broccoli, an open-source harness for taking coding tasks from Linear, running them in isolated cloud sandboxes, and opening PRs for a human to review.<p>We’re a small team, and our main company supplies voice data. But we kept running into the same problem with coding agents. We’d have a feature request, a refactor, a bug, and some internal tooling work all happening at once, and managing that through local agent sessions meant a lot of context switching, worktree juggling, and laptops left open just so tasks could keep running.<p>So we built Broccoli. Each task gets its own cloud sandbox to be executed end to end independently. Broccoli checks out the repo, uses the context in the ticket, works through an implementation, runs tests and review loops, and opens a PR for someone on the team to inspect.<p>Over the last four weeks, 100% of the PRs from non-developers are shipped via Broccoli, which is a safer and more efficient route. For developers on the team, this share is around 60%. More complicated features require more back and forth design with Codex / Claude Code and get shipped manually using the same set of skills locally.<p>Our implementation uses:<p>1. Webhook deployment: GCP
2. Sandbox: GCP or Blaxel
3. Project management: Linear
4. Code hosting & CI/CD: Github<p>Repo: <a href="https://github.com/besimple-oss/broccoli" rel="nofollow">https://github.com/besimple-oss/broccoli</a><p>We believe that if you should invest in your own coding harness if coding is an essential part of your business. That’s why we decided to open-source it as an alternative to all the cloud coding agents out there. Would love to hear your feedback on this!
Show HN: Broccoli, one shot coding agent on the cloud
Hi HN — we built Broccoli, an open-source harness for taking coding tasks from Linear, running them in isolated cloud sandboxes, and opening PRs for a human to review.<p>We’re a small team, and our main company supplies voice data. But we kept running into the same problem with coding agents. We’d have a feature request, a refactor, a bug, and some internal tooling work all happening at once, and managing that through local agent sessions meant a lot of context switching, worktree juggling, and laptops left open just so tasks could keep running.<p>So we built Broccoli. Each task gets its own cloud sandbox to be executed end to end independently. Broccoli checks out the repo, uses the context in the ticket, works through an implementation, runs tests and review loops, and opens a PR for someone on the team to inspect.<p>Over the last four weeks, 100% of the PRs from non-developers are shipped via Broccoli, which is a safer and more efficient route. For developers on the team, this share is around 60%. More complicated features require more back and forth design with Codex / Claude Code and get shipped manually using the same set of skills locally.<p>Our implementation uses:<p>1. Webhook deployment: GCP
2. Sandbox: GCP or Blaxel
3. Project management: Linear
4. Code hosting & CI/CD: Github<p>Repo: <a href="https://github.com/besimple-oss/broccoli" rel="nofollow">https://github.com/besimple-oss/broccoli</a><p>We believe that if you should invest in your own coding harness if coding is an essential part of your business. That’s why we decided to open-source it as an alternative to all the cloud coding agents out there. Would love to hear your feedback on this!
Show HN: Ctx – a /resume that works across Claude Code and Codex
ctx is a local SQLite-backed skill for Claude Code and Codex that stores context as a persistent workstream that can be continued across agent sessions. Each workstream can contain multiple sessions, notes, decisions, todos, and resume packs. It essentially functions as a /resume that can work across coding agents.<p>Here is a video of how it works: <a href="https://www.loom.com/share/5e558204885e4264a34d2cf6bd488117" rel="nofollow">https://www.loom.com/share/5e558204885e4264a34d2cf6bd488117</a><p>I initially built ctx because I wanted to try a workstream that I started on Claude and continue it from Codex. Since then, I’ve added a few quality of life improvements, including the ability to search across previous workstreams, manually delete parts of the context with, and branch off existing workstreams.. I’ve started using ctx instead of the native ‘/resume’ in Claude/Codex because I often have a lot of sessions going at once, and with the lists that these apps currently give, it’s not always obvious which one is the right one to pick back up. ctx gives me a much clearer way to organize and return to the sessions that actually matter.<p>It’s simple to install after you clone the repo with one line: ./setup.sh, which adds the skill to both Claude Code and Codex. After that, you should be able to directly use ctx in your agent as a skill with ‘/ctx [command]’ in Claude and ‘ctx [command]’ in Codex.<p>A few things it does:<p>- Resume an existing workstream from either tool<p>- Pull existing context into a new workstream<p>- Keep stable transcript binding, so once a workstream is linked to a Claude or Codex conversation, it keeps following that exact session instead of drifting to whichever transcript file is newest<p>- Search for relevant workstreams<p>- Branch from existing context to explore different tasks in parallel<p>It’s intentionally local-first: SQLite, no API keys, and no hosted backend. I built it mainly for myself, but thought it would be cool to share with the HN community.
Show HN: Ctx – a /resume that works across Claude Code and Codex
ctx is a local SQLite-backed skill for Claude Code and Codex that stores context as a persistent workstream that can be continued across agent sessions. Each workstream can contain multiple sessions, notes, decisions, todos, and resume packs. It essentially functions as a /resume that can work across coding agents.<p>Here is a video of how it works: <a href="https://www.loom.com/share/5e558204885e4264a34d2cf6bd488117" rel="nofollow">https://www.loom.com/share/5e558204885e4264a34d2cf6bd488117</a><p>I initially built ctx because I wanted to try a workstream that I started on Claude and continue it from Codex. Since then, I’ve added a few quality of life improvements, including the ability to search across previous workstreams, manually delete parts of the context with, and branch off existing workstreams.. I’ve started using ctx instead of the native ‘/resume’ in Claude/Codex because I often have a lot of sessions going at once, and with the lists that these apps currently give, it’s not always obvious which one is the right one to pick back up. ctx gives me a much clearer way to organize and return to the sessions that actually matter.<p>It’s simple to install after you clone the repo with one line: ./setup.sh, which adds the skill to both Claude Code and Codex. After that, you should be able to directly use ctx in your agent as a skill with ‘/ctx [command]’ in Claude and ‘ctx [command]’ in Codex.<p>A few things it does:<p>- Resume an existing workstream from either tool<p>- Pull existing context into a new workstream<p>- Keep stable transcript binding, so once a workstream is linked to a Claude or Codex conversation, it keeps following that exact session instead of drifting to whichever transcript file is newest<p>- Search for relevant workstreams<p>- Branch from existing context to explore different tasks in parallel<p>It’s intentionally local-first: SQLite, no API keys, and no hosted backend. I built it mainly for myself, but thought it would be cool to share with the HN community.