The best Hacker News stories from Show from the past day

Go back

Latest posts:

Show HN: Real-time dashboard for Claude Code agent teams

This project (Agents Observe) started as an exploration into building automation harnesses around claude code. I needed a way to see exactly what teams of agents were doing in realtime and to filter and search their output.<p>A few interesting learnings from building and using this:<p>- Claude code hooks are blocking - performance degrades rapidly if you have a lot of plugins that use hooks<p>- Hooks provide a lot more useful info than OTEL data<p>- Claude's jsonl files provide the full picture<p>- Lifecycle management of MCP processes started by plugins is a bit kludgy at best<p>The biggest takeaway is how much of a difference it made in claude performance when I switched to background (fire and forget) hooks and removed all other plugins. It's easy to forget how many claude plugins I've installed and how they effect performance.<p>The Agents Observe plugin uses docker to start the API and dashboard service. This is a pattern I'd love to see used more often for security (think Axios hack) reasons. The tricky bit was handling process management across multiple claude instances - the solution was to have the server track active connections then auto shut itself down when not in use. Then the plugin spins it back up when a new session is started.<p>This tool has been incredibly useful for my own daily workflow. Enjoy!

Show HN: Baton – A desktop app for developing with AI agents

Hi,<p>I built this because running multiple Claude Code agents across multiple IDE and terminal windows was getting messy. Like many, I went from working at one thing at the time, to multiple, and it was all changing quite fast.<p>I needed one place to see all my agents and worktrees, seamlessly switch between them, monitor their status and once their done, review their changes. I also wanted to quickly spin up new agents in isolated worktrees whenever an idea came to mind.<p>I've been building Baton from within Baton for a while now, which has been a pretty fun loop. Would love to hear what you think!

Show HN: Sycamore – next gen Rust web UI library using fine-grained reactivity

Show HN: Zerobox – Sandbox any command with file, network, credential controls

I'm excited to introduce Zerobox, a cross-platform, single binary process sandboxing CLI written in Rust. It uses the sandboxing crates from the OpenAI Codex repo and adds additional functionalities like secret injection, SDK, etc.<p>Watch the demo: <a href="https://www.youtube.com/watch?v=wZiPm9BOPCg" rel="nofollow">https://www.youtube.com/watch?v=wZiPm9BOPCg</a><p>Zerobox follows the same sandboxing policy as Deno which is deny by default. The only operation that the command can run is reading files, all writes and network I/O are blocked by default. No VMs, no Docker, no remote servers.<p>Want to block reads to /etc?<p><pre><code> zerobox --deny-read=/etc -- cat /etc/passwd cat: /etc/passwd: Operation not permitted </code></pre> How it works:<p>Zerobox wraps any commands/programs, runs an MITM proxy and uses the native sandboxing solutions on each operating system (e.g BubbleWrap on Linux) to run the given process in a sandbox. The MITM proxy has two jobs: blocking network calls and injecting credentials at the network level.<p>Think of it this way, I want to inject "Bearer OPENAI_API_KEY" but I don't want my sandboxed command to know about it, Zerobox does that by replacing "OPENAI_API_KEY" with a placeholder, then replaces it when the actual outbound network call is made, see this example:<p><pre><code> zerobox --secret OPENAI_API_KEY=$OPENAI_API_KEY --secret-host OPENAI_API_KEY=api.openai.com -- bun agent.ts </code></pre> Zerobox is different than other sandboxing solutions in the sense that it would allow you to easily sandbox any commands locally and it works the same on all platforms. I've been exploring different sandboxing solutions, including Firecracker VMs locally, and this is the closest I was able to get when it comes to sandboxing commands locally.<p>The next thing I'm exploring is `zerobox claude` or `zerobox openclaw` which would wrap the entire agent and preload the correct policy profiles.<p>I'd love to hear your feedback, especially if you are running AI Agents (e.g. OpenClaw), MCPs, AI Tools locally.

Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs

Show HN: CLI to order groceries via reverse-engineered REWE API (Haskell)

I just had the best time learning about the REWE (German supermarket chain) API, how they use mTLS and what the workflows are. Also `mitmproxy2swagger`[1] is a great tool to create OpenAPI spec automatically.<p>And then 2026 feels like the perfect time writing Haskell. The code is handwritten, but whenever I got stuck with the build system or was just not getting the types right, I could fall back to ask AI to unblock me. It was never that smooth before.<p>Finally the best side projects are the ones you actually use and this one will be used for all my future grocery shopping.<p>[1]<a href="https://github.com/alufers/mitmproxy2swagger" rel="nofollow">https://github.com/alufers/mitmproxy2swagger</a>

Show HN: Loreline, narrative language transpiled via Haxe: C++/C#/JS/Java/Py/Lua

Show HN: I turned a sketch into a 3D-print pegboard for my kid with an AI agent

We have pegboards and plywood all over our apartment, and I had an idea to make a tiny pegboard for my kid, Oli. So I naturally cut the wood, drilled in the holes, sat down at the computer to open Fusion 360 and spend an hour or two drawing the pieces by hand.<p>Then I looked at the rough sketch Oli and I had made together, took a photo of it, pasted it into Codex, and gave it just two dimensions: the holes are 40mm apart and the pegs are 8mm wide.<p>To my surprise, 5 minutes later my 3D printer was heating up and printing the first set.<p>I ran it a few times to tune the dimensions for ideal fit, but I am posting the final result as a repository in case anyone else wants to print one, tweak it, or have fun with it too. I am already printing another one to hang on our front door instead of a wreath, so people visiting us have something fun and intriguing to play with while they knock.<p>This is also going onto my list of weird uses of AI from the last few months.

Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs

Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs

Show HN: Postgres extension for BM25 relevance-ranked full-text search

Last summer we faced a conundrum at my company, Tiger Data, a Postgres cloud vendor whose main business is in timeseries data. We were trying to grow our business towards emerging AI-centric workloads and wanted to provide a state-of-the-art hybrid search stack in Postgres. We'd already built pgvectorscale in house with the goal of scaling semantic search beyond pgvector's main memory limitations. We just needed a scalable ranked keyword search solution too.<p>The problem: core Postgres doesn't provide this; the leading Postgres BM25 extension, ParadeDB, is guarded behind AGPL; developing our own extension appeared daunting. We'd need a small team of sharp engineers and 6-12 months, I figured. And we'd probably still fall short of the performance of a mature system like Parade/Tantivy.<p>Or would we? I'd be experimenting long enough with AI-boosted development at that point to realize that with the latest tools (Claude Code + Opus) and an experienced hand (I've been working in database systems internals for 25 years now), the old time estimates pretty much go out the window.<p>I told our CTO I thought I could solo the project in one quarter. This raised some eyebrows.<p>It did take a little more time than that (two quarters), and we got some real help from the community (amazing!) after open-sourcing the pre-release. But I'm thrilled/exhausted today to share that pg_textsearch v1.0 is freely available via open source (Postgres license), on Tiger Data cloud, and hopefully soon, a hyperscalar near you:<p><a href="https://github.com/timescale/pg_textsearch" rel="nofollow">https://github.com/timescale/pg_textsearch</a><p>In the blog post accompanying the release, I overview the architecture and present benchmark results using MS-MARCO. To my surprise, we were not only able to meet Parade/Tantivy's query performance, but exceed it substantially, measuring a 4.7x advantage on query throughput at scale:<p><a href="https://www.tigerdata.com/blog/pg-textsearch-bm25-full-text-search-postgres" rel="nofollow">https://www.tigerdata.com/blog/pg-textsearch-bm25-full-text-...</a><p>It's exciting (and, to be honest, a little unnerving) to see a field I've spent so much time toiling in change so quickly in ways that enable us to be more ambitious in our technical objectives. Technical moats are moats no longer.<p>The benchmark scripts and methodology are available in the github repo. Happy to answer any questions in the thread.<p>Thanks,<p>TJ (tj@tigerdata.com)

Show HN: Postgres extension for BM25 relevance-ranked full-text search

Last summer we faced a conundrum at my company, Tiger Data, a Postgres cloud vendor whose main business is in timeseries data. We were trying to grow our business towards emerging AI-centric workloads and wanted to provide a state-of-the-art hybrid search stack in Postgres. We'd already built pgvectorscale in house with the goal of scaling semantic search beyond pgvector's main memory limitations. We just needed a scalable ranked keyword search solution too.<p>The problem: core Postgres doesn't provide this; the leading Postgres BM25 extension, ParadeDB, is guarded behind AGPL; developing our own extension appeared daunting. We'd need a small team of sharp engineers and 6-12 months, I figured. And we'd probably still fall short of the performance of a mature system like Parade/Tantivy.<p>Or would we? I'd be experimenting long enough with AI-boosted development at that point to realize that with the latest tools (Claude Code + Opus) and an experienced hand (I've been working in database systems internals for 25 years now), the old time estimates pretty much go out the window.<p>I told our CTO I thought I could solo the project in one quarter. This raised some eyebrows.<p>It did take a little more time than that (two quarters), and we got some real help from the community (amazing!) after open-sourcing the pre-release. But I'm thrilled/exhausted today to share that pg_textsearch v1.0 is freely available via open source (Postgres license), on Tiger Data cloud, and hopefully soon, a hyperscalar near you:<p><a href="https://github.com/timescale/pg_textsearch" rel="nofollow">https://github.com/timescale/pg_textsearch</a><p>In the blog post accompanying the release, I overview the architecture and present benchmark results using MS-MARCO. To my surprise, we were not only able to meet Parade/Tantivy's query performance, but exceed it substantially, measuring a 4.7x advantage on query throughput at scale:<p><a href="https://www.tigerdata.com/blog/pg-textsearch-bm25-full-text-search-postgres" rel="nofollow">https://www.tigerdata.com/blog/pg-textsearch-bm25-full-text-...</a><p>It's exciting (and, to be honest, a little unnerving) to see a field I've spent so much time toiling in change so quickly in ways that enable us to be more ambitious in our technical objectives. Technical moats are moats no longer.<p>The benchmark scripts and methodology are available in the github repo. Happy to answer any questions in the thread.<p>Thanks,<p>TJ (tj@tigerdata.com)

Show HN: Forkrun – NUMA-aware shell parallelizer (50×–400× faster than parallel)

forkrun is the culmination of a 10-year-long journey focused on "how to make shell parallelization fast". What started as a standard "fork jobs in a loop" has turned into a lock-free, CAS-retry-loop-free, SIMD-accelerated, self-tuning, NUMA aware shell-based stream parallelization engine that is (mostly) a drop-in replacement for xargs -P and GNU parallel.<p>On my 14-core/28-thread i9-7940x, forkrun achieves:<p>* 200,000+ batch dispatches/sec (vs ~500 for GNU Parallel)<p>* ~95–99% CPU utilization across all 28 logical cores, even when the workload is non-existant (bash no-ops / `:`) (vs ~6% for GNU Parallel). These benchmarks are intentionally worst-case (near-zero work per task) because they measure the capability of the parallelization framework itself, not how much work an external tool can do.<p>* Typically 50×–400× faster on real high-frequency low-latency workloads (vs GNU Parallel)<p>A few of the techniques that make this possible:<p>* Born-local NUMA: stdin is splice()'d into a shared memfd, then pages are placed on the target NUMA node via set_mempolicy(MPOL_BIND) before any worker touches them, making the memfd NUMA-spliced. Each numa node only claims work that is <i>already</i> born-local on its node. Stealing from other nodes is permitted under some conditions when no local work exists.<p>* SIMD scanning: per-node indexers/scanners use AVX2/NEON to find line boundaries (delimiters) at speeds approaching memory bandwidth, and publish byte-offsets and line-counts into per-node lock-free rings.<p>* Lock-free claiming: workers claim batches with a single atomic_fetch_add — no locks, no CAS retry loops; contention is reduced to a single atomic on one cache line.<p>* Memory management: a background thread uses fallocate(PUNCH_HOLE) to reclaim space without breaking the logical offset system.<p>…and that’s just the surface. The implementation uses many additional systems-level techniques (phase-aware tail handling, adaptive batching, early-flush detection, etc.) to eliminate overhead, increase throughput and reduce latency at every stage.<p>In its fastest (-b) mode (fixed-size batches, minimal processing), it can exceed 1B lines/sec.<p>forkrun ships as a single bash file with an embedded, self-extracting C extension — no Perl, no Python, no install, full native support for parallelizing arbitrary shell functions. The binary is built in public GitHub Actions so you can trace it back to CI (see the GitHub "Blame" on the line containing the base64 embeddings). Trying it is literally two commands:<p><pre><code> . frun.bash frun shell_func_or_cmd < inputs </code></pre> For benchmarking scripts and results, see the BENCHMARKS dir in the GitHub repo<p>For an architecture deep-dive, see the DOCS dir in the GitHub repo<p>Happy to answer questions.

Show HN: Forkrun – NUMA-aware shell parallelizer (50×–400× faster than parallel)

forkrun is the culmination of a 10-year-long journey focused on "how to make shell parallelization fast". What started as a standard "fork jobs in a loop" has turned into a lock-free, CAS-retry-loop-free, SIMD-accelerated, self-tuning, NUMA aware shell-based stream parallelization engine that is (mostly) a drop-in replacement for xargs -P and GNU parallel.<p>On my 14-core/28-thread i9-7940x, forkrun achieves:<p>* 200,000+ batch dispatches/sec (vs ~500 for GNU Parallel)<p>* ~95–99% CPU utilization across all 28 logical cores, even when the workload is non-existant (bash no-ops / `:`) (vs ~6% for GNU Parallel). These benchmarks are intentionally worst-case (near-zero work per task) because they measure the capability of the parallelization framework itself, not how much work an external tool can do.<p>* Typically 50×–400× faster on real high-frequency low-latency workloads (vs GNU Parallel)<p>A few of the techniques that make this possible:<p>* Born-local NUMA: stdin is splice()'d into a shared memfd, then pages are placed on the target NUMA node via set_mempolicy(MPOL_BIND) before any worker touches them, making the memfd NUMA-spliced. Each numa node only claims work that is <i>already</i> born-local on its node. Stealing from other nodes is permitted under some conditions when no local work exists.<p>* SIMD scanning: per-node indexers/scanners use AVX2/NEON to find line boundaries (delimiters) at speeds approaching memory bandwidth, and publish byte-offsets and line-counts into per-node lock-free rings.<p>* Lock-free claiming: workers claim batches with a single atomic_fetch_add — no locks, no CAS retry loops; contention is reduced to a single atomic on one cache line.<p>* Memory management: a background thread uses fallocate(PUNCH_HOLE) to reclaim space without breaking the logical offset system.<p>…and that’s just the surface. The implementation uses many additional systems-level techniques (phase-aware tail handling, adaptive batching, early-flush detection, etc.) to eliminate overhead, increase throughput and reduce latency at every stage.<p>In its fastest (-b) mode (fixed-size batches, minimal processing), it can exceed 1B lines/sec.<p>forkrun ships as a single bash file with an embedded, self-extracting C extension — no Perl, no Python, no install, full native support for parallelizing arbitrary shell functions. The binary is built in public GitHub Actions so you can trace it back to CI (see the GitHub "Blame" on the line containing the base64 embeddings). Trying it is literally two commands:<p><pre><code> . frun.bash frun shell_func_or_cmd < inputs </code></pre> For benchmarking scripts and results, see the BENCHMARKS dir in the GitHub repo<p>For an architecture deep-dive, see the DOCS dir in the GitHub repo<p>Happy to answer questions.

Show HN: The Alphabetical Clock

Show HN: Public transit systems as data – lines, stations, railcars, and history

Show HN: Coasts – Containerized Hosts for Agents

Hi HN - We've been working on Coasts (“containerized hosts”) to make it so you can run multiple localhost instances, and multiple docker-compose runtimes, across git worktrees on the same computer. Here’s a demo: <a href="https://www.youtube.com/watch?v=yRiySdGQZZA" rel="nofollow">https://www.youtube.com/watch?v=yRiySdGQZZA</a>. There are also videos in our docs that give a good conceptual overview: <a href="https://coasts.dev/docs/learn-coasts-videos">https://coasts.dev/docs/learn-coasts-videos</a>.<p>Agents can make code changes in different worktrees in isolation, but it's hard for them to test their changes without multiple localhost runtimes that are isolated and scoped to those worktrees as well. You can do it up to a point with port hacking tricks, but it becomes impractical when you have a complex docker-compose with many services and multiple volumes.<p>We started playing with Codex and Conductor in the beginning of this year and had to come up with a bunch of hacky workarounds to give the agents access to isolated runtimes. After bastardizing our own docker-compose setup, we came up with Coasts as a way for agents to have their own runtimes without having to change your original docker-compose.<p>A containerized host (from now on we’ll just say “coast” for short) is a representation of your project's runtime, like a devcontainer but without the IDE stuff—it’s just focused on the runtime. You create a Coastfile at your project root and usually point to your project's docker-compose from there. When you run `coast build` next to the Coastfile you will get a build (essentially a docker image) that can be used to spin up multiple Docker-in-Docker runtimes of your project.<p>Once you have a coast running, you can then do things like assign it to a worktree, with `coast assign dev-1 -w worktree-1`. The coast will then point at the worktree-1 root.<p>Under the hood the host project root and any external worktree directories are Docker-bind-mounted into the container at creation time but the /workspace dir, where we run the services of the coast from, is a separate Linux bind mount that we create inside the running container. When switching worktrees we basically just do umount -l /workspace, mount --bind <path_to_worktree_root>, mount --make-rshared /workspace inside of the running coast. The rshared flag sets up mount propagation so that when we remount /workspace, the change flows down to the inner Docker daemon's containers.<p>The main idea is that the agents can continue to work host-side but then run exec commands against a specific coast instance if they need to test runtime changes or access runtime logs. This makes it so that we are harness agnostic and create interoperability around any agent or agent harness that runs host-side.<p>Each coast comes with its own set of dynamic ports: you define the ports you wish to expose back to the host machine in the Coastfile. You're also able to "checkout" a coast. When you do that, socat binds the canonical ports of your coast (e.g. web 3000, db 5432) to the host machine. This is useful if you have hard coded ports in your project or need to do something like test webhooks.<p>In your Coastfile you point to all the locations on your host-machine where you store your worktrees for your project (e.g. ~/.codex/worktrees). When an agent runs `coast lookup` from a host-side worktree directory, it is able to find the name of the coast instance it is running on, so it can do things like call `coast exec dev-1 make tests`. If your agent needs to do things like test with Playwright it can so that host-side by using the dynamic port of your frontend.<p>You can also configure volume topologies, omit services and volumes that your agent doesn't need, as well as share certain services host-side so you don't add overhead to each coast instance. You can also do things like define strategies for how each service should behave after a worktree assignment change (e.g. none, hot, restart, rebuild). This helps you optimize switching worktrees so you don't have to do a whole docker-compose down and up cycle every time.<p>We'd love to answer any questions and get your feedback!

Show HN: Coasts – Containerized Hosts for Agents

Hi HN - We've been working on Coasts (“containerized hosts”) to make it so you can run multiple localhost instances, and multiple docker-compose runtimes, across git worktrees on the same computer. Here’s a demo: <a href="https://www.youtube.com/watch?v=yRiySdGQZZA" rel="nofollow">https://www.youtube.com/watch?v=yRiySdGQZZA</a>. There are also videos in our docs that give a good conceptual overview: <a href="https://coasts.dev/docs/learn-coasts-videos">https://coasts.dev/docs/learn-coasts-videos</a>.<p>Agents can make code changes in different worktrees in isolation, but it's hard for them to test their changes without multiple localhost runtimes that are isolated and scoped to those worktrees as well. You can do it up to a point with port hacking tricks, but it becomes impractical when you have a complex docker-compose with many services and multiple volumes.<p>We started playing with Codex and Conductor in the beginning of this year and had to come up with a bunch of hacky workarounds to give the agents access to isolated runtimes. After bastardizing our own docker-compose setup, we came up with Coasts as a way for agents to have their own runtimes without having to change your original docker-compose.<p>A containerized host (from now on we’ll just say “coast” for short) is a representation of your project's runtime, like a devcontainer but without the IDE stuff—it’s just focused on the runtime. You create a Coastfile at your project root and usually point to your project's docker-compose from there. When you run `coast build` next to the Coastfile you will get a build (essentially a docker image) that can be used to spin up multiple Docker-in-Docker runtimes of your project.<p>Once you have a coast running, you can then do things like assign it to a worktree, with `coast assign dev-1 -w worktree-1`. The coast will then point at the worktree-1 root.<p>Under the hood the host project root and any external worktree directories are Docker-bind-mounted into the container at creation time but the /workspace dir, where we run the services of the coast from, is a separate Linux bind mount that we create inside the running container. When switching worktrees we basically just do umount -l /workspace, mount --bind <path_to_worktree_root>, mount --make-rshared /workspace inside of the running coast. The rshared flag sets up mount propagation so that when we remount /workspace, the change flows down to the inner Docker daemon's containers.<p>The main idea is that the agents can continue to work host-side but then run exec commands against a specific coast instance if they need to test runtime changes or access runtime logs. This makes it so that we are harness agnostic and create interoperability around any agent or agent harness that runs host-side.<p>Each coast comes with its own set of dynamic ports: you define the ports you wish to expose back to the host machine in the Coastfile. You're also able to "checkout" a coast. When you do that, socat binds the canonical ports of your coast (e.g. web 3000, db 5432) to the host machine. This is useful if you have hard coded ports in your project or need to do something like test webhooks.<p>In your Coastfile you point to all the locations on your host-machine where you store your worktrees for your project (e.g. ~/.codex/worktrees). When an agent runs `coast lookup` from a host-side worktree directory, it is able to find the name of the coast instance it is running on, so it can do things like call `coast exec dev-1 make tests`. If your agent needs to do things like test with Playwright it can so that host-side by using the dynamic port of your frontend.<p>You can also configure volume topologies, omit services and volumes that your agent doesn't need, as well as share certain services host-side so you don't add overhead to each coast instance. You can also do things like define strategies for how each service should behave after a worktree assignment change (e.g. none, hot, restart, rebuild). This helps you optimize switching worktrees so you don't have to do a whole docker-compose down and up cycle every time.<p>We'd love to answer any questions and get your feedback!

Show HN: I made a "programming language" looking for feedback

Show HN: Crazierl – An Erlang Operating System

Crazierl is an experimental/hobby operating system based around BEAM.<p>I've linked the browser based demo; I don’t recommend using a phone; it does work, slowly, on the phones I tested, but it’s very awkward to use. You can share a link with a hashtag with your friends and click the consent checkbox, and it (should) link up into dist and I’ve also included a chat application you can start with chat:start(). (quit chat with /quit, or use the shell menu with ctrl-g to switch between shells etc).<p>The browser demo relies on the v86 javascript x86 virtual machine. You can also run Crazierl on a real x86 system, but I’ve had mixed luck on modern systems, it uses some esoteric legacy VGA features and support for that isn’t getting better.<p>Crazierl is fairly limited: 32-bit x86, BIOS boot, only two NIC drivers virtio-net and realtek 8168. But it's got enough to become part of an Erlang dist cluster. It also supports SMP, but it’s crashy with high core counts in qemu; there’s almost certainly several concurrency bugs in the kernel. There's also a lot of excess tcp debug spew (sorry).<p>Source code is available (Apache) <a href="https://github.com/russor/crazierl/" rel="nofollow">https://github.com/russor/crazierl/</a>

1 2 3 ... 959 960 961 >