The best Hacker News stories from Show from the past day

Go back

Latest posts:

Show HN: Opbox – CRDT based sync for text files on disk

Hi! I’m one of the founders of s2.dev, and recently have been hacking on opbox, which is an open-source daemon that turns directories of text files (code, markdown, etc) into collaborative, multi-player workspaces.<p>This started as a bit of an intellectual curiosity, to see if it was possible to do real-time sync at the filesystem level (i.e., in an editor-agnostic way).<p>The idea is pretty simple:<p><pre><code> - Opbox workspaces are roughly analogous to git repositories (and can be used alongside existing git repos, to share live changes between commits) - When the opbox daemon is running in a workspace (ob start), it listens for local filesystem events within its directory (writes, deletes, new files), and translates them into operations (the titular “op”) on shadow CRDT documents (Yrs) corresponding to each text file (as well as one doc for the namespace as a whole, which handles paths) - These shadow CRDT docs are maintained in a workspace-local sqlite db (Turso) - The ops, which represent diffs on a corresponding CRDT document, are then appended to a durable stream (S2) that acts as a shared journal for all sync participants - Opbox also reads from that journal, receiving ops from other participants, which are then used to update the local documents, first in the db, then by materializing them into actual files on disk </code></pre> This has worked surprisingly well for sharing things like Obsidian graphs in real-time.<p>It’s most helpful in cases where you want the ability to edit local files from arbitrary editors, but still collaborate live. The experience is best from editors where you can configure an aggressive autosave policy, and where edits to an open file are reflected in the editor in a timely way.<p>To gain confidence in the correctness of the core opbox flows (particularly all the nuances around bidirectional sync) I invested in wiring up deterministic simulation testing using the turmoil library, which has been incredibly helpful (see the opbox-sim crate in the repo).

Show HN: Morph Reflexes – Multi-head classifiers for agent traces

The most common failures for production agents are behavioral: looping, reasoning leakage, user frustration, and more. Using a frontier model like GPT or Sonnet to judge every turn is too expensive and slow to run at scale.<p>To solve this, we built Reflexes: semantic signals from agent traces, served fast and cheap over API. Built on custom kernels and a custom inference engine forked from vLLM.<p>Under the hood, it is a small LLM architected around multi-head inference. Small models need to be trained for specific tasks, but running 50 separate small models on the same input for 50 tasks makes no sense.<p>How it works: We use a modern LLM with hybrid attention and remove the decode step. We built an inference engine that lets prefill compute be 99% reused from reflex to reflex, similar in spirit to older 2019-era BERT/HYDRA and older multiple-head techniques. we built the inference engine to reuse the KV/cache across inputs and compute across all reflexes. One shared backbone reads the trace once, then many heads classify different signals. Our inference engine reuses the same KV/cache and compute across all reflexes, giving us sub-30ms inference with less than 0.1% overhead for each additional reflex.<p>We took the same high-level idea and did the hard work to make it work with a modern architecture and attention. On it, we can run inference in under 30ms and serve the full request in under 90ms. If you run 4 reflexes or 100, the extra overhead is less than 2ms.<p>Why does optimizing this matter?<p>If you’re even a medium-sized startup, you’re dealing with tens of thousands of agent runs and millions of turns. If you want to track things like user frustration rates over time, frontier LLM-as-judge does not scale.<p>I built a similar stack at Tesla. When ML engineers needed to sample data across petabytes for signals like `is_camera_obfuscated=true`, along with 200 other things, you need to 1) spin them up quickly 2) run at scale efficiently<p>What it is not: A dashboard. 99% of dashboards go unused. 100% API first and made for devs who want to use this to trigger their own stuff.<p>vibetrain a custom reflex in our dashboard, and/or then let it self improve in production: <a href="https://www.morphllm.com/dashboard/reflex">https://www.morphllm.com/dashboard/reflex</a><p>Docs: <a href="https://docs.morphllm.com/sdk/components/reflexes/index">https://docs.morphllm.com/sdk/components/reflexes/index</a><p>I’d love feedback from people running agents in prod: what sorts of things do you wish you could track over time across 100% of turns but cant right now?<p>TLDR: semantic signals from agent traces, super fast, cheap via API

Show HN: CLI that helps AI agents avoid vulnerable dependencies

deptrust is a CLI that checks package versions for known vulnerabilities across npm, PyPI, crates.io, Go modules, RubyGems, NuGet, Maven, Packagist, pub.dev, CocoaPods, Hex.pm, Hackage, GitHub Actions, and more.<p>It runs locally as a CLI and as an MCP server. It calls public package registry and OSV APIs directly; there is no hosted deptrust service.<p>I built this because AI coding agents kept suggesting outdated or vulnerable package versions. I kept having to manually tell tools like Claude and Codex to use newer, safer versions.<p>deptrust gives the agent a quick way to verify whether a dependency version has known vulnerabilities before it installs or recommends it.<p>You can install it with:<p>1. pnpx @clidey/deptrust@latest install<p>2. brew install clidey/tap/deptrust<p>3. Or directly with go: go install github.com/clidey/deptrust/cmd/deptrust@latest

Show HN: Classify mechanical faults using Contrastive Language-Audio Pretraining

Show HN: A statically typed, cross-platform, easily bootstrappable build system

Show HN: I measured the half-life of 41,301 Show HN launches. It's 7 hours

I scraped every Show HN from the last 12 months (41,301 posts) plus the full comment tree of every launch with 10+ comments, ~100k comment timestamps, all from the Algolia HN API.<p>The median launch gets 2 points and 0 comments. For launches that do get traction, half the comments they'll ever get arrive within 7.2 hours and 90% within 26, and the top decile decays on the same clock as everyone else.<p>Vote timestamps aren't public, so comment timing is the attention proxy; caveats are in the post. Everything reproduces from the repo with one command (<a href="https://github.com/jonnonz1/hn-attention-cliff" rel="nofollow">https://github.com/jonnonz1/hn-attention-cliff</a>), and every number in the post maps to a named function. Keen to hear where the methodology falls short

Show HN: Mcpsnoop – Wireshark for MCP (transparent proxy and live TUI)

Show HN: Mcpsnoop – Wireshark for MCP (transparent proxy and live TUI)

Show HN: Inkwell – An RSS reader for e-ink devices

Show HN: ctx – Search the coding agent history already on your machine

Coding agents don't have long-term memory.<p>But you do have months of full-fidelity agent transcripts stored on your machine.<p>A simple solution that goes a long way: ingest those transcripts and logs into a structured SQLite database, then search them with ranked text match. Everything is fully local and doesn't require anything fancy like a graph database or hosted memory service.<p>This is the idea behind ctx, a Rust CLI that handles the ingestion and searching.<p>We give our agents a skill that tells them to reference past sessions before working in an area. Usually we do this through an "Agent History Research Subagent" whose job is just to prepare a short brief covering any relevant history before the task begins.<p>A real example: sometimes our test suite runs would fail because disk was full on the runner. The correct approach was to run the cleanup runbook, but the root cause of the failure was not clear to the agents, so they would think it was a test regression and go down the wrong rabbit hole debugging. When the agent searched history, it realized this failure had been encountered before and found the right workaround immediately. That got the agent onto the right cleanup path, and later we improved the log output so the same failure would be clearer next time. It's a boring story, but it's real agent productivity.<p>Another nice use case is quickly generating session transcripts for sharing. You can exclude the noisy intermediate messages, so the transcript shows the important parts of the session more cleanly. Try attaching a session transcript to your next PR so your teammate and their agent can review the provenance and prompting behind the change.<p>If you're up for an additional challenge, ask your agent to "exhaustively review all agent history in this repo and find where the SDLC is struggling or isn't agent-native". Using past sessions to recursively improve the agentic SDLC is a loop that we're using a lot today.<p>If you try it out, please let us know what you think!

Show HN: ctx – Search the coding agent history already on your machine

Coding agents don't have long-term memory.<p>But you do have months of full-fidelity agent transcripts stored on your machine.<p>A simple solution that goes a long way: ingest those transcripts and logs into a structured SQLite database, then search them with ranked text match. Everything is fully local and doesn't require anything fancy like a graph database or hosted memory service.<p>This is the idea behind ctx, a Rust CLI that handles the ingestion and searching.<p>We give our agents a skill that tells them to reference past sessions before working in an area. Usually we do this through an "Agent History Research Subagent" whose job is just to prepare a short brief covering any relevant history before the task begins.<p>A real example: sometimes our test suite runs would fail because disk was full on the runner. The correct approach was to run the cleanup runbook, but the root cause of the failure was not clear to the agents, so they would think it was a test regression and go down the wrong rabbit hole debugging. When the agent searched history, it realized this failure had been encountered before and found the right workaround immediately. That got the agent onto the right cleanup path, and later we improved the log output so the same failure would be clearer next time. It's a boring story, but it's real agent productivity.<p>Another nice use case is quickly generating session transcripts for sharing. You can exclude the noisy intermediate messages, so the transcript shows the important parts of the session more cleanly. Try attaching a session transcript to your next PR so your teammate and their agent can review the provenance and prompting behind the change.<p>If you're up for an additional challenge, ask your agent to "exhaustively review all agent history in this repo and find where the SDLC is struggling or isn't agent-native". Using past sessions to recursively improve the agentic SDLC is a loop that we're using a lot today.<p>If you try it out, please let us know what you think!

Show HN: Pieces – Social network for people

Hey HN, long time lurker first time poster. I built a social network called PIECES. After building a private blog last year after I had to get off IG and Substack, I decided to productize it. It's here now. It has a dedicated web experience + iOS/Android.<p>Would love if you tried it out!

Show HN: Pieces – Social network for people

Hey HN, long time lurker first time poster. I built a social network called PIECES. After building a private blog last year after I had to get off IG and Substack, I decided to productize it. It's here now. It has a dedicated web experience + iOS/Android.<p>Would love if you tried it out!

Show HN: Bramble – Local-first password manager

I'm currently working on Bramble, an open source password manager with P2P cross-device sync. Initially I released the Chrome extension, but recently I also published the Android app and iOS is pending Apple's approval. Besides that, the latest version also includes passkey storage for all platforms!<p>About Bramble:<p>It aims to be as feature-rich as all popular and a replacement for cloud-based providers. I don't think we need to store our data in the cloud and be at the whims of companies raising their prices every year. There's always a breach and then we find out that some fields aren't encrypted, metadata is visible, and so on. I'm frustrated with this and the increasing lack of transparency during these breaches.<p>The P2P sync in Bramble uses a Nostr relay (which can be self-hosted) to keep your devices in sync. The relay just introduces the devices to each other; the data then flows directly over WebRTC, so there's no vault server and no cloud copy of your passwords anywhere. What leaves your device is end-to-end encrypted and your devices authenticate each other directly, so a snooping or MITM relay gets practically nothing.<p>Crypto is all done in Rust so I can control exactly how key material lives and dies in memory (secrets get zeroed out, no GB leaving copies lying around). In Chromium it's a wasm module, on mobile it's native builds bridged over via uniffi.<p>Android app:<p>I'm still deciding whether to publish the app on Play store or simply provide the signed APK which users can sideload. Reason for that is Google's plan to lock down Android and take away ownership from its users. Read more about it here: <a href="https://keepandroidopen.com/" rel="nofollow">https://keepandroidopen.com/</a><p>The app uses no Play APIs whatsoever and runs perfectly on GrapheneOS, where I actually did all my testing.<p>Questions, feedback, feature requests - all welcome!<p>TL;DR: I dislike private-equity and venture funded companies messing with our security, so I created my own Password Manager which is local-first, free, open source and as transparent as it gets.

Show HN: Bramble – Local-first password manager

I'm currently working on Bramble, an open source password manager with P2P cross-device sync. Initially I released the Chrome extension, but recently I also published the Android app and iOS is pending Apple's approval. Besides that, the latest version also includes passkey storage for all platforms!<p>About Bramble:<p>It aims to be as feature-rich as all popular and a replacement for cloud-based providers. I don't think we need to store our data in the cloud and be at the whims of companies raising their prices every year. There's always a breach and then we find out that some fields aren't encrypted, metadata is visible, and so on. I'm frustrated with this and the increasing lack of transparency during these breaches.<p>The P2P sync in Bramble uses a Nostr relay (which can be self-hosted) to keep your devices in sync. The relay just introduces the devices to each other; the data then flows directly over WebRTC, so there's no vault server and no cloud copy of your passwords anywhere. What leaves your device is end-to-end encrypted and your devices authenticate each other directly, so a snooping or MITM relay gets practically nothing.<p>Crypto is all done in Rust so I can control exactly how key material lives and dies in memory (secrets get zeroed out, no GB leaving copies lying around). In Chromium it's a wasm module, on mobile it's native builds bridged over via uniffi.<p>Android app:<p>I'm still deciding whether to publish the app on Play store or simply provide the signed APK which users can sideload. Reason for that is Google's plan to lock down Android and take away ownership from its users. Read more about it here: <a href="https://keepandroidopen.com/" rel="nofollow">https://keepandroidopen.com/</a><p>The app uses no Play APIs whatsoever and runs perfectly on GrapheneOS, where I actually did all my testing.<p>Questions, feedback, feature requests - all welcome!<p>TL;DR: I dislike private-equity and venture funded companies messing with our security, so I created my own Password Manager which is local-first, free, open source and as transparent as it gets.

Show HN: zkGolf – Competitive optimization of formally verified circuits

Zero-Knowledge Proofs (ZKPs) let an untrusted proved show that computation was executed correctly without revealing the inputs to the verifier. However to prove anything, the computation first has to be expressed as a circuit: a system of polynomial equations (constraints) over a finite field. Circuits are the assembly language of zk and every constraint costs prover (and sometimes verifier) time, so production circuits are aggressively hand-optimized.<p>Over the last months, we have been experimenting with writing formal specifications instead and letting LLMs produce the circuits: as long as they could prove that their implementation was correct. It started with SHA-256: we hand wrote a specification in Lean for SHA-256 compression, and then we asked LLMs to write the circuit, targeting R1CS arithmetization and large fields.<p>It took a few hours of work for Opus 4.7, and some light steering into the right direction, but in the end the model came up with a reasonable implementation. We then asked the LLM to aggressively optimize the circuits, by driving down a cost metric of the circuit (number of constraints). We immediately got very promising results, just by asking to come up with optimization ideas, implement them and prove that the new circuit still satisfies soundness and completeness. Sometimes, it came up with unsound optimizations, however, since it could not prove them, it backtracked and got itself back on to the right approach.<p>The result was a (non-deterministic) circuit beating the current, human optimized, state of the art for SHA256 compression. This experience lead us to create "zk.golf" which is an open competition to produce optimized, formally verified circuits to lower the bar for the use of ZKPs and make their application more efficient.<p>Come play (<a href="https://zk.golf/llms.txt" rel="nofollow">https://zk.golf/llms.txt</a>) and learn about formal verification.

Show HN: zkGolf – Competitive optimization of formally verified circuits

Zero-Knowledge Proofs (ZKPs) let an untrusted proved show that computation was executed correctly without revealing the inputs to the verifier. However to prove anything, the computation first has to be expressed as a circuit: a system of polynomial equations (constraints) over a finite field. Circuits are the assembly language of zk and every constraint costs prover (and sometimes verifier) time, so production circuits are aggressively hand-optimized.<p>Over the last months, we have been experimenting with writing formal specifications instead and letting LLMs produce the circuits: as long as they could prove that their implementation was correct. It started with SHA-256: we hand wrote a specification in Lean for SHA-256 compression, and then we asked LLMs to write the circuit, targeting R1CS arithmetization and large fields.<p>It took a few hours of work for Opus 4.7, and some light steering into the right direction, but in the end the model came up with a reasonable implementation. We then asked the LLM to aggressively optimize the circuits, by driving down a cost metric of the circuit (number of constraints). We immediately got very promising results, just by asking to come up with optimization ideas, implement them and prove that the new circuit still satisfies soundness and completeness. Sometimes, it came up with unsound optimizations, however, since it could not prove them, it backtracked and got itself back on to the right approach.<p>The result was a (non-deterministic) circuit beating the current, human optimized, state of the art for SHA256 compression. This experience lead us to create "zk.golf" which is an open competition to produce optimized, formally verified circuits to lower the bar for the use of ZKPs and make their application more efficient.<p>Come play (<a href="https://zk.golf/llms.txt" rel="nofollow">https://zk.golf/llms.txt</a>) and learn about formal verification.

Show HN: QUALITY.md – open format/specification, agent skill, and CLI

Hello all, I created QUALITY.md to help build a holistic quality evaluation process for my projects. Turns out it's also ideal for loop engineering. I'm hoping this provides a valuable contribution to the conversation around quality and craft and having AI help us in the effort. I hope to shift the mindset from a reactive/review/repair mindset to a proactive care mindset.<p>Give it a go. I look forward to your thoughts/comments/feedback!<p>Website: <a href="https://getquality.md" rel="nofollow">https://getquality.md</a> GitHub: <a href="https://github.com/qualitymd/quality.md" rel="nofollow">https://github.com/qualitymd/quality.md</a>

Show HN: Claudoro, Pomodoro timer embedded in the Claude Code statusline

3 weeks ago I had a nasty accident and fractured my vertebrae. As I lay in bed I needed something to take my mind off it all so built "Claudoro".<p>Claudoro is a pomodoro timer built right into the Claude Code status line, as well as can be directly controlled from Claude Code and the CLI. A few years ago I built "pymodoro" which was great, but recently I felt I needed something embedded in the tools I actually use, and I also wanted something that was flexible, and I could tweak and nudge.<p>Anyway I hope it is useful to you, and I'd love some feedback on how to improve it.<p>Thank you...!<p>PS this is a write up all about how it works etc: <a href="https://benemson.com/blog/agents/claudoro-pomodoro-timer-claude-code" rel="nofollow">https://benemson.com/blog/agents/claudoro-pomodoro-timer-cla...</a>

Show HN: Claudoro, Pomodoro timer embedded in the Claude Code statusline

3 weeks ago I had a nasty accident and fractured my vertebrae. As I lay in bed I needed something to take my mind off it all so built "Claudoro".<p>Claudoro is a pomodoro timer built right into the Claude Code status line, as well as can be directly controlled from Claude Code and the CLI. A few years ago I built "pymodoro" which was great, but recently I felt I needed something embedded in the tools I actually use, and I also wanted something that was flexible, and I could tweak and nudge.<p>Anyway I hope it is useful to you, and I'd love some feedback on how to improve it.<p>Thank you...!<p>PS this is a write up all about how it works etc: <a href="https://benemson.com/blog/agents/claudoro-pomodoro-timer-claude-code" rel="nofollow">https://benemson.com/blog/agents/claudoro-pomodoro-timer-cla...</a>

1 2 3 ... 1005 1006 1007 >