The best Hacker News stories from Show from the past day
Latest posts:
Show HN: Hormuz Havoc, a satirical game that got overrun by AI bots in 24 hours
I built a satirical browser game to share with friends (Hormuz Havoc: you play an American president managing a crisis in the Middle East, only "loosely" inspired by current events). I had good fun making this, but that's not necessarily the interesting part.<p>The interesting part was that within a few hours of sharing it with my friends, some of them set about trying to overrun the leaderboard by launching a swarm of AI bots to learn the game and figure out how to get the highest score. This set off a game of cat-and-mouse as they found vulnerabilities and I tried patching them.<p>Within hours of sharing, someone used the Claude browser extension to read game.js directly. Large parts of the scoring formula, action effect values, and bonus thresholds were sitting in client-side JavaScript - this was a trivial thing even a human could've found, but a human would've still had to play the game, whereas the AI bot just optimised directly against the scoring formula. It meant that the first AI already scored 2.5x higher than the best human player by optimising directly against the source code rather than playing the game.<p>Straightforward fix: moved the entire game engine server-side. The client is now a dumb terminal, it sends an action ID, receives a rendered state. No scoring logic, no bonus thresholds, no action effects exist in the browser. The live score display uses a deliberately different formula as misdirection.<p>This increased the difficulty in finding bot-enabled hacks, so the subsequent bots tried brute-forcing the game, trying to game the RNG functions, and other methods.<p>But the next winning bot found a vulnerability where the same signed session token could be replayed. It would play turn N, observe a bad random event, replay the same token for turn N, get a different RNG outcome, keep the best one. Effectively branching from a single game state to cherry-pick lucky outcomes across 30 turns. Managed to 1.5x the previous bot's high score.<p>The bot's own description: "The key optimisation was token replay. Because the backend let the same signed state be replayed, I could branch from one exact game state repeatedly and continue from the luckiest high-value outcome each turn."<p>Fix here: consume a turn nonce atomically before any randomness is generated.<p>The current state is that the leaderboard is now split into human and AI-assisted. I think the capability of AI bots has flatlined a bit now. Perhaps Claude Mythos might be able to discover the next hackable exploit ¯\_(ツ)_/¯<p>Happy to go deeper on any of the above - or just enjoy the game! Feel free to try your own AI-powered leaderboard attempt too!
Show HN: Eve – Managed OpenClaw for work
Eve is an AI agent harness that runs in an isolated Linux sandbox (2 vCPUs, 4GB RAM, 10GB disk) with a real filesystem, headless Chromium, code execution, and connectors to 1000+ services.<p>You give it a task and it works in the background until it's done.<p>I built this because I wanted OpenClaw without the self-hosting, pointed at actual day-to-day work. I’m thinking less personal assistant and more helpful colleague.<p>Here’s a short demo video: <a href="https://www.loom.com/share/00d11bdbe804478e8817710f5f53ac61" rel="nofollow">https://www.loom.com/share/00d11bdbe804478e8817710f5f53ac61</a><p>The main interface is a web app where you can watch work happen in real time (agents spawning, files being written, use of the CLI). There's also an iMessage integration so you can fire a task asynchronously, put your phone down, and get a reply when it's finished.<p>Under the hood, there's an orchestrator (Claude Opus 4.6) that routes to the right domain-specific model for each subtask: browsing, coding, research, and media generation.<p>For complex tasks it spins up parallel sub-agents that coordinate through the shared filesystem. They have persistent memory across sessions so context compounds over time.<p>I’ve packaged it with a bunch of pre-installed skills so it can execute in a variety of job roles (sales, marketing, finance) at runtime.<p>Here are a few things Eve has helped me with in the last couple days:<p>- Edit this demo video with a voice over of Garry: <a href="https://www.youtube.com/watch?v=S4oD7H3cAQ0" rel="nofollow">https://www.youtube.com/watch?v=S4oD7H3cAQ0</a><p>- Do my tax returns<p>- To build HN as if it was the year 2030: <a href="https://api.eve.new/api/sites/hackernews-2030/#/" rel="nofollow">https://api.eve.new/api/sites/hackernews-2030/#/</a><p>AMA on the architecture and lmk your thoughts :)<p>P.S. I've given every new user $100 worth of credits to try it.
Show HN: Pardonned.com – A searchable database of US Pardons
<a href="https://pardonned.com" rel="nofollow">https://pardonned.com</a><p>Inspired by the videos of Liz Oyer, I wanted to be able to verify her claims and just look up all the pardons more easily.<p>Tech Stack:
Playwright - to sccrape the DOJ website
SQLite - local database
Astro 6 - Build out a static website from the sqlite db<p>All code is open source and available on Github.
Show HN: Druids – Build your own software factory
Hi HN!<p>Druids (<a href="https://github.com/fulcrumresearch/druids" rel="nofollow">https://github.com/fulcrumresearch/druids</a>) is an open-source library for structuring and running multi-agent coding workflows. Druids makes it easy to do this by abstracting away all the VM infrastructure, agent provisioning, and communication. You can watch our demo video here (<a href="https://www.youtube.com/watch?v=EVJqW-tvSy4" rel="nofollow">https://www.youtube.com/watch?v=EVJqW-tvSy4</a>) to see what it looks like.<p>At a high level:<p>- Users can write Python programs that define what roles the agents take on and how they interact with each other.<p>- A program is made of events - clear state transitions that the agents or clients can call to modify state. Each event gets exposed as an agent tool.<p>- Druids provisions full VMs so that the agents can run continuously and communicate effectively.<p>We made Druids because we were making lots of internal coding tools using agents and found it annoying to have to rearrange the wiring every time.<p>As we were building Druids, we realized a lot of our internal tools were easier to express as an event-driven architecture – separating deterministic control flow from agent behavior – and this design also made it possible to have many agents work reliably.<p>We had issues with scaling the number of concurrent agents within a run, so we decided to have each program run in an isolated sandbox program runtime, kind of the same way you run a Modal function. Each agent then calls the runtime with an agent token, which checks who can talk to who or send files across VMs, and then applies the tool call.<p>Our early users have found the library useful for:<p>- running many agents to do performance optimization<p>- building custom automated software pipelines for eg code review, pentesting, large-scale migrations, etc...<p>We've heard that the frontier labs have the infrastructure to quickly spin up 100 agents and have them coordinate with each other smoothly in various ways. We're hoping that Druids can be a starting point to make that infrastructure more accessible.
Show HN: Druids – Build your own software factory
Hi HN!<p>Druids (<a href="https://github.com/fulcrumresearch/druids" rel="nofollow">https://github.com/fulcrumresearch/druids</a>) is an open-source library for structuring and running multi-agent coding workflows. Druids makes it easy to do this by abstracting away all the VM infrastructure, agent provisioning, and communication. You can watch our demo video here (<a href="https://www.youtube.com/watch?v=EVJqW-tvSy4" rel="nofollow">https://www.youtube.com/watch?v=EVJqW-tvSy4</a>) to see what it looks like.<p>At a high level:<p>- Users can write Python programs that define what roles the agents take on and how they interact with each other.<p>- A program is made of events - clear state transitions that the agents or clients can call to modify state. Each event gets exposed as an agent tool.<p>- Druids provisions full VMs so that the agents can run continuously and communicate effectively.<p>We made Druids because we were making lots of internal coding tools using agents and found it annoying to have to rearrange the wiring every time.<p>As we were building Druids, we realized a lot of our internal tools were easier to express as an event-driven architecture – separating deterministic control flow from agent behavior – and this design also made it possible to have many agents work reliably.<p>We had issues with scaling the number of concurrent agents within a run, so we decided to have each program run in an isolated sandbox program runtime, kind of the same way you run a Modal function. Each agent then calls the runtime with an agent token, which checks who can talk to who or send files across VMs, and then applies the tool call.<p>Our early users have found the library useful for:<p>- running many agents to do performance optimization<p>- building custom automated software pipelines for eg code review, pentesting, large-scale migrations, etc...<p>We've heard that the frontier labs have the infrastructure to quickly spin up 100 agents and have them coordinate with each other smoothly in various ways. We're hoping that Druids can be a starting point to make that infrastructure more accessible.
Show HN: A WYSIWYG word processor in Python
Hi all,<p>Finding a good data structure for a word processor is a difficult problem. My notebook diaries on the problem go back 25 years when I was frustrated with using Word for my diploma thesis - it was slow and unstable at that time. I ended up getting pretty hooked on the problem.<p>Right now I’m taking a professional break and decided to finally use the time to push these ideas further, and build MiniWord — a WYSIWYG word processor in Python.<p>My goal is to have a native, non-HTML-based editor that stays simple, fast, and is hackable. So far I am focusing on getting the fundamentals right. What is working yet is:<p>- Real WYSIWYG editing (no HTML layer, no embedded browser) with styles, images and tables.<p>- Clean, simple file format (human-readable, diff-friendly, git-friendly, AI-friendly)<p>- Markdown support<p>- Support for Python-plugins<p><i>Things that I found:</i><p>- B-tree structures are perfect for holding rich text data<p>- A simple text-based file format is incredibly useful — you can diff documents, version them, and even process them with AI tools quite naturally<p><i>What I’d love feedback on:</i><p>- Where do you see real use cases for something like this?<p>- What would be missing for you to take it seriously as a tool or platform?<p>- What kinds of plugins or extensions would actually be worth building?<p>Happy about any thoughts — positive or critical.
Greetings
Show HN: A WYSIWYG word processor in Python
Hi all,<p>Finding a good data structure for a word processor is a difficult problem. My notebook diaries on the problem go back 25 years when I was frustrated with using Word for my diploma thesis - it was slow and unstable at that time. I ended up getting pretty hooked on the problem.<p>Right now I’m taking a professional break and decided to finally use the time to push these ideas further, and build MiniWord — a WYSIWYG word processor in Python.<p>My goal is to have a native, non-HTML-based editor that stays simple, fast, and is hackable. So far I am focusing on getting the fundamentals right. What is working yet is:<p>- Real WYSIWYG editing (no HTML layer, no embedded browser) with styles, images and tables.<p>- Clean, simple file format (human-readable, diff-friendly, git-friendly, AI-friendly)<p>- Markdown support<p>- Support for Python-plugins<p><i>Things that I found:</i><p>- B-tree structures are perfect for holding rich text data<p>- A simple text-based file format is incredibly useful — you can diff documents, version them, and even process them with AI tools quite naturally<p><i>What I’d love feedback on:</i><p>- Where do you see real use cases for something like this?<p>- What would be missing for you to take it seriously as a tool or platform?<p>- What kinds of plugins or extensions would actually be worth building?<p>Happy about any thoughts — positive or critical.
Greetings
Show HN: Keeper – embedded secret store for Go (help me break it)
Keeper is an embeddable secret store (Argon2id, XChaCha20-Poly1305 by default). Four security levels, audit chains, crash-safe rotation. Vault is overkill for most use cases. This is for when you ge paranoid about env and need encrypted local storage that doesn't suck. No security through obscurity, hence, It's still early, so now's the best time to find weird edge cases, race conditions, memory leaks, crypto misuse, anything that breaks. The README has a full security model breakdown if you want to get adversarial.
Show HN: Keeper – embedded secret store for Go (help me break it)
Keeper is an embeddable secret store (Argon2id, XChaCha20-Poly1305 by default). Four security levels, audit chains, crash-safe rotation. Vault is overkill for most use cases. This is for when you ge paranoid about env and need encrypted local storage that doesn't suck. No security through obscurity, hence, It's still early, so now's the best time to find weird edge cases, race conditions, memory leaks, crypto misuse, anything that breaks. The README has a full security model breakdown if you want to get adversarial.
Show HN: FluidCAD – Parametric CAD with JavaScript
Hello HN users,<p>This is a CAD by code project I have been working on on my free time for more than year now.<p>I built it with 3 goals in mind:<p>- It should be familiar to CAD designers who have used other programs. Same workflow, same terminology.<p>- Reduce the mental effort required to create models as much as possible. This is achieved by:<p><pre><code> - Provide live rendering and visual guidance as you type.
- Allow the user to reference existing edges/faces on the scene instead of having to calculate everything.
- Provide interactive mouse helpers for features that are hard to write by code: Only 3 interactive modes for now: Edge trimming, Sketch region extrude, Bezier curve drawing.
- Implicit coding whenever possible: e.g: There are sensible defaults for most parameters. The program will automatically fuse intersecting objects together so you do not have to worry about what object needs to be fused with what.</code></pre>
- It should be reasonably fast: The scene objects are cached and only the updated objects are re-computed.<p>I think I have achieved these goals to a good extent. The program is still in early stages and there are many features I want to add, rewrite but I think it is already usable for simple models.<p>Update to add more details:
This is based on Opencascade.js WASM binding. So you get all the good things that come with any brep kernel. Fillets, chamfers, step import and export...<p>The scene is webview but the editing is in your local file. You use your own editor and the environment you are familiar with.<p>One important feature that I think make this stand out among other code based cad software is the ability to transform features not just shapes. More here:
<a href="https://fluidcad.io/docs/guides/patterns" rel="nofollow">https://fluidcad.io/docs/guides/patterns</a>
You can see it in action in the lantern example:
<a href="https://fluidcad.io/docs/tutorials/lantern" rel="nofollow">https://fluidcad.io/docs/tutorials/lantern</a>
Show HN: FluidCAD – Parametric CAD with JavaScript
Hello HN users,<p>This is a CAD by code project I have been working on on my free time for more than year now.<p>I built it with 3 goals in mind:<p>- It should be familiar to CAD designers who have used other programs. Same workflow, same terminology.<p>- Reduce the mental effort required to create models as much as possible. This is achieved by:<p><pre><code> - Provide live rendering and visual guidance as you type.
- Allow the user to reference existing edges/faces on the scene instead of having to calculate everything.
- Provide interactive mouse helpers for features that are hard to write by code: Only 3 interactive modes for now: Edge trimming, Sketch region extrude, Bezier curve drawing.
- Implicit coding whenever possible: e.g: There are sensible defaults for most parameters. The program will automatically fuse intersecting objects together so you do not have to worry about what object needs to be fused with what.</code></pre>
- It should be reasonably fast: The scene objects are cached and only the updated objects are re-computed.<p>I think I have achieved these goals to a good extent. The program is still in early stages and there are many features I want to add, rewrite but I think it is already usable for simple models.<p>Update to add more details:
This is based on Opencascade.js WASM binding. So you get all the good things that come with any brep kernel. Fillets, chamfers, step import and export...<p>The scene is webview but the editing is in your local file. You use your own editor and the environment you are familiar with.<p>One important feature that I think make this stand out among other code based cad software is the ability to transform features not just shapes. More here:
<a href="https://fluidcad.io/docs/guides/patterns" rel="nofollow">https://fluidcad.io/docs/guides/patterns</a>
You can see it in action in the lantern example:
<a href="https://fluidcad.io/docs/tutorials/lantern" rel="nofollow">https://fluidcad.io/docs/tutorials/lantern</a>
Show HN: Marimo pair – Reactive Python notebooks as environments for agents
Hi HN! We're excited to share marimo pair [1] [2], a toolkit that drops AI agents into a running marimo notebook [3] session. This lets agents use marimo as working memory and a reactive Python runtime, while also making it easy for humans and agents to collaborate on computational research and data work.<p>GitHub repo: <a href="https://github.com/marimo-team/marimo-pair" rel="nofollow">https://github.com/marimo-team/marimo-pair</a><p>Demo: <a href="https://www.youtube.com/watch?v=6uaqtchDnoc" rel="nofollow">https://www.youtube.com/watch?v=6uaqtchDnoc</a><p>marimo pair is implemented as an agent skill. Connect your agent of choice to a running notebook with:<p>/marimo-pair pair with me on my_notebook.py<p>The agent can do anything a human can do with marimo and more. For example, it can obtain feedback by running code in an ephemeral scratchpad (inspect variables, run code against the program state, read outputs). If it wants to persist state, the agent can add cells, delete them, and install packages (marimo records these actions in the associated notebook, which is just a Python file). The agent can even manipulate marimo's user interface — for fun, try asking your agent to greet you from within a pair session.<p>The agent effects all actions by running Python code in the marimo kernel. Under the hood, the marimo pair skill explains how to discover and create marimo sessions, and how to control them using a semi-private interface we call code mode.<p>Code mode lets models treat marimo as a REPL that extends their context windows, similar to recursive language models (RLMs). But unlike traditional REPLs, the marimo "REPL" incrementally builds a reproducible Python program, because marimo notebooks are dataflow graphs with well-defined execution semantics. As it uses code mode, the agent is kept on track by marimo's guardrails, which include the elimination of hidden state: run a cell and dependent cells are run automatically, delete a cell and its variables are scrubbed from memory.<p>By giving models full control over a stateful reactive programming environment, rather than a collection of ephemeral scripts, marimo pair makes agents active participants in research and data work. In our early experimentation [4], we've found that marimo pair accelerates data exploration, makes it easy to steer agents while testing research hypotheses, and can serve as a backend for RLMs, yielding a notebook as an executable trace of how the model answered a query. We even use marimo pair to find and fix bugs in itself and marimo [5]. In these examples the notebook is not only a computational substrate but also a canvas for collaboration between humans and agents, and an executable, literate artifact comprised of prose, code, and visuals.<p>marimo pair is early and experimental. We would love your thoughts.<p>[1] <a href="https://github.com/marimo-team/marimo-pair" rel="nofollow">https://github.com/marimo-team/marimo-pair</a><p>[2] <a href="https://marimo.io/blog/marimo-pair" rel="nofollow">https://marimo.io/blog/marimo-pair</a><p>[3] <a href="https://github.com/marimo-team/marimo" rel="nofollow">https://github.com/marimo-team/marimo</a><p>[4] <a href="https://www.youtube.com/watch?v=VKvjPJeNRPk" rel="nofollow">https://www.youtube.com/watch?v=VKvjPJeNRPk</a><p>[5] <a href="https://github.com/manzt/dotfiles/blob/main/.claude/skills/marimo-dev/SKILL.md" rel="nofollow">https://github.com/manzt/dotfiles/blob/main/.claude/skills/m...</a>
Show HN: Marimo pair – Reactive Python notebooks as environments for agents
Hi HN! We're excited to share marimo pair [1] [2], a toolkit that drops AI agents into a running marimo notebook [3] session. This lets agents use marimo as working memory and a reactive Python runtime, while also making it easy for humans and agents to collaborate on computational research and data work.<p>GitHub repo: <a href="https://github.com/marimo-team/marimo-pair" rel="nofollow">https://github.com/marimo-team/marimo-pair</a><p>Demo: <a href="https://www.youtube.com/watch?v=6uaqtchDnoc" rel="nofollow">https://www.youtube.com/watch?v=6uaqtchDnoc</a><p>marimo pair is implemented as an agent skill. Connect your agent of choice to a running notebook with:<p>/marimo-pair pair with me on my_notebook.py<p>The agent can do anything a human can do with marimo and more. For example, it can obtain feedback by running code in an ephemeral scratchpad (inspect variables, run code against the program state, read outputs). If it wants to persist state, the agent can add cells, delete them, and install packages (marimo records these actions in the associated notebook, which is just a Python file). The agent can even manipulate marimo's user interface — for fun, try asking your agent to greet you from within a pair session.<p>The agent effects all actions by running Python code in the marimo kernel. Under the hood, the marimo pair skill explains how to discover and create marimo sessions, and how to control them using a semi-private interface we call code mode.<p>Code mode lets models treat marimo as a REPL that extends their context windows, similar to recursive language models (RLMs). But unlike traditional REPLs, the marimo "REPL" incrementally builds a reproducible Python program, because marimo notebooks are dataflow graphs with well-defined execution semantics. As it uses code mode, the agent is kept on track by marimo's guardrails, which include the elimination of hidden state: run a cell and dependent cells are run automatically, delete a cell and its variables are scrubbed from memory.<p>By giving models full control over a stateful reactive programming environment, rather than a collection of ephemeral scripts, marimo pair makes agents active participants in research and data work. In our early experimentation [4], we've found that marimo pair accelerates data exploration, makes it easy to steer agents while testing research hypotheses, and can serve as a backend for RLMs, yielding a notebook as an executable trace of how the model answered a query. We even use marimo pair to find and fix bugs in itself and marimo [5]. In these examples the notebook is not only a computational substrate but also a canvas for collaboration between humans and agents, and an executable, literate artifact comprised of prose, code, and visuals.<p>marimo pair is early and experimental. We would love your thoughts.<p>[1] <a href="https://github.com/marimo-team/marimo-pair" rel="nofollow">https://github.com/marimo-team/marimo-pair</a><p>[2] <a href="https://marimo.io/blog/marimo-pair" rel="nofollow">https://marimo.io/blog/marimo-pair</a><p>[3] <a href="https://github.com/marimo-team/marimo" rel="nofollow">https://github.com/marimo-team/marimo</a><p>[4] <a href="https://www.youtube.com/watch?v=VKvjPJeNRPk" rel="nofollow">https://www.youtube.com/watch?v=VKvjPJeNRPk</a><p>[5] <a href="https://github.com/manzt/dotfiles/blob/main/.claude/skills/marimo-dev/SKILL.md" rel="nofollow">https://github.com/manzt/dotfiles/blob/main/.claude/skills/m...</a>
Show HN: A (marginally) useful x86-64 ELF executable in 301 bytes
Show HN: TUI-use: Let AI agents control interactive terminal programs
Show HN: Moon simulator game, ray-casting
Did this a few years ago. Seems apropos. Sources and more here: <a href="https://github.com/EngineersNeedArt/Mooncraft2000" rel="nofollow">https://github.com/EngineersNeedArt/Mooncraft2000</a>
Show HN: Moon simulator game, ray-casting
Did this a few years ago. Seems apropos. Sources and more here: <a href="https://github.com/EngineersNeedArt/Mooncraft2000" rel="nofollow">https://github.com/EngineersNeedArt/Mooncraft2000</a>
Show HN: I pipe free sports streams into Jellyfin – no ads, just HLS
Show HN: 41 years sea surface temperature anomalies
Show HN: 41 years sea surface temperature anomalies