The best Hacker News stories from Show from the past day
Latest posts:
Show HN: Run GUIs as Scripts
Hi there, Zero Stars here.<p>I recently published some new work to Hokusai Pocket, which is a cross-platform binary made on top of raylib and MRuby that runs GUIs from ruby scripts.<p>License?<p>MIT!<p>How does it work?<p>The binary is available on the GitHub releases page: <a href="https://github.com/skinnyjames/hokusai-pocket/releases/tag/0.6.1" rel="nofollow">https://github.com/skinnyjames/hokusai-pocket/releases/tag/0...</a><p>You can download the binary on x86 Windows, OSX, or Linux, and run your GUI application with<p>hokusai-pocket run:target="<your_hokusai_app.rb>"<p>For a little bit of a hello world, I started a photoshop clone<p><a href="https://github.com/skinnyjames/hokusai_demo_paint" rel="nofollow">https://github.com/skinnyjames/hokusai_demo_paint</a><p>Also a little game<p><a href="https://github.com/skinnyjames/pocket-squares" rel="nofollow">https://github.com/skinnyjames/pocket-squares</a><p>Docs / Help?<p>The docs are in progress, but the old docs for the CRuby version express some of the basic ideas around the project. <a href="https://hokusai.skinnyjames.net/docs/intro" rel="nofollow">https://hokusai.skinnyjames.net/docs/intro</a><p>(I'm also available to answer questions in between slinging pizza)<p>Deps?<p>Hokusai pocket currently uses<p>* libuv for offloading cpu intensive tasks to a worker pool to prevent blocking the UI thread, and I plan to integrate some libuv networking as well.<p>* raylib for backend graphics / I've also built with SDL on arm64 to run applications on my pinephone<p>* NativeFileDialog for the lovely integration into filesystem.<p>* MRuby for running or embedding the scripts<p>* tree-sitter for the custom template grammar (Although templates can be built with ruby)<p>Anyway, I hope you get a chance to try it. If you make something cool AND have docker installed, you can also publish your work as single binary<p>`hokusai-pocket publish:target=<your cool program.rb>`<p>Would love feedback, apps, and help with documentation and more build targets.<p>urs truly,<p>@ ᴗ @
Show HN: Kelet – Root Cause Analysis agent for your LLM apps
I've spent the past few years building 50+ AI agents in prod (some reached 1M+ sessions/day), and the hardest part was never building them — it was figuring out why they fail.<p>AI agents don't crash. They just quietly give wrong answers. You end up scrolling through traces one by one, trying to find a pattern across hundreds of sessions.<p>Kelet automates that investigation. Here's how it works:<p>1. You connect your traces and signals (user feedback, edits, clicks, sentiment, LLM-as-a-judge, etc.)
2. Kelet processes those signals and extracts facts about each session
3. It forms hypotheses about what went wrong in each case
4. It clusters similar hypotheses across sessions and investigates them together
5. It surfaces a root cause with a suggested fix you can review and apply<p>The key insight: individual session failures look random. But when you cluster the hypotheses, failure patterns emerge.<p>The fastest way to integrate is through the Kelet Skill for coding agents — it scans your codebase, discovers where signals should be collected, and sets everything up for you. There are also Python and TypeScript SDKs if you prefer manual setup.<p>It’s currently free during beta. No credit card required.
Docs: <a href="https://kelet.ai/docs/" rel="nofollow">https://kelet.ai/docs/</a><p>I'd love feedback on the approach, especially from anyone running agents in prod. Does automating the manual error analysis sound right?
Show HN: A memory database that forgets, consolidates, and detects contradiction
Vector databases store memories. They don't manage them. After 10k memories, recall quality degrades because there's no consolidation, no forgetting, no conflict resolution. Your AI agent just gets noisier.<p>YantrikDB is a cognitive memory engine — embed it, run it as a server, or connect via MCP. It thinks about what it stores: consolidation collapses duplicate memories, contradiction detection flags incompatible facts, temporal decay with configurable half-life lets unimportant memories fade like human memory does.<p>Single Rust binary. HTTP + binary wire protocol. 2-voter + 1-witness HA cluster via Docker Compose or Kubernetes. Chaos-tested failover, runtime deadlock detection (parking_lot), per-tenant quotas, Prometheus metrics. Ran a 42-task hardening sprint last week — 1178 core tests, cargo-fuzz targets, CRDT property tests, 5 ops runbooks.<p>Live on a 3-node Proxmox homelab cluster with multiple tenants. Alpha — primary user is me, looking for the second one.
Show HN: Plain – The full-stack Python framework designed for humans and agents
Show HN: Kontext CLI – Credential broker for AI coding agents in Go
We built the Kontext CLI because AI coding agents need access to GitHub, Stripe, databases, and dozens of other services — and right now most teams handle this by copy-pasting long-lived API keys into .env files, or the actual chat interface, whilst hoping for the best.<p>The problem isn't just secret sprawl. It's that there's no lineage of access. You don't know which developer launched which agent, what it accessed, or whether it should have been allowed to. The moment you hand raw credentials to a process, you've lost the ability to enforce policy, audit access, or rotate without pain. The credential is the authorization, and that's fundamentally broken when autonomous agents are making hundreds of API calls per session.<p>Kontext takes a different approach. You declare what credentials a project needs in a .env.kontext file:<p><pre><code> GITHUB_TOKEN={{kontext:github}}
STRIPE_KEY={{kontext:stripe}}
LINEAR_TOKEN={{kontext:linear}}
</code></pre>
Then run `kontext start --agent claude`. The CLI authenticates you via OIDC, and for each placeholder: if the service supports OAuth, it exchanges the placeholder for a short-lived access token via RFC 8693 token exchange; for static API keys, the backend injects the credential directly into the agent's runtime environment. Either way, secrets exist only in memory during the session — never written to disk on your machine. Every tool call is streamed for audit as the agent runs.<p>The closest analogy is a Security Token Service (STS): you authenticate once, and the backend mints short-lived, scoped credentials on-the-fly — except unlike a classical STS, we hold the upstream secrets, so nothing long-lived ever reaches the agent. The backend holds your OAuth refresh tokens and API keys; the CLI never sees them. It gets back short-lived access tokens scoped to the session.<p>What the CLI captures for every tool call: what the agent tried to do, what happened, whether it was allowed, and who did it — attributed to a user, session, and org.<p>Install with one command: `brew install kontext-dev/tap/kontext`<p>The CLI is written in Go (~5ms hook overhead per tool call), uses ConnectRPC for backend communication, and stores auth in the system keyring. Works with Claude Code today, Codex support coming soon.<p>We're working on server-side policy enforcement next — the infrastructure for allow/deny decisions on every tool call is already wired, we just need to close the loop so tool calls can also be rejected.<p>We'd love feedback on the approach. Especially curious: how are teams handling credential management for AI agents today? Are you just pasting env vars into the agent chat, or have you found something better?<p>GitHub: <a href="https://github.com/kontext-dev/kontext-cli" rel="nofollow">https://github.com/kontext-dev/kontext-cli</a>
Site: <a href="https://kontext.security" rel="nofollow">https://kontext.security</a>
Show HN: LangAlpha – what if Claude Code was built for Wall Street?
Some technical context on what we ran into building this.<p>MCP tools don't really work for financial data at scale. One tool call for five years of daily prices dumps tens of thousands of tokens into the context window. And data vendors pack dozens of tools into a single MCP server, schemas alone can eat 50k+ tokens before the agent does anything useful. So we auto-generate typed Python modules from the MCP schemas at workspace init and upload them into the sandbox. The agent just imports them like a normal library. Only a one-line summary per server stays in the prompt. We have around 80 tools across our servers and the prompt cost is the same whether a server has 3 tools or 30. This part isn't finance-specific, it works with any MCP server.<p>The other big thing was making research actually persist across sessions. Most agents treat a single deliverable (a PDF, a spreadsheet) as the end goal. In investing that's day one. You update the model when earnings drop, re-run comps when a competitor reports, keep layering new analysis on old. But try doing that across agent sessions, files don't carry over, you re-paste context every time. So we built everything around workspaces. Each one maps to a persistent sandbox, one per research goal. The agent maintains its own memory file with findings and a file index that gets re-read before every LLM call. Come back a week later, start a new thread, it picks up where it left off.<p>We also wanted the agent to have real domain context the way Claude Code has codebase context. Portfolio, watchlist, risk tolerance, financial data sources, all injected into every call. Existing AI investing platforms have some of that but nothing close to what a proper agent harness can do. We wanted both and couldn't find it, so we built it and open-sourced the whole thing.
Show HN: A social feed with no strangers
Grateful is a gratitude app with a simple social layer.<p>You write a short entry, keep it private or share it to a circle. A circle is a small private group of your own making — family, close friends, whoever you'd actually want to hear from.<p>It shows you the most recent post first. People in the circle can react or leave a comment. There's also a daily notification that sends you something you were grateful for in the past.<p>Try it out on both iOS and Android. Go to grateful.so
Show HN: Turn your favorite YouTube channels into a streaming experience
A minimalist way to watch YouTube with cinematic previews, an immersive interface, and zero distractions. Free, no accounts or subscription needed.
Show HN: Continual Learning with .md
I have a proposal that addresses long-term memory problems for LLMs when new data arrives continuously (cheaply!). The program involves no code, but two Markdown files.<p>For retrieval, there is a semantic filesystem that makes it easy for LLMs to search using shell commands.<p>It is currently a scrappy v1, but it works better than anything I have tried.<p>Curious for any feedback!
Show HN: Ithihāsas – a character explorer for Hindu epics, built in a few hours
Hi HN!<p>I’ve always found it hard to explore the Mahābhārata and Rāmāyaṇa online. Most content is either long-form or scattered, and understanding a character like Karna or Bhishma usually means opening multiple tabs.<p>I built <a href="https://www.ithihasas.in/" rel="nofollow">https://www.ithihasas.in/</a> to solve that. It is a simple character explorer that lets you navigate the epics through people and their relationships instead of reading everything linearly.<p>This was also an experiment with Claude CLI. I was able to put together the first version in a couple of hours. It helped a lot with generating structured content and speeding up development, but UX and data consistency still needed manual work.<p>Would love feedback on the UX and whether this way of exploring mythology works for you.
Show HN: Ithihāsas – a character explorer for Hindu epics, built in a few hours
Hi HN!<p>I’ve always found it hard to explore the Mahābhārata and Rāmāyaṇa online. Most content is either long-form or scattered, and understanding a character like Karna or Bhishma usually means opening multiple tabs.<p>I built <a href="https://www.ithihasas.in/" rel="nofollow">https://www.ithihasas.in/</a> to solve that. It is a simple character explorer that lets you navigate the epics through people and their relationships instead of reading everything linearly.<p>This was also an experiment with Claude CLI. I was able to put together the first version in a couple of hours. It helped a lot with generating structured content and speeding up development, but UX and data consistency still needed manual work.<p>Would love feedback on the UX and whether this way of exploring mythology works for you.
Show HN: I built a social media management tool in 3 weeks with Claude and Codex
Show HN: I built a social media management tool in 3 weeks with Claude and Codex
Show HN: Waffle – Native macOS terminal that auto-tiles sessions into a grid
Hi HN. I built Waffle because I kept ending up with 15 terminal windows scattered across three spaces with no idea what was running where.<p>Splitting/merging in iTerm kind of works but it never felt intuitive to me.<p>With that in mind, I built something to suit my workflow:<p>Waffle is a native macOS terminal (Built on Miguel de Icaza's SwiftTerm) that tiles your sessions into an auto-scaling grid automatically. 1 session is fullscreen, 2 is side by side, 4 is 2x2, 9 is 3x3. Open a terminal, it joins the grid. Close one, the grid rebalances. No splitting, no config.<p>I've been using it a lot recently and one thing I've found really useful is that sessions detect which repo they're in and group accordingly. Each project gets a distinct colour. Cmd+[ and Cmd+] flip between groups. If you have three repos open across eight terminals, you can filter to just one project's sessions instantly. Also, no accidentally closing a window with CMD-W as it gives you a confirmation and requires a second CMD-W to close.<p>Honestly, if you live in tmux, this probably isn't for you but it's really helped to speed up my workflow.<p>Other things: It comes with a handful of themes (and has support for iTerm themes), bundled JetBrains mono, has keyboard shortcuts for everything. Free, no account, opt-in analytics only. macOS 14+.<p>There's a demo on the landing page if you want to see it in action.
Show HN: Claudraband – Claude Code for the Power User
Hello everyone.<p>Claudraband wraps a Claude Code TUI in a controlled terminal to enable extended workflows. It uses tmux for visible controlled sessions or xterm.js for headless sessions (a little slower), but everything is mediated by an actual Claude Code TUI.<p>One example of a workflow I use now is having my current Claude Code interrogate older sessions for certain decisions it made: <a href="https://github.com/halfwhey/claudraband?tab=readme-ov-file#self-interrogation" rel="nofollow">https://github.com/halfwhey/claudraband?tab=readme-ov-file#s...</a><p>This project provides:<p>- Resumable non-interactive workflows. Essentially `claude -p` with session support: `cband continue <session-id> 'what was the result of the research?'`
- HTTP server to remotely control a Claude Code session: `cband serve --port 8123`
- ACP server to use with alternative frontends such as Zed or Toad (<a href="https://github.com/batrachianai/toad" rel="nofollow">https://github.com/batrachianai/toad</a>): `cband acp --model haiku`.
- TypeScript library so you can integrate these workflows into your own application.<p>This exists cause I was using `tmux send-keys` heavily in a lot of my Claude Code workflows, but I wanted to streamline it.
Show HN: Claudraband – Claude Code for the Power User
Hello everyone.<p>Claudraband wraps a Claude Code TUI in a controlled terminal to enable extended workflows. It uses tmux for visible controlled sessions or xterm.js for headless sessions (a little slower), but everything is mediated by an actual Claude Code TUI.<p>One example of a workflow I use now is having my current Claude Code interrogate older sessions for certain decisions it made: <a href="https://github.com/halfwhey/claudraband?tab=readme-ov-file#self-interrogation" rel="nofollow">https://github.com/halfwhey/claudraband?tab=readme-ov-file#s...</a><p>This project provides:<p>- Resumable non-interactive workflows. Essentially `claude -p` with session support: `cband continue <session-id> 'what was the result of the research?'`
- HTTP server to remotely control a Claude Code session: `cband serve --port 8123`
- ACP server to use with alternative frontends such as Zed or Toad (<a href="https://github.com/batrachianai/toad" rel="nofollow">https://github.com/batrachianai/toad</a>): `cband acp --model haiku`.
- TypeScript library so you can integrate these workflows into your own application.<p>This exists cause I was using `tmux send-keys` heavily in a lot of my Claude Code workflows, but I wanted to streamline it.
Show HN: Oberon System 3 runs natively on Raspberry Pi 3 (with ready SD card)
Show HN: Oberon System 3 runs natively on Raspberry Pi 3 (with ready SD card)
Show HN: Oberon System 3 runs natively on Raspberry Pi 3 (with ready SD card)
Show HN: boringBar – a taskbar-style dock replacement for macOS
Hi HN!<p>I recently switched from a Fedora/GNOME laptop to a MacBook Air. My old setup served me well as a portable workstation, but I’ve started traveling more while working remotely and needed something with similar performance but better battery life. The main thing I missed was a simple taskbar that shows the windows in the current workspace instead of a Dock that mixes everything together.<p>I built boringBar so I would not have to use the Dock. It shows only the windows in the current Space, lets you switch Spaces by scrolling on the bar, and adds a desktop switcher so you can jump directly to any Space. You can also hide the system Dock, pin apps, preview windows with thumbnails, and launch apps from a searchable menu (I keep Spotlight disabled because for some reason it uses a lot of system resources on my machine).<p>I’ve been dogfooding it for a few months now, and it finally felt polished enough to share.<p>It’s for people who like macOS but want window management to feel a bit more like GNOME, Windows, or a traditional taskbar. It’s also for people like me who wanted an easier transition to macOS, especially now that Windows feels increasingly user-hostile.<p>I’d love feedback on the UX, bugs, and whether this solves the same Dock/Spaces pain for anyone else.<p>P.S. It might also appeal to people who feel nostalgic for the GNOME 2 desktop of yore. I started my Linux journey with it, and boringBar brings back some of that feeling for me.