The best Hacker News stories from Show from the past day
Latest posts:
Show HN: Destiny – Claude Code's fortune Teller skill
Destiny is the Claude Code's plugin that gives you a real fortune reading.<p>Type /destiny to see today's destiny!<p>It uses the actual classical East Asian astrology system. You enter your birthday once, then /destiny gives you today's reading anytime.<p>Two layers, kept honest:<p>1. The numbers (your eight-character birth chart, today's day pillar, the hexagram for the moment, five-element relationships) are computed by a Python script. Same person + same day = identical output. You can verify against any traditional calendar source.<p>2. The prose (today's stars, character sketch, life arc, advice) is written by Claude, applying centuries-old reading conventions to that fixed data. Not LLM-hallucinated horoscope.<p>If you have fun with it, a star would mean a lot.
Show HN: Site Mogging
Hi HN,<p>I've been playing around with Cloudflare's Browser Run and Workers AI to create this funny "website vs website"-website.<p>Google's Gemma 4b model is actually quite good at vision.
Show HN: AI CAD Harness
Hi HN, I'm Zach, one of the co-founders of Adam (<a href="https://adam.new">https://adam.new</a>).<p>We've been on HN twice before with text-to-CAD/3D experiments [1][2]. The honest takeaway from those threads: prompt-to-3D model web apps are fun, but serious mechanical engineers don't want a black box that spits out an STL. They want help inside the CAD tool they already use, with full visibility and control over the feature tree.<p>So we built that. Adam is now a harness that integrates directly with your CAD. It reads your parts, understands the existing feature tree, and edits it for you agentically. We are now live in beta on Onshape and Fusion! [3]:<p>Install link Autodesk Fusion: <a href="https://fusion.adam.new/install">https://fusion.adam.new/install</a><p>Install link PTC Onshape:
<a href="https://cad.onshape.com/appstore/apps/Design & Documentation/690a8dc864e816c112aa66a0" rel="nofollow">https://cad.onshape.com/appstore/apps/Design & Documenta...</a><p>Things people are using it for today: - "Merge redundant features and clean up my tree" - "Rename every feature so the tree is actually readable" - "Round all internal edges with a 2mm fillet" - “Parametrize my model” - Along with of course, using Adam to generate CAD end-to-end!<p>A few things we care about that aren't obvious from the listing:<p>1. From the start we have always believed in CAD as code as the right abstraction. Our harness leverages Onshape's FeatureScript and Python in Fusion heavily.<p>2. We run an internal CAD benchmark across frontier models. There has been a massive jump in the spatial reasoning capabilities of recent models, particularly GPT 5.5 and Opus 4.7 [4] [5]<p>3. We open-sourced our earlier text-to-CAD work [6]<p>A note on the Anthropic Autodesk connector that shipped a couple days ago [7]: We think it's great for the space and validates the direction.<p>Where Adam is different: - Model-agnostic. We pick whichever frontier model is winning on each task type from our own internal bench, instead of being tied to one lab. - We live natively in your CAD apps and are actively building integrations across all programs<p>What would you want an in-CAD agent to do that nothing does today?<p>[1] <a href="https://news.ycombinator.com/item?id=44182206">https://news.ycombinator.com/item?id=44182206</a><p>[2] <a href="https://news.ycombinator.com/item?id=45140921">https://news.ycombinator.com/item?id=45140921</a><p>[3] <a href="https://x.com/adamdotnew/status/2050264512230719980?s=20" rel="nofollow">https://x.com/adamdotnew/status/2050264512230719980?s=20</a><p>[4] <a href="https://x.com/adamdotnew/status/2044859329329893376?s=20" rel="nofollow">https://x.com/adamdotnew/status/2044859329329893376?s=20</a><p>[5] <a href="https://x.com/adamdotnew/status/2047795078912172122?s=20" rel="nofollow">https://x.com/adamdotnew/status/2047795078912172122?s=20</a><p>[6] <a href="https://github.com/Adam-CAD/CADAM" rel="nofollow">https://github.com/Adam-CAD/CADAM</a><p>[7] <a href="https://x.com/claudeai/status/2049143440508616863?s=20" rel="nofollow">https://x.com/claudeai/status/2049143440508616863?s=20</a>
Show HN: Perfect Bluetooth MIDI for Windows
Hi HN, I'm Erwin. I built a small free open-source utility that bridges Bluetooth LE MIDI keyboards into the new Windows MIDI Services stack so any DAW or Web MIDI app can use them as if they were wired.<p>I bought a Roland FP-90X piano partly because it had Bluetooth MIDI. On my Windows 11 PC, pairing succeeded, but my DAW couldn't see the keyboard, and notes I sent from the PC never made the piano sing. After a regrettable number of evenings, I'd separated this into three independent bugs stacked on top of each other.<p>The first one is the famous one: Windows only natively exposes BLE-MIDI through the WinRT API, which almost no DAW polls. So even when pairing succeeds, MIDI apps still don't see the device. The usual workaround is MIDIberry + loopMIDI, but I couldn't get that combination to work reliably in my case, and I wanted a single-app solution. The new Windows MIDI Services stack ships with a feature called loopback endpoints: anything written to one comes out the other, and any winmm/WinRT/WMS app sees them as normal MIDI ports. So the app does WinRT BLE-MIDI in, WMS loopback out. That solved direction one, piano to PC.<p>Direction two, PC to piano, still didn't work. NoteOn writes were getting ATT-acked, but the piano stayed silent. I tried both write modes (some BLE-MIDI firmware silently drops one or the other), poked the proprietary ISSC characteristic. Every variant ATT-acked, every variant produced silence. So the bytes were reaching the piano. Something above the GATT layer was discarding them.<p>After ruling out pairing, encryption, write-mode, and proprietary characteristics, the only obvious lever left was the MIDI channel itself. The FP-90X has a panel setting called Transmit Channel, default 1. Yet it turns out the FP-90X actually receives on channel 4 (and it can't be changed). Notes I sent on channel 1 were being GATT-acked and silently dropped at the synth engine because they weren't on the channel the engine was listening to. Zero feedback at any layer. The fix had to live up at the application layer, so I added a Detect button that plays N test notes ascending on each channel from 1 to 16: you count the notes you actually hear, and that number is the receive channel. Saved per BLE MAC, about 75 seconds, done forever per piano.<p>Tech stack: .NET 10, Avalonia for the UI (the BLE/MIDI side is Windows-only but the UI layer is portable), Microsoft.Windows.Devices.Midi2 packages for WMS, Windows.Devices.Midi (WinRT) directly for BLE rather than relying on Korg's older WinMM driver. MIT, single self-contained ~21 MB exe, no installer, no telemetry, no account.<p>I built it for myself and use it with my FP-90X to play through a few apps and Web MIDI sites. Pete from the Microsoft Windows MIDI Services team commented positively on the BLE integration when I shared it on r/synthesizers (<a href="https://www.reddit.com/r/synthesizers/comments/1szvuiq/comment/oj5ew9b/" rel="nofollow">https://www.reddit.com/r/synthesizers/comments/1szvuiq/comme...</a>).<p>Site (with screenshots): <a href="https://mayerwin.github.io/Perfect-Bluetooth-MIDI-For-Windows/" rel="nofollow">https://mayerwin.github.io/Perfect-Bluetooth-MIDI-For-Window...</a><p>Source: <a href="https://github.com/mayerwin/Perfect-Bluetooth-MIDI-For-Windows" rel="nofollow">https://github.com/mayerwin/Perfect-Bluetooth-MIDI-For-Windo...</a><p>Long-form technical writeup with the full debugging story: <a href="https://dev.to/mayerwin/why-your-bluetooth-midi-keyboard-silently-drops-notes-on-windows-2i84" rel="nofollow">https://dev.to/mayerwin/why-your-bluetooth-midi-keyboard-sil...</a><p>Personally tested with my FP-90X only. The BLE side is generic, so other keyboards (WIDI Master, CME, Yamaha MD-BT01, Korg microKey Air, ROLI Seaboard, etc.) should work, but I haven't confirmed individually. Device test reports, issues, and PRs very welcome.
Show HN: Winpodx – run Windows apps on Linux as native windows
Show HN: GhostBox – Borrow a disposable little machine from the Global Free Tier
I built this because I was always creating machines on GH actions to test builds on different OS, and I wanted a tight CLI that could do it. I always saw Actions as this great resources and ephemeral machines you could do dev work in just were a natural way for me to work, so this grew out of that workflow.<p>I didn't expect it to blow up, so it wasn't 100% finished when I posted it. But it should stabilize pretty quickly.<p>Happy to know what you think and talk about it.
Show HN: WhatCable, a tiny menu bar app for inspecting USB-C cables
USB-C cables can be a mess. One cable charges at 5W, another does 100W and Thunderbolt 4, and they look identical in the drawer.<p>WhatCable sits in your menu bar and reads the cable data your Mac already has access to. Plug in a cable and it tells you in plain English what it can actually do: charging wattage, data speed, display support, Thunderbolt, etc.<p>Built in Swift/SwiftUI. Open source, free, no tracking.<p>GitHub: <a href="https://github.com/darrylmorley/whatcable" rel="nofollow">https://github.com/darrylmorley/whatcable</a>
Show HN: TRiP – a complete transformer engine in C built from scratch just by me
Show HN: I wrote a DOOM clone in my own programming language
Show HN: My retired dad and I made a daily, somewhat difficult, quiz
My dad makes the questions, I made the site.<p>I think the genre and the level of difficulty is suited for HN. Hope you enjoy.<p>(I promise no AI-generated questions, they are all hand made!).
Show HN: Pu.sh – a full coding-agent harness in 400 lines of shell
I originally was just messing with pi-autoresearch. Gave it a sample task to build the most portable coding agent.<p>First cut was 6 KB of shell. Great for one-shots, unusable interactively. I was shocked it actually worked.<p>Started building up -- adding features — but with a self-imposed rule: no new dependencies, and sub 500 LOC. This thing had to be truly portable. Just sh, curl, awk. System primitives only.<p>Which means I did some genuinely disgusting things in awk, including JSON parsing and the OpenAI
Responses tool loop with reasoning items carried across turns.<p>It's now ~400 lines. In the box: Anthropic + OpenAI, 7 tools (bash, read, write, edit, grep, find, ls),
REPL, auto-compaction, checkpoint/resume, pipe mode, 90 no-API tests. Not in the box: TUI, streaming,
images, OAuth, Windows, dignity.<p>Two honest things:<p>1. I stole/modified the system prompt and the architecture. Pi/Claude/Codex wrote the awk. I cannot read most of this code. This wasn't possible for me a year ago.<p>2. Heavily inspired by Pi (pi.dev) — same 7-tool surface, same exact-text edit model. Credit where it's
due. Pi is awesome -- you should probably use them.<p>The agent loop itself is tiny. Almost everything else in a "real" agent CLI is DX and hardening. You can
probably build your own harness exactly how you like it. Mario Zechner's AI Engineer talk on taking back control of
your tools nudged me here.<p>The name is because it's a .sh file. The other thing it sounds like is, regrettably, also accurate.
Show HN: Pu.sh – a full coding-agent harness in 400 lines of shell
I originally was just messing with pi-autoresearch. Gave it a sample task to build the most portable coding agent.<p>First cut was 6 KB of shell. Great for one-shots, unusable interactively. I was shocked it actually worked.<p>Started building up -- adding features — but with a self-imposed rule: no new dependencies, and sub 500 LOC. This thing had to be truly portable. Just sh, curl, awk. System primitives only.<p>Which means I did some genuinely disgusting things in awk, including JSON parsing and the OpenAI
Responses tool loop with reasoning items carried across turns.<p>It's now ~400 lines. In the box: Anthropic + OpenAI, 7 tools (bash, read, write, edit, grep, find, ls),
REPL, auto-compaction, checkpoint/resume, pipe mode, 90 no-API tests. Not in the box: TUI, streaming,
images, OAuth, Windows, dignity.<p>Two honest things:<p>1. I stole/modified the system prompt and the architecture. Pi/Claude/Codex wrote the awk. I cannot read most of this code. This wasn't possible for me a year ago.<p>2. Heavily inspired by Pi (pi.dev) — same 7-tool surface, same exact-text edit model. Credit where it's
due. Pi is awesome -- you should probably use them.<p>The agent loop itself is tiny. Almost everything else in a "real" agent CLI is DX and hardening. You can
probably build your own harness exactly how you like it. Mario Zechner's AI Engineer talk on taking back control of
your tools nudged me here.<p>The name is because it's a .sh file. The other thing it sounds like is, regrettably, also accurate.
Show HN: A new benchmark for testing LLMs for deterministic outputs
When building workflows that rely on LLMs, we commonly use structured output for programmatic use cases like converting an invoice into rows or meeting transcripts into tickets or even complex PDFs into database entries.<p>The model may return the schema you want, but with hallucinated values like `invoice_date` being off by 2 months or the transcript array ordered wrongly. The JSON is valid, but the values are not.<p>Structured output today is a big part of using LLMs, especially when building deterministic workflows.<p>Current structured output benchmarks (e.g., JSONSchemaBench) only validate the pass rate for JSON schema and types, and not the actual values within the produced JSON.<p>So we designed the Structured Output Benchmark (SOB) that fixes this by measuring both the JSON schema pass rate, types, and the value accuracy across all three modalities, text, image, and audio.<p>For our test set, every record is paired with a JSON Schema and a ground-truth answer that was verified against the source context manually by a human and an LLM cross-check, so a missing or hallucinated value will be considered to be wrong.<p>Open source is doing pretty well with GLM 4.7 coming in number 2 right after GPT 5.4.<p>We noticed the rankings shift across modalities: GLM-4.7 leads text, Gemma-4-31B leads images, Gemini-2.5-Flash leads audio.<p>For example, GPT-5.4 ranks 3rd on text but 9th on images.<p>Model size is not a predictor, either: Qwen3.5-35B and GLM-4.7 beat GPT-5 and Claude-Sonnet-4.6 on Value Accuracy. Phi-4 (14B) beats GPT-5 and GPT-5-mini on text.<p>Structured hallucinations are the hardest bug. Such values are type-correct, schema-valid, and plausible, so they slip through most guardrails. For example, in one audio record, the ground truth is "target_market_age": "15 to 35 years", and a model returns "25 to 35". This is invisible without field-level checks.<p>Our goal is to be the best general model for deterministic tasks, and a key aspect of determinism is a controllable and consistent output structure. The first step to making structured output better is to measure it and hold ourselves against the best.
Show HN: A new benchmark for testing LLMs for deterministic outputs
When building workflows that rely on LLMs, we commonly use structured output for programmatic use cases like converting an invoice into rows or meeting transcripts into tickets or even complex PDFs into database entries.<p>The model may return the schema you want, but with hallucinated values like `invoice_date` being off by 2 months or the transcript array ordered wrongly. The JSON is valid, but the values are not.<p>Structured output today is a big part of using LLMs, especially when building deterministic workflows.<p>Current structured output benchmarks (e.g., JSONSchemaBench) only validate the pass rate for JSON schema and types, and not the actual values within the produced JSON.<p>So we designed the Structured Output Benchmark (SOB) that fixes this by measuring both the JSON schema pass rate, types, and the value accuracy across all three modalities, text, image, and audio.<p>For our test set, every record is paired with a JSON Schema and a ground-truth answer that was verified against the source context manually by a human and an LLM cross-check, so a missing or hallucinated value will be considered to be wrong.<p>Open source is doing pretty well with GLM 4.7 coming in number 2 right after GPT 5.4.<p>We noticed the rankings shift across modalities: GLM-4.7 leads text, Gemma-4-31B leads images, Gemini-2.5-Flash leads audio.<p>For example, GPT-5.4 ranks 3rd on text but 9th on images.<p>Model size is not a predictor, either: Qwen3.5-35B and GLM-4.7 beat GPT-5 and Claude-Sonnet-4.6 on Value Accuracy. Phi-4 (14B) beats GPT-5 and GPT-5-mini on text.<p>Structured hallucinations are the hardest bug. Such values are type-correct, schema-valid, and plausible, so they slip through most guardrails. For example, in one audio record, the ground truth is "target_market_age": "15 to 35 years", and a model returns "25 to 35". This is invisible without field-level checks.<p>Our goal is to be the best general model for deterministic tasks, and a key aspect of determinism is a controllable and consistent output structure. The first step to making structured output better is to measure it and hold ourselves against the best.
Show HN: A new benchmark for testing LLMs for deterministic outputs
When building workflows that rely on LLMs, we commonly use structured output for programmatic use cases like converting an invoice into rows or meeting transcripts into tickets or even complex PDFs into database entries.<p>The model may return the schema you want, but with hallucinated values like `invoice_date` being off by 2 months or the transcript array ordered wrongly. The JSON is valid, but the values are not.<p>Structured output today is a big part of using LLMs, especially when building deterministic workflows.<p>Current structured output benchmarks (e.g., JSONSchemaBench) only validate the pass rate for JSON schema and types, and not the actual values within the produced JSON.<p>So we designed the Structured Output Benchmark (SOB) that fixes this by measuring both the JSON schema pass rate, types, and the value accuracy across all three modalities, text, image, and audio.<p>For our test set, every record is paired with a JSON Schema and a ground-truth answer that was verified against the source context manually by a human and an LLM cross-check, so a missing or hallucinated value will be considered to be wrong.<p>Open source is doing pretty well with GLM 4.7 coming in number 2 right after GPT 5.4.<p>We noticed the rankings shift across modalities: GLM-4.7 leads text, Gemma-4-31B leads images, Gemini-2.5-Flash leads audio.<p>For example, GPT-5.4 ranks 3rd on text but 9th on images.<p>Model size is not a predictor, either: Qwen3.5-35B and GLM-4.7 beat GPT-5 and Claude-Sonnet-4.6 on Value Accuracy. Phi-4 (14B) beats GPT-5 and GPT-5-mini on text.<p>Structured hallucinations are the hardest bug. Such values are type-correct, schema-valid, and plausible, so they slip through most guardrails. For example, in one audio record, the ground truth is "target_market_age": "15 to 35 years", and a model returns "25 to 35". This is invisible without field-level checks.<p>Our goal is to be the best general model for deterministic tasks, and a key aspect of determinism is a controllable and consistent output structure. The first step to making structured output better is to measure it and hold ourselves against the best.
Show HN: Adblock-rust Manager – Firefox extension to enable the Brave ad blocker
Firefox 149 ships adblock-rust (Brave's Rust engine, MPL-2.0) completely disabled with no UI. It's controlled by two about:config prefs with no WebExtension API, so you can't touch them programmatically from a standard extension.<p>This extension gives it a UI: ETP toggle (via browser.privacy API, instant), filter list manager with clipboard helpers for the manual about:config steps, and 8 preset lists. You can also add your own if you so desire.
Show HN: Adblock-rust Manager – Firefox extension to enable the Brave ad blocker
Firefox 149 ships adblock-rust (Brave's Rust engine, MPL-2.0) completely disabled with no UI. It's controlled by two about:config prefs with no WebExtension API, so you can't touch them programmatically from a standard extension.<p>This extension gives it a UI: ETP toggle (via browser.privacy API, instant), filter list manager with clipboard helpers for the manual about:config steps, and 8 preset lists. You can also add your own if you so desire.
Show HN: Rocky – Rust SQL engine with branches, replay, column lineage
Hi HN, I'm Hugo. I've been building Rocky over the past month, shipping fast in the open. The binary is on GitHub Releases, `dagster-rocky` on PyPI, and the VS Code extension on the Marketplace. I held off on a broader announcement until the trust-system surface was coherent enough to talk about as one thing. The governance waveplan — column classification, per-env masking, 8-field audit trail on every run, `rocky compliance` rollup, role-graph reconciliation, retention policies — landed end-to-end last week in engine-v1.16.0 and rounded out in v1.17.4 (tagged 2026-04-26). That's the milestone I'd been waiting for.<p>The pitch: keep Databricks or Snowflake. Bring Rocky for the DAG. Rocky is a Rust-based control plane for warehouse pipelines. Storage and compute stay with your warehouse. Rocky owns the graph — dependencies, compile-time types, drift, incremental logic, cost, lineage, governance. The things your current stack can't give you because it doesn't own the DAG.<p>A few things I think are interesting:<p>- Branches + replay. `rocky branch create stg` gives you a logical copy of a pipeline's tables (schema-prefix today; native Delta SHALLOW CLONE and Snowflake zero-copy are next). `rocky replay <run_id>` reconstructs which SQL ran against which inputs. Git-grade workflow on a warehouse.<p>- Column-level lineage from the compiler, not a post-hoc graph crawl. The type checker traces columns through joins, CTEs, and windows. VS Code surfaces it inline via LSP.<p>- Governance as a first-class surface. Column classification tags plus per-env masking policies, applied to the warehouse via Unity Catalog (Databricks) or masking policies (Snowflake). 8-field audit trail on every run. `rocky compliance` rollup that CI can gate on. Role-graph reconciliation via SCIM + per-catalog GRANT. Retention policies with a warehouse-side drift probe.<p>- Cost attribution. Every run produces per-model cost (bytes, duration). `[budget]` blocks in `rocky.toml`; breaches fire a `budget_breach` hook event.<p>- Compile-time portability + blast radius. Dialect-divergence lint across Databricks / Snowflake / BigQuery / DuckDB (12 constructs). `SELECT *` downstream-impact lint.<p>- Schema-grounded AI. Generated SQL goes through the compiler — AI suggestions type-check before they can land.<p>What Rocky isn't:<p>- Not a warehouse — it's the control plane on top.<p>- Not a Fivetran replacement. `rocky load` handles files (CSV/Parquet/JSONL); for SaaS sources use Fivetran, Airbyte, or warehouse-native CDC.<p>- Not dbt Cloud — no hosted UI, no managed scheduler. First-class Dagster integration if you need orchestration.<p>Adapters: Databricks (GA), Snowflake (Beta), BigQuery (Beta), DuckDB (local dev / playground). Apache 2.0.<p>I'd love feedback on the trust-system framing, the governance surface (particularly classification-to-masking resolution in `rocky compile` and the `rocky compliance` CI gate), the branches/replay design, the cost-attribution primitives, or anything else that catches your eye. Happy to go deep in the thread.
Show HN: Rocky – Rust SQL engine with branches, replay, column lineage
Hi HN, I'm Hugo. I've been building Rocky over the past month, shipping fast in the open. The binary is on GitHub Releases, `dagster-rocky` on PyPI, and the VS Code extension on the Marketplace. I held off on a broader announcement until the trust-system surface was coherent enough to talk about as one thing. The governance waveplan — column classification, per-env masking, 8-field audit trail on every run, `rocky compliance` rollup, role-graph reconciliation, retention policies — landed end-to-end last week in engine-v1.16.0 and rounded out in v1.17.4 (tagged 2026-04-26). That's the milestone I'd been waiting for.<p>The pitch: keep Databricks or Snowflake. Bring Rocky for the DAG. Rocky is a Rust-based control plane for warehouse pipelines. Storage and compute stay with your warehouse. Rocky owns the graph — dependencies, compile-time types, drift, incremental logic, cost, lineage, governance. The things your current stack can't give you because it doesn't own the DAG.<p>A few things I think are interesting:<p>- Branches + replay. `rocky branch create stg` gives you a logical copy of a pipeline's tables (schema-prefix today; native Delta SHALLOW CLONE and Snowflake zero-copy are next). `rocky replay <run_id>` reconstructs which SQL ran against which inputs. Git-grade workflow on a warehouse.<p>- Column-level lineage from the compiler, not a post-hoc graph crawl. The type checker traces columns through joins, CTEs, and windows. VS Code surfaces it inline via LSP.<p>- Governance as a first-class surface. Column classification tags plus per-env masking policies, applied to the warehouse via Unity Catalog (Databricks) or masking policies (Snowflake). 8-field audit trail on every run. `rocky compliance` rollup that CI can gate on. Role-graph reconciliation via SCIM + per-catalog GRANT. Retention policies with a warehouse-side drift probe.<p>- Cost attribution. Every run produces per-model cost (bytes, duration). `[budget]` blocks in `rocky.toml`; breaches fire a `budget_breach` hook event.<p>- Compile-time portability + blast radius. Dialect-divergence lint across Databricks / Snowflake / BigQuery / DuckDB (12 constructs). `SELECT *` downstream-impact lint.<p>- Schema-grounded AI. Generated SQL goes through the compiler — AI suggestions type-check before they can land.<p>What Rocky isn't:<p>- Not a warehouse — it's the control plane on top.<p>- Not a Fivetran replacement. `rocky load` handles files (CSV/Parquet/JSONL); for SaaS sources use Fivetran, Airbyte, or warehouse-native CDC.<p>- Not dbt Cloud — no hosted UI, no managed scheduler. First-class Dagster integration if you need orchestration.<p>Adapters: Databricks (GA), Snowflake (Beta), BigQuery (Beta), DuckDB (local dev / playground). Apache 2.0.<p>I'd love feedback on the trust-system framing, the governance surface (particularly classification-to-masking resolution in `rocky compile` and the `rocky compliance` CI gate), the branches/replay design, the cost-attribution primitives, or anything else that catches your eye. Happy to go deep in the thread.
Show HN: Rocky – Rust SQL engine with branches, replay, column lineage
Hi HN, I'm Hugo. I've been building Rocky over the past month, shipping fast in the open. The binary is on GitHub Releases, `dagster-rocky` on PyPI, and the VS Code extension on the Marketplace. I held off on a broader announcement until the trust-system surface was coherent enough to talk about as one thing. The governance waveplan — column classification, per-env masking, 8-field audit trail on every run, `rocky compliance` rollup, role-graph reconciliation, retention policies — landed end-to-end last week in engine-v1.16.0 and rounded out in v1.17.4 (tagged 2026-04-26). That's the milestone I'd been waiting for.<p>The pitch: keep Databricks or Snowflake. Bring Rocky for the DAG. Rocky is a Rust-based control plane for warehouse pipelines. Storage and compute stay with your warehouse. Rocky owns the graph — dependencies, compile-time types, drift, incremental logic, cost, lineage, governance. The things your current stack can't give you because it doesn't own the DAG.<p>A few things I think are interesting:<p>- Branches + replay. `rocky branch create stg` gives you a logical copy of a pipeline's tables (schema-prefix today; native Delta SHALLOW CLONE and Snowflake zero-copy are next). `rocky replay <run_id>` reconstructs which SQL ran against which inputs. Git-grade workflow on a warehouse.<p>- Column-level lineage from the compiler, not a post-hoc graph crawl. The type checker traces columns through joins, CTEs, and windows. VS Code surfaces it inline via LSP.<p>- Governance as a first-class surface. Column classification tags plus per-env masking policies, applied to the warehouse via Unity Catalog (Databricks) or masking policies (Snowflake). 8-field audit trail on every run. `rocky compliance` rollup that CI can gate on. Role-graph reconciliation via SCIM + per-catalog GRANT. Retention policies with a warehouse-side drift probe.<p>- Cost attribution. Every run produces per-model cost (bytes, duration). `[budget]` blocks in `rocky.toml`; breaches fire a `budget_breach` hook event.<p>- Compile-time portability + blast radius. Dialect-divergence lint across Databricks / Snowflake / BigQuery / DuckDB (12 constructs). `SELECT *` downstream-impact lint.<p>- Schema-grounded AI. Generated SQL goes through the compiler — AI suggestions type-check before they can land.<p>What Rocky isn't:<p>- Not a warehouse — it's the control plane on top.<p>- Not a Fivetran replacement. `rocky load` handles files (CSV/Parquet/JSONL); for SaaS sources use Fivetran, Airbyte, or warehouse-native CDC.<p>- Not dbt Cloud — no hosted UI, no managed scheduler. First-class Dagster integration if you need orchestration.<p>Adapters: Databricks (GA), Snowflake (Beta), BigQuery (Beta), DuckDB (local dev / playground). Apache 2.0.<p>I'd love feedback on the trust-system framing, the governance surface (particularly classification-to-masking resolution in `rocky compile` and the `rocky compliance` CI gate), the branches/replay design, the cost-attribution primitives, or anything else that catches your eye. Happy to go deep in the thread.