The best Hacker News stories from Show from the past day
Latest posts:
Show HN: A roguelike game that runs inside Notepad++
Show HN: A roguelike game that runs inside Notepad++
Show HN: A roguelike game that runs inside Notepad++
Show HN: Chibi, AI that tells you why users churn
Hey HN,<p>I’ve been a PM for 3 years, and one hard part was always understanding why users churn, drop off and behave the way they do!<p>Session replays had the answer, but watching hours of them was painful.<p>I chatted with a bunch of founder friends and PMs and they too had similar troubles.<p>So I built Chibi an AI that watches replays and tells you what’s broken, confusing, or causing drop-off.<p>Long Term: I'm thinking if Chibi could evolve into an AI product manager co-worker that can detect and prioritize issues, think through features and even run A/B tests.<p>Tech Stack: Elixir + Phoenix, rrweb and gemini<p>Would love to know what you think :)<p>Happy to answer any questions too
Show HN: Chibi, AI that tells you why users churn
Hey HN,<p>I’ve been a PM for 3 years, and one hard part was always understanding why users churn, drop off and behave the way they do!<p>Session replays had the answer, but watching hours of them was painful.<p>I chatted with a bunch of founder friends and PMs and they too had similar troubles.<p>So I built Chibi an AI that watches replays and tells you what’s broken, confusing, or causing drop-off.<p>Long Term: I'm thinking if Chibi could evolve into an AI product manager co-worker that can detect and prioritize issues, think through features and even run A/B tests.<p>Tech Stack: Elixir + Phoenix, rrweb and gemini<p>Would love to know what you think :)<p>Happy to answer any questions too
Show HN: Entropy-Guided Loop – How to make small models reason
TLDR: A small, vendor-agnostic inference loop that turns token logprobs/perplexity/entropy into an extra pass and reasoning for LLMs.<p>- Captures logprobs/top-k during generation, computes perplexity and token-level entropy.<p>- Triggers at most one refine when simple thresholds fire; passes a compact “uncertainty report” (uncertain tokens + top-k alts + local context) back to the model.<p>- In our tests on technical Q&A / math / code, a small model recovered much of “reasoning” quality at ~⅓ the cost while refining ~⅓ of outputs.<p>I kept seeing “reasoning” models behave like expensive black boxes. Meanwhile, standard inference already computes useful signals both before softmax normalization and after it(logprobs), which we usually throw away. This loop tries the simplest thing that you could think of: use those signals to decide when (and where) to think again.<p>GitHub (notebook + minimal code): <a href="https://github.com/monostate/weave-logprobs-reasoning-loop" rel="nofollow">https://github.com/monostate/weave-logprobs-reasoning-loop</a><p>Paper (short & engineer made):
<a href="https://arxiv.org/abs/2509.00079" rel="nofollow">https://arxiv.org/abs/2509.00079</a><p>Blog (more context): <a href="https://monostate.ai/blog/entropy-refinement-blog" rel="nofollow">https://monostate.ai/blog/entropy-refinement-blog</a><p>Requirements: Python, API that exposes logprobs (tested with OpenAI non reasoning 4.1). OPENAI_API_KEY and WEAVE for observability. Run the notebook; it prints metrics and shows which tokens triggered refinement.<p>- Python, simple loop (no retraining).<p>- Uses Responses API logprobs/top-k; metrics: perplexity, max token entropy, low-confidence counts.<p>- Weave for lightweight logging/observability (optional).<p>- Passing alternatives (not just “this looks uncertain”) prevents over-correction.<p>- A simple OR rule (ppl / max-entropy / low-confidence count) catches complementary failure modes.<p>- Numbers drift across vendors; keeping the method vendor-agnostic is better than chasing fragile pairings.<p>- Needs APIs that expose logprobs/top-k.<p>- Results are indicative—not a leaderboard; focus is on within-model gains (single-pass vs +loop).<p>- Thresholds might need light tuning per domain.<p>- One pass only; not a chain-of-thought replacement.<p>- Run it on your models and ideas (e.g., 4o-mini, v3, Llama variants with logprobs) and share logs in a PR for our README in GitHub if you'd like, PRs welcome - I’ll credit and link.<p>Overall let me know if you find making small models reason like this useful!
Show HN: Entropy-Guided Loop – How to make small models reason
TLDR: A small, vendor-agnostic inference loop that turns token logprobs/perplexity/entropy into an extra pass and reasoning for LLMs.<p>- Captures logprobs/top-k during generation, computes perplexity and token-level entropy.<p>- Triggers at most one refine when simple thresholds fire; passes a compact “uncertainty report” (uncertain tokens + top-k alts + local context) back to the model.<p>- In our tests on technical Q&A / math / code, a small model recovered much of “reasoning” quality at ~⅓ the cost while refining ~⅓ of outputs.<p>I kept seeing “reasoning” models behave like expensive black boxes. Meanwhile, standard inference already computes useful signals both before softmax normalization and after it(logprobs), which we usually throw away. This loop tries the simplest thing that you could think of: use those signals to decide when (and where) to think again.<p>GitHub (notebook + minimal code): <a href="https://github.com/monostate/weave-logprobs-reasoning-loop" rel="nofollow">https://github.com/monostate/weave-logprobs-reasoning-loop</a><p>Paper (short & engineer made):
<a href="https://arxiv.org/abs/2509.00079" rel="nofollow">https://arxiv.org/abs/2509.00079</a><p>Blog (more context): <a href="https://monostate.ai/blog/entropy-refinement-blog" rel="nofollow">https://monostate.ai/blog/entropy-refinement-blog</a><p>Requirements: Python, API that exposes logprobs (tested with OpenAI non reasoning 4.1). OPENAI_API_KEY and WEAVE for observability. Run the notebook; it prints metrics and shows which tokens triggered refinement.<p>- Python, simple loop (no retraining).<p>- Uses Responses API logprobs/top-k; metrics: perplexity, max token entropy, low-confidence counts.<p>- Weave for lightweight logging/observability (optional).<p>- Passing alternatives (not just “this looks uncertain”) prevents over-correction.<p>- A simple OR rule (ppl / max-entropy / low-confidence count) catches complementary failure modes.<p>- Numbers drift across vendors; keeping the method vendor-agnostic is better than chasing fragile pairings.<p>- Needs APIs that expose logprobs/top-k.<p>- Results are indicative—not a leaderboard; focus is on within-model gains (single-pass vs +loop).<p>- Thresholds might need light tuning per domain.<p>- One pass only; not a chain-of-thought replacement.<p>- Run it on your models and ideas (e.g., 4o-mini, v3, Llama variants with logprobs) and share logs in a PR for our README in GitHub if you'd like, PRs welcome - I’ll credit and link.<p>Overall let me know if you find making small models reason like this useful!
Show HN: Entropy-Guided Loop – How to make small models reason
TLDR: A small, vendor-agnostic inference loop that turns token logprobs/perplexity/entropy into an extra pass and reasoning for LLMs.<p>- Captures logprobs/top-k during generation, computes perplexity and token-level entropy.<p>- Triggers at most one refine when simple thresholds fire; passes a compact “uncertainty report” (uncertain tokens + top-k alts + local context) back to the model.<p>- In our tests on technical Q&A / math / code, a small model recovered much of “reasoning” quality at ~⅓ the cost while refining ~⅓ of outputs.<p>I kept seeing “reasoning” models behave like expensive black boxes. Meanwhile, standard inference already computes useful signals both before softmax normalization and after it(logprobs), which we usually throw away. This loop tries the simplest thing that you could think of: use those signals to decide when (and where) to think again.<p>GitHub (notebook + minimal code): <a href="https://github.com/monostate/weave-logprobs-reasoning-loop" rel="nofollow">https://github.com/monostate/weave-logprobs-reasoning-loop</a><p>Paper (short & engineer made):
<a href="https://arxiv.org/abs/2509.00079" rel="nofollow">https://arxiv.org/abs/2509.00079</a><p>Blog (more context): <a href="https://monostate.ai/blog/entropy-refinement-blog" rel="nofollow">https://monostate.ai/blog/entropy-refinement-blog</a><p>Requirements: Python, API that exposes logprobs (tested with OpenAI non reasoning 4.1). OPENAI_API_KEY and WEAVE for observability. Run the notebook; it prints metrics and shows which tokens triggered refinement.<p>- Python, simple loop (no retraining).<p>- Uses Responses API logprobs/top-k; metrics: perplexity, max token entropy, low-confidence counts.<p>- Weave for lightweight logging/observability (optional).<p>- Passing alternatives (not just “this looks uncertain”) prevents over-correction.<p>- A simple OR rule (ppl / max-entropy / low-confidence count) catches complementary failure modes.<p>- Numbers drift across vendors; keeping the method vendor-agnostic is better than chasing fragile pairings.<p>- Needs APIs that expose logprobs/top-k.<p>- Results are indicative—not a leaderboard; focus is on within-model gains (single-pass vs +loop).<p>- Thresholds might need light tuning per domain.<p>- One pass only; not a chain-of-thought replacement.<p>- Run it on your models and ideas (e.g., 4o-mini, v3, Llama variants with logprobs) and share logs in a PR for our README in GitHub if you'd like, PRs welcome - I’ll credit and link.<p>Overall let me know if you find making small models reason like this useful!
Show HN: We built an open-source alternative to expensive pair programming apps
My friend and I grew frustrated with the high cost of existing pair programming tools, and of course of grainy screens when we used Huddle or similar tools.<p>We believe core developer collaboration shouldn't be locked behind an expensive subscription.<p>So for the past year we spent our nights and weekend building Hopp, an open-source alternative.<p>We would love your feedback and we are here to answer any and all questions.
Show HN: We built an open-source alternative to expensive pair programming apps
My friend and I grew frustrated with the high cost of existing pair programming tools, and of course of grainy screens when we used Huddle or similar tools.<p>We believe core developer collaboration shouldn't be locked behind an expensive subscription.<p>So for the past year we spent our nights and weekend building Hopp, an open-source alternative.<p>We would love your feedback and we are here to answer any and all questions.
Show HN: LightCycle, a FOSS game in Rust based on Tron
Show HN: LightCycle, a FOSS game in Rust based on Tron
Show HN: VoiceGecko – System-wide voice-to-text that types anywhere
Show HN: Davia – A community platform to build, share, and edit applications
Hi HN!<p>I'm Ruben, the founder of Davia (<a href="https://davia.ai/" rel="nofollow">https://davia.ai/</a>). Davia is a platform to build, edit and share applications, where builders get rewarded based on usage while users can discover and reuse any app's code to fit their own needs.<p>The problem we're trying to solve: Today's AI coding platforms charge you upfront for creation - it's like paying per block modification in no-code solutions. This works for commercial projects because you're aiming to generate revenue that exceeds your creation costs, but doesn't fit the reality of side projects and community apps that builders share for free. Most side projects require significant SEO effort and marketing to reach people who would actually find them useful, leaving creators with great apps but no discoverability.
For users, when you need a specific tool for your project, you don't want to build it from scratch when someone has probably already created something similar.<p>Our solution is a YouTube-style platform for apps where builders publish with open code so anyone can view, modify, and reuse applications. Unlike traditional platforms like GitHub (designed for developers as builders and users) or app stores (where code is closed and can't be modified), we focus on making apps accessible and customizable for everyone.<p>We verify code quality and prevent spam to maintain platform standards. Builders get rewarded when their original apps are used as foundations for others' projects. Users discover apps, customize them for their needs, and build their own versions through reuse. They can either self-host or use our platform hosting, paying based on team usage when they scale to multiple users on our infrastructure.<p>You can try it here: <a href="https://davia.ai/" rel="nofollow">https://davia.ai/</a>. We'd love to hear from the HN community, whether you're a vibe-coder or curious about creator economics in the age of AI-assisted development!
Show HN: Davia – A community platform to build, share, and edit applications
Hi HN!<p>I'm Ruben, the founder of Davia (<a href="https://davia.ai/" rel="nofollow">https://davia.ai/</a>). Davia is a platform to build, edit and share applications, where builders get rewarded based on usage while users can discover and reuse any app's code to fit their own needs.<p>The problem we're trying to solve: Today's AI coding platforms charge you upfront for creation - it's like paying per block modification in no-code solutions. This works for commercial projects because you're aiming to generate revenue that exceeds your creation costs, but doesn't fit the reality of side projects and community apps that builders share for free. Most side projects require significant SEO effort and marketing to reach people who would actually find them useful, leaving creators with great apps but no discoverability.
For users, when you need a specific tool for your project, you don't want to build it from scratch when someone has probably already created something similar.<p>Our solution is a YouTube-style platform for apps where builders publish with open code so anyone can view, modify, and reuse applications. Unlike traditional platforms like GitHub (designed for developers as builders and users) or app stores (where code is closed and can't be modified), we focus on making apps accessible and customizable for everyone.<p>We verify code quality and prevent spam to maintain platform standards. Builders get rewarded when their original apps are used as foundations for others' projects. Users discover apps, customize them for their needs, and build their own versions through reuse. They can either self-host or use our platform hosting, paying based on team usage when they scale to multiple users on our infrastructure.<p>You can try it here: <a href="https://davia.ai/" rel="nofollow">https://davia.ai/</a>. We'd love to hear from the HN community, whether you're a vibe-coder or curious about creator economics in the age of AI-assisted development!
Show HN: Blueprint: Fast, Nunjucks-like templating engine for Java 8 and beyond
I love the simplicity, expressibility and extendibility of Nunjucks.<p>But I was not able to find something with similar for Java, especially with the same syntax.<p>So, built one. And it's pretty fast too.<p><a href="https://github.com/freakynit/Blueprint" rel="nofollow">https://github.com/freakynit/Blueprint</a>
Show HN: Fine-tuned Llama 3.2 3B to match 70B models for local transcripts
I wrote a small local tool to transcribe audio notes (Whisper/Parakeet). Code: <a href="https://github.com/bilawalriaz/lazy-notes" rel="nofollow">https://github.com/bilawalriaz/lazy-notes</a><p>I wanted to process raw transcripts locally without OpenRouter. Llama 3.2 3B with a prompt was decent but incomplete, so I tried SFT.
I fine-tuned Llama 3.2 3B to clean/analyze dictation and emit structured JSON (title, tags, entities, dates, actions).<p>Data: 13 real memos → Kimi K2 gold JSON → ~40k synthetic + gold; keys canonicalized. Chutes.ai (5k req/day).<p>Training: RTX 4090 24GB, ~4h, LoRA (r=128, α=128, dropout=0.05), max seq 2048, bs=16, lr=5e-5, cosine, Unsloth. On 2070 Super 8GB it was ~8h.<p>Inference: merged to GGUF, Q4_K_M (llama.cpp), runs in LM Studio.<p>Evals (100-sample, scored by GLM 4.5 FP8): overall 5.35 (base 3B) → 8.55 (fine-tuned); completeness 4.12 → 7.62; factual 5.24 → 8.57.<p>Head-to-head (10 samples): ~8.40 vs Hermes-70B 8.18, Mistral-Small-24B 7.90, Gemma-3-12B 7.76, Qwen3-14B 7.62. Teacher Kimi K2 ~8.82.<p>Why: task specialization + JSON canonicalization reduces variance; the model learns the exact structure/fields.<p>Lessons: train on completions only; synthetic is fine for narrow tasks; Llama is straightforward to train.
Dataset pipeline + training script + evals: <a href="https://github.com/bilawalriaz/local-notes-transcribe-llm" rel="nofollow">https://github.com/bilawalriaz/local-notes-transcribe-llm</a>
Show HN: My first Go project, a useless animated bunny sign for your terminal
Hi HN, I wanted to share my very first (insignificant) project written in Go: a little CLI tool that displays messages with an animated bunny holding a sign.<p>I wanted to learn Go and needed a small, fun project to get my hands dirty with the language and the process of building and distributing a CLI. I've built a similar tool in JavaScript before so I thought porting it would be a great learning exercise.<p>This was a dive into Go's basics for me, from package structure and CLI flag parsing to building binaries for different platforms (never did that on my JS projects).<p>I'm starting to understand why Go is so praised: it's standard library is huge compared with other languages. One thing that really impressed me was the idea (at some point of this journey) to develop a functionality by myself (where in the javascript original project I choose to use an external library), here with the opportunities that std lib was giving me I thought "why don't try to create the function by miself?" and it worked! In the Js version I used the nodejs "log-update", here I write a dedicated pkg.<p>I know it's a bit silly, but I could see it being used to add some fun to build scripts or idk highlight important log messages, or just make a colleague smile. It's easy to install if you have Go set up:<p><pre><code> go install github.com/fsgreco/go-bunny-sign/cmd/bunnysign@latest
</code></pre>
Since I'm new to Go, I would genuinely appreciate any feedback on the code, project structure, or Go best practices. The README also lists my planned next steps, like adding tests and setting up CI better.<p>Thanks for taking a look!
Show HN: My first Go project, a useless animated bunny sign for your terminal
Hi HN, I wanted to share my very first (insignificant) project written in Go: a little CLI tool that displays messages with an animated bunny holding a sign.<p>I wanted to learn Go and needed a small, fun project to get my hands dirty with the language and the process of building and distributing a CLI. I've built a similar tool in JavaScript before so I thought porting it would be a great learning exercise.<p>This was a dive into Go's basics for me, from package structure and CLI flag parsing to building binaries for different platforms (never did that on my JS projects).<p>I'm starting to understand why Go is so praised: it's standard library is huge compared with other languages. One thing that really impressed me was the idea (at some point of this journey) to develop a functionality by myself (where in the javascript original project I choose to use an external library), here with the opportunities that std lib was giving me I thought "why don't try to create the function by miself?" and it worked! In the Js version I used the nodejs "log-update", here I write a dedicated pkg.<p>I know it's a bit silly, but I could see it being used to add some fun to build scripts or idk highlight important log messages, or just make a colleague smile. It's easy to install if you have Go set up:<p><pre><code> go install github.com/fsgreco/go-bunny-sign/cmd/bunnysign@latest
</code></pre>
Since I'm new to Go, I would genuinely appreciate any feedback on the code, project structure, or Go best practices. The README also lists my planned next steps, like adding tests and setting up CI better.<p>Thanks for taking a look!
Show HN: Woomarks, transfer your Pocket links to this app or self-host it
Pocket is shutting down and I really, really liked it. So I built woomarks, an app that let's you save links with a similar UI. It's very minimal, but it's doing everything I liked from Pocket and you can bulk import your links and use the app or self-host.<p>- Public app that you can test: <a href="https://woomarks.com/" rel="nofollow">https://woomarks.com/</a><p>- My self-hosted version, where you can see my saves: <a href="https://roberto.fyi/bookmarks/" rel="nofollow">https://roberto.fyi/bookmarks/</a><p>- Repository if you want to self-host: <a href="https://github.com/earlyriser/woomarks" rel="nofollow">https://github.com/earlyriser/woomarks</a><p>Export links from Pocket here: <a href="https://getpocket.com/export" rel="nofollow">https://getpocket.com/export</a> the last day will be on October 20025.<p>Features:
- Add/Delete links
- Search
- Tags
- Bookmarklet (useful for a 2-click-save)
- Data reads from:
csv file in server (these links are public)
local storage in browser (these links are visible just for the user)
- Local storage saving.
- Import to local storage from csv file
- Export to csv from local storage.
- Export to csv from csv file (useful when links are "deleted" using the app and just hidden using a local storage blacklist).
- Export to csv from both places.
- No external libraries.
- Vanilla css code.
- Vanilla js code.