The best Hacker News stories from Show from the past week
Latest posts:
Show HN: My 70 year old grandma is learning to code and made a word game
Show HN: Attaching to a virtual GPU over TCP
We developed a tool to trick your computer into thinking it’s attached to a GPU which actually sits across a network. This allows you to switch the number or type of GPUs you’re using with a single command.
Show HN: LLM-aided OCR – Correcting Tesseract OCR errors with LLMs
Almost exactly 1 year ago, I submitted something to HN about using Llama2 (which had just come out) to improve the output of Tesseract OCR by correcting obvious OCR errors [0]. That was exciting at the time because OpenAI's API calls were still quite expensive for GPT4, and the cost of running it on a book-length PDF would just be prohibitive. In contrast, you could run Llama2 locally on a machine with just a CPU, and it would be extremely slow, but "free" if you had a spare machine lying around.<p>Well, it's amazing how things have changed since then. Not only have models gotten a lot better, but the latest "low tier" offerings from OpenAI (GPT4o-mini) and Anthropic (Claude3-Haiku) are incredibly cheap and incredibly fast. So cheap and fast, in fact, that you can now break the document up into little chunks and submit them to the API concurrently (where each chunk can go through a multi-stage process, in which the output of the first stage is passed into another prompt for the next stage) and assemble it all in a shockingly short amount of time, and for basically a rounding error in terms of cost.<p>My original project had all sorts of complex stuff for detecting hallucinations and incorrect, spurious additions to the text (like "Here is the corrected text" preambles). But the newer models are already good enough to eliminate most of that stuff. And you can get very impressive results with the multi-stage approach. In this case, the first pass asks it to correct OCR errors and to remove line breaks in the middle of a word and things like that. The next stage takes that as the input and asks the model to do things like reformat the text using markdown, to suppress page numbers and repeated page headers, etc. Anyway, I think the samples (which take less than 1-2 minutes to generate) show the power of the approach:<p>Original PDF:
<a href="https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main/160301289-Warren-Buffett-Katharine-Graham-Letter.pdf">https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main...</a><p>Raw OCR Output:
<a href="https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main/160301289-Warren-Buffett-Katharine-Graham-Letter__raw_ocr_output.txt">https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main...</a><p>LLM-Corrected Markdown Output:
<a href="https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main/160301289-Warren-Buffett-Katharine-Graham-Letter_llm_corrected.md">https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main...</a><p>One interesting thing I found was that almost all my attempts to fix/improve things using "classical" methods like regex and other rule based things made everything worse and more brittle, and the real improvements came from adjusting the prompts to make things clearer for the model, and not asking the model to do too much in a single pass (like fixing OCR mistakes AND converting to markdown format).<p>Anyway, this project is very handy if you have some old scanned books you want to read from Archive.org or Google Books on a Kindle or other ereader device and want things to be re-flowable and clear. It's still not perfect, but I bet within the next year the models will improve even more that it will get closer to 100%. Hope you like it!<p>[0] <a href="https://news.ycombinator.com/item?id=36976333">https://news.ycombinator.com/item?id=36976333</a>
Show HN: LLM-aided OCR – Correcting Tesseract OCR errors with LLMs
Almost exactly 1 year ago, I submitted something to HN about using Llama2 (which had just come out) to improve the output of Tesseract OCR by correcting obvious OCR errors [0]. That was exciting at the time because OpenAI's API calls were still quite expensive for GPT4, and the cost of running it on a book-length PDF would just be prohibitive. In contrast, you could run Llama2 locally on a machine with just a CPU, and it would be extremely slow, but "free" if you had a spare machine lying around.<p>Well, it's amazing how things have changed since then. Not only have models gotten a lot better, but the latest "low tier" offerings from OpenAI (GPT4o-mini) and Anthropic (Claude3-Haiku) are incredibly cheap and incredibly fast. So cheap and fast, in fact, that you can now break the document up into little chunks and submit them to the API concurrently (where each chunk can go through a multi-stage process, in which the output of the first stage is passed into another prompt for the next stage) and assemble it all in a shockingly short amount of time, and for basically a rounding error in terms of cost.<p>My original project had all sorts of complex stuff for detecting hallucinations and incorrect, spurious additions to the text (like "Here is the corrected text" preambles). But the newer models are already good enough to eliminate most of that stuff. And you can get very impressive results with the multi-stage approach. In this case, the first pass asks it to correct OCR errors and to remove line breaks in the middle of a word and things like that. The next stage takes that as the input and asks the model to do things like reformat the text using markdown, to suppress page numbers and repeated page headers, etc. Anyway, I think the samples (which take less than 1-2 minutes to generate) show the power of the approach:<p>Original PDF:
<a href="https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main/160301289-Warren-Buffett-Katharine-Graham-Letter.pdf">https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main...</a><p>Raw OCR Output:
<a href="https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main/160301289-Warren-Buffett-Katharine-Graham-Letter__raw_ocr_output.txt">https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main...</a><p>LLM-Corrected Markdown Output:
<a href="https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main/160301289-Warren-Buffett-Katharine-Graham-Letter_llm_corrected.md">https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main...</a><p>One interesting thing I found was that almost all my attempts to fix/improve things using "classical" methods like regex and other rule based things made everything worse and more brittle, and the real improvements came from adjusting the prompts to make things clearer for the model, and not asking the model to do too much in a single pass (like fixing OCR mistakes AND converting to markdown format).<p>Anyway, this project is very handy if you have some old scanned books you want to read from Archive.org or Google Books on a Kindle or other ereader device and want things to be re-flowable and clear. It's still not perfect, but I bet within the next year the models will improve even more that it will get closer to 100%. Hope you like it!<p>[0] <a href="https://news.ycombinator.com/item?id=36976333">https://news.ycombinator.com/item?id=36976333</a>
Show HN: BudgetFlow – Budget planning using interactive Sankey diagrams
Show HN: I've spent nearly 5y on a web app that creates 3D apartments
Show HN: 1-FPS encrypted screen sharing for introverts
I wanted to show you something I was hacking on for the last few weeks.<p>I tired of sharing screen via Google Meet with 1-hour limitation, with Zoom and 40-minute limitation, etc. With paid Slack subscription. And often times I just needed to screenshare with no audio.<p>So I ended up with my own solution - no registration, low memory, low CPU, low tek 1 fps encrypted screen sharing. Currently sharing only the main screen (good for laptop users).<p>It's very raw in terms of infrastructure, since I'm not counting bytes (yikes!), everything works on my own dedicated server. But the service itself has been tested, we've been sharing screens for countless hours. All sessions last for 48 hours, then it gets removed with all remaining info.<p>Every new frame replaces the other, and everything is end-to-end encrypted so even server owners and operators won't be able to see what are you sharing.<p>There is also no tracking, except the main page - and I use my own analytics. Sessions are not getting tracked and never will be, and observability currently is not in place.<p>Again, this is a true one-person side hacking project I hope (but I have serious doubts) I might need to scale if it's getting traction to support more users.
Show HN: Iso20022.js – Create payments in 3 lines of code
Show HN: Pie Menu – a radial menu for macOS
Hi everyone! I'm Marius Hauken, an indie developer, and I'm excited to share my app: Pie Menu. It offers a fresh way to access your favorite menu bar commands and keyboard shortcuts on macOS. By simply pressing a hotkey you choose during setup, a radial menu appears around your cursor, customized to the current active application. This allows you to quickly select commands without having to remember complex shortcuts across different applications.<p>Pie Menu comes with a library of preprogrammed commands for popular apps, but you can easily add any app on your computer. We've also created an extensive database at <a href="https://www.pie-menu.com/shortcuts" rel="nofollow">https://www.pie-menu.com/shortcuts</a> where you can quickly add shortcuts for different programs. If a command lacks a keyboard shortcut, you can always create one through System Preferences > Keyboard > Application Shortcuts.<p>For now, you can use Apple’s SF Symbols to label your commands, but we plan to include custom symbol sets in the future. You can see and vote on our roadmap at <a href="https://www.pie-menu.com/help/roadmap" rel="nofollow">https://www.pie-menu.com/help/roadmap</a>.<p>I hope you give Pie Menu a try and find it as useful as I intended!
Show HN: Free e-book about WebGPU Programming
I am excited to announce the launch of my e-book on Graphics/WebGPU programming! This project has consumed much of my spare time, during which I developed various tools to facilitate the publishing process, including a code playground and a static site generator that can reference sample codes.<p>However, I'm feeling burnt out and ready to call it finished, even though it may not feel completely done. Avoiding another abandoned side project has been my primary motivation in reaching this point.
Show HN: Free e-book about WebGPU Programming
I am excited to announce the launch of my e-book on Graphics/WebGPU programming! This project has consumed much of my spare time, during which I developed various tools to facilitate the publishing process, including a code playground and a static site generator that can reference sample codes.<p>However, I'm feeling burnt out and ready to call it finished, even though it may not feel completely done. Avoiding another abandoned side project has been my primary motivation in reaching this point.
Show HN: Hanon Pro – piano technique and exercises for the digital age
Show HN: Ell – A command-line interface for LLMs written in Bash
Hi HN!<p>I've created a CLI tool called "ell" that allows you to interact with LLMs directly from your terminal. Designed with the Unix philosophy in mind, ell is simple, modular, and extensible. You can easily pipe input and output to integrate with other tools. Its templates and hook-based plugins enable you to customize and extend its functionality to suit any needs. Check out the README for usage instructions and examples.<p>I developed this tool because existing solutions often felt too heavy, with many dependencies, or they weren't friendly to piping and customization. I, on the contrary, wrote in almost pure Bash with least dependencies. Additionally, I found a lack of tools that could read past terminal output as context. Imagine encountering an issue in your terminal and being able to directly ask an LLM for help with a simple command—this is now possible with ell (see the demo video).<p>Known limitations:<p>- To maintain simplicity and efficiency, jq is used for JSON parsing.<p>- Cannot avoid curl to sending HTTPS requests. If only there were SSL / TLS support in `/dev/tcp/`!<p>- Perl is used to handle terminal escape sequences because regex in Bash does not support looking around.<p>- Markdown syntax highlighting is not perfect due to the need for streaming output. It relies on a simple state machine instead of a full parser, which may produce falsy results.<p>- Other known issues are listed in Github Issues. Please help add more!<p>I welcome any criticism and suggestions, whether it's about the idea or code!
Show HN: Turn any website into a knowledge base for LLMs
I built this tool because I wanted a way to just take a bunch of URLs or domains, and query their content in RAG applications.<p>It takes away the pain of crawling, extracting content, chunking, vectorizing, and updating periodically.<p>I'm curious to see if it can be useful to others. I meant to launch this six months ago but life got in the way...
Show HN: Turn any website into a knowledge base for LLMs
I built this tool because I wanted a way to just take a bunch of URLs or domains, and query their content in RAG applications.<p>It takes away the pain of crawling, extracting content, chunking, vectorizing, and updating periodically.<p>I'm curious to see if it can be useful to others. I meant to launch this six months ago but life got in the way...
Show HN: Create diagrams of complex data flows in software systems
Hello HN,<p>It has been a while since I contributed to the web, so I decided to get back in shape and publish "something".<p>This app is a POC of "what if diagrams were more dynamic".
I'm a software engineer by trade, and with conventional tools, I often times struggle to explain flows of data in complex software systems.<p>I got inspired by video games like The Incredible Machine and Factorio, as in some ways, software systems tend to become Rube Goldberg-esque machines ;)
As a side quest, I also wanted to craft diagrams faster than text-based tools (ex: mermaid), as I am always forgetting their syntax.<p>If you try the app, you will certainly struggle with its UI, especially when crafting flows, as I used all my brain juice on the core idea.
I have cool features in my head for a v1 but today I really wanted to simply show what I got.<p>You can access the app directly at <a href="https://gg-charts.com" rel="nofollow">https://gg-charts.com</a> and there are some examples in the Github README to get you started.<p>Happy to answer questions and humbly receive your honest feedback on this crazy idea!
Show HN: Create diagrams of complex data flows in software systems
Hello HN,<p>It has been a while since I contributed to the web, so I decided to get back in shape and publish "something".<p>This app is a POC of "what if diagrams were more dynamic".
I'm a software engineer by trade, and with conventional tools, I often times struggle to explain flows of data in complex software systems.<p>I got inspired by video games like The Incredible Machine and Factorio, as in some ways, software systems tend to become Rube Goldberg-esque machines ;)
As a side quest, I also wanted to craft diagrams faster than text-based tools (ex: mermaid), as I am always forgetting their syntax.<p>If you try the app, you will certainly struggle with its UI, especially when crafting flows, as I used all my brain juice on the core idea.
I have cool features in my head for a v1 but today I really wanted to simply show what I got.<p>You can access the app directly at <a href="https://gg-charts.com" rel="nofollow">https://gg-charts.com</a> and there are some examples in the Github README to get you started.<p>Happy to answer questions and humbly receive your honest feedback on this crazy idea!
Show HN: I made a tool to receive alerts when answers change
Hi HN,<p>I've created a tool called Alertfor that scours the open web to find the most relevant and up-to-date answers for complex questions. You can set up alerts to receive continuous updates whenever there are changes or new information becomes available for a given question.<p>I used an agent framework (Autogen + Sibyl) to collect and answer questions, and I schedule a Celery job to run the same query continuously every six hours.<p>I would love to hear your feedback, suggestions, or anything else you’d like to say.<p>Note: I'm submitting this for a second time; I'm not sure if this is against HN policy.
Show HN: A football/soccer pass visualizer made with Three.js
I've been working on a football pass visualiser for the past week.<p>It uses open data from StatsBomb to analyse and visualise passing patterns, allowing users to explore and filter the data by pass distance, team and players.
Show HN: I built an open-source tool to make on-call suck less
Hey HN,<p>I am building an open source platform to make on-call better and less stressful for engineers. We are building a tool that can silence alerts and help with debugging and root cause analysis. We also want to automate tedious parts of being on-call (running runbooks manually, answering questions on Slack, dealing with Pagerduty).
Here is a quick video of how it works: <a href="https://youtu.be/m_K9Dq1kZDw" rel="nofollow">https://youtu.be/m_K9Dq1kZDw</a><p>I hated being on-call for a couple of reasons:<p>* Alert volume: The number of alerts kept increasing over time. It was hard to maintain existing alerts. This would lead to a lot of noisy and unactionable alerts. I have lost count of the number of times I got woken up by alert that auto-resolved 5 minutes later.<p>* Debugging: Debugging an alert or a customer support ticket would need me to gain context on a service that I might not have worked on before. These companies used many observability tools that would make debugging challenging. There are always a time pressure to resolve issues quickly.<p>There were some more tangential issues that used to take up a lot of on-call time<p>* Support: Answering questions from other teams. A lot of times these questions were repetitive and have been answered before.<p>* Dealing with PagerDuty: These tools are hard to use. e.g. It was hard to schedule an override in PD or do holiday schedules.<p>I am building an on-call tool that is Slack-native since that has become the de-facto tool for on-call engineers.<p>We heard from a lot of engineers that maintaining good alert hygiene is a challenge.<p>To start off, Opslane integrates with Datadog and can classify alerts as actionable or noisy.<p>We analyze your alert history across various signals:<p>1. Alert frequency<p>2. How quickly the alerts have resolved in the past<p>3. Alert priority<p>4. Alert response history<p>Our classification is conservative and it can be tuned as teams get more confidence in the predictions. We want to make sure that you aren't accidentally missing a critical alert.<p>Additionally, we generate a weekly report based on all your alerts to give you a picture of your overall alert hygiene.<p>What’s next?<p>1. Building more integrations (Prometheus, Splunk, Sentry, PagerDuty) to continue making on-call quality of life better<p>2. Help make debugging and root cause analysis easier.<p>3. Runbook automation<p>We’re still pretty early in development and we want to make on-call quality of life better. Any feedback would be much appreciated!