The best Hacker News stories from Show from the past day
Latest posts:
Show HN: Web Audio Spring-Mass Synthesis
Hi, I'm the author of this little Web Audio toy which does physical modeling synthesis using a simple spring-mass system.<p>My current area of research is in sparse, event-based encodings of musical audio (<a href="https://blog.cochlea.xyz/sparse-interpretable-audio-codec-paper.html" rel="nofollow">https://blog.cochlea.xyz/sparse-interpretable-audio-codec-pa...</a>). I'm very interested in decomposing audio signals into a description of the "system" (e.g., room, instrument, vocal tract, etc.) and a sparse "control signal" which describes how and when energy is injected into that system. This toy was a great way to start learning about physical modeling synthesis, which seems to be the next stop in my research journey. I was also pleasantly surprised at what's possible these days writing custom Audio Worklets!
Show HN: Web Audio Spring-Mass Synthesis
Hi, I'm the author of this little Web Audio toy which does physical modeling synthesis using a simple spring-mass system.<p>My current area of research is in sparse, event-based encodings of musical audio (<a href="https://blog.cochlea.xyz/sparse-interpretable-audio-codec-paper.html" rel="nofollow">https://blog.cochlea.xyz/sparse-interpretable-audio-codec-pa...</a>). I'm very interested in decomposing audio signals into a description of the "system" (e.g., room, instrument, vocal tract, etc.) and a sparse "control signal" which describes how and when energy is injected into that system. This toy was a great way to start learning about physical modeling synthesis, which seems to be the next stop in my research journey. I was also pleasantly surprised at what's possible these days writing custom Audio Worklets!
Show HN: Fashion Shopping with Nearest Neighbors
I made this website with my wife in mind; it makes it possible to browse for similar fashion products over many different retailers at once.<p>The backend is written in Swift, and is hosted on a single Mac Mini. It performs nearest neighbors on the GPU over ~3M product images.<p>No vector DB, just pure matrix multiplications. Since we aren't just doing approximate nearest neighbors but rather sorting all results by distance, it's possible to show different "variety" levels by changing the stride over the sorted search results.<p>Nearest neighbors are computed in a latent vector space. The model which produces the vectors is also something I trained in pure Swift.<p>The underlying data is about 2TB scraped from <a href="https://www.shopltk.com/" rel="nofollow">https://www.shopltk.com/</a>.<p>All the code is at <a href="https://github.com/unixpickle/LTKlassifier" rel="nofollow">https://github.com/unixpickle/LTKlassifier</a>
Show HN: Fashion Shopping with Nearest Neighbors
I made this website with my wife in mind; it makes it possible to browse for similar fashion products over many different retailers at once.<p>The backend is written in Swift, and is hosted on a single Mac Mini. It performs nearest neighbors on the GPU over ~3M product images.<p>No vector DB, just pure matrix multiplications. Since we aren't just doing approximate nearest neighbors but rather sorting all results by distance, it's possible to show different "variety" levels by changing the stride over the sorted search results.<p>Nearest neighbors are computed in a latent vector space. The model which produces the vectors is also something I trained in pure Swift.<p>The underlying data is about 2TB scraped from <a href="https://www.shopltk.com/" rel="nofollow">https://www.shopltk.com/</a>.<p>All the code is at <a href="https://github.com/unixpickle/LTKlassifier" rel="nofollow">https://github.com/unixpickle/LTKlassifier</a>
Show HN: Metacheck – preview how any link appears on social media and chat apps
Hey HN,<p>I’ve been an indie hacker for a while, but I haven’t had much success with my past projects. Recently, I came across Marc Lou’s advice about building free tools just for fun, so I decided to give it a shot.<p>I built Metacheck, a simple tool that lets you preview how any link will appear on Twitter/X, LinkedIn, Whatsapp, Telegram, and other platforms. No API keys, no setup—just paste a link and see the preview.<p>Why I built this
I often ran into issues where social platforms displayed broken or unexpected link previews. Debugging Open Graph meta tags was annoying, so I made a tool to make it easier.<p>How it works
Fetches metadata from any URL
Parses Open Graph & Twitter Card tags
Shows real-time previews of how the link will look when shared<p>Try it out: <a href="https://metacheck.appstate.co/" rel="nofollow">https://metacheck.appstate.co/</a>
Show HN: Metacheck – preview how any link appears on social media and chat apps
Hey HN,<p>I’ve been an indie hacker for a while, but I haven’t had much success with my past projects. Recently, I came across Marc Lou’s advice about building free tools just for fun, so I decided to give it a shot.<p>I built Metacheck, a simple tool that lets you preview how any link will appear on Twitter/X, LinkedIn, Whatsapp, Telegram, and other platforms. No API keys, no setup—just paste a link and see the preview.<p>Why I built this
I often ran into issues where social platforms displayed broken or unexpected link previews. Debugging Open Graph meta tags was annoying, so I made a tool to make it easier.<p>How it works
Fetches metadata from any URL
Parses Open Graph & Twitter Card tags
Shows real-time previews of how the link will look when shared<p>Try it out: <a href="https://metacheck.appstate.co/" rel="nofollow">https://metacheck.appstate.co/</a>
Show HN: Metacheck – preview how any link appears on social media and chat apps
Hey HN,<p>I’ve been an indie hacker for a while, but I haven’t had much success with my past projects. Recently, I came across Marc Lou’s advice about building free tools just for fun, so I decided to give it a shot.<p>I built Metacheck, a simple tool that lets you preview how any link will appear on Twitter/X, LinkedIn, Whatsapp, Telegram, and other platforms. No API keys, no setup—just paste a link and see the preview.<p>Why I built this
I often ran into issues where social platforms displayed broken or unexpected link previews. Debugging Open Graph meta tags was annoying, so I made a tool to make it easier.<p>How it works
Fetches metadata from any URL
Parses Open Graph & Twitter Card tags
Shows real-time previews of how the link will look when shared<p>Try it out: <a href="https://metacheck.appstate.co/" rel="nofollow">https://metacheck.appstate.co/</a>
Show HN: A personal YouTube frontend based on yt-dlp
Show HN: A personal YouTube frontend based on yt-dlp
Show HN: A personal YouTube frontend based on yt-dlp
Show HN: Nash, I made a standalone note with single HTML file
Hello HN,
I hope it will posted as well.
I made a note in single html file.
This does not require a separate membership or installation of the software, and if you download and modify an empty file, you can modify and read it at any time, regardless of online or offline.
It can be shared through messengers such as Telegram, so it is also suitable to share contents with long articles and images.
It is also possible to host and blog because it is static html file content.
Show HN: Nash, I made a standalone note with single HTML file
Hello HN,
I hope it will posted as well.
I made a note in single html file.
This does not require a separate membership or installation of the software, and if you download and modify an empty file, you can modify and read it at any time, regardless of online or offline.
It can be shared through messengers such as Telegram, so it is also suitable to share contents with long articles and images.
It is also possible to host and blog because it is static html file content.
Show HN: Nash, I made a standalone note with single HTML file
Hello HN,
I hope it will posted as well.
I made a note in single html file.
This does not require a separate membership or installation of the software, and if you download and modify an empty file, you can modify and read it at any time, regardless of online or offline.
It can be shared through messengers such as Telegram, so it is also suitable to share contents with long articles and images.
It is also possible to host and blog because it is static html file content.
Show HN: Nash, I made a standalone note with single HTML file
Hello HN,
I hope it will posted as well.
I made a note in single html file.
This does not require a separate membership or installation of the software, and if you download and modify an empty file, you can modify and read it at any time, regardless of online or offline.
It can be shared through messengers such as Telegram, so it is also suitable to share contents with long articles and images.
It is also possible to host and blog because it is static html file content.
Show HN: I'm working on a Chrome extension for viewing EXIF data of images
I started this because similar Chrome extensions were paywalling features and I wanted a free, open source alternative. I'm new to this, so I would appreciate feedback and tips if anyone has some!
Show HN: MCPGod: Fine-grained control over MCP clients, servers, and tools
Hey everyone,
I've wanted an easy way to control which mcp server tools are available to clients. So for example, I might want a gmail server to only expose the read tool (but not send, delete etc).<p>I figured if I create a cli for spawning mcp servers, I could intercept the stdin, stdout, stderr etc and modify what the clients see when they are making calls to list tools, resources, and prompts.<p>Well it worked!<p>In the initial version you can easily add a server to claude with a safe list of tools:<p>npx -y mcpgod add @modelcontextprotocol/server-everything --client claude --tools=echo,add<p>Now when you load Claude Desktop, it will only discover the echo and add tools from that server. It's a nice way to keep the agents in line :)<p>You can check it out here: <a href="https://github.com/mcpgod/cli" rel="nofollow">https://github.com/mcpgod/cli</a><p>It will also log everything that a client is doing to ~/mcpgod/logs.<p>Currently it only has support for claude, but it will be easy to add cursor, cline, windsurf, etc.<p>With the `tools` command you can list all of a servers tools, and even call a tool directly from the command line, which is pretty fun.<p>I was thinking it would be nice to create a UI for it to easily enable/disable servers and tools for each client, inspect logs, view analytics, etc.<p>Thanks for reading!
Show HN: OCR Benchmark Focusing on Automation
OCR/Document extraction field has seen lot of action recently with releases like Mixtral OCR, Andrew Ng's agentic document processing etc. Also there are several benchmarks for OCR, however all testing for something slightly different which make good comparison of models very hard.<p>To give an example, some models like mixtral-ocr only try to convert a document to markdown format. You have to use another LLM on top of it to get the final result. Some VLM’s directly give structured information like key fields from documents like invoices, but you have to either add business rules on top of it or use some LLM as a judge kind of system to get sense of which output needs to be manually reviewed or can be taken as correct output. No benchmark attempts to measure the actual rate of automation you can achieve.<p>We have tried to solve this problem with a benchmark that is only applicable for documents/usecases where you are looking for automation and its trying to measure that end to end automation level of different models or systems.<p>We have collected a dataset that represents documents like invoices etc which are applicable in processes where automation is needed vs are more copilot in nature where you would need to chat with document. Also have annotated these documents and published the dataset and repo so it can be extended.<p>Here is writeup: <a href="https://nanonets.com/automation-benchmark" rel="nofollow">https://nanonets.com/automation-benchmark</a>
Dataset: <a href="https://huggingface.co/datasets/nanonets/nn-auto-bench-ds" rel="nofollow">https://huggingface.co/datasets/nanonets/nn-auto-bench-ds</a>
Github: <a href="https://github.com/NanoNets/nn-auto-bench" rel="nofollow">https://github.com/NanoNets/nn-auto-bench</a><p>Looking for suggestions on how this benchmark can be improved further.
Show HN: OCR Benchmark Focusing on Automation
OCR/Document extraction field has seen lot of action recently with releases like Mixtral OCR, Andrew Ng's agentic document processing etc. Also there are several benchmarks for OCR, however all testing for something slightly different which make good comparison of models very hard.<p>To give an example, some models like mixtral-ocr only try to convert a document to markdown format. You have to use another LLM on top of it to get the final result. Some VLM’s directly give structured information like key fields from documents like invoices, but you have to either add business rules on top of it or use some LLM as a judge kind of system to get sense of which output needs to be manually reviewed or can be taken as correct output. No benchmark attempts to measure the actual rate of automation you can achieve.<p>We have tried to solve this problem with a benchmark that is only applicable for documents/usecases where you are looking for automation and its trying to measure that end to end automation level of different models or systems.<p>We have collected a dataset that represents documents like invoices etc which are applicable in processes where automation is needed vs are more copilot in nature where you would need to chat with document. Also have annotated these documents and published the dataset and repo so it can be extended.<p>Here is writeup: <a href="https://nanonets.com/automation-benchmark" rel="nofollow">https://nanonets.com/automation-benchmark</a>
Dataset: <a href="https://huggingface.co/datasets/nanonets/nn-auto-bench-ds" rel="nofollow">https://huggingface.co/datasets/nanonets/nn-auto-bench-ds</a>
Github: <a href="https://github.com/NanoNets/nn-auto-bench" rel="nofollow">https://github.com/NanoNets/nn-auto-bench</a><p>Looking for suggestions on how this benchmark can be improved further.
Show HN: OCR Benchmark Focusing on Automation
OCR/Document extraction field has seen lot of action recently with releases like Mixtral OCR, Andrew Ng's agentic document processing etc. Also there are several benchmarks for OCR, however all testing for something slightly different which make good comparison of models very hard.<p>To give an example, some models like mixtral-ocr only try to convert a document to markdown format. You have to use another LLM on top of it to get the final result. Some VLM’s directly give structured information like key fields from documents like invoices, but you have to either add business rules on top of it or use some LLM as a judge kind of system to get sense of which output needs to be manually reviewed or can be taken as correct output. No benchmark attempts to measure the actual rate of automation you can achieve.<p>We have tried to solve this problem with a benchmark that is only applicable for documents/usecases where you are looking for automation and its trying to measure that end to end automation level of different models or systems.<p>We have collected a dataset that represents documents like invoices etc which are applicable in processes where automation is needed vs are more copilot in nature where you would need to chat with document. Also have annotated these documents and published the dataset and repo so it can be extended.<p>Here is writeup: <a href="https://nanonets.com/automation-benchmark" rel="nofollow">https://nanonets.com/automation-benchmark</a>
Dataset: <a href="https://huggingface.co/datasets/nanonets/nn-auto-bench-ds" rel="nofollow">https://huggingface.co/datasets/nanonets/nn-auto-bench-ds</a>
Github: <a href="https://github.com/NanoNets/nn-auto-bench" rel="nofollow">https://github.com/NanoNets/nn-auto-bench</a><p>Looking for suggestions on how this benchmark can be improved further.
Show HN: CodeVideo – Two years in the making to build an event-sourced IDE
Hi everyone! I originally created CodeVideo as a little side project using FFMPEG WASM in the browser as an experiment, but it's since grown into my vision for a completely automated software educational course production system.<p>The idea is that you create the educational content once, then can export the course to multiple formats - as a video (of course!), but also as an interactive webpage, a blog post, or even a book, PDF, or PowerPoint! Basically a "create once, ship everywhere" concept.<p>Things will get more interesting as I incorporate stuff like spell check (for speech) and abstract syntax tree checking (for code), so you can quite literally check the validity of your software course in realtime as you build the course.<p>You can read more about the technical details and history on my Substack launch post:<p><a href="https://codevideo.substack.com/p/launching-codevideo-after-two-years" rel="nofollow">https://codevideo.substack.com/p/launching-codevideo-after-t...</a><p>And here's the intro video about how to use the studio:<p><a href="https://youtu.be/4nyuhWF6SS0" rel="nofollow">https://youtu.be/4nyuhWF6SS0</a><p>EDIT: added link to the mp4 created in the demo video:<p><a href="https://coffee-app.sfo2.cdn.digitaloceanspaces.com/codevideo/v3/a5edf4e4-c512-4b62-b7f9-11dbe689440e.mp4" rel="nofollow">https://coffee-app.sfo2.cdn.digitaloceanspaces.com/codevideo...</a><p>From an intellectual and software standpoint this product has been (and still is) an absolute blast to build - and as always, I've learned a TON along the way. Very excited to get feedback from the Hacker community - even (maybe especially?) the classic skeptical feedback ;)<p>As an engineer, I always suck at monetization and things like that - I already am wondering if the whole token system is too complex and perhaps a different model would be better. Again, waiting for feedback from everyone. Until then, enjoy the studio!