The best Hacker News stories from Show from the past day
Latest posts:
Show HN: Dosidicus – A digital pet with a simple neural network
Show HN: Dosidicus – A digital pet with a simple neural network
Show HN: Rowboat – Open-source IDE for multi-agent systems
Hi HN! We’re Arjun, Ramnique, and Akhilesh, and we are building Rowboat (<a href="https://www.rowboatlabs.com/">https://www.rowboatlabs.com/</a>), an AI-assisted IDE for building and managing multi-agent systems. You start with a single agent, then scale up to teams of agents that work together, use MCP tools, and improve over time - all through a chat-based copilot.<p>Our repo is <a href="https://github.com/rowboatlabs/rowboat">https://github.com/rowboatlabs/rowboat</a>, docs are at <a href="https://docs.rowboatlabs.com/">https://docs.rowboatlabs.com/</a>, and there’s a demo video here: <a href="https://youtu.be/YRTCw9UHRbU" rel="nofollow">https://youtu.be/YRTCw9UHRbU</a><p>It’s becoming clear that real-world agentic systems work best when multiple agents collaborate, rather than having one agent attempt to do everything. This isn’t too surprising - it’s a bit like how good code consists of multiple functions that each do one thing, rather than cramming everything into one function.<p>For example, a travel assistant works best when different agents handle specialized tasks: one agent finds the best flights, another optimizes hotel selections, and a third organizes the itinerary. This modular approach makes the system easier to manage, debug, and improve over time.<p>OpenAI’s Agents SDK provides a neat Python library to support this, but building reliable agentic systems requires constant iterations and tweaking - e.g. updating agent instructions (which can quickly get as complex as actual code), connecting tools, and testing the system and incorporating feedback. Rowboat is an AI IDE to do all this. Rowboat is to AI agents what Cursor is to code.<p>We’ve taken a code-like approach to agent instructions (prompts). There are special keywords to directly reference other agents, tools or prompts - which are highlighted in the UI. The copilot is the best way to create and edit these instructions - each change comes with a code-style diff.<p>You can give agents access to tools by integrating any MCP server or connecting your own functions through a webhook. You can instruct the agents on when to use specific tools via ‘@mentions’ in the agent instruction. To enable quick testing, we added a way to mock tool responses using LLM calls.<p>Rowboat playground lets you test and debug the assistants as you build them. You can see agent transfers, tool invocations and tool responses in real-time. The copilot has the context of the chat, and can improve the agent instructions based on feedback. For example, you could say ‘The agent shouldn’t have done x here. Fix this’ and the copilot can go and make this fix.<p>You can integrate agentic systems built in Rowboat into your application via the HTTP API or the Python SDK (‘pip install rowboat’). For example, you can build user-facing chatbots, enterprise workflows and employee assistants using Rowboat.<p>We’ve been working with LLMs since GPT-1 launched in 2018. Most recently, we built Coinbase’s support chatbot after our last AI startup was acquired by them.<p>Rowboat is Apache 2.0 licensed, giving you full freedom to self-host, modify, or extend it however you like.<p>We’re excited to share Rowboat with everyone here. We’d love to hear your thoughts!
Show HN: Rowboat – Open-source IDE for multi-agent systems
Hi HN! We’re Arjun, Ramnique, and Akhilesh, and we are building Rowboat (<a href="https://www.rowboatlabs.com/">https://www.rowboatlabs.com/</a>), an AI-assisted IDE for building and managing multi-agent systems. You start with a single agent, then scale up to teams of agents that work together, use MCP tools, and improve over time - all through a chat-based copilot.<p>Our repo is <a href="https://github.com/rowboatlabs/rowboat">https://github.com/rowboatlabs/rowboat</a>, docs are at <a href="https://docs.rowboatlabs.com/">https://docs.rowboatlabs.com/</a>, and there’s a demo video here: <a href="https://youtu.be/YRTCw9UHRbU" rel="nofollow">https://youtu.be/YRTCw9UHRbU</a><p>It’s becoming clear that real-world agentic systems work best when multiple agents collaborate, rather than having one agent attempt to do everything. This isn’t too surprising - it’s a bit like how good code consists of multiple functions that each do one thing, rather than cramming everything into one function.<p>For example, a travel assistant works best when different agents handle specialized tasks: one agent finds the best flights, another optimizes hotel selections, and a third organizes the itinerary. This modular approach makes the system easier to manage, debug, and improve over time.<p>OpenAI’s Agents SDK provides a neat Python library to support this, but building reliable agentic systems requires constant iterations and tweaking - e.g. updating agent instructions (which can quickly get as complex as actual code), connecting tools, and testing the system and incorporating feedback. Rowboat is an AI IDE to do all this. Rowboat is to AI agents what Cursor is to code.<p>We’ve taken a code-like approach to agent instructions (prompts). There are special keywords to directly reference other agents, tools or prompts - which are highlighted in the UI. The copilot is the best way to create and edit these instructions - each change comes with a code-style diff.<p>You can give agents access to tools by integrating any MCP server or connecting your own functions through a webhook. You can instruct the agents on when to use specific tools via ‘@mentions’ in the agent instruction. To enable quick testing, we added a way to mock tool responses using LLM calls.<p>Rowboat playground lets you test and debug the assistants as you build them. You can see agent transfers, tool invocations and tool responses in real-time. The copilot has the context of the chat, and can improve the agent instructions based on feedback. For example, you could say ‘The agent shouldn’t have done x here. Fix this’ and the copilot can go and make this fix.<p>You can integrate agentic systems built in Rowboat into your application via the HTTP API or the Python SDK (‘pip install rowboat’). For example, you can build user-facing chatbots, enterprise workflows and employee assistants using Rowboat.<p>We’ve been working with LLMs since GPT-1 launched in 2018. Most recently, we built Coinbase’s support chatbot after our last AI startup was acquired by them.<p>Rowboat is Apache 2.0 licensed, giving you full freedom to self-host, modify, or extend it however you like.<p>We’re excited to share Rowboat with everyone here. We’d love to hear your thoughts!
Show HN: Rowboat – Open-source IDE for multi-agent systems
Hi HN! We’re Arjun, Ramnique, and Akhilesh, and we are building Rowboat (<a href="https://www.rowboatlabs.com/">https://www.rowboatlabs.com/</a>), an AI-assisted IDE for building and managing multi-agent systems. You start with a single agent, then scale up to teams of agents that work together, use MCP tools, and improve over time - all through a chat-based copilot.<p>Our repo is <a href="https://github.com/rowboatlabs/rowboat">https://github.com/rowboatlabs/rowboat</a>, docs are at <a href="https://docs.rowboatlabs.com/">https://docs.rowboatlabs.com/</a>, and there’s a demo video here: <a href="https://youtu.be/YRTCw9UHRbU" rel="nofollow">https://youtu.be/YRTCw9UHRbU</a><p>It’s becoming clear that real-world agentic systems work best when multiple agents collaborate, rather than having one agent attempt to do everything. This isn’t too surprising - it’s a bit like how good code consists of multiple functions that each do one thing, rather than cramming everything into one function.<p>For example, a travel assistant works best when different agents handle specialized tasks: one agent finds the best flights, another optimizes hotel selections, and a third organizes the itinerary. This modular approach makes the system easier to manage, debug, and improve over time.<p>OpenAI’s Agents SDK provides a neat Python library to support this, but building reliable agentic systems requires constant iterations and tweaking - e.g. updating agent instructions (which can quickly get as complex as actual code), connecting tools, and testing the system and incorporating feedback. Rowboat is an AI IDE to do all this. Rowboat is to AI agents what Cursor is to code.<p>We’ve taken a code-like approach to agent instructions (prompts). There are special keywords to directly reference other agents, tools or prompts - which are highlighted in the UI. The copilot is the best way to create and edit these instructions - each change comes with a code-style diff.<p>You can give agents access to tools by integrating any MCP server or connecting your own functions through a webhook. You can instruct the agents on when to use specific tools via ‘@mentions’ in the agent instruction. To enable quick testing, we added a way to mock tool responses using LLM calls.<p>Rowboat playground lets you test and debug the assistants as you build them. You can see agent transfers, tool invocations and tool responses in real-time. The copilot has the context of the chat, and can improve the agent instructions based on feedback. For example, you could say ‘The agent shouldn’t have done x here. Fix this’ and the copilot can go and make this fix.<p>You can integrate agentic systems built in Rowboat into your application via the HTTP API or the Python SDK (‘pip install rowboat’). For example, you can build user-facing chatbots, enterprise workflows and employee assistants using Rowboat.<p>We’ve been working with LLMs since GPT-1 launched in 2018. Most recently, we built Coinbase’s support chatbot after our last AI startup was acquired by them.<p>Rowboat is Apache 2.0 licensed, giving you full freedom to self-host, modify, or extend it however you like.<p>We’re excited to share Rowboat with everyone here. We’d love to hear your thoughts!
Show HN: Morphik – Open-source RAG that understands PDF images, runs locally
Hey HN, we’re Adi and Arnav. A few months ago, we hit a wall trying to get LLMs to answer questions over research papers and instruction manuals. Everything worked fine, until the answer lived inside an image or diagram embedded in the PDF. Even GPT‑4o flubbed it (we recently tried O3 with the same, and surprisingly it flubbed it too). Naive RAG pipelines just pulled in some text chunks and ignored the rest.<p>We took an invention disclosure PDF (<a href="https://drive.google.com/file/d/1ySzQgbNZkC5dPLtE3pnnVL2rW_9aTeuG/view?usp=sharing" rel="nofollow">https://drive.google.com/file/d/1ySzQgbNZkC5dPLtE3pnnVL2rW_9...</a>) containing an IRR‑vs‑frequency graph and asked GPT “From the graph, at what frequency is the IRR maximized?”. We originally tried this on gpt-4o, but while writing this used the new natively multimodal model o4‑mini‑high. After a 30‑second thinking pause, it asked for clarifications, then churned out buggy code, pulled data from the wrong page, and still couldn’t answer the question. We wrote up the full story with screenshots here: <a href="https://docs.morphik.ai/blogs/gpt-vs-morphik-multimodal">https://docs.morphik.ai/blogs/gpt-vs-morphik-multimodal</a>.<p>We got frustrated enough to try fixing it ourselves.<p>We built Morphik to do multimodal retrieval over documents like PDFs, where images and diagrams matter as much as the text.<p>To do this, we use Colpali-style embeddings, which treat each document page as an image and generate multi-vector representations. These embeddings capture layout, typography, and visual context, allowing retrieval to get a whole table or schematic, not just nearby tokens. Along with vector search, this could now retrieve exact pages with relevant diagrams and pass them as images to the LLM to get relevant answers. It’s able to answer the question with an 8B llama 3.1 vision running locally!<p>Early pharma testers hit our system with queries like "Which EGFR inhibitors at 50 mg showed ≥ 30% tumor reduction?" We correctly returned the right tables and plots, but still hit a bottleneck, we weren’t able to join the dots across multiple reports. So we built a knowledge graph: we tag entities in both text and images, normalize synonyms (Erlotinib → EGFR inhibitor), infer relations (e.g. administered_at, yields_reduction), and stitch everything into a graph. Now a single query could traverse that graph across documents and surface a coherent, cross‑document answer along with the correct pages as images.<p>To illustrate that, and just for fun, we built a graph of 100 Paul Graham’s essays here: <a href="https://pggraph.streamlit.app/" rel="nofollow">https://pggraph.streamlit.app/</a> You can search for various nodes, (eg. startup, sam altman, paul graham and see corresponding connections). In our system, we create graphs and store the relevant text chunks along with the entities, so on querying, we can extract the relevant entity, do a search on the graph and pull in the text chunks of all connected nodes, improving cross document queries.<p>For longer or multi-turn queries, we added persistent KV caching, which stores intermediate key-value states from transformer attention layers. Instead of recomputing attention from scratch every time, we reuse prior layers, speeding up repeated queries and letting us handle much longer context windows.<p>We’re open‑source under the MIT Expat license: <a href="https://github.com/morphik-org/morphik-core">https://github.com/morphik-org/morphik-core</a><p>Would love to hear your RAG horror stories, what worked, what didn’t and any feedback on Morphik. We’re here for it.
Show HN: Morphik – Open-source RAG that understands PDF images, runs locally
Hey HN, we’re Adi and Arnav. A few months ago, we hit a wall trying to get LLMs to answer questions over research papers and instruction manuals. Everything worked fine, until the answer lived inside an image or diagram embedded in the PDF. Even GPT‑4o flubbed it (we recently tried O3 with the same, and surprisingly it flubbed it too). Naive RAG pipelines just pulled in some text chunks and ignored the rest.<p>We took an invention disclosure PDF (<a href="https://drive.google.com/file/d/1ySzQgbNZkC5dPLtE3pnnVL2rW_9aTeuG/view?usp=sharing" rel="nofollow">https://drive.google.com/file/d/1ySzQgbNZkC5dPLtE3pnnVL2rW_9...</a>) containing an IRR‑vs‑frequency graph and asked GPT “From the graph, at what frequency is the IRR maximized?”. We originally tried this on gpt-4o, but while writing this used the new natively multimodal model o4‑mini‑high. After a 30‑second thinking pause, it asked for clarifications, then churned out buggy code, pulled data from the wrong page, and still couldn’t answer the question. We wrote up the full story with screenshots here: <a href="https://docs.morphik.ai/blogs/gpt-vs-morphik-multimodal">https://docs.morphik.ai/blogs/gpt-vs-morphik-multimodal</a>.<p>We got frustrated enough to try fixing it ourselves.<p>We built Morphik to do multimodal retrieval over documents like PDFs, where images and diagrams matter as much as the text.<p>To do this, we use Colpali-style embeddings, which treat each document page as an image and generate multi-vector representations. These embeddings capture layout, typography, and visual context, allowing retrieval to get a whole table or schematic, not just nearby tokens. Along with vector search, this could now retrieve exact pages with relevant diagrams and pass them as images to the LLM to get relevant answers. It’s able to answer the question with an 8B llama 3.1 vision running locally!<p>Early pharma testers hit our system with queries like "Which EGFR inhibitors at 50 mg showed ≥ 30% tumor reduction?" We correctly returned the right tables and plots, but still hit a bottleneck, we weren’t able to join the dots across multiple reports. So we built a knowledge graph: we tag entities in both text and images, normalize synonyms (Erlotinib → EGFR inhibitor), infer relations (e.g. administered_at, yields_reduction), and stitch everything into a graph. Now a single query could traverse that graph across documents and surface a coherent, cross‑document answer along with the correct pages as images.<p>To illustrate that, and just for fun, we built a graph of 100 Paul Graham’s essays here: <a href="https://pggraph.streamlit.app/" rel="nofollow">https://pggraph.streamlit.app/</a> You can search for various nodes, (eg. startup, sam altman, paul graham and see corresponding connections). In our system, we create graphs and store the relevant text chunks along with the entities, so on querying, we can extract the relevant entity, do a search on the graph and pull in the text chunks of all connected nodes, improving cross document queries.<p>For longer or multi-turn queries, we added persistent KV caching, which stores intermediate key-value states from transformer attention layers. Instead of recomputing attention from scratch every time, we reuse prior layers, speeding up repeated queries and letting us handle much longer context windows.<p>We’re open‑source under the MIT Expat license: <a href="https://github.com/morphik-org/morphik-core">https://github.com/morphik-org/morphik-core</a><p>Would love to hear your RAG horror stories, what worked, what didn’t and any feedback on Morphik. We’re here for it.
Show HN: Morphik – Open-source RAG that understands PDF images, runs locally
Hey HN, we’re Adi and Arnav. A few months ago, we hit a wall trying to get LLMs to answer questions over research papers and instruction manuals. Everything worked fine, until the answer lived inside an image or diagram embedded in the PDF. Even GPT‑4o flubbed it (we recently tried O3 with the same, and surprisingly it flubbed it too). Naive RAG pipelines just pulled in some text chunks and ignored the rest.<p>We took an invention disclosure PDF (<a href="https://drive.google.com/file/d/1ySzQgbNZkC5dPLtE3pnnVL2rW_9aTeuG/view?usp=sharing" rel="nofollow">https://drive.google.com/file/d/1ySzQgbNZkC5dPLtE3pnnVL2rW_9...</a>) containing an IRR‑vs‑frequency graph and asked GPT “From the graph, at what frequency is the IRR maximized?”. We originally tried this on gpt-4o, but while writing this used the new natively multimodal model o4‑mini‑high. After a 30‑second thinking pause, it asked for clarifications, then churned out buggy code, pulled data from the wrong page, and still couldn’t answer the question. We wrote up the full story with screenshots here: <a href="https://docs.morphik.ai/blogs/gpt-vs-morphik-multimodal">https://docs.morphik.ai/blogs/gpt-vs-morphik-multimodal</a>.<p>We got frustrated enough to try fixing it ourselves.<p>We built Morphik to do multimodal retrieval over documents like PDFs, where images and diagrams matter as much as the text.<p>To do this, we use Colpali-style embeddings, which treat each document page as an image and generate multi-vector representations. These embeddings capture layout, typography, and visual context, allowing retrieval to get a whole table or schematic, not just nearby tokens. Along with vector search, this could now retrieve exact pages with relevant diagrams and pass them as images to the LLM to get relevant answers. It’s able to answer the question with an 8B llama 3.1 vision running locally!<p>Early pharma testers hit our system with queries like "Which EGFR inhibitors at 50 mg showed ≥ 30% tumor reduction?" We correctly returned the right tables and plots, but still hit a bottleneck, we weren’t able to join the dots across multiple reports. So we built a knowledge graph: we tag entities in both text and images, normalize synonyms (Erlotinib → EGFR inhibitor), infer relations (e.g. administered_at, yields_reduction), and stitch everything into a graph. Now a single query could traverse that graph across documents and surface a coherent, cross‑document answer along with the correct pages as images.<p>To illustrate that, and just for fun, we built a graph of 100 Paul Graham’s essays here: <a href="https://pggraph.streamlit.app/" rel="nofollow">https://pggraph.streamlit.app/</a> You can search for various nodes, (eg. startup, sam altman, paul graham and see corresponding connections). In our system, we create graphs and store the relevant text chunks along with the entities, so on querying, we can extract the relevant entity, do a search on the graph and pull in the text chunks of all connected nodes, improving cross document queries.<p>For longer or multi-turn queries, we added persistent KV caching, which stores intermediate key-value states from transformer attention layers. Instead of recomputing attention from scratch every time, we reuse prior layers, speeding up repeated queries and letting us handle much longer context windows.<p>We’re open‑source under the MIT Expat license: <a href="https://github.com/morphik-org/morphik-core">https://github.com/morphik-org/morphik-core</a><p>Would love to hear your RAG horror stories, what worked, what didn’t and any feedback on Morphik. We’re here for it.
Show HN: I open-sourced my AI toy company that runs on ESP32 and OpenAI realtime
Hi HN! Last year the project I launched here got a lot of good feedback on creating speech to speech AI on the ESP32. Recently I revamped the whole stack, iterated on that feedback and made our project fully open-source—all of the client, hardware, firmware code.<p>This Github repo turns an ESP32-S3 into a realtime AI speech companion using the OpenAI Realtime API, Arduino WebSockets, Deno Edge Functions, and a full-stack web interface. You can talk to your own custom AI character, and it responds instantly.<p>I couldn't find a resource that helped set up a reliable, secure websocket (WSS) AI speech to speech service. While there are several useful Text-To-Speech (TTS) and Speech-To-Text (STT) repos out there, I believe none gets Speech-To-Speech right. OpenAI launched an embedded-repo late last year which sets up WebRTC with ESP-IDF. However, it's not beginner friendly and doesn't have a server side component for business logic.<p>This repo is an attempt at solving the above pains and creating a great speech to speech experience on Arduino with Secure Websockets using Edge Servers (with Deno/Supabase Edge Functions) for fast global connectivity and low latency.
Show HN: I open-sourced my AI toy company that runs on ESP32 and OpenAI realtime
Hi HN! Last year the project I launched here got a lot of good feedback on creating speech to speech AI on the ESP32. Recently I revamped the whole stack, iterated on that feedback and made our project fully open-source—all of the client, hardware, firmware code.<p>This Github repo turns an ESP32-S3 into a realtime AI speech companion using the OpenAI Realtime API, Arduino WebSockets, Deno Edge Functions, and a full-stack web interface. You can talk to your own custom AI character, and it responds instantly.<p>I couldn't find a resource that helped set up a reliable, secure websocket (WSS) AI speech to speech service. While there are several useful Text-To-Speech (TTS) and Speech-To-Text (STT) repos out there, I believe none gets Speech-To-Speech right. OpenAI launched an embedded-repo late last year which sets up WebRTC with ESP-IDF. However, it's not beginner friendly and doesn't have a server side component for business logic.<p>This repo is an attempt at solving the above pains and creating a great speech to speech experience on Arduino with Secure Websockets using Edge Servers (with Deno/Supabase Edge Functions) for fast global connectivity and low latency.
Show HN: Keep your PyTorch model in VRAM by hot swapping code
Show HN: Keep your PyTorch model in VRAM by hot swapping code
"Is This Tech Dead?" A snarky autopsy engine for your dead frameworks
Hi HN, I built this irony and data driven Regret-as-a-service tool to almost scientifically declare tech deaths. F.
Show HN: Open Codex – OpenAI Codex CLI with open-source LLMs
Hey HN,<p>I’ve built Open Codex, a fully local, open-source alternative to OpenAI’s Codex CLI.<p>My initial plan was to fork their project and extend it. I even started doing that. But it turned out their code has several leaky abstractions, which made it hard to override core behavior cleanly. Shortly after, OpenAI introduced breaking changes. Maintaining my customizations on top became increasingly difficult.<p>So I rewrote the whole thing from scratch using Python. My version is designed to support local LLMs.<p>Right now, it only works with phi-4-mini (GGUF) via lmstudio-community/Phi-4-mini-instruct-GGUF, but I plan to support more models. Everything is structured to be extendable.<p>At the moment I only support single-shot mode, but I intend to add interactive (chat mode), function calling, and more.<p>You can install it using Homebrew:<p><pre><code> brew tap codingmoh/open-codex
brew install open-codex
</code></pre>
It's also published on PyPI:<p><pre><code> pip install open-codex
</code></pre>
Source: <a href="https://github.com/codingmoh/open-codex">https://github.com/codingmoh/open-codex</a>
Show HN: Open Codex – OpenAI Codex CLI with open-source LLMs
Hey HN,<p>I’ve built Open Codex, a fully local, open-source alternative to OpenAI’s Codex CLI.<p>My initial plan was to fork their project and extend it. I even started doing that. But it turned out their code has several leaky abstractions, which made it hard to override core behavior cleanly. Shortly after, OpenAI introduced breaking changes. Maintaining my customizations on top became increasingly difficult.<p>So I rewrote the whole thing from scratch using Python. My version is designed to support local LLMs.<p>Right now, it only works with phi-4-mini (GGUF) via lmstudio-community/Phi-4-mini-instruct-GGUF, but I plan to support more models. Everything is structured to be extendable.<p>At the moment I only support single-shot mode, but I intend to add interactive (chat mode), function calling, and more.<p>You can install it using Homebrew:<p><pre><code> brew tap codingmoh/open-codex
brew install open-codex
</code></pre>
It's also published on PyPI:<p><pre><code> pip install open-codex
</code></pre>
Source: <a href="https://github.com/codingmoh/open-codex">https://github.com/codingmoh/open-codex</a>
Show HN: Open Codex – OpenAI Codex CLI with open-source LLMs
Hey HN,<p>I’ve built Open Codex, a fully local, open-source alternative to OpenAI’s Codex CLI.<p>My initial plan was to fork their project and extend it. I even started doing that. But it turned out their code has several leaky abstractions, which made it hard to override core behavior cleanly. Shortly after, OpenAI introduced breaking changes. Maintaining my customizations on top became increasingly difficult.<p>So I rewrote the whole thing from scratch using Python. My version is designed to support local LLMs.<p>Right now, it only works with phi-4-mini (GGUF) via lmstudio-community/Phi-4-mini-instruct-GGUF, but I plan to support more models. Everything is structured to be extendable.<p>At the moment I only support single-shot mode, but I intend to add interactive (chat mode), function calling, and more.<p>You can install it using Homebrew:<p><pre><code> brew tap codingmoh/open-codex
brew install open-codex
</code></pre>
It's also published on PyPI:<p><pre><code> pip install open-codex
</code></pre>
Source: <a href="https://github.com/codingmoh/open-codex">https://github.com/codingmoh/open-codex</a>
Show HN: Nerdlog – Fast, multi-host TUI log viewer with timeline histogram
For more background and technical details, I wrote this up as well: <a href="https://dmitryfrank.com/projects/nerdlog/article" rel="nofollow">https://dmitryfrank.com/projects/nerdlog/article</a>
Show HN: Nerdlog – Fast, multi-host TUI log viewer with timeline histogram
For more background and technical details, I wrote this up as well: <a href="https://dmitryfrank.com/projects/nerdlog/article" rel="nofollow">https://dmitryfrank.com/projects/nerdlog/article</a>
Show HN: Dia, an open-weights TTS model for generating realistic dialogue
Show HN: Dia, an open-weights TTS model for generating realistic dialogue