The best Hacker News stories from Show from the past week

Go back

Latest posts:

Show HN: Advanced Tab Manager for Firefox

Show HN: Langfuse – Open-source observability and analytics for LLM apps

Hi HN! Langfuse is OSS observability and analytics for LLM applications (repo: <a href="https://github.com/langfuse/langfuse">https://github.com/langfuse/langfuse</a>, 2 min demo: <a href="https://langfuse.com/video">https://langfuse.com/video</a>, try it yourself: <a href="https://langfuse.com/demo">https://langfuse.com/demo</a>)<p>Langfuse makes capturing and viewing LLM calls (execution traces) a breeze. On top of this data, you can analyze the quality, cost and latency of LLM apps.<p>When GPT-4 dropped, we started building LLM apps – a lot of them! [1, 2] But they all suffered from the same issue: it’s hard to assure quality in 100% of cases and even to have a clear view of user behavior. Initially, we logged all prompts/completions to our production database to understand what works and what doesn’t. We soon realized we needed more context, more data and better analytics to sustainably improve our apps. So we started building a homegrown tool.<p>Our first task was to track and view what is going on in production: what user input is provided, how prompt templates or vector db requests work, and which steps of an LLM chain fail. We built async SDKs and a slick frontend to render chains in a nested way. It’s a good way to look at LLM logic ‘natively’. Then we added some basic analytics to understand token usage and quality over time for the entire project or single users (pre-built dashboards).<p>Under the hood, we use the T3 stack (Typescript, NextJs, Prisma, tRPC, Tailwind, NextAuth), which allows us to move fast + it means it's easy to contribute to our repo. The SDKs are heavily influenced by the design of the PostHog SDKs [3] for stable implementations of async network requests. It was a surprisingly inconvenient experience to convert OpenAPI specs to boilerplate Python code and we ended up using Fern [4] here. We’re fans of Tailwind + shadcn/ui + tremor.so for speed and flexibility in building tables and dashboards fast.<p>Our SDKs run fully asynchronously and make network requests in the background. We did our best to reduce any impact on application performance to a minimum. We never block the main execution path.<p>We've made two engineering decisions we've felt uncertain about: to use a Postgres database and Looker Studio for the analytics MVP. Supabase performs well at our scale and integrates seamlessly into our tech stack. We will need to move to an OLAP database soon and are debating if we need to start batching ingestion and if we can keep using Vercel. Any experience you could share would be helpful!<p>Integrating Looker Studio got us to first analytics charts in half a day. As it is not open-source and does not work with our UI/UX, we are looking to switch it out for an OSS solution to flexibly generate charts and dashboards. We’ve had a look at Lightdash and would be happy to hear your thoughts.<p>We’re borrowing our OSS business model from Posthog/Supabase who make it easy to self-host with features reserved for enterprise (no plans yet) and a paid version for managed cloud service. Right now all of our code is available under a permissive license (MIT).<p>Next, we’re going deep on analytics. For quality specifically, we will build out model-based evaluations and labeling to be able to cluster traces by scores and use cases.<p>Looking forward to hearing your thoughts and discussion – we’ll be in the comments. Thanks!<p>[1] <a href="https://learn-from-ai.com/" rel="nofollow noreferrer">https://learn-from-ai.com/</a><p>[2] <a href="https://www.loom.com/share/5c044ca77be44ff7821967834dd70cba" rel="nofollow noreferrer">https://www.loom.com/share/5c044ca77be44ff7821967834dd70cba</a><p>[3] <a href="https://posthog.com/docs/libraries">https://posthog.com/docs/libraries</a><p>[4] <a href="https://buildwithfern.com/">https://buildwithfern.com/</a>

Show HN: Going into freshman year, figured I should build an interpreter

Hi all!<p>I'm going into my freshman year, and figured that the best way to prepare for the intro to programming Racket course would be to implement my own garbage-collected, dynamically typed, functional programming language in C ;)<p>Anyways... here's the repo: https://github.com/liam-ilan/crumb<p>I started learning C over the summer, so I still have a whole lot to learn... Any feedback would be greatly appreciated! :D

Show HN: Going into freshman year, figured I should build an interpreter

Hi all!<p>I'm going into my freshman year, and figured that the best way to prepare for the intro to programming Racket course would be to implement my own garbage-collected, dynamically typed, functional programming language in C ;)<p>Anyways... here's the repo: https://github.com/liam-ilan/crumb<p>I started learning C over the summer, so I still have a whole lot to learn... Any feedback would be greatly appreciated! :D

Show HN: Release AI – Talk to Your Infrastructure

Hello, Hacker News! I'm David, cofounder of Release (YCW20). Introducing Release AI, a tool designed to empower users with instant access to DevOps expertise, all without monopolizing the valuable time of our experts. Developed with the developer and engineer community in mind, Release AI takes the power of OpenAI's cutting-edge GPT-4 public LLM and augments it with DevOps knowledge.<p>In its initial phase, Release AI offers "read-only" access to both AWS and Kubernetes. This means you can engage in insightful conversations with your AWS account and K8s infrastructure effortlessly. Looking ahead, our roadmap includes plans to integrate more tools for commonly used systems. This will enable you to automate an even broader array of your daily tasks.<p>If you would like more info you can check-out our launch YC (it has more details, screen casts): <a href="https://www.ycombinator.com/launches/JI1-release-ai-talk-to-your-infrastructure">https://www.ycombinator.com/launches/JI1-release-ai-talk-to-...</a><p>Our quickstart guide: <a href="https://docs.release.com/release-ai/quickstart">https://docs.release.com/release-ai/quickstart</a><p>Signup and use it: <a href="https://beta.release.com/ai/register">https://beta.release.com/ai/register</a><p>Please give it a try! We would love your feedback as we are enhancing Release AI, reach out to us with any feature requests or crazy ideas that Release AI could do for you. Feel free to email me at david@release.com or leave a comment, looking forward to chatting with you.<p>Join the conversation in our Slack community and discover the future of DevOps with Release AI!

Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B

Hi HN,<p>We have fine-tuned CodeLlama-34B and CodeLlama-34B-Python on an internal Phind dataset that achieved 67.6% and 69.5% pass@1 on HumanEval, respectively. GPT-4 achieved 67%. To ensure result validity, we applied OpenAI's decontamination methodology to our dataset.<p>The CodeLlama models released yesterday demonstrate impressive performance on HumanEval.<p>- CodeLlama-34B achieved 48.8% pass@1 on HumanEval<p>- CodeLlama-34B-Python achieved 53.7% pass@1 on HumanEval<p>We have fine-tuned both models on a proprietary dataset of ~80k high-quality programming problems and solutions. Instead of code completion examples, this dataset features instruction-answer pairs, setting it apart structurally from HumanEval. We trained the Phind models over two epochs, for a total of ~160k examples. LoRA was not used — both models underwent a native fine-tuning. We employed DeepSpeed ZeRO 3 and Flash Attention 2 to train these models in three hours using 32 A100-80GB GPUs, with a sequence length of 4096 tokens.<p>Furthermore, we applied OpenAI's decontamination methodology to our dataset to ensure valid results, and found no contaminated examples.<p>The methodology is:<p>- For each evaluation example, we randomly sampled three substrings of 50 characters or used the entire example if it was fewer than 50 characters.<p>- A match was identified if any sampled substring was a substring of the processed training example.<p>For further insights on the decontamination methodology, please refer to Appendix C of OpenAI's technical report.<p>Presented below are the pass@1 scores we achieved with our fine-tuned models:<p>- Phind-CodeLlama-34B-v1 achieved 67.6% pass@1 on HumanEval<p>- Phind-CodeLlama-34B-Python-v1 achieved 69.5% pass@1 on HumanEval<p>Note on GPT-4<p>According to the official technical report in March, OpenAI reported a pass@1 score of 67% for GPT-4's performance on HumanEval. Since then, there have been claims reporting higher scores. However, it's essential to note that there hasn't been any concrete evidence pointing towards an enhancement in the model's coding abilities since then. It's also crucial to highlight that these elevated figures lack the rigorous contamination analysis that the official statistic underwent, making them less of a reliable comparison. As a result, we consider 67% as the pass@1 score for GPT-4.<p>Download<p>We are releasing both models on Huggingface for verifiability and to bolster the open-source community. We welcome independent verification of results.<p>Phind-CodeLlama-34B-v1: <a href="https://huggingface.co/Phind/Phind-CodeLlama-34B-v1" rel="nofollow noreferrer">https://huggingface.co/Phind/Phind-CodeLlama-34B-v1</a><p>Phind-CodeLlama-34B-Python-v1: <a href="https://huggingface.co/Phind/Phind-CodeLlama-34B-Python-v1" rel="nofollow noreferrer">https://huggingface.co/Phind/Phind-CodeLlama-34B-Python-v1</a><p>We'd love to hear your thoughts!<p>Best,<p>The Phind Team

Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B

Hi HN,<p>We have fine-tuned CodeLlama-34B and CodeLlama-34B-Python on an internal Phind dataset that achieved 67.6% and 69.5% pass@1 on HumanEval, respectively. GPT-4 achieved 67%. To ensure result validity, we applied OpenAI's decontamination methodology to our dataset.<p>The CodeLlama models released yesterday demonstrate impressive performance on HumanEval.<p>- CodeLlama-34B achieved 48.8% pass@1 on HumanEval<p>- CodeLlama-34B-Python achieved 53.7% pass@1 on HumanEval<p>We have fine-tuned both models on a proprietary dataset of ~80k high-quality programming problems and solutions. Instead of code completion examples, this dataset features instruction-answer pairs, setting it apart structurally from HumanEval. We trained the Phind models over two epochs, for a total of ~160k examples. LoRA was not used — both models underwent a native fine-tuning. We employed DeepSpeed ZeRO 3 and Flash Attention 2 to train these models in three hours using 32 A100-80GB GPUs, with a sequence length of 4096 tokens.<p>Furthermore, we applied OpenAI's decontamination methodology to our dataset to ensure valid results, and found no contaminated examples.<p>The methodology is:<p>- For each evaluation example, we randomly sampled three substrings of 50 characters or used the entire example if it was fewer than 50 characters.<p>- A match was identified if any sampled substring was a substring of the processed training example.<p>For further insights on the decontamination methodology, please refer to Appendix C of OpenAI's technical report.<p>Presented below are the pass@1 scores we achieved with our fine-tuned models:<p>- Phind-CodeLlama-34B-v1 achieved 67.6% pass@1 on HumanEval<p>- Phind-CodeLlama-34B-Python-v1 achieved 69.5% pass@1 on HumanEval<p>Note on GPT-4<p>According to the official technical report in March, OpenAI reported a pass@1 score of 67% for GPT-4's performance on HumanEval. Since then, there have been claims reporting higher scores. However, it's essential to note that there hasn't been any concrete evidence pointing towards an enhancement in the model's coding abilities since then. It's also crucial to highlight that these elevated figures lack the rigorous contamination analysis that the official statistic underwent, making them less of a reliable comparison. As a result, we consider 67% as the pass@1 score for GPT-4.<p>Download<p>We are releasing both models on Huggingface for verifiability and to bolster the open-source community. We welcome independent verification of results.<p>Phind-CodeLlama-34B-v1: <a href="https://huggingface.co/Phind/Phind-CodeLlama-34B-v1" rel="nofollow noreferrer">https://huggingface.co/Phind/Phind-CodeLlama-34B-v1</a><p>Phind-CodeLlama-34B-Python-v1: <a href="https://huggingface.co/Phind/Phind-CodeLlama-34B-Python-v1" rel="nofollow noreferrer">https://huggingface.co/Phind/Phind-CodeLlama-34B-Python-v1</a><p>We'd love to hear your thoughts!<p>Best,<p>The Phind Team

Show HN: Use Code Llama as Drop-In Replacement for Copilot Chat

Hi HN,<p>Code Llama was released, but we noticed a ton of questions in the main thread about how/where to use it — not just from an API or the terminal, but <i>in your own codebase</i> as a drop-in replacement for Copilot Chat. Without this, developers don't get much utility from the model.<p>This concern is also important because benchmarks like HumanEval don't perfectly reflect the quality of responses. There's likely to be a flurry of improvements to coding models in the coming months, and rather than relying on the benchmarks to evaluate them, the community will get better feedback from people actually using the models. This means <i>real</i> usage in <i>real</i>, everyday workflows.<p>We've worked to make this possible with Continue (<a href="https://github.com/continuedev/continue">https://github.com/continuedev/continue</a>) and want to hear what you find to be the real capabilities of Code Llama. Is it on-par with GPT-4, does it require fine-tuning, or does it excel at certain tasks?<p>If you’d like to try Code Llama with Continue, it only takes a few steps to set up (<a href="https://continue.dev/docs/walkthroughs/codellama">https://continue.dev/docs/walkthroughs/codellama</a>), either locally with Ollama, or through TogetherAI or Replicate's APIs.

Show HN: Use Code Llama as Drop-In Replacement for Copilot Chat

Hi HN,<p>Code Llama was released, but we noticed a ton of questions in the main thread about how/where to use it — not just from an API or the terminal, but <i>in your own codebase</i> as a drop-in replacement for Copilot Chat. Without this, developers don't get much utility from the model.<p>This concern is also important because benchmarks like HumanEval don't perfectly reflect the quality of responses. There's likely to be a flurry of improvements to coding models in the coming months, and rather than relying on the benchmarks to evaluate them, the community will get better feedback from people actually using the models. This means <i>real</i> usage in <i>real</i>, everyday workflows.<p>We've worked to make this possible with Continue (<a href="https://github.com/continuedev/continue">https://github.com/continuedev/continue</a>) and want to hear what you find to be the real capabilities of Code Llama. Is it on-par with GPT-4, does it require fine-tuning, or does it excel at certain tasks?<p>If you’d like to try Code Llama with Continue, it only takes a few steps to set up (<a href="https://continue.dev/docs/walkthroughs/codellama">https://continue.dev/docs/walkthroughs/codellama</a>), either locally with Ollama, or through TogetherAI or Replicate's APIs.

Show HN: Shimmer – ADHD coaching for adults, now on web

Hi, I’m Chris, one of the co-founders of Shimmer. Last October, following my ADHD diagnosis, I launched Shimmer (<a href="https://shimmer.care">https://shimmer.care</a>), one-to-one ADHD Coaching for adults. Our HN launch was here: <a href="https://news.ycombinator.com/item?id=33468611">https://news.ycombinator.com/item?id=33468611</a>.<p>A quick recap before I dive into our new launch: Shimmer is an ADHD coaching service for adults. We took apart the traditionally expensive, inaccessible ADHD coaching offering ($300-600+/session) and redesigned it from first principles. You get matched with one of our expert ADHD coaches, meet weekly over video, and get supported throughout the week via text and with learning tools. This solution is special to me personally (and our community) because it doesn’t just give you “knowledge” or offer another “tool”—our coaches help you set realistic goals, take personalized steps towards it, and keep you accountable.<p>Today we’re excited to launch our most-request feature: Web.<p>Over the past 9 months, we learned (and iterated) a lot with our members and coaches. A few key challenges pointed to the need for a web version: (1) ADHD “object permanence” challenges (e.g. out of sight out of mind), we needed to be multi-platform so when you finish a task or goal or encounter a challenge, regardless of if you’re near your laptop or phone, you can check it off & ping your coach right away, (2) members used reflection modules (e.g. after each task, you’re prompted to reflect on what worked and didn’t work, and it informs your coach) more thoroughly than we originally anticipated, and web allows for deeper reflection and typing, (3) overarching coaching goals were often forgotten during the day-to-day, and the web makes it easier to use visual cues to keep goals top of mind for motivation, (4) many of our members struggle with phone addiction and driving members to the mobile app ended them up in Tiktok/IG, whereas the web app offers a focused environment to get in their “coaching zone”.<p>Our new web app was designed alongside over 1,200 members, 22 coaches, countless hours of testing and iterating. We’re excited (but nervous!) to unveil this new version. If you have ADHD (or think you do), we’d love for you to check out our platform and give us critical feedback (or positive reinforcement!). It’s a super streamlined and ADHD-friendly signup process and in honor of our web launch and back to school/work, the first month is 30% off.<p>Our pricing: $115/mo. for Essentials plan (15-min weekly sessions), $230/mo. for Standard plan (30-min weekly sessions), $345/mo. for Immersive plan (45-min weekly sessions); all plans additional 30% off first month, HSA/FSA-eligible.<p>We know these prices are expensive for many people with ADHD and we’re committed to bringing costs down over time. It’s more affordable than what many people are paying for coaches, but the fact that we’re relying on humans, and not going the “we can automate all this with AI” route, puts a floor on how low the costs can drop. That said, here are some actions we’re taking to drive down costs for those who need it: (1) we offer needs-based scholarships and aim to have 5% of members on them at any time, (2) we often run fully sponsored scholarships with our partners—over 40 full ride scholarships and 100 group coaching spots have been disbursed alongside Asian Mental Health Project, government of Canada, and more, and (3) we have aligned our coaching model alongside Health & Wellness Coaching, which is expected to be reimbursed in 2024. If you have ideas or expertise here, please reach out to me directly at chris@shimmer.care.<p>On behalf of our small but mighty & passionate Shimmer team, I’m excited for the Hacker News community to share your thoughts, feedback, and ideas. If you feel comfortable, I’d also love to hear your personal ADHD story and what has worked / hasn’t worked for you.<p>Co-founders Christal & Vikram

Show HN: Shimmer – ADHD coaching for adults, now on web

Hi, I’m Chris, one of the co-founders of Shimmer. Last October, following my ADHD diagnosis, I launched Shimmer (<a href="https://shimmer.care">https://shimmer.care</a>), one-to-one ADHD Coaching for adults. Our HN launch was here: <a href="https://news.ycombinator.com/item?id=33468611">https://news.ycombinator.com/item?id=33468611</a>.<p>A quick recap before I dive into our new launch: Shimmer is an ADHD coaching service for adults. We took apart the traditionally expensive, inaccessible ADHD coaching offering ($300-600+/session) and redesigned it from first principles. You get matched with one of our expert ADHD coaches, meet weekly over video, and get supported throughout the week via text and with learning tools. This solution is special to me personally (and our community) because it doesn’t just give you “knowledge” or offer another “tool”—our coaches help you set realistic goals, take personalized steps towards it, and keep you accountable.<p>Today we’re excited to launch our most-request feature: Web.<p>Over the past 9 months, we learned (and iterated) a lot with our members and coaches. A few key challenges pointed to the need for a web version: (1) ADHD “object permanence” challenges (e.g. out of sight out of mind), we needed to be multi-platform so when you finish a task or goal or encounter a challenge, regardless of if you’re near your laptop or phone, you can check it off & ping your coach right away, (2) members used reflection modules (e.g. after each task, you’re prompted to reflect on what worked and didn’t work, and it informs your coach) more thoroughly than we originally anticipated, and web allows for deeper reflection and typing, (3) overarching coaching goals were often forgotten during the day-to-day, and the web makes it easier to use visual cues to keep goals top of mind for motivation, (4) many of our members struggle with phone addiction and driving members to the mobile app ended them up in Tiktok/IG, whereas the web app offers a focused environment to get in their “coaching zone”.<p>Our new web app was designed alongside over 1,200 members, 22 coaches, countless hours of testing and iterating. We’re excited (but nervous!) to unveil this new version. If you have ADHD (or think you do), we’d love for you to check out our platform and give us critical feedback (or positive reinforcement!). It’s a super streamlined and ADHD-friendly signup process and in honor of our web launch and back to school/work, the first month is 30% off.<p>Our pricing: $115/mo. for Essentials plan (15-min weekly sessions), $230/mo. for Standard plan (30-min weekly sessions), $345/mo. for Immersive plan (45-min weekly sessions); all plans additional 30% off first month, HSA/FSA-eligible.<p>We know these prices are expensive for many people with ADHD and we’re committed to bringing costs down over time. It’s more affordable than what many people are paying for coaches, but the fact that we’re relying on humans, and not going the “we can automate all this with AI” route, puts a floor on how low the costs can drop. That said, here are some actions we’re taking to drive down costs for those who need it: (1) we offer needs-based scholarships and aim to have 5% of members on them at any time, (2) we often run fully sponsored scholarships with our partners—over 40 full ride scholarships and 100 group coaching spots have been disbursed alongside Asian Mental Health Project, government of Canada, and more, and (3) we have aligned our coaching model alongside Health & Wellness Coaching, which is expected to be reimbursed in 2024. If you have ideas or expertise here, please reach out to me directly at chris@shimmer.care.<p>On behalf of our small but mighty & passionate Shimmer team, I’m excited for the Hacker News community to share your thoughts, feedback, and ideas. If you feel comfortable, I’d also love to hear your personal ADHD story and what has worked / hasn’t worked for you.<p>Co-founders Christal & Vikram

Show HN: Open-source obsidian.md sync server

<a href="https://github.com/acheong08/obsidian-sync">https://github.com/acheong08/obsidian-sync</a><p>Hello HN,<p>I'm a recent high school graduate and can't afford $8 per month for the official sync service, so I tried my hand at replicating the server.<p>It's still missing a few features, such as file recovery and history, but the basic sync is working.<p>To the creators of Obsidian.md: I'm probably violating the TOS, and I'm sorry. I'll take down the repository if asked. It's not ready for production and is highly inefficient; Not competition, so I hope you'll be lenient.

Show HN: Open-source obsidian.md sync server

<a href="https://github.com/acheong08/obsidian-sync">https://github.com/acheong08/obsidian-sync</a><p>Hello HN,<p>I'm a recent high school graduate and can't afford $8 per month for the official sync service, so I tried my hand at replicating the server.<p>It's still missing a few features, such as file recovery and history, but the basic sync is working.<p>To the creators of Obsidian.md: I'm probably violating the TOS, and I'm sorry. I'll take down the repository if asked. It's not ready for production and is highly inefficient; Not competition, so I hope you'll be lenient.

Show HN: Dataherald AI – Natural Language to SQL Engine

Hi HN community. We are excited to open source Dataherald’s natural-language-to-SQL engine today (<a href="https://github.com/Dataherald/dataherald">https://github.com/Dataherald/dataherald</a>). This engine allows you to set up an API from your structured database that can answer questions in plain English.<p>GPT-4 class LLMs have gotten remarkably good at writing SQL. However, out-of-the-box LLMs and existing frameworks would not work with our own structured data at a necessary quality level. For example, given the question “what was the average rent in Los Angeles in May 2023?” a reasonable human would either assume the question is about Los Angeles, CA or would confirm the state with the question asker in a follow up. However, an LLM translates this to:<p>select price from rent_prices where city=”Los Angeles” AND month=”05” AND year=”2023”<p>This pulls data for Los Angeles, CA and Los Angeles, TX without getting columns to differentiate between the two. You can read more about the challenges of enterprise-level text-to-SQL in this blog post I wrote on the topic: <a href="https://medium.com/dataherald/why-enterprise-natural-language-to-sql-is-hard-8849414f41c" rel="nofollow noreferrer">https://medium.com/dataherald/why-enterprise-natural-languag...</a><p>Dataherald comes with “batteries-included.” It has best-in-class implementations of core components, including, but not limited to: a state of the art NL-to-SQL agent, an LLM-based SQL-accuracy evaluator. The architecture is modular, allowing these components to be easily replaced. It’s easy to set up and use with major data warehouses.<p>There is a “Context Store” where information (NL2SQL examples, schemas and table descriptions) is used for the LLM prompts to make the engine get better with usage. And we even made it fast!<p>This version allows you to easily connect to PG, Databricks, BigQuery or Snowflake and set up an API for semantic interactions with your structured data. You can then add business and data context that are used for few-shot prompting by the engine.<p>The NL-to-SQL agent in this open source release was developed by our own Mohammadreza Pourreza, whose DIN-SQL algorithm is currently top of the Spider (<a href="https://yale-lily.github.io/spider" rel="nofollow noreferrer">https://yale-lily.github.io/spider</a>) and Bird (<a href="https://bird-bench.github.io/" rel="nofollow noreferrer">https://bird-bench.github.io/</a>) NL 2 SQL benchmarks. This agent has outperformed the Langchain SQLAgent anywhere from 12%-250%.5x (depending on the provided context) in our own internal benchmarking while being only ~15s slower on average.<p>Needless to say, this is an early release and the codebase is under swift development. We would love for you to try it out and give us your feedback! And if you are interested in contributing, we’d love to hear from you!

Show HN: Dataherald AI – Natural Language to SQL Engine

Hi HN community. We are excited to open source Dataherald’s natural-language-to-SQL engine today (<a href="https://github.com/Dataherald/dataherald">https://github.com/Dataherald/dataherald</a>). This engine allows you to set up an API from your structured database that can answer questions in plain English.<p>GPT-4 class LLMs have gotten remarkably good at writing SQL. However, out-of-the-box LLMs and existing frameworks would not work with our own structured data at a necessary quality level. For example, given the question “what was the average rent in Los Angeles in May 2023?” a reasonable human would either assume the question is about Los Angeles, CA or would confirm the state with the question asker in a follow up. However, an LLM translates this to:<p>select price from rent_prices where city=”Los Angeles” AND month=”05” AND year=”2023”<p>This pulls data for Los Angeles, CA and Los Angeles, TX without getting columns to differentiate between the two. You can read more about the challenges of enterprise-level text-to-SQL in this blog post I wrote on the topic: <a href="https://medium.com/dataherald/why-enterprise-natural-language-to-sql-is-hard-8849414f41c" rel="nofollow noreferrer">https://medium.com/dataherald/why-enterprise-natural-languag...</a><p>Dataherald comes with “batteries-included.” It has best-in-class implementations of core components, including, but not limited to: a state of the art NL-to-SQL agent, an LLM-based SQL-accuracy evaluator. The architecture is modular, allowing these components to be easily replaced. It’s easy to set up and use with major data warehouses.<p>There is a “Context Store” where information (NL2SQL examples, schemas and table descriptions) is used for the LLM prompts to make the engine get better with usage. And we even made it fast!<p>This version allows you to easily connect to PG, Databricks, BigQuery or Snowflake and set up an API for semantic interactions with your structured data. You can then add business and data context that are used for few-shot prompting by the engine.<p>The NL-to-SQL agent in this open source release was developed by our own Mohammadreza Pourreza, whose DIN-SQL algorithm is currently top of the Spider (<a href="https://yale-lily.github.io/spider" rel="nofollow noreferrer">https://yale-lily.github.io/spider</a>) and Bird (<a href="https://bird-bench.github.io/" rel="nofollow noreferrer">https://bird-bench.github.io/</a>) NL 2 SQL benchmarks. This agent has outperformed the Langchain SQLAgent anywhere from 12%-250%.5x (depending on the provided context) in our own internal benchmarking while being only ~15s slower on average.<p>Needless to say, this is an early release and the codebase is under swift development. We would love for you to try it out and give us your feedback! And if you are interested in contributing, we’d love to hear from you!

Show HN: FlakeHub – Discover and publish Nix flakes

Show HN: Just intonation keyboard – play music without knowing music

This is a keyboard in just intonation. It can play the notes a piano can. The big difference from a piano is that all the notes become consonant. At least, when you want to play a dissonant chord, you are clearly opting in to it because it's clear which notes are dissonant to each other. You won't bump into a dissonant note by mistake.<p>You can play without knowing any music theory. Hit arbitrary notes with the rhythm you want, and the pitches will work. Not understanding the buttons is fine. Even rolling your elbow around your keyboard is fine.<p>If you are a musician and press the wrong key while playing a song, it will still fit. It will sound like you made an intelligent, conscious choice to play another note, even though you know in your heart it was an accident. Beginner jazz musicians rejoice.<p>It's not an AI making choices for you; it's just a very elegant interface. What makes this possible is several new discoveries in psychoacoustics about how harmony works. While a piano lays out notes in pitch space, this keyboard is able to lay out notes in consonance space. When you play random notes, they tend to be "close together" on the physical keyboard. Distance on the keyboard maps well to distance in consonance space, so those random notes are close together in consonance space and sound good together.<p>According to Miles Davis, a "wrong" note becomes correct in the right context. If you try to play a wrong note, the purple buttons you press will automatically land you in the right context, even if you don't know what that context is yourself. So you can stumble your way through an improv and the keyboard will offer the right notes without needing you to think about it.<p>Harmonic consonance of chords can be read directly off the numbers in the keyboard, which implies that these numbers are a good language to think about music with. It doesn't take years of training, just reading the rules. The key harmony insight that you can do on this keyboard, and not on a piano, is to add frequencies linearly (like 400 Hz + 300 Hz). The reason this matters is that linear combinations of frequencies are a major factor of harmony, in lattice tones. So to see how dissonant or consonant a chord is, you want to check how distant it is from a sum or arithmetic progression. On a piano, to do the same, you'd have to memorize fractional approximations of 2^(N/12), then add and subtract these fractions, which is very difficult. For example, how far is 6/5 + 4/3 from 5/2? Hard to say! But if denominators are cleared, it's easy to compare 36 40 45: they're off by 1 from an arithmetic progression. This also applies to overlapping notes, not just chords. Having all the keys accessible on a piano is very convenient, but this translation layer of 2^(N/12) approximation + fractional arithmetic makes it hard to see harmony beyond the pairwise ratios.<p>The subset of playable songs is different from a piano, which means that songs in your existing piano repertoire will snip off some notes. Hardware for thumb keys would fix this, so you could play your existing piano songs in full, plus other songs a piano can't play. I don't have such hardware so I haven't implemented this. The other way is to have two keyboards and a partner.<p>The remaining issue is that there is no sheet music in just intonation. Unfortunately, I have had no success in finding piano sheet music in a common, interpretable format. So while I do have a converter from 12 equal temperament to just intonation, there are no input files to use it with...

Show HN: A simple, open-source Notion-like avatar generator

Show HN: Rivet – Open-source game server management with Nomad and Rust

Hey HN!<p>Rivet is an OSS game server management tool that enables game developers to easily deploy their dedicated servers without any infra experience.<p>We recently open-sourced Rivet after working on it for the past couple of years. I wanted to share some of my favorite things about our experience building this with the HN community.<p>My cofounder and I have been building multiplayer games together since middle school for fun (and not much profit [1]). In HS, I stumbled into building the entire infrastructure powering [Krunker.io](<a href="http://Krunker.io" rel="nofollow noreferrer">http://Krunker.io</a>) (acq by FRVR) & other popular multiplayer web games. After wasting months rebuilding dedicated server infrastructure + DDoS/bot mitigation over and over, we started building Rivet as a side project.<p>Some interesting tidbits:<p>- ~99% Rust and a smidgeon of Lua.<p>- Bolt [2] – Cluster dev & management toolchain for super configurable self-hosted Rivet clusters. It’s way over-engineered.<p>- The entire repo is usable as a library. Our EE repo uses OSS as a submodule.<p>- Traefik used as an edge proxy for low-latency UDP, TCP+TLS, & WSS traffic.<p>- Apache Traffic Server is under-appreciated as a large file cache. Used as an edge Docker pull-through cache to improve cold starts & as a CDN cache to lower our S3 bill.<p>- ClickHouse used for analytics & game server logs. It’s so simple, I have nothing more to say.<p>- Serving Docker images with Apache TS is simpler & cheaper than running a Docker pull-through cache.<p>- Nebula has been rock solid & easy to operate as our overlay network.<p>- We use Redis Lua scripts for complex, atomic, in-memory operations.<p>- Obviously, we love Nix.<p>- We keep a rough SBOM [3].<p>- Licensed under Apache 2.0 (OSI-approved). We seriously want people to run & tinker with Rivet themselves. We get a lot of questions about this: [4] [5]<p>Some HN-flavored FAQ:<p>> Why not build on top of Agones or Kubernetes?<p>Nomad is simpler & more flexible than Agones/Kubernetes out of the box, which let us get up and running faster. For example, Nomad natively supports multiple task drivers, edge workloads, and runs as a standalone binary.<p>> [Fly.io](<a href="http://Fly.io">http://Fly.io</a>) migrated off of Nomad, how will you scale?<p>Nomad can support 2M containers [6]. Some quick math: avg 8 players per lobby * 2M lobbies * 8 regional clusters = ~128M CCU. That’s well above PUBG’s 3.2m CCU peak.<p>Roblox’s game servers also run on top of Nomad [7]. We’re in good company.<p>> Are you affected by the recent Nomad BSL relicensing [8]?<p>Maybe, see [9].<p>> How do you compare to $X?<p>Our core goal is to get developers up and running as fast as possible. We provide extra services like our matchmaker [10], CDN [11], and KV [12] to make shipping a fully-fledged multiplayer game require only a couple of lines of code.<p>No other project provides a comparably accessible, OSS, and comprehensive game server manager.<p>> Do you handle networking logic?<p>No. We work with existing tools like FishNet, Mirror, NGO, Unreal & Godot replication, and anything else you can run in Docker.<p>> Is anyone actually using this?<p>Yes, we’ve been running in closed beta since Jan ‘22 and currently support millions of MAU across many titles.<p>[1]: <a href="https://github.com/rivet-gg/microgravity.io">https://github.com/rivet-gg/microgravity.io</a><p>[2]: <a href="https://github.com/rivet-gg/rivet/tree/main/docs/libraries/bolt">https://github.com/rivet-gg/rivet/tree/main/docs/libraries/b...</a><p>[3]: <a href="https://github.com/rivet-gg/rivet/blob/main/docs/infrastructure/SBOM.md">https://github.com/rivet-gg/rivet/blob/main/docs/infrastruct...</a><p>[4]: <a href="https://github.com/rivet-gg/rivet/blob/main/docs/philosophy/LICENSING.md">https://github.com/rivet-gg/rivet/blob/main/docs/philosophy/...</a><p>[5]: <a href="https://github.com/rivet-gg/rivet/blob/main/docs/philosophy/WHY_OPEN_SOURCE.md">https://github.com/rivet-gg/rivet/blob/main/docs/philosophy/...</a><p>[6]: <a href="https://www.hashicorp.com/c2m" rel="nofollow noreferrer">https://www.hashicorp.com/c2m</a><p>[7]: <a href="https://www.hashicorp.com/case-studies/roblox" rel="nofollow noreferrer">https://www.hashicorp.com/case-studies/roblox</a><p>[8]: <a href="https://www.hashicorp.com/blog/hashicorp-adopts-business-source-license" rel="nofollow noreferrer">https://www.hashicorp.com/blog/hashicorp-adopts-business-sou...</a><p>[9]: <a href="https://news.ycombinator.com/item?id=37084825">https://news.ycombinator.com/item?id=37084825</a><p>[10]: <a href="https://rivet.gg/docs/matchmaker">https://rivet.gg/docs/matchmaker</a><p>[11]: <a href="https://rivet.gg/docs/cdn">https://rivet.gg/docs/cdn</a><p>[12]: <a href="https://rivet.gg/docs/kv">https://rivet.gg/docs/kv</a>

Show HN: Rivet – Open-source game server management with Nomad and Rust

Hey HN!<p>Rivet is an OSS game server management tool that enables game developers to easily deploy their dedicated servers without any infra experience.<p>We recently open-sourced Rivet after working on it for the past couple of years. I wanted to share some of my favorite things about our experience building this with the HN community.<p>My cofounder and I have been building multiplayer games together since middle school for fun (and not much profit [1]). In HS, I stumbled into building the entire infrastructure powering [Krunker.io](<a href="http://Krunker.io" rel="nofollow noreferrer">http://Krunker.io</a>) (acq by FRVR) & other popular multiplayer web games. After wasting months rebuilding dedicated server infrastructure + DDoS/bot mitigation over and over, we started building Rivet as a side project.<p>Some interesting tidbits:<p>- ~99% Rust and a smidgeon of Lua.<p>- Bolt [2] – Cluster dev & management toolchain for super configurable self-hosted Rivet clusters. It’s way over-engineered.<p>- The entire repo is usable as a library. Our EE repo uses OSS as a submodule.<p>- Traefik used as an edge proxy for low-latency UDP, TCP+TLS, & WSS traffic.<p>- Apache Traffic Server is under-appreciated as a large file cache. Used as an edge Docker pull-through cache to improve cold starts & as a CDN cache to lower our S3 bill.<p>- ClickHouse used for analytics & game server logs. It’s so simple, I have nothing more to say.<p>- Serving Docker images with Apache TS is simpler & cheaper than running a Docker pull-through cache.<p>- Nebula has been rock solid & easy to operate as our overlay network.<p>- We use Redis Lua scripts for complex, atomic, in-memory operations.<p>- Obviously, we love Nix.<p>- We keep a rough SBOM [3].<p>- Licensed under Apache 2.0 (OSI-approved). We seriously want people to run & tinker with Rivet themselves. We get a lot of questions about this: [4] [5]<p>Some HN-flavored FAQ:<p>> Why not build on top of Agones or Kubernetes?<p>Nomad is simpler & more flexible than Agones/Kubernetes out of the box, which let us get up and running faster. For example, Nomad natively supports multiple task drivers, edge workloads, and runs as a standalone binary.<p>> [Fly.io](<a href="http://Fly.io">http://Fly.io</a>) migrated off of Nomad, how will you scale?<p>Nomad can support 2M containers [6]. Some quick math: avg 8 players per lobby * 2M lobbies * 8 regional clusters = ~128M CCU. That’s well above PUBG’s 3.2m CCU peak.<p>Roblox’s game servers also run on top of Nomad [7]. We’re in good company.<p>> Are you affected by the recent Nomad BSL relicensing [8]?<p>Maybe, see [9].<p>> How do you compare to $X?<p>Our core goal is to get developers up and running as fast as possible. We provide extra services like our matchmaker [10], CDN [11], and KV [12] to make shipping a fully-fledged multiplayer game require only a couple of lines of code.<p>No other project provides a comparably accessible, OSS, and comprehensive game server manager.<p>> Do you handle networking logic?<p>No. We work with existing tools like FishNet, Mirror, NGO, Unreal & Godot replication, and anything else you can run in Docker.<p>> Is anyone actually using this?<p>Yes, we’ve been running in closed beta since Jan ‘22 and currently support millions of MAU across many titles.<p>[1]: <a href="https://github.com/rivet-gg/microgravity.io">https://github.com/rivet-gg/microgravity.io</a><p>[2]: <a href="https://github.com/rivet-gg/rivet/tree/main/docs/libraries/bolt">https://github.com/rivet-gg/rivet/tree/main/docs/libraries/b...</a><p>[3]: <a href="https://github.com/rivet-gg/rivet/blob/main/docs/infrastructure/SBOM.md">https://github.com/rivet-gg/rivet/blob/main/docs/infrastruct...</a><p>[4]: <a href="https://github.com/rivet-gg/rivet/blob/main/docs/philosophy/LICENSING.md">https://github.com/rivet-gg/rivet/blob/main/docs/philosophy/...</a><p>[5]: <a href="https://github.com/rivet-gg/rivet/blob/main/docs/philosophy/WHY_OPEN_SOURCE.md">https://github.com/rivet-gg/rivet/blob/main/docs/philosophy/...</a><p>[6]: <a href="https://www.hashicorp.com/c2m" rel="nofollow noreferrer">https://www.hashicorp.com/c2m</a><p>[7]: <a href="https://www.hashicorp.com/case-studies/roblox" rel="nofollow noreferrer">https://www.hashicorp.com/case-studies/roblox</a><p>[8]: <a href="https://www.hashicorp.com/blog/hashicorp-adopts-business-source-license" rel="nofollow noreferrer">https://www.hashicorp.com/blog/hashicorp-adopts-business-sou...</a><p>[9]: <a href="https://news.ycombinator.com/item?id=37084825">https://news.ycombinator.com/item?id=37084825</a><p>[10]: <a href="https://rivet.gg/docs/matchmaker">https://rivet.gg/docs/matchmaker</a><p>[11]: <a href="https://rivet.gg/docs/cdn">https://rivet.gg/docs/cdn</a><p>[12]: <a href="https://rivet.gg/docs/kv">https://rivet.gg/docs/kv</a>

< 1 2 3 ... 44 45 46 47 48 ... 130 131 132 >