The best Hacker News stories from Show from the past day
Latest posts:
Show HN: PromptTools – open-source tools for evaluating LLMs and vector DBs
Hey HN! We’re Kevin and Steve. We’re building PromptTools (<a href="https://github.com/hegelai/prompttools">https://github.com/hegelai/prompttools</a>): open-source, self-hostable tools for experimenting with, testing, and evaluating LLMs, vector databases, and prompts.<p>Evaluating prompts, LLMs, and vector databases is a painful, time-consuming but necessary part of the product engineering process. Our tools allow engineers to do this in a lot less time.<p>By “evaluating” we mean checking the quality of a model's response for a given use case, which is a combination of testing and benchmarking. As examples:
- For generated JSON, SQL, or Python, you can check that the output is actually JSON, SQL, or executable Python.
- For generated emails, you can use another model to assess the quality of the generated email given some requirements, like whether or not the email is written professionally.
- For a question-answering chatbot, you can check that the actual answer is semantically similar to an expected answer.<p>At Google, Steve worked with HuggingFace and Lightning to support running the newest open-source models on TPUs. He realized that while the open-source community was contributing incredibly powerful models, it wasn’t so easy to discover and evaluate them. It wasn’t clear when you could use Llama or Falcon instead of GPT-4. We began looking for ways to simplify and scale this evaluation process.<p>With PromptTools, you can write a short Python script (as short as 5 lines) to run such checks across models, parameters, and prompts, and pass the results into an evaluation function to get scores. All these can be executed on your local machine without sending data to third-parties. Then we help you turn those experiments into unit tests and CI/CD that track your model’s performance over time.<p>Today we support all of the major model providers like OpenAI, Anthropic, Google, HuggingFace, and even LlamaCpp, and vector databases like ChromaDB and Weaviate. You can evaluate responses via semantic similarity, auto-evaluation by a language model, or structured output validations like JSON and Python. We even have a notebook UI for recording manual feedback.<p>Quickstart:<p><pre><code> pip install prompttools
git clone https://github.com/hegelai/prompttools.git
cd prompttools && jupyter notebook examples/notebooks/OpenAIChatExperiment.ipynb
</code></pre>
For detailed instructions, see our documentation at <a href="https://prompttools.readthedocs.io/en/latest/" rel="nofollow noreferrer">https://prompttools.readthedocs.io/en/latest/</a>.<p>We also have a playground UI, built in streamlit, which is currently in beta: <a href="https://github.com/hegelai/prompttools/tree/main/prompttools/playground">https://github.com/hegelai/prompttools/tree/main/prompttools...</a>. Launch it with:<p><pre><code> pip install prompttools
git clone https://github.com/hegelai/prompttools.git
cd prompttools && streamlit run prompttools/ui/playground.py
</code></pre>
We’d love it if you tried our product out and let us know what you think! We just got started a month ago and we’re eager to get feedback and keep building.
Show HN: PromptTools – open-source tools for evaluating LLMs and vector DBs
Hey HN! We’re Kevin and Steve. We’re building PromptTools (<a href="https://github.com/hegelai/prompttools">https://github.com/hegelai/prompttools</a>): open-source, self-hostable tools for experimenting with, testing, and evaluating LLMs, vector databases, and prompts.<p>Evaluating prompts, LLMs, and vector databases is a painful, time-consuming but necessary part of the product engineering process. Our tools allow engineers to do this in a lot less time.<p>By “evaluating” we mean checking the quality of a model's response for a given use case, which is a combination of testing and benchmarking. As examples:
- For generated JSON, SQL, or Python, you can check that the output is actually JSON, SQL, or executable Python.
- For generated emails, you can use another model to assess the quality of the generated email given some requirements, like whether or not the email is written professionally.
- For a question-answering chatbot, you can check that the actual answer is semantically similar to an expected answer.<p>At Google, Steve worked with HuggingFace and Lightning to support running the newest open-source models on TPUs. He realized that while the open-source community was contributing incredibly powerful models, it wasn’t so easy to discover and evaluate them. It wasn’t clear when you could use Llama or Falcon instead of GPT-4. We began looking for ways to simplify and scale this evaluation process.<p>With PromptTools, you can write a short Python script (as short as 5 lines) to run such checks across models, parameters, and prompts, and pass the results into an evaluation function to get scores. All these can be executed on your local machine without sending data to third-parties. Then we help you turn those experiments into unit tests and CI/CD that track your model’s performance over time.<p>Today we support all of the major model providers like OpenAI, Anthropic, Google, HuggingFace, and even LlamaCpp, and vector databases like ChromaDB and Weaviate. You can evaluate responses via semantic similarity, auto-evaluation by a language model, or structured output validations like JSON and Python. We even have a notebook UI for recording manual feedback.<p>Quickstart:<p><pre><code> pip install prompttools
git clone https://github.com/hegelai/prompttools.git
cd prompttools && jupyter notebook examples/notebooks/OpenAIChatExperiment.ipynb
</code></pre>
For detailed instructions, see our documentation at <a href="https://prompttools.readthedocs.io/en/latest/" rel="nofollow noreferrer">https://prompttools.readthedocs.io/en/latest/</a>.<p>We also have a playground UI, built in streamlit, which is currently in beta: <a href="https://github.com/hegelai/prompttools/tree/main/prompttools/playground">https://github.com/hegelai/prompttools/tree/main/prompttools...</a>. Launch it with:<p><pre><code> pip install prompttools
git clone https://github.com/hegelai/prompttools.git
cd prompttools && streamlit run prompttools/ui/playground.py
</code></pre>
We’d love it if you tried our product out and let us know what you think! We just got started a month ago and we’re eager to get feedback and keep building.
Show HN: Magic Loops – Combine LLMs and code to create simple automations
Howdy! We built this as an experiment in personal-programming, combining the best of LLMs and code to help automate tasks around you. I personally use it to track the tides and get notified when certain conditions are met, something that pure LLMs had trouble dealing with and pure code was often too brittle for.<p>We created it after getting frustrated with the inability of LLMs to deal with numbers and the various hoops we had to jump through to make ChatGPT output repeatable.<p>At the core, Magic Loops are just a series of "blocks" (JSON) that can be triggered with different inputs (email, time, webhook), then operate on those inputs using a combination of LLMs and code, and then output those results (email, text, webhook). Under the hood, the LLM calls are using GPT-4 via OpenAI and the code is run in sandboxed (no internet) Docker containers in AWS.<p>You have full control over each step of the loop, but you can also create (or attempt to create) a Magic Loop by simply describing what you want. We use GPT-4 to break that request into feasible steps, and then create a Magic Loop scaffold. Of course, you should still validate the loop before publishing it!<p>We've seen some neat use cases already:<p>- "Text me when the tide is less than 1ft between 7am and 7pm at Fort Funston"<p>- "Summarize an email using this format and forward it to this address"<p>- "Text me every time our store does more than $1000/day in volume on Shopify"<p>- "Take specific data from Cloudflare, format it, and send it to Mixpanel every hour"<p>We hope you enjoy what's essentially an experiment at this point. If folks like the concept, we're thinking about open sourcing it so you can run the loops locally with the code runtimes you wish (rather than in our code runners).<p>Let us know what you think, and more importantly, what you wish to build or automate!<p>Cheers,
Adam & Mihai
Show HN: Magic Loops – Combine LLMs and code to create simple automations
Howdy! We built this as an experiment in personal-programming, combining the best of LLMs and code to help automate tasks around you. I personally use it to track the tides and get notified when certain conditions are met, something that pure LLMs had trouble dealing with and pure code was often too brittle for.<p>We created it after getting frustrated with the inability of LLMs to deal with numbers and the various hoops we had to jump through to make ChatGPT output repeatable.<p>At the core, Magic Loops are just a series of "blocks" (JSON) that can be triggered with different inputs (email, time, webhook), then operate on those inputs using a combination of LLMs and code, and then output those results (email, text, webhook). Under the hood, the LLM calls are using GPT-4 via OpenAI and the code is run in sandboxed (no internet) Docker containers in AWS.<p>You have full control over each step of the loop, but you can also create (or attempt to create) a Magic Loop by simply describing what you want. We use GPT-4 to break that request into feasible steps, and then create a Magic Loop scaffold. Of course, you should still validate the loop before publishing it!<p>We've seen some neat use cases already:<p>- "Text me when the tide is less than 1ft between 7am and 7pm at Fort Funston"<p>- "Summarize an email using this format and forward it to this address"<p>- "Text me every time our store does more than $1000/day in volume on Shopify"<p>- "Take specific data from Cloudflare, format it, and send it to Mixpanel every hour"<p>We hope you enjoy what's essentially an experiment at this point. If folks like the concept, we're thinking about open sourcing it so you can run the loops locally with the code runtimes you wish (rather than in our code runners).<p>Let us know what you think, and more importantly, what you wish to build or automate!<p>Cheers,
Adam & Mihai
Show HN: Magic Loops – Combine LLMs and code to create simple automations
Howdy! We built this as an experiment in personal-programming, combining the best of LLMs and code to help automate tasks around you. I personally use it to track the tides and get notified when certain conditions are met, something that pure LLMs had trouble dealing with and pure code was often too brittle for.<p>We created it after getting frustrated with the inability of LLMs to deal with numbers and the various hoops we had to jump through to make ChatGPT output repeatable.<p>At the core, Magic Loops are just a series of "blocks" (JSON) that can be triggered with different inputs (email, time, webhook), then operate on those inputs using a combination of LLMs and code, and then output those results (email, text, webhook). Under the hood, the LLM calls are using GPT-4 via OpenAI and the code is run in sandboxed (no internet) Docker containers in AWS.<p>You have full control over each step of the loop, but you can also create (or attempt to create) a Magic Loop by simply describing what you want. We use GPT-4 to break that request into feasible steps, and then create a Magic Loop scaffold. Of course, you should still validate the loop before publishing it!<p>We've seen some neat use cases already:<p>- "Text me when the tide is less than 1ft between 7am and 7pm at Fort Funston"<p>- "Summarize an email using this format and forward it to this address"<p>- "Text me every time our store does more than $1000/day in volume on Shopify"<p>- "Take specific data from Cloudflare, format it, and send it to Mixpanel every hour"<p>We hope you enjoy what's essentially an experiment at this point. If folks like the concept, we're thinking about open sourcing it so you can run the loops locally with the code runtimes you wish (rather than in our code runners).<p>Let us know what you think, and more importantly, what you wish to build or automate!<p>Cheers,
Adam & Mihai
Show HN: Magic Loops – Combine LLMs and code to create simple automations
Howdy! We built this as an experiment in personal-programming, combining the best of LLMs and code to help automate tasks around you. I personally use it to track the tides and get notified when certain conditions are met, something that pure LLMs had trouble dealing with and pure code was often too brittle for.<p>We created it after getting frustrated with the inability of LLMs to deal with numbers and the various hoops we had to jump through to make ChatGPT output repeatable.<p>At the core, Magic Loops are just a series of "blocks" (JSON) that can be triggered with different inputs (email, time, webhook), then operate on those inputs using a combination of LLMs and code, and then output those results (email, text, webhook). Under the hood, the LLM calls are using GPT-4 via OpenAI and the code is run in sandboxed (no internet) Docker containers in AWS.<p>You have full control over each step of the loop, but you can also create (or attempt to create) a Magic Loop by simply describing what you want. We use GPT-4 to break that request into feasible steps, and then create a Magic Loop scaffold. Of course, you should still validate the loop before publishing it!<p>We've seen some neat use cases already:<p>- "Text me when the tide is less than 1ft between 7am and 7pm at Fort Funston"<p>- "Summarize an email using this format and forward it to this address"<p>- "Text me every time our store does more than $1000/day in volume on Shopify"<p>- "Take specific data from Cloudflare, format it, and send it to Mixpanel every hour"<p>We hope you enjoy what's essentially an experiment at this point. If folks like the concept, we're thinking about open sourcing it so you can run the loops locally with the code runtimes you wish (rather than in our code runners).<p>Let us know what you think, and more importantly, what you wish to build or automate!<p>Cheers,
Adam & Mihai
Show HN: File distribution over DNS: (ab)using DNS as a CDN
Show HN: File distribution over DNS: (ab)using DNS as a CDN
Show HN: File distribution over DNS: (ab)using DNS as a CDN
Show HN: A Notion-like platform for building interactive models
Hey HN. I wanted to share an update to our previous thread, “Notion with problem solving capabilities”.<p>The Decipad public beta is now live. You can try it for free here. <a href="https://www.decipad.com/" rel="nofollow noreferrer">https://www.decipad.com/</a><p>We started building Decipad to make numbers more expressive and playful. It’s a notebook environment where you can combine text, numbers, data and calculations into a story.<p>Our goal is to help people communicate with numbers more effectively and collaborate across diverse backgrounds. It’s feels a bit like Notion, but it’s for building interactive models and reports.<p>A few things we’ve been addressing building Decipad…<p>- A friendly modelling experience: You can express variables and calculations with quasi-natural language and connect them with tables, charts, pivot tables and other widgets.<p>- Unit expression: we built Decipad on a powerful unit system. You can assign labels and units to your data, like, `Cost = $5 per month per seat.`<p>- Dimensional Categories: Expressing relationships between variables and categories, making a model easy to adapt. We wrote about it here: <a href="https://www.decipad.com/blog/breaking-the-grid-overcoming-dimensional-constraints-in-spreadsheets" rel="nofollow noreferrer">https://www.decipad.com/blog/breaking-the-grid-overcoming-di...</a><p>- Connecting Data: Ability to connect data sources directly to your notebook. Right now, it’s intended for technical users. You can use JS and SQL to run a query.<p>We’re still exploring several areas like support for large data sets and building more UX interactions on top of our language to make modeling even more approachable and collaborative.<p>We would love to get feedback or any thoughts on our approach.
Show HN: LearnLingo – Converse with an AI-powered language tutor
Hey folks! I'm Callum, and I'm working on a way to practice a new language with an AI powered tutor.<p>I've always found that the hardest part of learning a new language is finding someone to actually converse with. Even if a partner can be found, the pressure can mean that you are more focused on not making mistakes than on actually learning new grammar or vocabulary.<p>The service that I have been working on allows you to practice with a language tutor via online chat messages, or you can have a turn-based voice conversation.<p>I'm working on a number of other features that will be coming out shortly, including a few games for practising pronunciation and listening skills, as well as a plan to release some lesson plans for specific languages later on.<p>Have a try, and let me know if you have any feedback!
Show HN: LearnLingo – Converse with an AI-powered language tutor
Hey folks! I'm Callum, and I'm working on a way to practice a new language with an AI powered tutor.<p>I've always found that the hardest part of learning a new language is finding someone to actually converse with. Even if a partner can be found, the pressure can mean that you are more focused on not making mistakes than on actually learning new grammar or vocabulary.<p>The service that I have been working on allows you to practice with a language tutor via online chat messages, or you can have a turn-based voice conversation.<p>I'm working on a number of other features that will be coming out shortly, including a few games for practising pronunciation and listening skills, as well as a plan to release some lesson plans for specific languages later on.<p>Have a try, and let me know if you have any feedback!
Show HN: LearnLingo – Converse with an AI-powered language tutor
Hey folks! I'm Callum, and I'm working on a way to practice a new language with an AI powered tutor.<p>I've always found that the hardest part of learning a new language is finding someone to actually converse with. Even if a partner can be found, the pressure can mean that you are more focused on not making mistakes than on actually learning new grammar or vocabulary.<p>The service that I have been working on allows you to practice with a language tutor via online chat messages, or you can have a turn-based voice conversation.<p>I'm working on a number of other features that will be coming out shortly, including a few games for practising pronunciation and listening skills, as well as a plan to release some lesson plans for specific languages later on.<p>Have a try, and let me know if you have any feedback!
Show HN: LearnLingo – Converse with an AI-powered language tutor
Hey folks! I'm Callum, and I'm working on a way to practice a new language with an AI powered tutor.<p>I've always found that the hardest part of learning a new language is finding someone to actually converse with. Even if a partner can be found, the pressure can mean that you are more focused on not making mistakes than on actually learning new grammar or vocabulary.<p>The service that I have been working on allows you to practice with a language tutor via online chat messages, or you can have a turn-based voice conversation.<p>I'm working on a number of other features that will be coming out shortly, including a few games for practising pronunciation and listening skills, as well as a plan to release some lesson plans for specific languages later on.<p>Have a try, and let me know if you have any feedback!
Show HN: LearnLingo – Converse with an AI-powered language tutor
Hey folks! I'm Callum, and I'm working on a way to practice a new language with an AI powered tutor.<p>I've always found that the hardest part of learning a new language is finding someone to actually converse with. Even if a partner can be found, the pressure can mean that you are more focused on not making mistakes than on actually learning new grammar or vocabulary.<p>The service that I have been working on allows you to practice with a language tutor via online chat messages, or you can have a turn-based voice conversation.<p>I'm working on a number of other features that will be coming out shortly, including a few games for practising pronunciation and listening skills, as well as a plan to release some lesson plans for specific languages later on.<p>Have a try, and let me know if you have any feedback!
Show HN: LearnLingo – Converse with an AI-powered language tutor
Hey folks! I'm Callum, and I'm working on a way to practice a new language with an AI powered tutor.<p>I've always found that the hardest part of learning a new language is finding someone to actually converse with. Even if a partner can be found, the pressure can mean that you are more focused on not making mistakes than on actually learning new grammar or vocabulary.<p>The service that I have been working on allows you to practice with a language tutor via online chat messages, or you can have a turn-based voice conversation.<p>I'm working on a number of other features that will be coming out shortly, including a few games for practising pronunciation and listening skills, as well as a plan to release some lesson plans for specific languages later on.<p>Have a try, and let me know if you have any feedback!
Show HN: Linkwarden – An open source collaborative bookmark manager
Hey there HN!
Meet Linkwarden, a fully self-hostable, open-source collaborative bookmark manager to collect, organize and archive webpages.<p>Please also visit/star our GitHub repo [1].<p>Linkwarden was built using TypeScript and NextJS, backed by a PostgreSQL database for the lighter-weight data. The rest of the data can be chosen either to be stored on the filesystem, or stored on the cloud on Digital Ocean Space/AWS S3, the reason for the cloud storage solution was for the Cloud offering [2], we realized that the preserved webpages (archives) take up space pretty quickly and S3 was much more efficient for this task. On the front-end we used TailwindCSS for styling and Zustand for state management.<p>You could either use our Cloud offering (with 14-day free trial) to directly support this project and experience Linkwarden, or you could self-host it on your own machine and have maximum flexibility.<p>Feel free if you had any questions, we'll do our best to answer it.<p>[1]: <a href="https://github.com/linkwarden/linkwarden">https://github.com/linkwarden/linkwarden</a><p>[2]: <a href="https://cloud.linkwarden.app/register" rel="nofollow noreferrer">https://cloud.linkwarden.app/register</a> - Hosted in Digital Ocean's datacenter located here in Toronto, ON.
Show HN: Linkwarden – An open source collaborative bookmark manager
Hey there HN!
Meet Linkwarden, a fully self-hostable, open-source collaborative bookmark manager to collect, organize and archive webpages.<p>Please also visit/star our GitHub repo [1].<p>Linkwarden was built using TypeScript and NextJS, backed by a PostgreSQL database for the lighter-weight data. The rest of the data can be chosen either to be stored on the filesystem, or stored on the cloud on Digital Ocean Space/AWS S3, the reason for the cloud storage solution was for the Cloud offering [2], we realized that the preserved webpages (archives) take up space pretty quickly and S3 was much more efficient for this task. On the front-end we used TailwindCSS for styling and Zustand for state management.<p>You could either use our Cloud offering (with 14-day free trial) to directly support this project and experience Linkwarden, or you could self-host it on your own machine and have maximum flexibility.<p>Feel free if you had any questions, we'll do our best to answer it.<p>[1]: <a href="https://github.com/linkwarden/linkwarden">https://github.com/linkwarden/linkwarden</a><p>[2]: <a href="https://cloud.linkwarden.app/register" rel="nofollow noreferrer">https://cloud.linkwarden.app/register</a> - Hosted in Digital Ocean's datacenter located here in Toronto, ON.
Show HN: Markwhen: Markdown for Timelines
I've been working on markwhen for a bit as a way to create timelines and calendars from plain text, like markdown.<p>I personally like tools that let you immediately start using them, and I set out to do that here with markwhen.<p>Let me know if you have any questions or feedback!
Show HN: Markwhen: Markdown for Timelines
I've been working on markwhen for a bit as a way to create timelines and calendars from plain text, like markdown.<p>I personally like tools that let you immediately start using them, and I set out to do that here with markwhen.<p>Let me know if you have any questions or feedback!