The best Hacker News stories from Show from the past day
Latest posts:
Show HN: LLMs can generate valid JSON 100% of the time
Outlines is a Python library that focuses on text generation with large language models. Brandon and I are not LLM experts and started the project a few months ago because we wanted to understand better how the generation process works. Our original background is probabilistic, relational and symbolic programming.<p>Recently we came up with a fast way to generate text that matches a regex (<a href="https://blog.normalcomputing.ai/posts/2023-07-27-regex-guided-generation/regex-guided-generation.html" rel="nofollow noreferrer">https://blog.normalcomputing.ai/posts/2023-07-27-regex-guide...</a>). The basic idea is simple: regular expressions have an equivalent Deterministic-Finite Automaton (DFA) representation. We can transform this DFA into a generative model: in each state we get a list of symbols which correspond to completions that partially match the regular expression. We mask the other symbols in the logits returned by a large language model, sample a new symbol and move to the next state. The subtelty is that language models work with tokens, not symbols, so we derive a new FSM whose alphabet is the model's vocabulary. We can do this in only one pass over the vocabulary.<p>Generating the token masks thus only requires a dictionary lookup at each state. Our method blows other libraries like Microsoft's guidance out of the water.<p>From there it was only a small leap to be able to generate text that follows a JSON schema (<a href="https://json-schema.org/" rel="nofollow noreferrer">https://json-schema.org/</a>), or is parseable into a Pydantic model (<a href="https://docs.pydantic.dev/latest/usage/models/" rel="nofollow noreferrer">https://docs.pydantic.dev/latest/usage/models/</a>). The method works with union types, optional types, nested schemas, arrays, everything. It is guaranteed that the output is parseable.<p>I think it's cool, and I've spent a lot of time watching even tiny models output valid JSON over the weekend. Hope you will too.<p>I look forward to feedback, bug reports, feature requests and discussions!<p>Edit: Link to our pre-print explaining the method and how this can be extended to generate text that follows a Context-Free Grammar <a href="https://arxiv.org/abs/2307.09702" rel="nofollow noreferrer">https://arxiv.org/abs/2307.09702</a>
Show HN: LLMs can generate valid JSON 100% of the time
Outlines is a Python library that focuses on text generation with large language models. Brandon and I are not LLM experts and started the project a few months ago because we wanted to understand better how the generation process works. Our original background is probabilistic, relational and symbolic programming.<p>Recently we came up with a fast way to generate text that matches a regex (<a href="https://blog.normalcomputing.ai/posts/2023-07-27-regex-guided-generation/regex-guided-generation.html" rel="nofollow noreferrer">https://blog.normalcomputing.ai/posts/2023-07-27-regex-guide...</a>). The basic idea is simple: regular expressions have an equivalent Deterministic-Finite Automaton (DFA) representation. We can transform this DFA into a generative model: in each state we get a list of symbols which correspond to completions that partially match the regular expression. We mask the other symbols in the logits returned by a large language model, sample a new symbol and move to the next state. The subtelty is that language models work with tokens, not symbols, so we derive a new FSM whose alphabet is the model's vocabulary. We can do this in only one pass over the vocabulary.<p>Generating the token masks thus only requires a dictionary lookup at each state. Our method blows other libraries like Microsoft's guidance out of the water.<p>From there it was only a small leap to be able to generate text that follows a JSON schema (<a href="https://json-schema.org/" rel="nofollow noreferrer">https://json-schema.org/</a>), or is parseable into a Pydantic model (<a href="https://docs.pydantic.dev/latest/usage/models/" rel="nofollow noreferrer">https://docs.pydantic.dev/latest/usage/models/</a>). The method works with union types, optional types, nested schemas, arrays, everything. It is guaranteed that the output is parseable.<p>I think it's cool, and I've spent a lot of time watching even tiny models output valid JSON over the weekend. Hope you will too.<p>I look forward to feedback, bug reports, feature requests and discussions!<p>Edit: Link to our pre-print explaining the method and how this can be extended to generate text that follows a Context-Free Grammar <a href="https://arxiv.org/abs/2307.09702" rel="nofollow noreferrer">https://arxiv.org/abs/2307.09702</a>
Show HN: LLMs can generate valid JSON 100% of the time
Outlines is a Python library that focuses on text generation with large language models. Brandon and I are not LLM experts and started the project a few months ago because we wanted to understand better how the generation process works. Our original background is probabilistic, relational and symbolic programming.<p>Recently we came up with a fast way to generate text that matches a regex (<a href="https://blog.normalcomputing.ai/posts/2023-07-27-regex-guided-generation/regex-guided-generation.html" rel="nofollow noreferrer">https://blog.normalcomputing.ai/posts/2023-07-27-regex-guide...</a>). The basic idea is simple: regular expressions have an equivalent Deterministic-Finite Automaton (DFA) representation. We can transform this DFA into a generative model: in each state we get a list of symbols which correspond to completions that partially match the regular expression. We mask the other symbols in the logits returned by a large language model, sample a new symbol and move to the next state. The subtelty is that language models work with tokens, not symbols, so we derive a new FSM whose alphabet is the model's vocabulary. We can do this in only one pass over the vocabulary.<p>Generating the token masks thus only requires a dictionary lookup at each state. Our method blows other libraries like Microsoft's guidance out of the water.<p>From there it was only a small leap to be able to generate text that follows a JSON schema (<a href="https://json-schema.org/" rel="nofollow noreferrer">https://json-schema.org/</a>), or is parseable into a Pydantic model (<a href="https://docs.pydantic.dev/latest/usage/models/" rel="nofollow noreferrer">https://docs.pydantic.dev/latest/usage/models/</a>). The method works with union types, optional types, nested schemas, arrays, everything. It is guaranteed that the output is parseable.<p>I think it's cool, and I've spent a lot of time watching even tiny models output valid JSON over the weekend. Hope you will too.<p>I look forward to feedback, bug reports, feature requests and discussions!<p>Edit: Link to our pre-print explaining the method and how this can be extended to generate text that follows a Context-Free Grammar <a href="https://arxiv.org/abs/2307.09702" rel="nofollow noreferrer">https://arxiv.org/abs/2307.09702</a>
Show HN: Broken Bear, the AI teddy bear that loves your broken self
I made a GPT-based AI Chatbot based on Carl Roger's philosophy of radical self-acceptance. Broken Bear is designed to be a kind, comforting, and quietly encouraging friend.
Show HN: NotYetNews – AI-Generated News from the Future
Show HN: I wrote a RDBMS (SQLite clone) from scratch in pure Python
I wrote a relational database management system (RDBMS) (sqlite clone) from scratch in pure Python.
Show HN: I wrote a RDBMS (SQLite clone) from scratch in pure Python
I wrote a relational database management system (RDBMS) (sqlite clone) from scratch in pure Python.
Show HN: I wrote a RDBMS (SQLite clone) from scratch in pure Python
I wrote a relational database management system (RDBMS) (sqlite clone) from scratch in pure Python.
Show HN: Run LLaMa2 on the Browser with Ggml.js
You can now build serverless AI inference web application with ggml.js's LM backends.
Show HN: Openform – use Google Forms and Google Sheets as a simple database
Show HN: There are over a thousand possible finger arrangements for your hands
Show HN: There are over a thousand possible finger arrangements for your hands
Show HN: liteLLM Proxy Server: 50+ LLM Models, Error Handling, Caching
Hello hacker news,<p>I’m the maintainer of liteLLM() - package to simplify input/output to OpenAI, Azure, Cohere, Anthropic, Hugging face API Endpoints: <a href="https://github.com/BerriAI/litellm/">https://github.com/BerriAI/litellm/</a><p>We’re open sourcing our implementation of liteLLM proxy: <a href="https://github.com/BerriAI/litellm/blob/main/cookbook/proxy-server/readme.md">https://github.com/BerriAI/litellm/blob/main/cookbook/proxy-...</a><p>TLDR: It has one API endpoint /chat/completions and standardizes input/output for 50+ LLM models + handles logging, error tracking, caching, streaming<p>What can liteLLM proxy do?
- It’s a central place to manage all LLM provider integrations<p>- Consistent Input/Output Format
- Call all models using the OpenAI format: completion(model, messages)
- Text responses will always be available at ['choices'][0]['message']['content']<p>- Error Handling Using Model Fallbacks (if GPT-4 fails, try llama2)<p>- Logging - Log Requests, Responses and Errors to Supabase, Posthog, Mixpanel, Sentry, Helicone<p>- Token Usage & Spend - Track Input + Completion tokens used + Spend/model<p>- Caching - Implementation of Semantic Caching<p>- Streaming & Async Support - Return generators to stream text responses<p>You can deploy liteLLM to your own infrastructure using Railway, GCP, AWS, Azure<p>Happy completion() !
Show HN: liteLLM Proxy Server: 50+ LLM Models, Error Handling, Caching
Hello hacker news,<p>I’m the maintainer of liteLLM() - package to simplify input/output to OpenAI, Azure, Cohere, Anthropic, Hugging face API Endpoints: <a href="https://github.com/BerriAI/litellm/">https://github.com/BerriAI/litellm/</a><p>We’re open sourcing our implementation of liteLLM proxy: <a href="https://github.com/BerriAI/litellm/blob/main/cookbook/proxy-server/readme.md">https://github.com/BerriAI/litellm/blob/main/cookbook/proxy-...</a><p>TLDR: It has one API endpoint /chat/completions and standardizes input/output for 50+ LLM models + handles logging, error tracking, caching, streaming<p>What can liteLLM proxy do?
- It’s a central place to manage all LLM provider integrations<p>- Consistent Input/Output Format
- Call all models using the OpenAI format: completion(model, messages)
- Text responses will always be available at ['choices'][0]['message']['content']<p>- Error Handling Using Model Fallbacks (if GPT-4 fails, try llama2)<p>- Logging - Log Requests, Responses and Errors to Supabase, Posthog, Mixpanel, Sentry, Helicone<p>- Token Usage & Spend - Track Input + Completion tokens used + Spend/model<p>- Caching - Implementation of Semantic Caching<p>- Streaming & Async Support - Return generators to stream text responses<p>You can deploy liteLLM to your own infrastructure using Railway, GCP, AWS, Azure<p>Happy completion() !
Show HN: liteLLM Proxy Server: 50+ LLM Models, Error Handling, Caching
Hello hacker news,<p>I’m the maintainer of liteLLM() - package to simplify input/output to OpenAI, Azure, Cohere, Anthropic, Hugging face API Endpoints: <a href="https://github.com/BerriAI/litellm/">https://github.com/BerriAI/litellm/</a><p>We’re open sourcing our implementation of liteLLM proxy: <a href="https://github.com/BerriAI/litellm/blob/main/cookbook/proxy-server/readme.md">https://github.com/BerriAI/litellm/blob/main/cookbook/proxy-...</a><p>TLDR: It has one API endpoint /chat/completions and standardizes input/output for 50+ LLM models + handles logging, error tracking, caching, streaming<p>What can liteLLM proxy do?
- It’s a central place to manage all LLM provider integrations<p>- Consistent Input/Output Format
- Call all models using the OpenAI format: completion(model, messages)
- Text responses will always be available at ['choices'][0]['message']['content']<p>- Error Handling Using Model Fallbacks (if GPT-4 fails, try llama2)<p>- Logging - Log Requests, Responses and Errors to Supabase, Posthog, Mixpanel, Sentry, Helicone<p>- Token Usage & Spend - Track Input + Completion tokens used + Spend/model<p>- Caching - Implementation of Semantic Caching<p>- Streaming & Async Support - Return generators to stream text responses<p>You can deploy liteLLM to your own infrastructure using Railway, GCP, AWS, Azure<p>Happy completion() !
Show HN: Covert – Rewrite of HashiCorp Vault Using Rust, SQLite and Litestream
Show HN: Covert – Rewrite of HashiCorp Vault Using Rust, SQLite and Litestream
Show HN: Pykoi – a Python library for LLM data collection and fine tuning
Hi HN,<p>pykoi is an open-source python library for ML scientists. pykoi makes it easier to collect data for LLMs, to use that data for finetuning, and to compare models to each other (e.g. your model pre- and post- finetuning, or your model vs openai vs claude). The library comes from pain points we experienced in LLM development:<p>1. Collecting feedback data from users isn't as easy as it could be. (The current process usually involves sharing excel files of annotated responses back-and-forth, offering no insight into how users actually engage with your models).<p>2. RLHF remains complicated to carry out. By <i>complicated</i>, we mean requires a lot of steps, hundreds of configs, lengthy setups, etc.<p>3. Comparing models to each other <i>as they're used</i> (that is, independent from academic metrics) is full of friction. The current approach: spin up a model, ask questions, write them down. Repeat for other models then compare.<p>At a high-level, we think that the active learning process should be closed-loop: data collection, fine tuning, and inference all feed from the same system. This library is our first step in that direction.<p>The project is still very early but we hope that some if it is useful. Note, we're fully open-source, and actively adding features!<p>Website: <a href="https://www.cambioml.com/pykoi">https://www.cambioml.com/pykoi</a>
GitHub: <a href="https://github.com/CambioML/pykoi">https://github.com/CambioML/pykoi</a><p>We would love your feedback!
Show HN: Pykoi – a Python library for LLM data collection and fine tuning
Hi HN,<p>pykoi is an open-source python library for ML scientists. pykoi makes it easier to collect data for LLMs, to use that data for finetuning, and to compare models to each other (e.g. your model pre- and post- finetuning, or your model vs openai vs claude). The library comes from pain points we experienced in LLM development:<p>1. Collecting feedback data from users isn't as easy as it could be. (The current process usually involves sharing excel files of annotated responses back-and-forth, offering no insight into how users actually engage with your models).<p>2. RLHF remains complicated to carry out. By <i>complicated</i>, we mean requires a lot of steps, hundreds of configs, lengthy setups, etc.<p>3. Comparing models to each other <i>as they're used</i> (that is, independent from academic metrics) is full of friction. The current approach: spin up a model, ask questions, write them down. Repeat for other models then compare.<p>At a high-level, we think that the active learning process should be closed-loop: data collection, fine tuning, and inference all feed from the same system. This library is our first step in that direction.<p>The project is still very early but we hope that some if it is useful. Note, we're fully open-source, and actively adding features!<p>Website: <a href="https://www.cambioml.com/pykoi">https://www.cambioml.com/pykoi</a>
GitHub: <a href="https://github.com/CambioML/pykoi">https://github.com/CambioML/pykoi</a><p>We would love your feedback!
Show HN: Pykoi – a Python library for LLM data collection and fine tuning
Hi HN,<p>pykoi is an open-source python library for ML scientists. pykoi makes it easier to collect data for LLMs, to use that data for finetuning, and to compare models to each other (e.g. your model pre- and post- finetuning, or your model vs openai vs claude). The library comes from pain points we experienced in LLM development:<p>1. Collecting feedback data from users isn't as easy as it could be. (The current process usually involves sharing excel files of annotated responses back-and-forth, offering no insight into how users actually engage with your models).<p>2. RLHF remains complicated to carry out. By <i>complicated</i>, we mean requires a lot of steps, hundreds of configs, lengthy setups, etc.<p>3. Comparing models to each other <i>as they're used</i> (that is, independent from academic metrics) is full of friction. The current approach: spin up a model, ask questions, write them down. Repeat for other models then compare.<p>At a high-level, we think that the active learning process should be closed-loop: data collection, fine tuning, and inference all feed from the same system. This library is our first step in that direction.<p>The project is still very early but we hope that some if it is useful. Note, we're fully open-source, and actively adding features!<p>Website: <a href="https://www.cambioml.com/pykoi">https://www.cambioml.com/pykoi</a>
GitHub: <a href="https://github.com/CambioML/pykoi">https://github.com/CambioML/pykoi</a><p>We would love your feedback!