The best Hacker News stories from All from the past day

Latest posts:

How Is LLaMa.cpp Possible?

The OpenTF Manifesto

The OpenTF Manifesto

Things you forgot (or never knew) because of React

Tell HN: t.co is adding a five-second delay to some domains

Go to Twitter and click on a link going to any url on "NYTimes.com" or "threads.net" and you'll see about a ~5 second delay before t.co forwards you to the right address.<p>Twitter won't ban domains they don't like but will waste your time if you visit them.<p>I've been tracking the NYT delay ever since it was added (8/4, roughly noon Pacific time), and the delay is so consistent it's obviously deliberate.

Firefox finally outperforming Google Chrome in SunSpider

We reduced the cost of building Mastodon at Twitter-scale by 100x

We reduced the cost of building Mastodon at Twitter-scale by 100x

Show HN: Little Rat – Chrome extension monitors network calls of all extensions

Hi HN<p>I needed a way to monitor network calls made by chrome extensions so I made a small extension.<p>You can install it by dropping the zip or crx into the extensions page. It'll be on the chrome store whenever/if it gets through the review.<p>Hopefully it's useful to others.<p><a href="https://github.com/dnakov/little-rat">https://github.com/dnakov/little-rat</a><p><a href="https://twitter.com/dnak0v" rel="nofollow noreferrer">https://twitter.com/dnak0v</a>

Tech workers remain some of the highest paid in New Zealand

Software Engineering at Google (2020)

Backward Compatibility, Go 1.21, and Go 2

A video game where you are an operating system

Bypassing YouTube video download throttling

Writing about what you learn pushes you to understand topics better

Show HN: LLMs can generate valid JSON 100% of the time

Outlines is a Python library that focuses on text generation with large language models. Brandon and I are not LLM experts and started the project a few months ago because we wanted to understand better how the generation process works. Our original background is probabilistic, relational and symbolic programming.<p>Recently we came up with a fast way to generate text that matches a regex (<a href="https://blog.normalcomputing.ai/posts/2023-07-27-regex-guided-generation/regex-guided-generation.html" rel="nofollow noreferrer">https://blog.normalcomputing.ai/posts/2023-07-27-regex-guide...</a>). The basic idea is simple: regular expressions have an equivalent Deterministic-Finite Automaton (DFA) representation. We can transform this DFA into a generative model: in each state we get a list of symbols which correspond to completions that partially match the regular expression. We mask the other symbols in the logits returned by a large language model, sample a new symbol and move to the next state. The subtelty is that language models work with tokens, not symbols, so we derive a new FSM whose alphabet is the model's vocabulary. We can do this in only one pass over the vocabulary.<p>Generating the token masks thus only requires a dictionary lookup at each state. Our method blows other libraries like Microsoft's guidance out of the water.<p>From there it was only a small leap to be able to generate text that follows a JSON schema (<a href="https://json-schema.org/" rel="nofollow noreferrer">https://json-schema.org/</a>), or is parseable into a Pydantic model (<a href="https://docs.pydantic.dev/latest/usage/models/" rel="nofollow noreferrer">https://docs.pydantic.dev/latest/usage/models/</a>). The method works with union types, optional types, nested schemas, arrays, everything. It is guaranteed that the output is parseable.<p>I think it's cool, and I've spent a lot of time watching even tiny models output valid JSON over the weekend. Hope you will too.<p>I look forward to feedback, bug reports, feature requests and discussions!<p>Edit: Link to our pre-print explaining the method and how this can be extended to generate text that follows a Context-Free Grammar <a href="https://arxiv.org/abs/2307.09702" rel="nofollow noreferrer">https://arxiv.org/abs/2307.09702</a>

Show HN: LLMs can generate valid JSON 100% of the time

Outlines is a Python library that focuses on text generation with large language models. Brandon and I are not LLM experts and started the project a few months ago because we wanted to understand better how the generation process works. Our original background is probabilistic, relational and symbolic programming.<p>Recently we came up with a fast way to generate text that matches a regex (<a href="https://blog.normalcomputing.ai/posts/2023-07-27-regex-guided-generation/regex-guided-generation.html" rel="nofollow noreferrer">https://blog.normalcomputing.ai/posts/2023-07-27-regex-guide...</a>). The basic idea is simple: regular expressions have an equivalent Deterministic-Finite Automaton (DFA) representation. We can transform this DFA into a generative model: in each state we get a list of symbols which correspond to completions that partially match the regular expression. We mask the other symbols in the logits returned by a large language model, sample a new symbol and move to the next state. The subtelty is that language models work with tokens, not symbols, so we derive a new FSM whose alphabet is the model's vocabulary. We can do this in only one pass over the vocabulary.<p>Generating the token masks thus only requires a dictionary lookup at each state. Our method blows other libraries like Microsoft's guidance out of the water.<p>From there it was only a small leap to be able to generate text that follows a JSON schema (<a href="https://json-schema.org/" rel="nofollow noreferrer">https://json-schema.org/</a>), or is parseable into a Pydantic model (<a href="https://docs.pydantic.dev/latest/usage/models/" rel="nofollow noreferrer">https://docs.pydantic.dev/latest/usage/models/</a>). The method works with union types, optional types, nested schemas, arrays, everything. It is guaranteed that the output is parseable.<p>I think it's cool, and I've spent a lot of time watching even tiny models output valid JSON over the weekend. Hope you will too.<p>I look forward to feedback, bug reports, feature requests and discussions!<p>Edit: Link to our pre-print explaining the method and how this can be extended to generate text that follows a Context-Free Grammar <a href="https://arxiv.org/abs/2307.09702" rel="nofollow noreferrer">https://arxiv.org/abs/2307.09702</a>

Today I realized I now trust Microsoft more than Google. What is happening?

‘I've got nothing to hide’ and other misunderstandings of privacy (2007)

Toki Pona: an attempted universal language with only ~120 words