The best Hacker News stories from Show from the past day

Latest posts:

Show HN: BBC “In Our Time”, categorised by Dewey Decimal, heavy lifting by GPT

I'm a big fan of the BBC podcast In Our Time -- and (like most people) I've been playing with the OpenAI APIs.<p>In Our Time has almost 1,000 episodes on everything from Cleopatra to the evolution of teeth to plasma physics, all still available, so it's my starting point to learn about most topics. But it's not well organised.<p>So here are the episodes sorted by library code. It's fun to explore.<p>Web scraping is usually pretty tedious, but I found that I could send the minimised HTML to GPT-3 and get (almost) perfect JSON back: the prompt includes the Typescript definition.<p>At the same time I asked for a Dewey classification... and it worked. So I replaced a few days of fiddly work with 3 cents per inference and an overnight data run.<p>My takeaway is that I'll be using LLMs as function call way more in the future. This isn't "generative" AI, more "programmatic" AI perhaps?<p>So I'm interested in what temperature=0 LLM usage looks like (you want it to be pretty deterministic), at scale, and what a language that treats that as a first-class concept might look like.

Show HN: BBC “In Our Time”, categorised by Dewey Decimal, heavy lifting by GPT

I'm a big fan of the BBC podcast In Our Time -- and (like most people) I've been playing with the OpenAI APIs.<p>In Our Time has almost 1,000 episodes on everything from Cleopatra to the evolution of teeth to plasma physics, all still available, so it's my starting point to learn about most topics. But it's not well organised.<p>So here are the episodes sorted by library code. It's fun to explore.<p>Web scraping is usually pretty tedious, but I found that I could send the minimised HTML to GPT-3 and get (almost) perfect JSON back: the prompt includes the Typescript definition.<p>At the same time I asked for a Dewey classification... and it worked. So I replaced a few days of fiddly work with 3 cents per inference and an overnight data run.<p>My takeaway is that I'll be using LLMs as function call way more in the future. This isn't "generative" AI, more "programmatic" AI perhaps?<p>So I'm interested in what temperature=0 LLM usage looks like (you want it to be pretty deterministic), at scale, and what a language that treats that as a first-class concept might look like.

Show HN: BBC “In Our Time”, categorised by Dewey Decimal, heavy lifting by GPT

I'm a big fan of the BBC podcast In Our Time -- and (like most people) I've been playing with the OpenAI APIs.<p>In Our Time has almost 1,000 episodes on everything from Cleopatra to the evolution of teeth to plasma physics, all still available, so it's my starting point to learn about most topics. But it's not well organised.<p>So here are the episodes sorted by library code. It's fun to explore.<p>Web scraping is usually pretty tedious, but I found that I could send the minimised HTML to GPT-3 and get (almost) perfect JSON back: the prompt includes the Typescript definition.<p>At the same time I asked for a Dewey classification... and it worked. So I replaced a few days of fiddly work with 3 cents per inference and an overnight data run.<p>My takeaway is that I'll be using LLMs as function call way more in the future. This isn't "generative" AI, more "programmatic" AI perhaps?<p>So I'm interested in what temperature=0 LLM usage looks like (you want it to be pretty deterministic), at scale, and what a language that treats that as a first-class concept might look like.

Show HN: ChatGPT Inline Bot on Telegram

Show HN: Regex Derivatives (Brzozowski Derivatives)

A Python sketch of a regex engine in less than 150 lines of code

Show HN: Bearer – Open-source code security scanning solution (SAST)

Hi HN,<p>we’re the co-founders of Bearer, and today we launch an open-source alternative to code security solutions such as Snyk Code, SonarQube, or Checkmarx. Essentially, we help security & engineering teams to discover, filter and prioritize security risks and vulnerabilities in their codebase, with a unique approach through sensitive data (PII, PD, PHI).<p>Our website is at <a href="https://www.bearer.com" rel="nofollow">https://www.bearer.com</a> and our GitHub is here: <a href="https://github.com/bearer/bearer">https://github.com/bearer/bearer</a><p>We are not originally Security experts but have been software developers and engineering leaders for over 15 years now, and we thought we could provide a new perspective to security products with a strong emphasis on the developer experience, something we often found lacking for security tools.<p>In addition to building a true developer-friendly security solution, we’ve also heard a lot of teams complaining about how noisy their static code security solutions are. As a result, they often have difficulties triaging the most important issues, and ultimately it’s difficult to remediate them. We believe an important part of the problem lies in the fact that we lack a clear understanding of the real impact of any security issues. Without that understanding, it’s very difficult to ask developers to remediate critical security flaws.<p>We’ve built a unique approach to this problem, by looking at the impact of security issues through the lens of sensitive data. Interestingly, most security team ultimate responsibility today is to secure those sensitive data and protect their organization from costly data loss and leakage, but until today, that connection has never been made.<p>In practical terms, we provide a set of rules that assess the variety of ways known code vulnerabilities (CWE) ultimately impact your application security, and we reconcile it with your sensitive data flows. At the time of this writing, Bearer provides over 100 rules.<p>Here are some examples of what those rules can detect: - Leakage of sensitive data through cookies, internal loggers, third-party logging services, and into analytics environments. - Non-filtered user input that can lead to breaches of sensitive information. - Usage of weak encryption libraries or misusage of encryption algorithms. - Unencrypted incoming and outgoing communication (HTTP, FTP, SMTP) of sensitive information. - Hard-coded secrets and tokens. - And many you can find see here: <a href="https://docs.bearer.com/reference/rules/" rel="nofollow">https://docs.bearer.com/reference/rules/</a><p>Rules are easily extendable to allow you to create your own, everything is YAML based. For example, some of our early users used this system to detect the leakage of sensitive data in their backup environments or missing application-level encryption of their health data.<p>I’m sure you are wondering how can we detect sensitive data flows just by looking at the code. Essentially, we also perform static code analysis to detect those. In a nutshell, we look for those sensitive data flows at two levels: - Analyzing class names, methods, functions, variables, properties, and attributes. It then ties those together to detected data structures. It does variable reconciliation etc. - Analyzing data structure definitions files such as OpenAPI, SQL, GraphQL, and Protobuf.<p>Then we pass this over to a classification engine that assess 120+ data types from sensitive data categories such as Personal Data (PD), Sensitive PD, Personally identifiable information (PII), and Personal Health Information (PHI). All of that is documented here: <a href="https://docs.bearer.com/explanations/discovery-and-classification/" rel="nofollow">https://docs.bearer.com/explanations/discovery-and-classific...</a><p>As we said before, developer experience is key, that’s why you can install Bearer in 15 seconds, from cURL, Homebrew, apt-get, yum, or as a docker image. Then you run it as a CLI locally, or as part of your CI/CD.<p>We currently support JavaScript and Ruby stacks, but more will follow shortly!<p>Please let us know what you think and check out the repo here: <a href="https://github.com/Bearer/bearer">https://github.com/Bearer/bearer</a>

Show HN: Bearer – Open-source code security scanning solution (SAST)

Hi HN,<p>we’re the co-founders of Bearer, and today we launch an open-source alternative to code security solutions such as Snyk Code, SonarQube, or Checkmarx. Essentially, we help security & engineering teams to discover, filter and prioritize security risks and vulnerabilities in their codebase, with a unique approach through sensitive data (PII, PD, PHI).<p>Our website is at <a href="https://www.bearer.com" rel="nofollow">https://www.bearer.com</a> and our GitHub is here: <a href="https://github.com/bearer/bearer">https://github.com/bearer/bearer</a><p>We are not originally Security experts but have been software developers and engineering leaders for over 15 years now, and we thought we could provide a new perspective to security products with a strong emphasis on the developer experience, something we often found lacking for security tools.<p>In addition to building a true developer-friendly security solution, we’ve also heard a lot of teams complaining about how noisy their static code security solutions are. As a result, they often have difficulties triaging the most important issues, and ultimately it’s difficult to remediate them. We believe an important part of the problem lies in the fact that we lack a clear understanding of the real impact of any security issues. Without that understanding, it’s very difficult to ask developers to remediate critical security flaws.<p>We’ve built a unique approach to this problem, by looking at the impact of security issues through the lens of sensitive data. Interestingly, most security team ultimate responsibility today is to secure those sensitive data and protect their organization from costly data loss and leakage, but until today, that connection has never been made.<p>In practical terms, we provide a set of rules that assess the variety of ways known code vulnerabilities (CWE) ultimately impact your application security, and we reconcile it with your sensitive data flows. At the time of this writing, Bearer provides over 100 rules.<p>Here are some examples of what those rules can detect: - Leakage of sensitive data through cookies, internal loggers, third-party logging services, and into analytics environments. - Non-filtered user input that can lead to breaches of sensitive information. - Usage of weak encryption libraries or misusage of encryption algorithms. - Unencrypted incoming and outgoing communication (HTTP, FTP, SMTP) of sensitive information. - Hard-coded secrets and tokens. - And many you can find see here: <a href="https://docs.bearer.com/reference/rules/" rel="nofollow">https://docs.bearer.com/reference/rules/</a><p>Rules are easily extendable to allow you to create your own, everything is YAML based. For example, some of our early users used this system to detect the leakage of sensitive data in their backup environments or missing application-level encryption of their health data.<p>I’m sure you are wondering how can we detect sensitive data flows just by looking at the code. Essentially, we also perform static code analysis to detect those. In a nutshell, we look for those sensitive data flows at two levels: - Analyzing class names, methods, functions, variables, properties, and attributes. It then ties those together to detected data structures. It does variable reconciliation etc. - Analyzing data structure definitions files such as OpenAPI, SQL, GraphQL, and Protobuf.<p>Then we pass this over to a classification engine that assess 120+ data types from sensitive data categories such as Personal Data (PD), Sensitive PD, Personally identifiable information (PII), and Personal Health Information (PHI). All of that is documented here: <a href="https://docs.bearer.com/explanations/discovery-and-classification/" rel="nofollow">https://docs.bearer.com/explanations/discovery-and-classific...</a><p>As we said before, developer experience is key, that’s why you can install Bearer in 15 seconds, from cURL, Homebrew, apt-get, yum, or as a docker image. Then you run it as a CLI locally, or as part of your CI/CD.<p>We currently support JavaScript and Ruby stacks, but more will follow shortly!<p>Please let us know what you think and check out the repo here: <a href="https://github.com/Bearer/bearer">https://github.com/Bearer/bearer</a>

Show HN: Historical HN Hiring Data

Hi HN! A few days ago I saw a graph[0] that showed the # of job postings on HN was declining. I started wondering what other trends I could glean from the data, so I created this!<p>You can filter through the top level comments by keyword; for example you can filter by "remote" to see the massive spike around March 2020. Another interesting thing I found is that I can compare hiring across cities.<p>I hope you enjoy! I made it so that the links to your search are sharable so if you have some interesting data you should be able to just link the page you're on!<p>[0] <a href="https://rinzewind.org/blog-en/2023/the-tech-downturn-seen-through-hacker-news-comments.html" rel="nofollow">https://rinzewind.org/blog-en/2023/the-tech-downturn-seen-th...</a>

Show HN: Historical HN Hiring Data

Hi HN! A few days ago I saw a graph[0] that showed the # of job postings on HN was declining. I started wondering what other trends I could glean from the data, so I created this!<p>You can filter through the top level comments by keyword; for example you can filter by "remote" to see the massive spike around March 2020. Another interesting thing I found is that I can compare hiring across cities.<p>I hope you enjoy! I made it so that the links to your search are sharable so if you have some interesting data you should be able to just link the page you're on!<p>[0] <a href="https://rinzewind.org/blog-en/2023/the-tech-downturn-seen-through-hacker-news-comments.html" rel="nofollow">https://rinzewind.org/blog-en/2023/the-tech-downturn-seen-th...</a>

Show HN: I built a better UI for ChatGPT

Show HN: I built a better UI for ChatGPT

Show HN: NESFab – Programming language for making NES games

This is a long-running personal project I've had to write an optimizing compiler from scratch. Everything was done by me, including the lexer/parser, SSA-based IR, high-performance data structures, and code generator.<p>Originally I wasn't targeting the NES. It started as a scripting language, then it morphed into a C++ replacement, and then finally I turned it into what it is today. The large scope of the project and colorful history means it's still a little rough around the edges, but it's now working well enough to post.

Show HN: NESFab – Programming language for making NES games

This is a long-running personal project I've had to write an optimizing compiler from scratch. Everything was done by me, including the lexer/parser, SSA-based IR, high-performance data structures, and code generator.<p>Originally I wasn't targeting the NES. It started as a scripting language, then it morphed into a C++ replacement, and then finally I turned it into what it is today. The large scope of the project and colorful history means it's still a little rough around the edges, but it's now working well enough to post.

Show HN: NESFab – Programming language for making NES games

This is a long-running personal project I've had to write an optimizing compiler from scratch. Everything was done by me, including the lexer/parser, SSA-based IR, high-performance data structures, and code generator.<p>Originally I wasn't targeting the NES. It started as a scripting language, then it morphed into a C++ replacement, and then finally I turned it into what it is today. The large scope of the project and colorful history means it's still a little rough around the edges, but it's now working well enough to post.

Show HN: I made a battery-powered e-ink display that shows my calendar

Show HN: Sorbay – Open-source alternative to Loom

Hey HN, we're excited to introduce Sorbay - an open source alternative to Loom for creating and sharing screen recordings.<p>With Sorbay, you can easily record your screen, camera, and microphone all at once. It is a complete solution that comes with its own backend service, allowing you to instantly share a link of your recording as soon as it is finished. The video is streamed directly to the backend service as the recording happens to make this possible.<p>With both founders based in different countries, we needed a tool to quickly share screen recordings to keep us up to date or to ask for feedback. Meetings are cool if you need to discuss something deeply, but for almost everything else a quick recording works better.<p>We had to settle for one of the proprietary solutions because none of the open source tools allowed us to quickly share something with each other. Doing the recording is one aspect, but having the ability to instantly share a link was crucial. Waiting on a 400mb video upload to a Dropbox is just too much interruption if you want to quickly share something.<p>The tipping point for us to actually build this open source tool came via an interaction from one of our day jobs. A third party provider sent a screen recording full of confidential information and to make things worse, all of it was uploaded by them to a different third party service. We strongly believe that information like this should stay within a company, ideally on infrastructure that they control themselves. Having a fully integrated open source solution is the best way to go for this.<p>Our goal with this first public release is to gather feedback. The critical code paths are working, but it is still a bit rough to use. We deliberately cut out all non-essential features, but have a clear roadmap on what we want to release this year.<p>There are a couple of known issues like audio glitches, non-working videos in Safari and crashing binaries that we hope to fix in the coming weeks. Later this year, we plan on releasing a cloud hosted version of Sorbay that would let you connect your own S3 storage provider. Additionally, we will be releasing an on-prem option focused on features for enterprises (SSO, RBAC, compliance).<p>Both the Sorbay Client and the backend service are completely open source. For licensing we choose the AGPLv3 throughout the stack. The client is built with Vue.js on top of Electron. The use of Electron might be a bit controversial here on Hackernews but given the resources we currently have that was the only way that allowed us to get a working client out on all major platforms. The backend service is realized with Django. We use Keycloak for authentication and Minio for S3 compatible storage. All of this is run alongside Postgres and Redis, running on Docker containers which are managed by Docker Compose.<p>We invite you to try Sorbay for yourself and join us on our issue tracker[1][2], Slack channel[3] or here on HN.<p>Thanks for checking out Sorbay!<p>[1]: <a href="https://github.com/sorbayhq/sorbay">https://github.com/sorbayhq/sorbay</a><p>[2]: <a href="https://github.com/sorbayhq/sorbay-client">https://github.com/sorbayhq/sorbay-client</a><p>[3]: <a href="https://join.slack.com/t/slack-oso6527/shared_invite/zt-1qd8gm543-KGdb5gD4WqikZEKEk8sSTA" rel="nofollow">https://join.slack.com/t/slack-oso6527/shared_invite/zt-1qd8...</a>

Show HN: Total.js – Low-code development (Node-RED alternative)

Show HN: Total.js – Low-code development (Node-RED alternative)

Show HN: Total.js – Low-code development (Node-RED alternative)

Show HN: Faster FastAPI with simdjson and io_uring on Linux 5.19

A few months ago, I benchmarked FastAPI on an i9 MacBook Pro. I couldn't believe my eyes. A primary REST endpoint to `sum` two integers took 6 milliseconds to evaluate. It is okay if you are targeting a server in another city, but it should be less when your client and server apps are running on the same machine.<p>FastAPI would have bottleneck-ed the inference of our lightweight UForm neural networks recently trending on HN under the title "Beating OpenAI CLIP with 100x less data and compute". (Thank you all for the kind words!) So I wrote another library.<p>It has been a while since I have written networking libraries, so I was eager to try the newer io_uring networking functionality added by Jens Axboe in kernel 5.19. TLDR: It's excellent! We used pre-registered buffers and re-allocated file descriptors from a managed pool. Some other parts, like multi-shot requests, also look intriguing, but we couldn't see a flawless way to integrate them into UJRPC. Maybe next time.<p>Like a parent with two kids, we tell everyone we love Kernel Bypass and SIMD equally. So I decided to combine the two, potentially implementing one of the fastest implementations of the most straightforward RPC protocol - JSON-RPC. Healthy and Fun Efficient and Simple, what can be better?<p>By now, you may already guess at least one of the dependencies - `simdjson` by Daniel Lemiere, that has become the industry standard. io_uring is generally very fast, even with a single core. Adding more polling threads may only increase congestion. We needed to continue using no more than one thread, but parsing messages may involve more work than just invoking a JSON parser.<p>JSON-RPC is transport agnostic. The incoming requests can be sent over HTTP, pre-pended by rows of headers. Those would have to be POSTs and generally contain Content-Length and Content-Type. There is a SIMD-accelerated library for that as well. It is called `picohttpparser`, uses SSE, and is maintained by H2O.<p>The story doesn't end there. JSON is limited. Passing binary strings is a nightmare. The most common approach is to encode them with base-64. So we took the Turbo-Base64 from the PowTurbo project to decode those binary strings.<p>The core implementation of UJRPC is under 2000 lines of C++. Knowing that those lines connect 3 great libraries with the newest and coolest parts of Linux is enough to put a smile on my face. Most people are more rational, so here is another reason to be cheerful.<p>- FastAPI throughput: 3'184 rps. - Python gRPC throughput: 9'849 rps. - UJRPC throughput: -- Python server with io_uring: 43'000 rps. -- C server with POSIX: 79'000 rps. -- C server with io_uring: 231'000 rps.<p>Granted, this is yet to be your batteries-included server. It can't balance the load, manage threads, spell S in HTTPS, or call parents when you misbehave in school. But at least part of it you shouldn't expect from a web server.<p>After following the standardization process of executors in C++ for the last N+1 years, we adapted the "bring your runtime" and "bring your thread-pool" policies. HTTPS support, however, is our next primary objective.<p>---<p>Of course, it is a pre-production project and must have a lot of bugs. Don't hesitate to report them. We have huge plans for this tiny package and will potentially make it the default transport of UKV: <a href="https://github.com/unum-cloud/ukv">https://github.com/unum-cloud/ukv</a>