The best Hacker News stories from Show from the past day

Latest posts:

Show HN: ScratchDB – Open-Source Snowflake on ClickHouse

Hello! For the past year I’ve been working on a fully-managed data warehouse built on Clickhouse. I built this because I was frustrated with how much work was required to run an OLAP database in prod: re-writing my app to do batch inserts, managing clusters and needing to look up special CREATE TABLE syntax every time I made a change. I found pricing for other warehouses confusing (what is a “credit” exactly?) and worried about getting capacity-planning wrong.<p>I was previously building accounting software for firms with millions of transactions. I desperately needed to move from Postgres to an OLAP database but didn’t know where to start. I eventually built abstractions around Clickhouse: My application code called an insert() function but in the background I had to stand up Kafka for streaming, bulk loading, DB drivers, Clickhouse configs, and manage schema changes.<p>This was all a big distraction when all I wanted was to save data and get it back. So I decided to build a better developer experience around it. The software is open-source: <a href="https://github.com/scratchdata/ScratchDB">https://github.com/scratchdata/ScratchDB</a> and and the paid offering is a hosted version: <a href="https://www.scratchdb.com/">https://www.scratchdb.com/</a>.<p>It's called “ScratchDB” because the idea is to make it easy to get started from scratch. It’s a massively simpler abstraction on top of Clickhouse.<p>ScratchDB provides two endpoints [1]: one to insert data and another to query. When you send any JSON, it automatically creates tables and columns based on the structure [2]. Because table creation is automated, you can just start sending data and the system will just work [3]. It also means you can use Scratch as any webhook destination without prior setup [4,5]. When you query, just pass SQL as a query param and it returns JSON.<p>It handles streaming and bulk loading data. When data is inserted, I append it to a file on disk, which is then bulk loaded into Clickhouse. The overall goal is for the platform to automatically handle managing shards and replicas.<p>The whole thing runs on regular servers. Hetzner has become our cloud of choice, along with Backblaze B2 and SQS. It is written in Go. From an architecture perspective I try to keep things simple - want folks to make economical use of their servers.<p>So far ScratchDB has ingested about 2 TB of data and 4,000 requests/second on about $100 worth of monthly server costs.<p>Feel free to download it and play around - if you’re interested in this stuff then I’d love to chat! Really looking for feedback on what is hard about analytical databases and what would make the developer experience easier!<p>[1] <a href="https://scratchdb.com/docs">https://scratchdb.com/docs</a><p>[2] <a href="https://scratchdb.com/blog/flatten-json/">https://scratchdb.com/blog/flatten-json/</a><p>[3] <a href="https://scratchdb.com/blog/scratchdb-email-signups/">https://scratchdb.com/blog/scratchdb-email-signups/</a><p>[4] <a href="https://scratchdb.com/blog/stripe-data-ingest/">https://scratchdb.com/blog/stripe-data-ingest/</a><p>[5] <a href="https://scratchdb.com/blog/shopify-data-ingest/">https://scratchdb.com/blog/shopify-data-ingest/</a>

Show HN: ScratchDB – Open-Source Snowflake on ClickHouse

Hello! For the past year I’ve been working on a fully-managed data warehouse built on Clickhouse. I built this because I was frustrated with how much work was required to run an OLAP database in prod: re-writing my app to do batch inserts, managing clusters and needing to look up special CREATE TABLE syntax every time I made a change. I found pricing for other warehouses confusing (what is a “credit” exactly?) and worried about getting capacity-planning wrong.<p>I was previously building accounting software for firms with millions of transactions. I desperately needed to move from Postgres to an OLAP database but didn’t know where to start. I eventually built abstractions around Clickhouse: My application code called an insert() function but in the background I had to stand up Kafka for streaming, bulk loading, DB drivers, Clickhouse configs, and manage schema changes.<p>This was all a big distraction when all I wanted was to save data and get it back. So I decided to build a better developer experience around it. The software is open-source: <a href="https://github.com/scratchdata/ScratchDB">https://github.com/scratchdata/ScratchDB</a> and and the paid offering is a hosted version: <a href="https://www.scratchdb.com/">https://www.scratchdb.com/</a>.<p>It's called “ScratchDB” because the idea is to make it easy to get started from scratch. It’s a massively simpler abstraction on top of Clickhouse.<p>ScratchDB provides two endpoints [1]: one to insert data and another to query. When you send any JSON, it automatically creates tables and columns based on the structure [2]. Because table creation is automated, you can just start sending data and the system will just work [3]. It also means you can use Scratch as any webhook destination without prior setup [4,5]. When you query, just pass SQL as a query param and it returns JSON.<p>It handles streaming and bulk loading data. When data is inserted, I append it to a file on disk, which is then bulk loaded into Clickhouse. The overall goal is for the platform to automatically handle managing shards and replicas.<p>The whole thing runs on regular servers. Hetzner has become our cloud of choice, along with Backblaze B2 and SQS. It is written in Go. From an architecture perspective I try to keep things simple - want folks to make economical use of their servers.<p>So far ScratchDB has ingested about 2 TB of data and 4,000 requests/second on about $100 worth of monthly server costs.<p>Feel free to download it and play around - if you’re interested in this stuff then I’d love to chat! Really looking for feedback on what is hard about analytical databases and what would make the developer experience easier!<p>[1] <a href="https://scratchdb.com/docs">https://scratchdb.com/docs</a><p>[2] <a href="https://scratchdb.com/blog/flatten-json/">https://scratchdb.com/blog/flatten-json/</a><p>[3] <a href="https://scratchdb.com/blog/scratchdb-email-signups/">https://scratchdb.com/blog/scratchdb-email-signups/</a><p>[4] <a href="https://scratchdb.com/blog/stripe-data-ingest/">https://scratchdb.com/blog/stripe-data-ingest/</a><p>[5] <a href="https://scratchdb.com/blog/shopify-data-ingest/">https://scratchdb.com/blog/shopify-data-ingest/</a>

Show HN: Prompt-Engineering Tool: AI-to-AI Testing for LLM

Spelltest framework simulates conversations between AI ‘synthetic users' in an environment to test and refine LLM-based applications. It ensures your app converse with utmost accuracy and relevance. Post-chat, Spelltest assesses responses, providing qualitative and quantitative feedback on performance. Suitable for both chat and completion modes.<p>When to use: - After modifying your prompt. - When your LLM provider updates. - As a CI step for you repo.<p>All feedback and collaborations appreciated!

Show HN: Tonic Validate Metrics – an open-source RAG evaluation metrics package

Hey HN, Joe and Ethan from Tonic.ai here. We just released a new open-source python package for evaluating the performance of Retrieval Augmented Generation (RAG) systems.<p>Earlier this year, we started developing a RAG-powered app to enable companies to talk to their free-text data safely.<p>During our experimentation, however, we realized that using such a new method meant that there weren’t industry-standards for evaluation metrics to measure the accuracy of RAG performance. We built Tonic Validate Metrics (tvalmetrics, for short) to easily calculate the benchmarks we needed to meet in building our RAG system.<p>We’re sharing this python package with the hope that it will be as useful for you as it has been for us and become a key part of the toolset you use to build LLM-powered applications. We also made Tonic Validate Metrics open-source so that it can thrive and evolve with your contributions!<p>Please take it for a spin and let us know what you think in the comments.<p>Docs: <a href="https://docs.tonic.ai/validate" rel="nofollow noreferrer">https://docs.tonic.ai/validate</a><p>Repo: <a href="https://github.com/TonicAI/tvalmetrics">https://github.com/TonicAI/tvalmetrics</a><p>Tonic Validate: <a href="https://validate.tonic.ai" rel="nofollow noreferrer">https://validate.tonic.ai</a>

Show HN: Orbital – Dynamically unifying APIs and data with no glue code

Hey HN,<p>I'm excited to share Orbital, a new approach for unifying APIs and Data sources!<p>Rather than relying on glue code to bridge endpoints, Orbital leverages annotations in schemas & API specs to build the integration dynamically.<p>The traditional method of crafting glue code often becomes repetitive and burdensome to maintain in the long run.<p>With Orbital, developers embed tags to their existing API specs (OAP, Protobuf, etc), indicating where data can be sourced, and publish these specs to Orbital (which runs self-hosted).<p>Consumers query these tags with our TaxiQL language, and Orbital generates the integration on the fly. That could be merging multiple APIs, blending API and database queries, or enriching event streams to craft custom message payloads.<p>It feels a lot like writing GraphQL, but there's no resolvers to maintain, and producers are free to use a variety of API Spec languages.<p>The beauty of using tags over field names is the adaptability. As API developers update and publish their specs, the integration remains seamless and automatically adjusts.<p>Under the hood, the tags (and associated query language) are actually Taxi - an OSS meta-language and toolchain we build (and have shared previously). Orbital is a query engine that executes TaxiQL queries, generating the integration.<p>We've been working on this for a while, and have a number of production deployments. We recently made the move to make the source available on Github under a mix of Apache 2 (Open Core) and BuSL.<p>In a nutshell, Orbital excels in Data Composition (covering APIs, DBs), crafting Bespoke Event Streams, and streamlining ELT workloads into databases.<p>Excited to hear your thoughts and feedback!

Show HN: Instant API – Build type-safe web APIs with JavaScript

Hey there HN! I just wrapped up an all-day documentation spree for Instant API, a JavaScript framework for easily building web APIs that implements type safety at the HTTP interface. It uses a function-as-a-service approach combined with a slightly modified JSDoc spec to automatically enforce parameter and schema validation before HTTP requests make it through to your code. You just write comments above your endpoint functions and voila, enforced type safety contracts between your API endpoints and your developers. OpenAPI specifications are generated automatically as a byproduct of simply building your endpoints.<p>This eliminates the need for most schema validation libraries, automates user input sanitization, and prevents your team from developing carpal tunnel syndrome trying to keep your OpenAPI / Swagger specifications up-to-date. We developed it as a side effect of building our own serverless API platform, where it has scaled to handle over 100M requests from users per day. This is an early release, but Instant API has had about six years of consistent development as proprietary software to get to the point it is at today. We have spent the last couple of months modernizing it.<p>We have command line tools that make building, testing, and deploying Instant API to Vercel or AWS a breeze. Additionally, there is a sister package, Instant ORM, that provides a Ruby on Rails-like model management, migration and querying suite for Postgres. Everything you need to build an API from scratch.<p>Finally -- LLMs and chatbots are the the top of the hype cycle right now. We aren't immune, and have developed a number of LLM-integrated tools ourselves. Instant API comes with first-class Server-Sent Event support for building your own assistants and a few other nifty tools that help with AI integrations and chatbot webhook responses, like executing endpoints as background jobs.<p>This has been a huge labor of love -- it has become our "JavaScript on Rails" suite -- and we hope y'all enjoy it!

Show HN: Instant API – Build type-safe web APIs with JavaScript

Hey there HN! I just wrapped up an all-day documentation spree for Instant API, a JavaScript framework for easily building web APIs that implements type safety at the HTTP interface. It uses a function-as-a-service approach combined with a slightly modified JSDoc spec to automatically enforce parameter and schema validation before HTTP requests make it through to your code. You just write comments above your endpoint functions and voila, enforced type safety contracts between your API endpoints and your developers. OpenAPI specifications are generated automatically as a byproduct of simply building your endpoints.<p>This eliminates the need for most schema validation libraries, automates user input sanitization, and prevents your team from developing carpal tunnel syndrome trying to keep your OpenAPI / Swagger specifications up-to-date. We developed it as a side effect of building our own serverless API platform, where it has scaled to handle over 100M requests from users per day. This is an early release, but Instant API has had about six years of consistent development as proprietary software to get to the point it is at today. We have spent the last couple of months modernizing it.<p>We have command line tools that make building, testing, and deploying Instant API to Vercel or AWS a breeze. Additionally, there is a sister package, Instant ORM, that provides a Ruby on Rails-like model management, migration and querying suite for Postgres. Everything you need to build an API from scratch.<p>Finally -- LLMs and chatbots are the the top of the hype cycle right now. We aren't immune, and have developed a number of LLM-integrated tools ourselves. Instant API comes with first-class Server-Sent Event support for building your own assistants and a few other nifty tools that help with AI integrations and chatbot webhook responses, like executing endpoints as background jobs.<p>This has been a huge labor of love -- it has become our "JavaScript on Rails" suite -- and we hope y'all enjoy it!

Show HN: A note-keeping system on top of Fossil SCM

Show HN: A note-keeping system on top of Fossil SCM

Show HN: A note-keeping system on top of Fossil SCM

Show HN: JellyBox – Jellyfin Desktop Client

Hey guys, so I've been working on native desktop macos client for jellyfin server.<p>Feel free to join and try it out <a href="https://testflight.apple.com/join/LVj8KwAq" rel="nofollow noreferrer">https://testflight.apple.com/join/LVj8KwAq</a>

Show HN: JellyBox – Jellyfin Desktop Client

Hey guys, so I've been working on native desktop macos client for jellyfin server.<p>Feel free to join and try it out <a href="https://testflight.apple.com/join/LVj8KwAq" rel="nofollow noreferrer">https://testflight.apple.com/join/LVj8KwAq</a>

Show HN: JellyBox – Jellyfin Desktop Client

Hey guys, so I've been working on native desktop macos client for jellyfin server.<p>Feel free to join and try it out <a href="https://testflight.apple.com/join/LVj8KwAq" rel="nofollow noreferrer">https://testflight.apple.com/join/LVj8KwAq</a>

Show HN: Dlt – Python library to automate the creation of datasets

Hi HN,<p>We're Anna, Adrian, Marcin and Matt, developers of dlt. dlt is an open source library to automatically create datasets out of messy, unstructured data sources. You can use the library to move data from about anywhere into most of well known SQL and vector stores, data lakes, storage buckets, or local engines like DuckDB. It automates many cumbersome data engineering tasks and can by handled by anyone who knows Python.<p>Here’s our Github: <a href="https://github.com/dlt-hub/dlt">https://github.com/dlt-hub/dlt</a><p>Here’s our Colab demo: <a href="https://colab.research.google.com/drive/1DhaKW0tiSTHDCVmPjM-eoyL47BJ30xmP" rel="nofollow noreferrer">https://colab.research.google.com/drive/1DhaKW0tiSTHDCVmPjM-...</a><p>— — —<p>In the past we wrote hundreds of Python scripts to fit messy data sources into something that you can work with in Python - a database, Pandas frame or just a Python list. We were solving the same problems and making the similar mistakes again and again.<p>This is why we built an easy to use Python library called dlt that will automate most data engineering tasks. It hides the complexities of data loading and automatically generates a structured and clean datasets for immediate querying and sharing.<p>— — —<p>At its core, dlt removes the need to create the dataset schemas, react to changing data, generate append or merge statements, and to move the data in transactional and idempotent manner. Those things are automated and can be declared right in the Python code, just by decorating functions.<p>Add @dlt.resource decorator, give it a few hints, and convert any data into a simple pipeline that creates and updates datasets.<p>dlt gets the details out of your way:<p>1. You do not need to worry about the structure of a database or parquet files<p>dlt will create a nice, typed schema out of your data and will migrate it when the data changes. You can put some data contracts and Pydantic models on top to keep your data clean.<p>2. You do not need to write any INSERT/UPDATE or data copy statements<p>dlt will push the data to DuckDB, Weaviate, storage buckets and many popular SQL stores. It will align the data types, file formats, and identifier names automatically<p>3. You do not need to worry when you need to add new data or update the changes.<p>dlt lets you declare how to load the data, how to increment it and will keep the loading state together so they are always in sync.<p>4. You keep how you develop and test your code<p>Iterate and test quickly on your laptop or in a dev container. Run locally on DuckDB and just swap destination name to go to the cloud - your code, schema and data will stay the same.<p>5. You can work with data on your laptop.<p>Combine dlt with other tools and libraries to process data locally. duckdb, Pandas, Arrow tables and Rust based loading libraries like ConnectorX work nicely with dlt and process data blazingly fast, compared to the cloud.<p>6. You do not need to worry if your pipeline will work when you deploy it.<p>dlt is a minimalistic Python library, requires no backend and works whenever Python works. You can finetune it to work on constrained environments like AWS Lambda or run with Airflow, GitHub Actions or Dagster.<p>dlt has an Apache 2.0 license. We plan to make money by offering organizations a paid control plane, where dlt users can track and policy what every pipeline does, manage schemas and contracts across organization, create data catalogues, and share them with the team members and customers.

Show HN: Fediverser Portal – Bring your subreddits to Lemmy

This is my attempt at helping those who are trying to ditch reddit but have not been satisfied with the content from Lemmy or haven't been able to find the corresponding communities.<p>There are two sides to this project. The first one is that I have setup a Lemmy instance (alien.top) which is mirroring some of the reddit content from subreddits that I wanted to follow <i>with the comments</i>. The difference from most mirroring bots is that, instead of one single bot account mirroring all content, the system creates one account for each reddit user that is being mirrored.<p>The <i>other</i> part of this idea which I believe is more interesting: reddit users can <i>take over</i> their own mirrored bot account on this Lemmy instance. The instance itself does not use the regular registration process, but instead authenticates via Reddit OAuth. If you login through through the "Portal", we can then grab your subscribed subreddits and (when it can) find the corresponding Lemmy communities and subscribe you to those automatically. At the moment there are not that Lemmy communities that are being mirrored because I've been the sole user, but hopefully if more people sign-up, it will help to create the network effects and more instance admins will be interested in hosting these "fediversed" communities.<p>All of the code is open source (<a href="https://github.com/mushroomlabs/fediverser">https://github.com/mushroomlabs/fediverser</a>) and I'm more than willing to help people getting their own instances if they don't want to use alien.top itself.<p>Questions and any type of feedback is always welcome!

Show HN: Fediverser Portal – Bring your subreddits to Lemmy

This is my attempt at helping those who are trying to ditch reddit but have not been satisfied with the content from Lemmy or haven't been able to find the corresponding communities.<p>There are two sides to this project. The first one is that I have setup a Lemmy instance (alien.top) which is mirroring some of the reddit content from subreddits that I wanted to follow <i>with the comments</i>. The difference from most mirroring bots is that, instead of one single bot account mirroring all content, the system creates one account for each reddit user that is being mirrored.<p>The <i>other</i> part of this idea which I believe is more interesting: reddit users can <i>take over</i> their own mirrored bot account on this Lemmy instance. The instance itself does not use the regular registration process, but instead authenticates via Reddit OAuth. If you login through through the "Portal", we can then grab your subscribed subreddits and (when it can) find the corresponding Lemmy communities and subscribe you to those automatically. At the moment there are not that Lemmy communities that are being mirrored because I've been the sole user, but hopefully if more people sign-up, it will help to create the network effects and more instance admins will be interested in hosting these "fediversed" communities.<p>All of the code is open source (<a href="https://github.com/mushroomlabs/fediverser">https://github.com/mushroomlabs/fediverser</a>) and I'm more than willing to help people getting their own instances if they don't want to use alien.top itself.<p>Questions and any type of feedback is always welcome!

Show HN: Fediverser Portal – Bring your subreddits to Lemmy

This is my attempt at helping those who are trying to ditch reddit but have not been satisfied with the content from Lemmy or haven't been able to find the corresponding communities.<p>There are two sides to this project. The first one is that I have setup a Lemmy instance (alien.top) which is mirroring some of the reddit content from subreddits that I wanted to follow <i>with the comments</i>. The difference from most mirroring bots is that, instead of one single bot account mirroring all content, the system creates one account for each reddit user that is being mirrored.<p>The <i>other</i> part of this idea which I believe is more interesting: reddit users can <i>take over</i> their own mirrored bot account on this Lemmy instance. The instance itself does not use the regular registration process, but instead authenticates via Reddit OAuth. If you login through through the "Portal", we can then grab your subscribed subreddits and (when it can) find the corresponding Lemmy communities and subscribe you to those automatically. At the moment there are not that Lemmy communities that are being mirrored because I've been the sole user, but hopefully if more people sign-up, it will help to create the network effects and more instance admins will be interested in hosting these "fediversed" communities.<p>All of the code is open source (<a href="https://github.com/mushroomlabs/fediverser">https://github.com/mushroomlabs/fediverser</a>) and I'm more than willing to help people getting their own instances if they don't want to use alien.top itself.<p>Questions and any type of feedback is always welcome!

Show HN: ArtistAssistApp – a web app to paint better with ease

Hey HN!<p>I want to show my new open-source project <i>ArtistAssistApp</i>.<p><i>ArtistAssistApp</i> - the web app to paint better with ease.<p>Tools for realistic color mixing based on real paints, tonal value drawing, simplified sketching, and more.<p>Import your own photos, select any desired color directly from the image, and learn how to mix it with your paints. The web app provides a step-by-step guide on how to precisely mix that color using your own paints using atomic or optical mixing. Atomic mixing is the physical mixing of colors together, while optical mixing is the result of placing a transparent layer of color over another color (glaze technique).<p>Save instructions on how to mix your favorite colors from the paints you have for quick reference.<p>Smooth your photo to reduce detail and focus on the big shapes and proportions of your subject, and learn how to simplify and abstract your paintings.<p>Use tonal value sketches that capture the light and shadow of your subject to learn how to create contrast and depth in your paintings.<p>Works on desktops, laptops, tablets and smartphones.<p>You can try it at <<a href="https://artistassistapp.com/" rel="nofollow noreferrer">https://artistassistapp.com/</a>>. No login or registration required.<p>The source code is available on GitHub <<a href="https://github.com/eugene-khyst/artistassistapp">https://github.com/eugene-khyst/artistassistapp</a>>.

Show HN: ArtistAssistApp – a web app to paint better with ease

Hey HN!<p>I want to show my new open-source project <i>ArtistAssistApp</i>.<p><i>ArtistAssistApp</i> - the web app to paint better with ease.<p>Tools for realistic color mixing based on real paints, tonal value drawing, simplified sketching, and more.<p>Import your own photos, select any desired color directly from the image, and learn how to mix it with your paints. The web app provides a step-by-step guide on how to precisely mix that color using your own paints using atomic or optical mixing. Atomic mixing is the physical mixing of colors together, while optical mixing is the result of placing a transparent layer of color over another color (glaze technique).<p>Save instructions on how to mix your favorite colors from the paints you have for quick reference.<p>Smooth your photo to reduce detail and focus on the big shapes and proportions of your subject, and learn how to simplify and abstract your paintings.<p>Use tonal value sketches that capture the light and shadow of your subject to learn how to create contrast and depth in your paintings.<p>Works on desktops, laptops, tablets and smartphones.<p>You can try it at <<a href="https://artistassistapp.com/" rel="nofollow noreferrer">https://artistassistapp.com/</a>>. No login or registration required.<p>The source code is available on GitHub <<a href="https://github.com/eugene-khyst/artistassistapp">https://github.com/eugene-khyst/artistassistapp</a>>.

Show HN: ArtistAssistApp – a web app to paint better with ease

Hey HN!<p>I want to show my new open-source project <i>ArtistAssistApp</i>.<p><i>ArtistAssistApp</i> - the web app to paint better with ease.<p>Tools for realistic color mixing based on real paints, tonal value drawing, simplified sketching, and more.<p>Import your own photos, select any desired color directly from the image, and learn how to mix it with your paints. The web app provides a step-by-step guide on how to precisely mix that color using your own paints using atomic or optical mixing. Atomic mixing is the physical mixing of colors together, while optical mixing is the result of placing a transparent layer of color over another color (glaze technique).<p>Save instructions on how to mix your favorite colors from the paints you have for quick reference.<p>Smooth your photo to reduce detail and focus on the big shapes and proportions of your subject, and learn how to simplify and abstract your paintings.<p>Use tonal value sketches that capture the light and shadow of your subject to learn how to create contrast and depth in your paintings.<p>Works on desktops, laptops, tablets and smartphones.<p>You can try it at <<a href="https://artistassistapp.com/" rel="nofollow noreferrer">https://artistassistapp.com/</a>>. No login or registration required.<p>The source code is available on GitHub <<a href="https://github.com/eugene-khyst/artistassistapp">https://github.com/eugene-khyst/artistassistapp</a>>.