The best Hacker News stories from Show from the past day
Latest posts:
Show HN: Pykoi – a Python library for LLM data collection and fine tuning
Hi HN,<p>pykoi is an open-source python library for ML scientists. pykoi makes it easier to collect data for LLMs, to use that data for finetuning, and to compare models to each other (e.g. your model pre- and post- finetuning, or your model vs openai vs claude). The library comes from pain points we experienced in LLM development:<p>1. Collecting feedback data from users isn't as easy as it could be. (The current process usually involves sharing excel files of annotated responses back-and-forth, offering no insight into how users actually engage with your models).<p>2. RLHF remains complicated to carry out. By <i>complicated</i>, we mean requires a lot of steps, hundreds of configs, lengthy setups, etc.<p>3. Comparing models to each other <i>as they're used</i> (that is, independent from academic metrics) is full of friction. The current approach: spin up a model, ask questions, write them down. Repeat for other models then compare.<p>At a high-level, we think that the active learning process should be closed-loop: data collection, fine tuning, and inference all feed from the same system. This library is our first step in that direction.<p>The project is still very early but we hope that some if it is useful. Note, we're fully open-source, and actively adding features!<p>Website: <a href="https://www.cambioml.com/pykoi">https://www.cambioml.com/pykoi</a>
GitHub: <a href="https://github.com/CambioML/pykoi">https://github.com/CambioML/pykoi</a><p>We would love your feedback!
Show HN: Obl.ong, Free, quality domains for all
Show HN: Obl.ong, Free, quality domains for all
Show HN: Pip Imports in Deno
deno_python 0.3.1 adds support for importing python pip packages directly in JavaScript! Fun and useful, slightly cursed.
Show HN: Pip Imports in Deno
deno_python 0.3.1 adds support for importing python pip packages directly in JavaScript! Fun and useful, slightly cursed.
Show HN: Pip Imports in Deno
deno_python 0.3.1 adds support for importing python pip packages directly in JavaScript! Fun and useful, slightly cursed.
Show HN: Pip Imports in Deno
deno_python 0.3.1 adds support for importing python pip packages directly in JavaScript! Fun and useful, slightly cursed.
Show HN: Tetris, but the blocks are ARM instructions that execute in the browser
OFRAK Tetris is a project I started at work about two weeks ago. It's a web-based game that works on desktop and mobile. I made it for my company to bring to events like DEF CON, and to promote our binary analysis and patching framework called OFRAK.<p>In the game, 32-bit, little-endian ARM assembly instructions fall, and you can modify the operands before executing them on a CPU emulator. There are two segments mapped – one for instructions, and one for data (though both have read, write, and execute permissions). Your score is a four byte signed integer stored at the virtual address pointed to by the R12 register, and the goal is to use the instructions that fall to make the score value in memory as high as possible. When it's game over, you can download your game as an ELF to relive the glory in GDB on your favorite ARM device.<p>The CPU emulator is a version of Unicorn (<a href="https://www.unicorn-engine.org/" rel="nofollow noreferrer">https://www.unicorn-engine.org/</a>) that has been cross-compiled to WebAssembly (<a href="https://alexaltea.github.io/unicorn.js/" rel="nofollow noreferrer">https://alexaltea.github.io/unicorn.js/</a>), so everything on the page runs in the browser without the need for any complicated infrastructure on the back end.<p>Since I've only been working on this for a short period of time leading up to its debut at DEF CON, there are still many more features I'd eventually like to implement. These include adding support for other ISAs besides ARM, adding an instruction reference manual, and lots of little cleanups, bug fixes, and adjustments.<p>My highest score is 509,644,979, but my average is about 131,378.<p>I look forward to feedback, bug reports, feature requests, and strategy discussions!
Show HN: Tetris, but the blocks are ARM instructions that execute in the browser
OFRAK Tetris is a project I started at work about two weeks ago. It's a web-based game that works on desktop and mobile. I made it for my company to bring to events like DEF CON, and to promote our binary analysis and patching framework called OFRAK.<p>In the game, 32-bit, little-endian ARM assembly instructions fall, and you can modify the operands before executing them on a CPU emulator. There are two segments mapped – one for instructions, and one for data (though both have read, write, and execute permissions). Your score is a four byte signed integer stored at the virtual address pointed to by the R12 register, and the goal is to use the instructions that fall to make the score value in memory as high as possible. When it's game over, you can download your game as an ELF to relive the glory in GDB on your favorite ARM device.<p>The CPU emulator is a version of Unicorn (<a href="https://www.unicorn-engine.org/" rel="nofollow noreferrer">https://www.unicorn-engine.org/</a>) that has been cross-compiled to WebAssembly (<a href="https://alexaltea.github.io/unicorn.js/" rel="nofollow noreferrer">https://alexaltea.github.io/unicorn.js/</a>), so everything on the page runs in the browser without the need for any complicated infrastructure on the back end.<p>Since I've only been working on this for a short period of time leading up to its debut at DEF CON, there are still many more features I'd eventually like to implement. These include adding support for other ISAs besides ARM, adding an instruction reference manual, and lots of little cleanups, bug fixes, and adjustments.<p>My highest score is 509,644,979, but my average is about 131,378.<p>I look forward to feedback, bug reports, feature requests, and strategy discussions!
Show HN: Tetris, but the blocks are ARM instructions that execute in the browser
OFRAK Tetris is a project I started at work about two weeks ago. It's a web-based game that works on desktop and mobile. I made it for my company to bring to events like DEF CON, and to promote our binary analysis and patching framework called OFRAK.<p>In the game, 32-bit, little-endian ARM assembly instructions fall, and you can modify the operands before executing them on a CPU emulator. There are two segments mapped – one for instructions, and one for data (though both have read, write, and execute permissions). Your score is a four byte signed integer stored at the virtual address pointed to by the R12 register, and the goal is to use the instructions that fall to make the score value in memory as high as possible. When it's game over, you can download your game as an ELF to relive the glory in GDB on your favorite ARM device.<p>The CPU emulator is a version of Unicorn (<a href="https://www.unicorn-engine.org/" rel="nofollow noreferrer">https://www.unicorn-engine.org/</a>) that has been cross-compiled to WebAssembly (<a href="https://alexaltea.github.io/unicorn.js/" rel="nofollow noreferrer">https://alexaltea.github.io/unicorn.js/</a>), so everything on the page runs in the browser without the need for any complicated infrastructure on the back end.<p>Since I've only been working on this for a short period of time leading up to its debut at DEF CON, there are still many more features I'd eventually like to implement. These include adding support for other ISAs besides ARM, adding an instruction reference manual, and lots of little cleanups, bug fixes, and adjustments.<p>My highest score is 509,644,979, but my average is about 131,378.<p>I look forward to feedback, bug reports, feature requests, and strategy discussions!
Show HN: Tetris, but the blocks are ARM instructions that execute in the browser
OFRAK Tetris is a project I started at work about two weeks ago. It's a web-based game that works on desktop and mobile. I made it for my company to bring to events like DEF CON, and to promote our binary analysis and patching framework called OFRAK.<p>In the game, 32-bit, little-endian ARM assembly instructions fall, and you can modify the operands before executing them on a CPU emulator. There are two segments mapped – one for instructions, and one for data (though both have read, write, and execute permissions). Your score is a four byte signed integer stored at the virtual address pointed to by the R12 register, and the goal is to use the instructions that fall to make the score value in memory as high as possible. When it's game over, you can download your game as an ELF to relive the glory in GDB on your favorite ARM device.<p>The CPU emulator is a version of Unicorn (<a href="https://www.unicorn-engine.org/" rel="nofollow noreferrer">https://www.unicorn-engine.org/</a>) that has been cross-compiled to WebAssembly (<a href="https://alexaltea.github.io/unicorn.js/" rel="nofollow noreferrer">https://alexaltea.github.io/unicorn.js/</a>), so everything on the page runs in the browser without the need for any complicated infrastructure on the back end.<p>Since I've only been working on this for a short period of time leading up to its debut at DEF CON, there are still many more features I'd eventually like to implement. These include adding support for other ISAs besides ARM, adding an instruction reference manual, and lots of little cleanups, bug fixes, and adjustments.<p>My highest score is 509,644,979, but my average is about 131,378.<p>I look forward to feedback, bug reports, feature requests, and strategy discussions!
Show HN: Hacker News home page spoof
i'd forgotten i'd written this a year and a half ago, and when a friend passed me the link to it just now, it seemed hilarious to me. probably some other people will enjoy it too
Show HN: Hacker News home page spoof
i'd forgotten i'd written this a year and a half ago, and when a friend passed me the link to it just now, it seemed hilarious to me. probably some other people will enjoy it too
Show HN: Retake – Open-Source Hybrid Search for Postgres
Hey HN! We're Phil and Ming, co-founders of Retake (<a href="https://github.com/getretake/retake">https://github.com/getretake/retake</a>). Retake is an open source tool that adds keyword and semantic (i.e hybrid) search to databases. We’ve started by extending the capabilities of Postgres with an SDK for lightning-fast queries.<p>We built Retake to fix two issues: keeping vectors in sync with Postgres in real time is difficult, and most vector databases aren’t built for hybrid search.<p>A quick refresher: “keyword search” refers to a technique where results are scored based on the appearance of exact words or terms. “Semantic search” uses vector embeddings to understand the meaning behind those words. Hybrid search combines these two approaches to enhance the precision and relevance of results.<p>To implement semantic or hybrid search today, most organizations run batch jobs that update their search engine or vector database using ETL tools or custom data pipelines. We’ve seen from firsthand experience how time-consuming and costly this can be, as moving vectors often requires re-embedding the entire data source.<p>We’ve also seen how many vector databases lack crucial features of “traditional” search: keyword-based (BM25) search, faceting/aggregations, highlighting, efficient filtering, etc.<p>Here’s how Retake works - our core is built on top of OpenSearch, which acts as a search engine and vector database. We leverage logical-replication-based Change Data Capture (CDC) to stay in sync with Postgres, so documents and vectors are updated incrementally and in real time. Finally, Python and Typescript SDKs make it easy to integrate Retake into your application. There’s no need to manage separate vector databases and search engines, upload and embed documents, or run expensive reindexing jobs. All you need to think about is writing search queries.<p>The easiest way to get started with Retake is by running our Docker Compose stack:<p><pre><code> git clone https://github.com/getretake/retake.git
cd retake/docker && docker compose up
</code></pre>
Retake is Apache licensed and our repo is here: <a href="https://github.com/getretake/retake">https://github.com/getretake/retake</a>. For next steps, see our quick start guide: <a href="https://docs.getretake.com/quickstart">https://docs.getretake.com/quickstart</a><p>We’d love your feedback on our solution to hybrid search. Our focus right now is on nailing the basics, but we’d also love to hear what you think we should focus on next.
Show HN: Retake – Open-Source Hybrid Search for Postgres
Hey HN! We're Phil and Ming, co-founders of Retake (<a href="https://github.com/getretake/retake">https://github.com/getretake/retake</a>). Retake is an open source tool that adds keyword and semantic (i.e hybrid) search to databases. We’ve started by extending the capabilities of Postgres with an SDK for lightning-fast queries.<p>We built Retake to fix two issues: keeping vectors in sync with Postgres in real time is difficult, and most vector databases aren’t built for hybrid search.<p>A quick refresher: “keyword search” refers to a technique where results are scored based on the appearance of exact words or terms. “Semantic search” uses vector embeddings to understand the meaning behind those words. Hybrid search combines these two approaches to enhance the precision and relevance of results.<p>To implement semantic or hybrid search today, most organizations run batch jobs that update their search engine or vector database using ETL tools or custom data pipelines. We’ve seen from firsthand experience how time-consuming and costly this can be, as moving vectors often requires re-embedding the entire data source.<p>We’ve also seen how many vector databases lack crucial features of “traditional” search: keyword-based (BM25) search, faceting/aggregations, highlighting, efficient filtering, etc.<p>Here’s how Retake works - our core is built on top of OpenSearch, which acts as a search engine and vector database. We leverage logical-replication-based Change Data Capture (CDC) to stay in sync with Postgres, so documents and vectors are updated incrementally and in real time. Finally, Python and Typescript SDKs make it easy to integrate Retake into your application. There’s no need to manage separate vector databases and search engines, upload and embed documents, or run expensive reindexing jobs. All you need to think about is writing search queries.<p>The easiest way to get started with Retake is by running our Docker Compose stack:<p><pre><code> git clone https://github.com/getretake/retake.git
cd retake/docker && docker compose up
</code></pre>
Retake is Apache licensed and our repo is here: <a href="https://github.com/getretake/retake">https://github.com/getretake/retake</a>. For next steps, see our quick start guide: <a href="https://docs.getretake.com/quickstart">https://docs.getretake.com/quickstart</a><p>We’d love your feedback on our solution to hybrid search. Our focus right now is on nailing the basics, but we’d also love to hear what you think we should focus on next.
Show HN: Bubblic – end loneliness together using the power of your voice
We have gotten over 1000 voice messages left by the users of our platform.<p>We take privacy seriously, so all data are anonymized and are not sold to anyone.<p>So far, we had a user who said that 'had it not been for Bubblic, I might not be here today'. This gives us so much drive to carry on with our project!<p>We'd appreciate any feedback you have :)
Show HN: Applite – Clean Homebrew front end app for macOS built with SwiftUI
Show HN: Applite – Clean Homebrew front end app for macOS built with SwiftUI
Show HN: Applite – Clean Homebrew front end app for macOS built with SwiftUI
Show HN: Applite – Clean Homebrew front end app for macOS built with SwiftUI