The best Hacker News stories from Show from the past day
Latest posts:
Show HN: Speeding up LLM inference 2x times (possibly)
Here's a project I've been working on for the last few months.<p>It's a new (I think) algorithm, that allows to adjust smoothly - and in real time - how many calculations you'd like to do during inference of an LLM model.<p>It seems that it's possible to do just 20-25% of weight multiplications instead of all of them, and still get good inference results.<p>I implemented it to run on M1/M2/M3 GPU. The mmul approximation itself can be pushed to run 2x fast before the quality of output collapses.<p>The inference speed is just a bit faster than Llama.cpp's, because the rest of implementation could be better, but with a better development I think it can be a new method to speed up inference - in addition to quantization.<p>You could call it ad-hoc model distillation :)<p>You can change the speed / accuracy of a model at will, in real time.<p>Oh, and as a side effect, the data format allows to also choose how much of the model you want to load into the memory. You can decide to skip say 10-20-40% of the least important weights.<p>It's implemented for Mistral, it was also tested slightly on Mixtral and Llama. It's for FP16 for now, but Q8 is in the works.<p>The algorithm is described here, and the implementation is open source.<p><a href="https://kolinko.github.io/effort/" rel="nofollow">https://kolinko.github.io/effort/</a><p>I know these are bold claims, but I hope they survive the scrutiny :)
Show HN: Speeding up LLM inference 2x times (possibly)
Here's a project I've been working on for the last few months.<p>It's a new (I think) algorithm, that allows to adjust smoothly - and in real time - how many calculations you'd like to do during inference of an LLM model.<p>It seems that it's possible to do just 20-25% of weight multiplications instead of all of them, and still get good inference results.<p>I implemented it to run on M1/M2/M3 GPU. The mmul approximation itself can be pushed to run 2x fast before the quality of output collapses.<p>The inference speed is just a bit faster than Llama.cpp's, because the rest of implementation could be better, but with a better development I think it can be a new method to speed up inference - in addition to quantization.<p>You could call it ad-hoc model distillation :)<p>You can change the speed / accuracy of a model at will, in real time.<p>Oh, and as a side effect, the data format allows to also choose how much of the model you want to load into the memory. You can decide to skip say 10-20-40% of the least important weights.<p>It's implemented for Mistral, it was also tested slightly on Mixtral and Llama. It's for FP16 for now, but Q8 is in the works.<p>The algorithm is described here, and the implementation is open source.<p><a href="https://kolinko.github.io/effort/" rel="nofollow">https://kolinko.github.io/effort/</a><p>I know these are bold claims, but I hope they survive the scrutiny :)
Show HN: Search HN for interesting comment sections
I built this tool to help me find interesting discussions on Hacker News. I love reading HN discussions almost more than the articles themselves. However, I found that full text search, although highly performant, is not always good at surfacing interesting discussions on a certain topic -- especially if you don't know what to search for exactly.<p>I built this by scraping the most recent ~6 million posts (that's about 2 years of history) and putting the resulting posts and their vector embeddings into Postgres.<p>Let me know what could be improved, and if you'd like a more detailed writeup of how this was built :)
Show HN: Search HN for interesting comment sections
I built this tool to help me find interesting discussions on Hacker News. I love reading HN discussions almost more than the articles themselves. However, I found that full text search, although highly performant, is not always good at surfacing interesting discussions on a certain topic -- especially if you don't know what to search for exactly.<p>I built this by scraping the most recent ~6 million posts (that's about 2 years of history) and putting the resulting posts and their vector embeddings into Postgres.<p>Let me know what could be improved, and if you'd like a more detailed writeup of how this was built :)
Show HN: YouTube Shorts Redirector
I am neurodivergent and noticed the Youtube Shorts format was hacking my brain to engage longer than I wanted. I wrote this quick extension to gain my time back. If you have suggestions for improvement, I'm all ears. Thank you :)
Show HN: a Rust based CLI tool 'imgcatr' for displaying images
cat for images, by RUST
Show HN: a Rust based CLI tool 'imgcatr' for displaying images
cat for images, by RUST
Show HN: Render audio waveforms to HTML canvas using WebGPU
Hey HN. I built this quick and dirty component to render audio waveforms using WebGPU. I just published it to NPM.<p>It's the first time I use WebGPU and it's been a while since I write shaders. Feedback is very welcome!<p>GitHub: <a href="https://github.com/mrkev/webgpu-waveform">https://github.com/mrkev/webgpu-waveform</a>
Examples: <a href="https://aykev.dev/webgpu-waveform" rel="nofollow">https://aykev.dev/webgpu-waveform</a>
Show HN: Render audio waveforms to HTML canvas using WebGPU
Hey HN. I built this quick and dirty component to render audio waveforms using WebGPU. I just published it to NPM.<p>It's the first time I use WebGPU and it's been a while since I write shaders. Feedback is very welcome!<p>GitHub: <a href="https://github.com/mrkev/webgpu-waveform">https://github.com/mrkev/webgpu-waveform</a>
Examples: <a href="https://aykev.dev/webgpu-waveform" rel="nofollow">https://aykev.dev/webgpu-waveform</a>
Show HN: Term Typer – Learn a language by typing
Hey HN! I'm from Brazil and I created Term Typer to help my little brother learn other languages while practicing his keyboard typing skills. We've found it super helpful and fun. Feel free to try it out and let me know your thoughts and feedback. Thanks a lot!
Show HN: Term Typer – Learn a language by typing
Hey HN! I'm from Brazil and I created Term Typer to help my little brother learn other languages while practicing his keyboard typing skills. We've found it super helpful and fun. Feel free to try it out and let me know your thoughts and feedback. Thanks a lot!
Show HN: Term Typer – Learn a language by typing
Hey HN! I'm from Brazil and I created Term Typer to help my little brother learn other languages while practicing his keyboard typing skills. We've found it super helpful and fun. Feel free to try it out and let me know your thoughts and feedback. Thanks a lot!
Show HN: Term Typer – Learn a language by typing
Hey HN! I'm from Brazil and I created Term Typer to help my little brother learn other languages while practicing his keyboard typing skills. We've found it super helpful and fun. Feel free to try it out and let me know your thoughts and feedback. Thanks a lot!
Show HN: Solo founder launched iOS Development Agency as a Subscription
Hi there! I recently launched my iOS development agency with a subscription model, inspired by DesignJoy founder Brett Williams. My agency is among the first in the US (or in Texas, or in Dallas ;)) to offer iOS development services on a month-to-month basis. The subscription is flexible, allowing for pauses or cancellations at any time. With 10 years of experience in software development and success as a consultant for Fortune 500 companies, I aim to leverage my expertise to help startups and other companies elevate their iOS development.
Show HN: I just made an All-In-One IP Toolbox form builder open-sourced
Show HN: Semantic Search React Component
If you've ever used CTRL+F on websites or documentation, you'll love this functionality.
Show HN: I made a tool which fixes broken JSONs
Show HN: Docker-boot – Run a system from RAM without LiveCD
How often do you screw up the system so much you have to reformat the disk (without losing data) to fix it? Well, sometimes I do, and sometimes I can't be bothered to burn a live ISO onto a USB stick. There's initramfs, but it's hardly a pleasant environment, with network configuration and all.<p>My go-to solution has typically been to create a chroot with busybox and a few utilities in /tmp, chroot into it, and then kill services that use the solid drive so that I can unmount it. That's an error-prone process, and sometimes systemd itself uses disk, so you can't unmount the drive despite killing all the userland but PID 1.<p>This script improves the UX. It uses a Docker image as the chroot base, which is much easier to tailor to your needs, and automagically commits all the atrocities, such as tearing down all the userland processes, including PID 1, and re-spawning the host system from the container filesystem.<p>It also drives libostree and Nix users mad, because it can be used to try out a new DE or even a whole OS without polluting the host filesystem or spawning a virtual machine. The video in the README shows me trying out KDE + SDDM from a host running GNOME + GDM3.
Show HN: Purl – A Simple Tool for Text Processing
Hello HN community,<p>I'm excited to share a new command-line tool I developed called purl, inspired by the simplicity of Perl one-liners for efficient text processing. Purl features include Perl-like regex that simplifies text manipulation, it's cross-platform so works equally well on macOS, Linux, etc., and it's quick and easy to install. The tool also supports simple commands such as -replace, -filter, and -exclude, and offers optional color output to enhance readability.<p>Purl is a practical alternative to traditional tools like sed and grep, designed to address some of their common limitations.<p>For more information and to try purl yourself, visit: <a href="https://github.com/catatsuy/purl">https://github.com/catatsuy/purl</a><p>I appreciate any feedback!
Show HN: Purl – A Simple Tool for Text Processing
Hello HN community,<p>I'm excited to share a new command-line tool I developed called purl, inspired by the simplicity of Perl one-liners for efficient text processing. Purl features include Perl-like regex that simplifies text manipulation, it's cross-platform so works equally well on macOS, Linux, etc., and it's quick and easy to install. The tool also supports simple commands such as -replace, -filter, and -exclude, and offers optional color output to enhance readability.<p>Purl is a practical alternative to traditional tools like sed and grep, designed to address some of their common limitations.<p>For more information and to try purl yourself, visit: <a href="https://github.com/catatsuy/purl">https://github.com/catatsuy/purl</a><p>I appreciate any feedback!