The best Hacker News stories from Show from the past day
Latest posts:
Show HN: Flutter_compositions: Vue-inspired reactive building blocks for Flutter
Show HN: Flutter_compositions: Vue-inspired reactive building blocks for Flutter
Show HN: TabPFN-2.5 – SOTA foundation model for tabular data
I am excited to announce the release of TabPFN-2.5, our tabular foundation model that now scales to datasets of up to 50,000 samples and 2,000 features - a 5x increase from TabPFN v2, published in the Nature journal earlier this year. TabPFN-2.5 delivers state-of-the-art predictions in one forward pass without hyperparameter tuning across classification and regression tasks.<p><i>What’s new in 2.5</i>:
TabPFN-2.5 maintains the core approach of v2 - a pretrained transformer trained on more than hundred million synthetic datasets to perform in-context learning and output a predictive distribution for the test data. It natively supports missing values, cateogrical features, text and numerical features is robust to outliers and uninformative features.<p>The major improvements:<p>- 5x scale increase: Now handles 50,000 samples × 2,000 features (up from 10,000 × 500 in v2)<p>- SOTA performance: TabPFN-2.5 outperforms tuned tree-based methods and matches the performance of a complex ensemble (AutoGluon 1.4), that itself includes TabPFN v2, tuned for 4 hours. Tuning the model improves performance, outperforming AutoGluon 1.4 for regression tasks.<p>- Rebuilt API: New REST interface along with Python SDK with dedicated fit & predict endpoints, making deployment and integration more developer-friendly<p>- A distillation engine that converts TabPFN-2.5 into a compact MLP or tree ensemble while preserving accuracy and offer low latency inference.<p>There are still some limitations. The model is designed for datasets up to 50K samples. It can handle larger datasets but that hasn’t been our focus with TabPFN-2.5. The distillation engine is not yet available through the API but only through licenses (though we do show the performance in the model report).<p>We’re actively working on removing these limitations and intend to release newer models focused on context reasoning, causal inference, graph networks, larger data and time-series.
TabPFN-2.5 is available via API and a package on Hugging Face. Would love for you to try it and give us your feedback!<p>Model report: <a href="https://priorlabs.ai/technical-reports/tabpfn-2-5-model-report" rel="nofollow">https://priorlabs.ai/technical-reports/tabpfn-2-5-model-repo...</a><p>Package: <a href="https://github.com/PriorLabs/TabPFN" rel="nofollow">https://github.com/PriorLabs/TabPFN</a><p>Client: <a href="https://github.com/PriorLabs/tabpfn-client" rel="nofollow">https://github.com/PriorLabs/tabpfn-client</a><p>Docs: <a href="https://docs.priorlabs.ai/quickstart" rel="nofollow">https://docs.priorlabs.ai/quickstart</a>
Show HN: TabPFN-2.5 – SOTA foundation model for tabular data
I am excited to announce the release of TabPFN-2.5, our tabular foundation model that now scales to datasets of up to 50,000 samples and 2,000 features - a 5x increase from TabPFN v2, published in the Nature journal earlier this year. TabPFN-2.5 delivers state-of-the-art predictions in one forward pass without hyperparameter tuning across classification and regression tasks.<p><i>What’s new in 2.5</i>:
TabPFN-2.5 maintains the core approach of v2 - a pretrained transformer trained on more than hundred million synthetic datasets to perform in-context learning and output a predictive distribution for the test data. It natively supports missing values, cateogrical features, text and numerical features is robust to outliers and uninformative features.<p>The major improvements:<p>- 5x scale increase: Now handles 50,000 samples × 2,000 features (up from 10,000 × 500 in v2)<p>- SOTA performance: TabPFN-2.5 outperforms tuned tree-based methods and matches the performance of a complex ensemble (AutoGluon 1.4), that itself includes TabPFN v2, tuned for 4 hours. Tuning the model improves performance, outperforming AutoGluon 1.4 for regression tasks.<p>- Rebuilt API: New REST interface along with Python SDK with dedicated fit & predict endpoints, making deployment and integration more developer-friendly<p>- A distillation engine that converts TabPFN-2.5 into a compact MLP or tree ensemble while preserving accuracy and offer low latency inference.<p>There are still some limitations. The model is designed for datasets up to 50K samples. It can handle larger datasets but that hasn’t been our focus with TabPFN-2.5. The distillation engine is not yet available through the API but only through licenses (though we do show the performance in the model report).<p>We’re actively working on removing these limitations and intend to release newer models focused on context reasoning, causal inference, graph networks, larger data and time-series.
TabPFN-2.5 is available via API and a package on Hugging Face. Would love for you to try it and give us your feedback!<p>Model report: <a href="https://priorlabs.ai/technical-reports/tabpfn-2-5-model-report" rel="nofollow">https://priorlabs.ai/technical-reports/tabpfn-2-5-model-repo...</a><p>Package: <a href="https://github.com/PriorLabs/TabPFN" rel="nofollow">https://github.com/PriorLabs/TabPFN</a><p>Client: <a href="https://github.com/PriorLabs/tabpfn-client" rel="nofollow">https://github.com/PriorLabs/tabpfn-client</a><p>Docs: <a href="https://docs.priorlabs.ai/quickstart" rel="nofollow">https://docs.priorlabs.ai/quickstart</a>
Show HN: See chords as flags – Visual harmony of top composers on musescore
I designed a relative piano-roll-based music notation. I used 12 colored arranged in a specific way to make visible the main effects and oppositions of Western tonal harmony. The tonic is always white, so a manual annotation/interpretation is required for each MIDI file.<p>All chords are flags of three to four colors. Minor mode is darker, major mode is lighter. Colors are arranged in thirds.<p>I sorted the pieces from simple complex harmony. I also wrote a bit of text to explain what you may see. There's also a corpus of structures: hyperlinks of tags that allow you to find similar patterns throughout my corpus of 3000+ popular pieces.<p>My method makes chord progressions memorizable and instantly visible in the scores. No preparation of Roman numeral analysis / chord symbols analysis is required. After a bit of training the chords will stare right in your eyes.<p>It's not synesthesia, it's a missing script for tonal music which makes harmonically identical things look the same (or similar).<p>I've also recorded lectures on my method in Russian (<a href="https://www.youtube.com/playlist?list=PLzQrZe3EemP5pVPYMwBJGtiejiN3qtCce" rel="nofollow">https://www.youtube.com/playlist?list=PLzQrZe3EemP5pVPYMwBJG...</a>). I'm sorry I haven't yet found time to re-record in English.<p>I've also sketched a friendlier intro: <a href="https://vpavlenko.github.io/d/" rel="nofollow">https://vpavlenko.github.io/d/</a><p>Sorry, but this thing won't make any sense if you're color-blind.<p>It's open-source: <a href="https://github.com/vpavlenko/rawl" rel="nofollow">https://github.com/vpavlenko/rawl</a><p>Earlier context: <a href="https://news.ycombinator.com/item?id=39165596">https://news.ycombinator.com/item?id=39165596</a><p>(Back then colors were less logical, and there was no corpus of 3000+ piece annotated yet)
Show HN: See chords as flags – Visual harmony of top composers on musescore
I designed a relative piano-roll-based music notation. I used 12 colored arranged in a specific way to make visible the main effects and oppositions of Western tonal harmony. The tonic is always white, so a manual annotation/interpretation is required for each MIDI file.<p>All chords are flags of three to four colors. Minor mode is darker, major mode is lighter. Colors are arranged in thirds.<p>I sorted the pieces from simple complex harmony. I also wrote a bit of text to explain what you may see. There's also a corpus of structures: hyperlinks of tags that allow you to find similar patterns throughout my corpus of 3000+ popular pieces.<p>My method makes chord progressions memorizable and instantly visible in the scores. No preparation of Roman numeral analysis / chord symbols analysis is required. After a bit of training the chords will stare right in your eyes.<p>It's not synesthesia, it's a missing script for tonal music which makes harmonically identical things look the same (or similar).<p>I've also recorded lectures on my method in Russian (<a href="https://www.youtube.com/playlist?list=PLzQrZe3EemP5pVPYMwBJGtiejiN3qtCce" rel="nofollow">https://www.youtube.com/playlist?list=PLzQrZe3EemP5pVPYMwBJG...</a>). I'm sorry I haven't yet found time to re-record in English.<p>I've also sketched a friendlier intro: <a href="https://vpavlenko.github.io/d/" rel="nofollow">https://vpavlenko.github.io/d/</a><p>Sorry, but this thing won't make any sense if you're color-blind.<p>It's open-source: <a href="https://github.com/vpavlenko/rawl" rel="nofollow">https://github.com/vpavlenko/rawl</a><p>Earlier context: <a href="https://news.ycombinator.com/item?id=39165596">https://news.ycombinator.com/item?id=39165596</a><p>(Back then colors were less logical, and there was no corpus of 3000+ piece annotated yet)
Show HN: See chords as flags – Visual harmony of top composers on musescore
I designed a relative piano-roll-based music notation. I used 12 colored arranged in a specific way to make visible the main effects and oppositions of Western tonal harmony. The tonic is always white, so a manual annotation/interpretation is required for each MIDI file.<p>All chords are flags of three to four colors. Minor mode is darker, major mode is lighter. Colors are arranged in thirds.<p>I sorted the pieces from simple complex harmony. I also wrote a bit of text to explain what you may see. There's also a corpus of structures: hyperlinks of tags that allow you to find similar patterns throughout my corpus of 3000+ popular pieces.<p>My method makes chord progressions memorizable and instantly visible in the scores. No preparation of Roman numeral analysis / chord symbols analysis is required. After a bit of training the chords will stare right in your eyes.<p>It's not synesthesia, it's a missing script for tonal music which makes harmonically identical things look the same (or similar).<p>I've also recorded lectures on my method in Russian (<a href="https://www.youtube.com/playlist?list=PLzQrZe3EemP5pVPYMwBJGtiejiN3qtCce" rel="nofollow">https://www.youtube.com/playlist?list=PLzQrZe3EemP5pVPYMwBJG...</a>). I'm sorry I haven't yet found time to re-record in English.<p>I've also sketched a friendlier intro: <a href="https://vpavlenko.github.io/d/" rel="nofollow">https://vpavlenko.github.io/d/</a><p>Sorry, but this thing won't make any sense if you're color-blind.<p>It's open-source: <a href="https://github.com/vpavlenko/rawl" rel="nofollow">https://github.com/vpavlenko/rawl</a><p>Earlier context: <a href="https://news.ycombinator.com/item?id=39165596">https://news.ycombinator.com/item?id=39165596</a><p>(Back then colors were less logical, and there was no corpus of 3000+ piece annotated yet)
Show HN: See chords as flags – Visual harmony of top composers on musescore
I designed a relative piano-roll-based music notation. I used 12 colored arranged in a specific way to make visible the main effects and oppositions of Western tonal harmony. The tonic is always white, so a manual annotation/interpretation is required for each MIDI file.<p>All chords are flags of three to four colors. Minor mode is darker, major mode is lighter. Colors are arranged in thirds.<p>I sorted the pieces from simple complex harmony. I also wrote a bit of text to explain what you may see. There's also a corpus of structures: hyperlinks of tags that allow you to find similar patterns throughout my corpus of 3000+ popular pieces.<p>My method makes chord progressions memorizable and instantly visible in the scores. No preparation of Roman numeral analysis / chord symbols analysis is required. After a bit of training the chords will stare right in your eyes.<p>It's not synesthesia, it's a missing script for tonal music which makes harmonically identical things look the same (or similar).<p>I've also recorded lectures on my method in Russian (<a href="https://www.youtube.com/playlist?list=PLzQrZe3EemP5pVPYMwBJGtiejiN3qtCce" rel="nofollow">https://www.youtube.com/playlist?list=PLzQrZe3EemP5pVPYMwBJG...</a>). I'm sorry I haven't yet found time to re-record in English.<p>I've also sketched a friendlier intro: <a href="https://vpavlenko.github.io/d/" rel="nofollow">https://vpavlenko.github.io/d/</a><p>Sorry, but this thing won't make any sense if you're color-blind.<p>It's open-source: <a href="https://github.com/vpavlenko/rawl" rel="nofollow">https://github.com/vpavlenko/rawl</a><p>Earlier context: <a href="https://news.ycombinator.com/item?id=39165596">https://news.ycombinator.com/item?id=39165596</a><p>(Back then colors were less logical, and there was no corpus of 3000+ piece annotated yet)
Show HN: qqqa – A fast, stateless LLM-powered assistant for your shell
I built qqqa as an open-source project, because I was tired of bouncing between shell, ChatGPT / the browser for rather simple commands. It comes with two binaries: qq and qa.<p>qq means "quick question" - it is read-only, perfect for the commands I always forget.<p>qa means "quick agent" - it is qq's sibling that can run things, but only after showing its plan and getting an approval by the user.<p>It is built entirely around the Unix philosophy of focused tools, stateless by default - pretty much the opposite of what most coding agent are focusing on.<p>Personally I've had the best experience using Groq + gpt-oss-20b, as it feels almost instant (up to 1k tokens/s according to Groq) - but any OpenAI-compatible API will do.<p>Curious if the HN crowd finds it useful - and of course, AMA.
Show HN: qqqa – A fast, stateless LLM-powered assistant for your shell
I built qqqa as an open-source project, because I was tired of bouncing between shell, ChatGPT / the browser for rather simple commands. It comes with two binaries: qq and qa.<p>qq means "quick question" - it is read-only, perfect for the commands I always forget.<p>qa means "quick agent" - it is qq's sibling that can run things, but only after showing its plan and getting an approval by the user.<p>It is built entirely around the Unix philosophy of focused tools, stateless by default - pretty much the opposite of what most coding agent are focusing on.<p>Personally I've had the best experience using Groq + gpt-oss-20b, as it feels almost instant (up to 1k tokens/s according to Groq) - but any OpenAI-compatible API will do.<p>Curious if the HN crowd finds it useful - and of course, AMA.
Show HN: qqqa – A fast, stateless LLM-powered assistant for your shell
I built qqqa as an open-source project, because I was tired of bouncing between shell, ChatGPT / the browser for rather simple commands. It comes with two binaries: qq and qa.<p>qq means "quick question" - it is read-only, perfect for the commands I always forget.<p>qa means "quick agent" - it is qq's sibling that can run things, but only after showing its plan and getting an approval by the user.<p>It is built entirely around the Unix philosophy of focused tools, stateless by default - pretty much the opposite of what most coding agent are focusing on.<p>Personally I've had the best experience using Groq + gpt-oss-20b, as it feels almost instant (up to 1k tokens/s according to Groq) - but any OpenAI-compatible API will do.<p>Curious if the HN crowd finds it useful - and of course, AMA.
Show HN: qqqa – A fast, stateless LLM-powered assistant for your shell
I built qqqa as an open-source project, because I was tired of bouncing between shell, ChatGPT / the browser for rather simple commands. It comes with two binaries: qq and qa.<p>qq means "quick question" - it is read-only, perfect for the commands I always forget.<p>qa means "quick agent" - it is qq's sibling that can run things, but only after showing its plan and getting an approval by the user.<p>It is built entirely around the Unix philosophy of focused tools, stateless by default - pretty much the opposite of what most coding agent are focusing on.<p>Personally I've had the best experience using Groq + gpt-oss-20b, as it feels almost instant (up to 1k tokens/s according to Groq) - but any OpenAI-compatible API will do.<p>Curious if the HN crowd finds it useful - and of course, AMA.
Show HN: I scraped 3B Goodreads reviews to train a better recommendation model
Hi everyone,<p>For the past couple months I've been working on a website with two main features:<p>- <a href="https://book.sv" rel="nofollow">https://book.sv</a> - put in a list of books and get recommendations on what to read next from a model trained on over a billion reviews<p>- <a href="https://book.sv/intersect" rel="nofollow">https://book.sv/intersect</a> - put in a list of books and find the users on Goodreads who have read them all (if you don't want to be included in these results, you can opt-out here: <a href="https://book.sv/remove-my-data" rel="nofollow">https://book.sv/remove-my-data</a>)<p>Technical info available here: <a href="https://book.sv/how-it-works" rel="nofollow">https://book.sv/how-it-works</a><p>Note 1: If you only provide one or two books, the model doesn't have a lot to work with and may include a handful of somewhat unrelated popular books in the results. If you want recommendations based on just one book, click the "Similar" button next to the book after adding it to the input book list on the recommendations page.<p>Note 2: This is uncommon, but if you get an unexpected non-English titled book in the results, it is probably not a mistake and it very likely has an English edition. The "canonical" edition of a book I use for display is whatever one is the most popular, which is usually the English version, but this is not the case for all books, especially those by famous French or Russian authors.
Show HN: I scraped 3B Goodreads reviews to train a better recommendation model
Hi everyone,<p>For the past couple months I've been working on a website with two main features:<p>- <a href="https://book.sv" rel="nofollow">https://book.sv</a> - put in a list of books and get recommendations on what to read next from a model trained on over a billion reviews<p>- <a href="https://book.sv/intersect" rel="nofollow">https://book.sv/intersect</a> - put in a list of books and find the users on Goodreads who have read them all (if you don't want to be included in these results, you can opt-out here: <a href="https://book.sv/remove-my-data" rel="nofollow">https://book.sv/remove-my-data</a>)<p>Technical info available here: <a href="https://book.sv/how-it-works" rel="nofollow">https://book.sv/how-it-works</a><p>Note 1: If you only provide one or two books, the model doesn't have a lot to work with and may include a handful of somewhat unrelated popular books in the results. If you want recommendations based on just one book, click the "Similar" button next to the book after adding it to the input book list on the recommendations page.<p>Note 2: This is uncommon, but if you get an unexpected non-English titled book in the results, it is probably not a mistake and it very likely has an English edition. The "canonical" edition of a book I use for display is whatever one is the most popular, which is usually the English version, but this is not the case for all books, especially those by famous French or Russian authors.
Show HN: I scraped 3B Goodreads reviews to train a better recommendation model
Hi everyone,<p>For the past couple months I've been working on a website with two main features:<p>- <a href="https://book.sv" rel="nofollow">https://book.sv</a> - put in a list of books and get recommendations on what to read next from a model trained on over a billion reviews<p>- <a href="https://book.sv/intersect" rel="nofollow">https://book.sv/intersect</a> - put in a list of books and find the users on Goodreads who have read them all (if you don't want to be included in these results, you can opt-out here: <a href="https://book.sv/remove-my-data" rel="nofollow">https://book.sv/remove-my-data</a>)<p>Technical info available here: <a href="https://book.sv/how-it-works" rel="nofollow">https://book.sv/how-it-works</a><p>Note 1: If you only provide one or two books, the model doesn't have a lot to work with and may include a handful of somewhat unrelated popular books in the results. If you want recommendations based on just one book, click the "Similar" button next to the book after adding it to the input book list on the recommendations page.<p>Note 2: This is uncommon, but if you get an unexpected non-English titled book in the results, it is probably not a mistake and it very likely has an English edition. The "canonical" edition of a book I use for display is whatever one is the most popular, which is usually the English version, but this is not the case for all books, especially those by famous French or Russian authors.
Show HN: I scraped 3B Goodreads reviews to train a better recommendation model
Hi everyone,<p>For the past couple months I've been working on a website with two main features:<p>- <a href="https://book.sv" rel="nofollow">https://book.sv</a> - put in a list of books and get recommendations on what to read next from a model trained on over a billion reviews<p>- <a href="https://book.sv/intersect" rel="nofollow">https://book.sv/intersect</a> - put in a list of books and find the users on Goodreads who have read them all (if you don't want to be included in these results, you can opt-out here: <a href="https://book.sv/remove-my-data" rel="nofollow">https://book.sv/remove-my-data</a>)<p>Technical info available here: <a href="https://book.sv/how-it-works" rel="nofollow">https://book.sv/how-it-works</a><p>Note 1: If you only provide one or two books, the model doesn't have a lot to work with and may include a handful of somewhat unrelated popular books in the results. If you want recommendations based on just one book, click the "Similar" button next to the book after adding it to the input book list on the recommendations page.<p>Note 2: This is uncommon, but if you get an unexpected non-English titled book in the results, it is probably not a mistake and it very likely has an English edition. The "canonical" edition of a book I use for display is whatever one is the most popular, which is usually the English version, but this is not the case for all books, especially those by famous French or Russian authors.
Show HN: Zee – AI that interviews everyone so you only meet the best
Hey HN, I'm Dave one of the co-founders of Zeeda.<p>After scaling my last company (Voxpopme) from 0 to $10M ARR and hiring 100+ people, I lived through the worst part of hiring: posting a role on LinkedIn, getting 200+ applications, and drowning in resumes where 80% are completely unqualified.<p>So we built Zee - an AI that conducts actual 20-30 minute first-round interviews (not just screening) with every applicant, then gives you a shortlist of the 5-10% actually worth your time.<p>We're also solving stakeholder misalignment too before Zee interviews any applicants he talks to key stakeholders about the role to go beyond basic job descriptions.<p>Would love HN's feedback, especially from fellow founders who've felt this pain.
Show HN: I was in a boring meeting so I made an encyclopedia
Show HN: I was in a boring meeting so I made an encyclopedia
Show HN: I got fired so I built a bank statement converter
I recently got fired and decided to channel my energy into something productive. Over two weeks, I spent 16-hour days building a tool that converts Australian bank PDFs into clean, reliable CSVs, tailored specifically for Aussie banks.<p>Most Aussie banks only provide statements as a PDF, and generic converters often fail: columns drift, multi-line descriptions break parsing, headers shift. Existing tools don’t handle it well and I wanted a tool that just works.<p>To get started, I used my own bank statements to build the initial parsers. There was a "duh" moment when I realised how hard it is to get more realistic test data. People don't just hand over their financial ledgers. This solidified my core principle: trust and privacy had to be the absolute top priority.<p>I initially tried building everything client-side in JavaScript for maximum privacy, but performance and reliability were poor, and exposing the parsers on the front-end would have made them easy to copy.<p>I settled on a middle ground: a Python and FastAPI backend on Google Cloud Run. This lets me balance reliability with a strict privacy architecture. Files are processed in real-time and the temp file is deleted immediately after the request is complete. There is no persistent storage and no logging of request bodies.<p>My technical approach is straightforward and focused on reliability:<p>- I use pdfplumber to extract text, avoiding complex and error-prone OCR.<p>- I apply a set of bank-specific regex patterns to pinpoint dates, amounts, and descriptions.<p>- A lookahead heuristic correctly merges multi-line transactions. Each parser is customised to its bank's unique PDF layout quirks.<p>The project is deliberately focused. Instead of supporting hundreds of banks with mediocre results, I'm concentrating on a small set to get them right. It currently supports CommBank, Westpac, UBank, and ING, with ANZ and NAB next. The whole thing is deployed on Cloudflare Pages and outputs clean CSVs ready for Excel, Google Sheets, Xero, or MYOB.<p>It was a fun challenge in reverse-engineering messy, real-world data.<p>Try it out here: <a href="https://aussiebankstatements.com" rel="nofollow">https://aussiebankstatements.com</a><p>I'd love to hear feedback. If it breaks on your statement, a redacted sample would be a huge help for improving the parser.<p>I'm also curious to hear how others here have tackled similar messy data extraction challenges.