The best Hacker News stories from Show from the past week

Latest posts:

Show HN: Extract an RSS feed from almost anything

Howdy! RSSfeedASAP scratches my own itch. I run a regional podcasting directory which gets dozens of messy submission for podcasts. Often they don't even include an xml file and me being a good samaritan I sometimes do the manual work and find it myself. I got tired of that manual work and decided to build a microapp.<p>RSSfeedASAP is this app and I decided to release it in case someone else finds any use in it.

Show HN: Menu Bar Calendar on macOS

Show HN: XRss – RSS Reader and web stack demo

XRss is a simple RSS reader web app built to showcase xtemplate, a new web development tool based on Go's html/template and Caddy server.<p>The entire site UI for XRss comes from <i>a single HTML template file</i>. This index.html includes everything from SQL queries and route definitions and handlers to htmx state transition attributes and tailwindcss classes, and developing it requires <i>zero build steps</i> (amortized).<p>Check out the source which manages to be at once banal and gnarly: <a href="https://github.com/infogulch/xrss/blob/master/templates/index.html">https://github.com/infogulch/xrss/blob/master/templates/inde...</a><p>xtemplate preloads the whole template structure into memory and builds the router at startup, so responses to matching requests are rendered after a single lookup. Combined with direct queries to sqlite makes for a very snappy experience typically responding in less than 5ms. (Fingers crossed.)<p>There are various places where XRss could be improved (PRs welcome!), but it already delivers on its purpose of demonstrating the plausibility of xtemplate. See the xtemplate readme for an overview of what you can do with it. I think of it as 'PHP but the syntax looks like Go templates'.<p><a href="https://github.com/infogulch/caddy-xtemplate">https://github.com/infogulch/caddy-xtemplate</a><p>Let me know what you think! Does remaking PHP from scratch out of Go templates make me a lunatic? (yes) Is it a good idea anyway? (yes) What kind of web application do you think would be a good fit for a platform like this?

Show HN: Shaq, a CLI for Shazam

I mirrored all the code from PyPI to GitHub and analysed it

This is a side project I've been working on for the last few months. I built an automated system to continuously mirror all the code on PyPI to a series of Github repositories. Mirroring PyPI code to Github enables:<p>1. Scanning of all new Python packages for accidentally published credentials<p>2. A browsable/searchable index of published code with a nice UI<p>3. Large-scale analysis of <i>all</i> published code to see how the language is evolving<p>Using this project anyone is able to download the contents of PyPI to their personal machine and analyse every piece of code ever published in a matter of hours.<p>I hope it enables people to do things with the worlds largest and oldest corpus of Python code that wasn't possible before, and while this is likely totally useless to most people I think that is kind of cool and unique.

ChangeDetection, monitor any website change

ChangeDetection, monitor any website change

Show HN: I automated half of my typing

I've been using this for about a year now - I parsed 6 months of my messages on slack and found the most common phrases I use and generated keyboard shortcuts for them.

Show HN: I automated half of my typing

I've been using this for about a year now - I parsed 6 months of my messages on slack and found the most common phrases I use and generated keyboard shortcuts for them.

Show HN: Advanced Tab Manager for Firefox

Show HN: Advanced Tab Manager for Firefox

Show HN: Langfuse – Open-source observability and analytics for LLM apps

Hi HN! Langfuse is OSS observability and analytics for LLM applications (repo: <a href="https://github.com/langfuse/langfuse">https://github.com/langfuse/langfuse</a>, 2 min demo: <a href="https://langfuse.com/video">https://langfuse.com/video</a>, try it yourself: <a href="https://langfuse.com/demo">https://langfuse.com/demo</a>)<p>Langfuse makes capturing and viewing LLM calls (execution traces) a breeze. On top of this data, you can analyze the quality, cost and latency of LLM apps.<p>When GPT-4 dropped, we started building LLM apps – a lot of them! [1, 2] But they all suffered from the same issue: it’s hard to assure quality in 100% of cases and even to have a clear view of user behavior. Initially, we logged all prompts/completions to our production database to understand what works and what doesn’t. We soon realized we needed more context, more data and better analytics to sustainably improve our apps. So we started building a homegrown tool.<p>Our first task was to track and view what is going on in production: what user input is provided, how prompt templates or vector db requests work, and which steps of an LLM chain fail. We built async SDKs and a slick frontend to render chains in a nested way. It’s a good way to look at LLM logic ‘natively’. Then we added some basic analytics to understand token usage and quality over time for the entire project or single users (pre-built dashboards).<p>Under the hood, we use the T3 stack (Typescript, NextJs, Prisma, tRPC, Tailwind, NextAuth), which allows us to move fast + it means it's easy to contribute to our repo. The SDKs are heavily influenced by the design of the PostHog SDKs [3] for stable implementations of async network requests. It was a surprisingly inconvenient experience to convert OpenAPI specs to boilerplate Python code and we ended up using Fern [4] here. We’re fans of Tailwind + shadcn/ui + tremor.so for speed and flexibility in building tables and dashboards fast.<p>Our SDKs run fully asynchronously and make network requests in the background. We did our best to reduce any impact on application performance to a minimum. We never block the main execution path.<p>We've made two engineering decisions we've felt uncertain about: to use a Postgres database and Looker Studio for the analytics MVP. Supabase performs well at our scale and integrates seamlessly into our tech stack. We will need to move to an OLAP database soon and are debating if we need to start batching ingestion and if we can keep using Vercel. Any experience you could share would be helpful!<p>Integrating Looker Studio got us to first analytics charts in half a day. As it is not open-source and does not work with our UI/UX, we are looking to switch it out for an OSS solution to flexibly generate charts and dashboards. We’ve had a look at Lightdash and would be happy to hear your thoughts.<p>We’re borrowing our OSS business model from Posthog/Supabase who make it easy to self-host with features reserved for enterprise (no plans yet) and a paid version for managed cloud service. Right now all of our code is available under a permissive license (MIT).<p>Next, we’re going deep on analytics. For quality specifically, we will build out model-based evaluations and labeling to be able to cluster traces by scores and use cases.<p>Looking forward to hearing your thoughts and discussion – we’ll be in the comments. Thanks!<p>[1] <a href="https://learn-from-ai.com/" rel="nofollow noreferrer">https://learn-from-ai.com/</a><p>[2] <a href="https://www.loom.com/share/5c044ca77be44ff7821967834dd70cba" rel="nofollow noreferrer">https://www.loom.com/share/5c044ca77be44ff7821967834dd70cba</a><p>[3] <a href="https://posthog.com/docs/libraries">https://posthog.com/docs/libraries</a><p>[4] <a href="https://buildwithfern.com/">https://buildwithfern.com/</a>

Show HN: Going into freshman year, figured I should build an interpreter

Hi all!<p>I'm going into my freshman year, and figured that the best way to prepare for the intro to programming Racket course would be to implement my own garbage-collected, dynamically typed, functional programming language in C ;)<p>Anyways... here's the repo: https://github.com/liam-ilan/crumb<p>I started learning C over the summer, so I still have a whole lot to learn... Any feedback would be greatly appreciated! :D

Show HN: Going into freshman year, figured I should build an interpreter

Hi all!<p>I'm going into my freshman year, and figured that the best way to prepare for the intro to programming Racket course would be to implement my own garbage-collected, dynamically typed, functional programming language in C ;)<p>Anyways... here's the repo: https://github.com/liam-ilan/crumb<p>I started learning C over the summer, so I still have a whole lot to learn... Any feedback would be greatly appreciated! :D

Show HN: Release AI – Talk to Your Infrastructure

Hello, Hacker News! I'm David, cofounder of Release (YCW20). Introducing Release AI, a tool designed to empower users with instant access to DevOps expertise, all without monopolizing the valuable time of our experts. Developed with the developer and engineer community in mind, Release AI takes the power of OpenAI's cutting-edge GPT-4 public LLM and augments it with DevOps knowledge.<p>In its initial phase, Release AI offers "read-only" access to both AWS and Kubernetes. This means you can engage in insightful conversations with your AWS account and K8s infrastructure effortlessly. Looking ahead, our roadmap includes plans to integrate more tools for commonly used systems. This will enable you to automate an even broader array of your daily tasks.<p>If you would like more info you can check-out our launch YC (it has more details, screen casts): <a href="https://www.ycombinator.com/launches/JI1-release-ai-talk-to-your-infrastructure">https://www.ycombinator.com/launches/JI1-release-ai-talk-to-...</a><p>Our quickstart guide: <a href="https://docs.release.com/release-ai/quickstart">https://docs.release.com/release-ai/quickstart</a><p>Signup and use it: <a href="https://beta.release.com/ai/register">https://beta.release.com/ai/register</a><p>Please give it a try! We would love your feedback as we are enhancing Release AI, reach out to us with any feature requests or crazy ideas that Release AI could do for you. Feel free to email me at david@release.com or leave a comment, looking forward to chatting with you.<p>Join the conversation in our Slack community and discover the future of DevOps with Release AI!

Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B

Hi HN,<p>We have fine-tuned CodeLlama-34B and CodeLlama-34B-Python on an internal Phind dataset that achieved 67.6% and 69.5% pass@1 on HumanEval, respectively. GPT-4 achieved 67%. To ensure result validity, we applied OpenAI's decontamination methodology to our dataset.<p>The CodeLlama models released yesterday demonstrate impressive performance on HumanEval.<p>- CodeLlama-34B achieved 48.8% pass@1 on HumanEval<p>- CodeLlama-34B-Python achieved 53.7% pass@1 on HumanEval<p>We have fine-tuned both models on a proprietary dataset of ~80k high-quality programming problems and solutions. Instead of code completion examples, this dataset features instruction-answer pairs, setting it apart structurally from HumanEval. We trained the Phind models over two epochs, for a total of ~160k examples. LoRA was not used — both models underwent a native fine-tuning. We employed DeepSpeed ZeRO 3 and Flash Attention 2 to train these models in three hours using 32 A100-80GB GPUs, with a sequence length of 4096 tokens.<p>Furthermore, we applied OpenAI's decontamination methodology to our dataset to ensure valid results, and found no contaminated examples.<p>The methodology is:<p>- For each evaluation example, we randomly sampled three substrings of 50 characters or used the entire example if it was fewer than 50 characters.<p>- A match was identified if any sampled substring was a substring of the processed training example.<p>For further insights on the decontamination methodology, please refer to Appendix C of OpenAI's technical report.<p>Presented below are the pass@1 scores we achieved with our fine-tuned models:<p>- Phind-CodeLlama-34B-v1 achieved 67.6% pass@1 on HumanEval<p>- Phind-CodeLlama-34B-Python-v1 achieved 69.5% pass@1 on HumanEval<p>Note on GPT-4<p>According to the official technical report in March, OpenAI reported a pass@1 score of 67% for GPT-4's performance on HumanEval. Since then, there have been claims reporting higher scores. However, it's essential to note that there hasn't been any concrete evidence pointing towards an enhancement in the model's coding abilities since then. It's also crucial to highlight that these elevated figures lack the rigorous contamination analysis that the official statistic underwent, making them less of a reliable comparison. As a result, we consider 67% as the pass@1 score for GPT-4.<p>Download<p>We are releasing both models on Huggingface for verifiability and to bolster the open-source community. We welcome independent verification of results.<p>Phind-CodeLlama-34B-v1: <a href="https://huggingface.co/Phind/Phind-CodeLlama-34B-v1" rel="nofollow noreferrer">https://huggingface.co/Phind/Phind-CodeLlama-34B-v1</a><p>Phind-CodeLlama-34B-Python-v1: <a href="https://huggingface.co/Phind/Phind-CodeLlama-34B-Python-v1" rel="nofollow noreferrer">https://huggingface.co/Phind/Phind-CodeLlama-34B-Python-v1</a><p>We'd love to hear your thoughts!<p>Best,<p>The Phind Team

Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B

Hi HN,<p>We have fine-tuned CodeLlama-34B and CodeLlama-34B-Python on an internal Phind dataset that achieved 67.6% and 69.5% pass@1 on HumanEval, respectively. GPT-4 achieved 67%. To ensure result validity, we applied OpenAI's decontamination methodology to our dataset.<p>The CodeLlama models released yesterday demonstrate impressive performance on HumanEval.<p>- CodeLlama-34B achieved 48.8% pass@1 on HumanEval<p>- CodeLlama-34B-Python achieved 53.7% pass@1 on HumanEval<p>We have fine-tuned both models on a proprietary dataset of ~80k high-quality programming problems and solutions. Instead of code completion examples, this dataset features instruction-answer pairs, setting it apart structurally from HumanEval. We trained the Phind models over two epochs, for a total of ~160k examples. LoRA was not used — both models underwent a native fine-tuning. We employed DeepSpeed ZeRO 3 and Flash Attention 2 to train these models in three hours using 32 A100-80GB GPUs, with a sequence length of 4096 tokens.<p>Furthermore, we applied OpenAI's decontamination methodology to our dataset to ensure valid results, and found no contaminated examples.<p>The methodology is:<p>- For each evaluation example, we randomly sampled three substrings of 50 characters or used the entire example if it was fewer than 50 characters.<p>- A match was identified if any sampled substring was a substring of the processed training example.<p>For further insights on the decontamination methodology, please refer to Appendix C of OpenAI's technical report.<p>Presented below are the pass@1 scores we achieved with our fine-tuned models:<p>- Phind-CodeLlama-34B-v1 achieved 67.6% pass@1 on HumanEval<p>- Phind-CodeLlama-34B-Python-v1 achieved 69.5% pass@1 on HumanEval<p>Note on GPT-4<p>According to the official technical report in March, OpenAI reported a pass@1 score of 67% for GPT-4's performance on HumanEval. Since then, there have been claims reporting higher scores. However, it's essential to note that there hasn't been any concrete evidence pointing towards an enhancement in the model's coding abilities since then. It's also crucial to highlight that these elevated figures lack the rigorous contamination analysis that the official statistic underwent, making them less of a reliable comparison. As a result, we consider 67% as the pass@1 score for GPT-4.<p>Download<p>We are releasing both models on Huggingface for verifiability and to bolster the open-source community. We welcome independent verification of results.<p>Phind-CodeLlama-34B-v1: <a href="https://huggingface.co/Phind/Phind-CodeLlama-34B-v1" rel="nofollow noreferrer">https://huggingface.co/Phind/Phind-CodeLlama-34B-v1</a><p>Phind-CodeLlama-34B-Python-v1: <a href="https://huggingface.co/Phind/Phind-CodeLlama-34B-Python-v1" rel="nofollow noreferrer">https://huggingface.co/Phind/Phind-CodeLlama-34B-Python-v1</a><p>We'd love to hear your thoughts!<p>Best,<p>The Phind Team

Show HN: Use Code Llama as Drop-In Replacement for Copilot Chat

Hi HN,<p>Code Llama was released, but we noticed a ton of questions in the main thread about how/where to use it — not just from an API or the terminal, but <i>in your own codebase</i> as a drop-in replacement for Copilot Chat. Without this, developers don't get much utility from the model.<p>This concern is also important because benchmarks like HumanEval don't perfectly reflect the quality of responses. There's likely to be a flurry of improvements to coding models in the coming months, and rather than relying on the benchmarks to evaluate them, the community will get better feedback from people actually using the models. This means <i>real</i> usage in <i>real</i>, everyday workflows.<p>We've worked to make this possible with Continue (<a href="https://github.com/continuedev/continue">https://github.com/continuedev/continue</a>) and want to hear what you find to be the real capabilities of Code Llama. Is it on-par with GPT-4, does it require fine-tuning, or does it excel at certain tasks?<p>If you’d like to try Code Llama with Continue, it only takes a few steps to set up (<a href="https://continue.dev/docs/walkthroughs/codellama">https://continue.dev/docs/walkthroughs/codellama</a>), either locally with Ollama, or through TogetherAI or Replicate's APIs.

Show HN: Use Code Llama as Drop-In Replacement for Copilot Chat

Hi HN,<p>Code Llama was released, but we noticed a ton of questions in the main thread about how/where to use it — not just from an API or the terminal, but <i>in your own codebase</i> as a drop-in replacement for Copilot Chat. Without this, developers don't get much utility from the model.<p>This concern is also important because benchmarks like HumanEval don't perfectly reflect the quality of responses. There's likely to be a flurry of improvements to coding models in the coming months, and rather than relying on the benchmarks to evaluate them, the community will get better feedback from people actually using the models. This means <i>real</i> usage in <i>real</i>, everyday workflows.<p>We've worked to make this possible with Continue (<a href="https://github.com/continuedev/continue">https://github.com/continuedev/continue</a>) and want to hear what you find to be the real capabilities of Code Llama. Is it on-par with GPT-4, does it require fine-tuning, or does it excel at certain tasks?<p>If you’d like to try Code Llama with Continue, it only takes a few steps to set up (<a href="https://continue.dev/docs/walkthroughs/codellama">https://continue.dev/docs/walkthroughs/codellama</a>), either locally with Ollama, or through TogetherAI or Replicate's APIs.

Show HN: Shimmer – ADHD coaching for adults, now on web

Hi, I’m Chris, one of the co-founders of Shimmer. Last October, following my ADHD diagnosis, I launched Shimmer (<a href="https://shimmer.care">https://shimmer.care</a>), one-to-one ADHD Coaching for adults. Our HN launch was here: <a href="https://news.ycombinator.com/item?id=33468611">https://news.ycombinator.com/item?id=33468611</a>.<p>A quick recap before I dive into our new launch: Shimmer is an ADHD coaching service for adults. We took apart the traditionally expensive, inaccessible ADHD coaching offering ($300-600+/session) and redesigned it from first principles. You get matched with one of our expert ADHD coaches, meet weekly over video, and get supported throughout the week via text and with learning tools. This solution is special to me personally (and our community) because it doesn’t just give you “knowledge” or offer another “tool”—our coaches help you set realistic goals, take personalized steps towards it, and keep you accountable.<p>Today we’re excited to launch our most-request feature: Web.<p>Over the past 9 months, we learned (and iterated) a lot with our members and coaches. A few key challenges pointed to the need for a web version: (1) ADHD “object permanence” challenges (e.g. out of sight out of mind), we needed to be multi-platform so when you finish a task or goal or encounter a challenge, regardless of if you’re near your laptop or phone, you can check it off & ping your coach right away, (2) members used reflection modules (e.g. after each task, you’re prompted to reflect on what worked and didn’t work, and it informs your coach) more thoroughly than we originally anticipated, and web allows for deeper reflection and typing, (3) overarching coaching goals were often forgotten during the day-to-day, and the web makes it easier to use visual cues to keep goals top of mind for motivation, (4) many of our members struggle with phone addiction and driving members to the mobile app ended them up in Tiktok/IG, whereas the web app offers a focused environment to get in their “coaching zone”.<p>Our new web app was designed alongside over 1,200 members, 22 coaches, countless hours of testing and iterating. We’re excited (but nervous!) to unveil this new version. If you have ADHD (or think you do), we’d love for you to check out our platform and give us critical feedback (or positive reinforcement!). It’s a super streamlined and ADHD-friendly signup process and in honor of our web launch and back to school/work, the first month is 30% off.<p>Our pricing: $115/mo. for Essentials plan (15-min weekly sessions), $230/mo. for Standard plan (30-min weekly sessions), $345/mo. for Immersive plan (45-min weekly sessions); all plans additional 30% off first month, HSA/FSA-eligible.<p>We know these prices are expensive for many people with ADHD and we’re committed to bringing costs down over time. It’s more affordable than what many people are paying for coaches, but the fact that we’re relying on humans, and not going the “we can automate all this with AI” route, puts a floor on how low the costs can drop. That said, here are some actions we’re taking to drive down costs for those who need it: (1) we offer needs-based scholarships and aim to have 5% of members on them at any time, (2) we often run fully sponsored scholarships with our partners—over 40 full ride scholarships and 100 group coaching spots have been disbursed alongside Asian Mental Health Project, government of Canada, and more, and (3) we have aligned our coaching model alongside Health & Wellness Coaching, which is expected to be reimbursed in 2024. If you have ideas or expertise here, please reach out to me directly at chris@shimmer.care.<p>On behalf of our small but mighty & passionate Shimmer team, I’m excited for the Hacker News community to share your thoughts, feedback, and ideas. If you feel comfortable, I’d also love to hear your personal ADHD story and what has worked / hasn’t worked for you.<p>Co-founders Christal & Vikram