The best Hacker News stories from Show from the past day
Latest posts:
Show HN: FlakeHub Cache: Fast, secure, configurable. A new take on Nix caching
Show HN: A fast HNSW implementation in Rust
Show HN: A fast HNSW implementation in Rust
Show HN: PyKidos, Teach Your Kid Python in the Browser
Show HN: PyKidos, Teach Your Kid Python in the Browser
Show HN: PyKidos, Teach Your Kid Python in the Browser
Show HN: PyKidos, Teach Your Kid Python in the Browser
Show HN: Skyvern – Browser automation using LLMs and computer vision
Hey HN, we're building Skyvern (<a href="https://www.skyvern.com">https://www.skyvern.com</a>), an open-source tool that uses LLMs and computer vision to help companies automate browser-based workflows. You can see some examples here: <a href="https://github.com/Skyvern-AI/skyvern#real-world-examples-of-skyvern">https://github.com/Skyvern-AI/skyvern#real-world-examples-of...</a> and there's a demo video at <a href="https://github.com/Skyvern-AI/skyvern#demo">https://github.com/Skyvern-AI/skyvern#demo</a>, along with some instructions on running it locally.<p>We provide a natural-language API to automate repetitive manual workflows that happen within the companies' backoffices. You can check out our code and play with Skyvern here: <a href="https://github.com/Skyvern-AI/Skyvern">https://github.com/Skyvern-AI/Skyvern</a><p>We talked to hundreds of companies about things they do in the background and found that most of them depend on repetitive manual workflows. The breadth of these workflows surprised us – most companies started off doing things manually, and eventually either hired people to scale the manual work, or wrote scripts using Selenium-like browser automation libraries.<p>In these conversations, one common point stood out: scaling is a pain either way. Companies relying on hiring struggled to adjust team sizes with fluctuating demand. Companies using Selenium and similar tools had a different problem: it can take days or even weeks to get a new workflow automated, and then would require ongoing maintenance any time the underlying websites changed because their XPath based interaction logic suddenly became invalid.<p>We felt like there was a way to get the best of both worlds with LLMs. We could use LLMs to reason through a website’s layout, while preserving the advantage of traditional browser automations allowing it to scale alongside demand. This led us to build Skyvern with a few core functionalities:<p>1. Skyvern can operate on websites it’s never seen before by connecting visible elements with the natural language instructions provided to us. We use a blend of computer vision and DOM parsing to identify a set of possible actions on a website, and multi-modal LLMs to map the natural language instructions to the available actions on the page.<p>2. Skyvern is resistant to website layout changes, as it doesn’t depend on any predetermined XPaths or other selectors. If a layout ever changes, we can leverage the methodology in #1 to complete the user-specified goal.<p>3. Skyvern accepts a blob of information when navigating workflows—basically just a json blob of whatever information you want to put, and then we use LLMs to map that to information on the screen. For example: if you're generating a quote from Geico, they commonly ask “Were you eligible to drive at 21?”. The answer could be inferred from the driver receiving their license in 2012, and having a birth date of 1996.<p>The above strategy adapts well to a number of use cases that Skyvern is helping companies with today: (1) Automating materials procurement by searching for, adding to cart, and transacting products through vendor websites that don’t have APIs; (2) Registering accounts, filing forms, and searching for information on government websites (ex: registering franchise tax information for Delaware C-corps); (3) Generating insurance quotes by completing multi-step dynamic forms on insurance websites; (4) Automating the job application process by mapping user-specified information (such as a Resume) to a job posting.<p>And here are some use-cases we’re actively looking to expand into: (1) Automating post-checkup data entry with patient data inside medical EHR systems (ie submitting billing codes, adding notes, etc), an (2) Doing customer research ahead of discovery calls by analyzing landing pages and other metadata about a specific business.<p>We’re still very early and would love to get your feedback!
Show HN: Skyvern – Browser automation using LLMs and computer vision
Hey HN, we're building Skyvern (<a href="https://www.skyvern.com">https://www.skyvern.com</a>), an open-source tool that uses LLMs and computer vision to help companies automate browser-based workflows. You can see some examples here: <a href="https://github.com/Skyvern-AI/skyvern#real-world-examples-of-skyvern">https://github.com/Skyvern-AI/skyvern#real-world-examples-of...</a> and there's a demo video at <a href="https://github.com/Skyvern-AI/skyvern#demo">https://github.com/Skyvern-AI/skyvern#demo</a>, along with some instructions on running it locally.<p>We provide a natural-language API to automate repetitive manual workflows that happen within the companies' backoffices. You can check out our code and play with Skyvern here: <a href="https://github.com/Skyvern-AI/Skyvern">https://github.com/Skyvern-AI/Skyvern</a><p>We talked to hundreds of companies about things they do in the background and found that most of them depend on repetitive manual workflows. The breadth of these workflows surprised us – most companies started off doing things manually, and eventually either hired people to scale the manual work, or wrote scripts using Selenium-like browser automation libraries.<p>In these conversations, one common point stood out: scaling is a pain either way. Companies relying on hiring struggled to adjust team sizes with fluctuating demand. Companies using Selenium and similar tools had a different problem: it can take days or even weeks to get a new workflow automated, and then would require ongoing maintenance any time the underlying websites changed because their XPath based interaction logic suddenly became invalid.<p>We felt like there was a way to get the best of both worlds with LLMs. We could use LLMs to reason through a website’s layout, while preserving the advantage of traditional browser automations allowing it to scale alongside demand. This led us to build Skyvern with a few core functionalities:<p>1. Skyvern can operate on websites it’s never seen before by connecting visible elements with the natural language instructions provided to us. We use a blend of computer vision and DOM parsing to identify a set of possible actions on a website, and multi-modal LLMs to map the natural language instructions to the available actions on the page.<p>2. Skyvern is resistant to website layout changes, as it doesn’t depend on any predetermined XPaths or other selectors. If a layout ever changes, we can leverage the methodology in #1 to complete the user-specified goal.<p>3. Skyvern accepts a blob of information when navigating workflows—basically just a json blob of whatever information you want to put, and then we use LLMs to map that to information on the screen. For example: if you're generating a quote from Geico, they commonly ask “Were you eligible to drive at 21?”. The answer could be inferred from the driver receiving their license in 2012, and having a birth date of 1996.<p>The above strategy adapts well to a number of use cases that Skyvern is helping companies with today: (1) Automating materials procurement by searching for, adding to cart, and transacting products through vendor websites that don’t have APIs; (2) Registering accounts, filing forms, and searching for information on government websites (ex: registering franchise tax information for Delaware C-corps); (3) Generating insurance quotes by completing multi-step dynamic forms on insurance websites; (4) Automating the job application process by mapping user-specified information (such as a Resume) to a job posting.<p>And here are some use-cases we’re actively looking to expand into: (1) Automating post-checkup data entry with patient data inside medical EHR systems (ie submitting billing codes, adding notes, etc), an (2) Doing customer research ahead of discovery calls by analyzing landing pages and other metadata about a specific business.<p>We’re still very early and would love to get your feedback!
Show HN: Skyvern – Browser automation using LLMs and computer vision
Hey HN, we're building Skyvern (<a href="https://www.skyvern.com">https://www.skyvern.com</a>), an open-source tool that uses LLMs and computer vision to help companies automate browser-based workflows. You can see some examples here: <a href="https://github.com/Skyvern-AI/skyvern#real-world-examples-of-skyvern">https://github.com/Skyvern-AI/skyvern#real-world-examples-of...</a> and there's a demo video at <a href="https://github.com/Skyvern-AI/skyvern#demo">https://github.com/Skyvern-AI/skyvern#demo</a>, along with some instructions on running it locally.<p>We provide a natural-language API to automate repetitive manual workflows that happen within the companies' backoffices. You can check out our code and play with Skyvern here: <a href="https://github.com/Skyvern-AI/Skyvern">https://github.com/Skyvern-AI/Skyvern</a><p>We talked to hundreds of companies about things they do in the background and found that most of them depend on repetitive manual workflows. The breadth of these workflows surprised us – most companies started off doing things manually, and eventually either hired people to scale the manual work, or wrote scripts using Selenium-like browser automation libraries.<p>In these conversations, one common point stood out: scaling is a pain either way. Companies relying on hiring struggled to adjust team sizes with fluctuating demand. Companies using Selenium and similar tools had a different problem: it can take days or even weeks to get a new workflow automated, and then would require ongoing maintenance any time the underlying websites changed because their XPath based interaction logic suddenly became invalid.<p>We felt like there was a way to get the best of both worlds with LLMs. We could use LLMs to reason through a website’s layout, while preserving the advantage of traditional browser automations allowing it to scale alongside demand. This led us to build Skyvern with a few core functionalities:<p>1. Skyvern can operate on websites it’s never seen before by connecting visible elements with the natural language instructions provided to us. We use a blend of computer vision and DOM parsing to identify a set of possible actions on a website, and multi-modal LLMs to map the natural language instructions to the available actions on the page.<p>2. Skyvern is resistant to website layout changes, as it doesn’t depend on any predetermined XPaths or other selectors. If a layout ever changes, we can leverage the methodology in #1 to complete the user-specified goal.<p>3. Skyvern accepts a blob of information when navigating workflows—basically just a json blob of whatever information you want to put, and then we use LLMs to map that to information on the screen. For example: if you're generating a quote from Geico, they commonly ask “Were you eligible to drive at 21?”. The answer could be inferred from the driver receiving their license in 2012, and having a birth date of 1996.<p>The above strategy adapts well to a number of use cases that Skyvern is helping companies with today: (1) Automating materials procurement by searching for, adding to cart, and transacting products through vendor websites that don’t have APIs; (2) Registering accounts, filing forms, and searching for information on government websites (ex: registering franchise tax information for Delaware C-corps); (3) Generating insurance quotes by completing multi-step dynamic forms on insurance websites; (4) Automating the job application process by mapping user-specified information (such as a Resume) to a job posting.<p>And here are some use-cases we’re actively looking to expand into: (1) Automating post-checkup data entry with patient data inside medical EHR systems (ie submitting billing codes, adding notes, etc), an (2) Doing customer research ahead of discovery calls by analyzing landing pages and other metadata about a specific business.<p>We’re still very early and would love to get your feedback!
Show HN: Skyvern – Browser automation using LLMs and computer vision
Hey HN, we're building Skyvern (<a href="https://www.skyvern.com">https://www.skyvern.com</a>), an open-source tool that uses LLMs and computer vision to help companies automate browser-based workflows. You can see some examples here: <a href="https://github.com/Skyvern-AI/skyvern#real-world-examples-of-skyvern">https://github.com/Skyvern-AI/skyvern#real-world-examples-of...</a> and there's a demo video at <a href="https://github.com/Skyvern-AI/skyvern#demo">https://github.com/Skyvern-AI/skyvern#demo</a>, along with some instructions on running it locally.<p>We provide a natural-language API to automate repetitive manual workflows that happen within the companies' backoffices. You can check out our code and play with Skyvern here: <a href="https://github.com/Skyvern-AI/Skyvern">https://github.com/Skyvern-AI/Skyvern</a><p>We talked to hundreds of companies about things they do in the background and found that most of them depend on repetitive manual workflows. The breadth of these workflows surprised us – most companies started off doing things manually, and eventually either hired people to scale the manual work, or wrote scripts using Selenium-like browser automation libraries.<p>In these conversations, one common point stood out: scaling is a pain either way. Companies relying on hiring struggled to adjust team sizes with fluctuating demand. Companies using Selenium and similar tools had a different problem: it can take days or even weeks to get a new workflow automated, and then would require ongoing maintenance any time the underlying websites changed because their XPath based interaction logic suddenly became invalid.<p>We felt like there was a way to get the best of both worlds with LLMs. We could use LLMs to reason through a website’s layout, while preserving the advantage of traditional browser automations allowing it to scale alongside demand. This led us to build Skyvern with a few core functionalities:<p>1. Skyvern can operate on websites it’s never seen before by connecting visible elements with the natural language instructions provided to us. We use a blend of computer vision and DOM parsing to identify a set of possible actions on a website, and multi-modal LLMs to map the natural language instructions to the available actions on the page.<p>2. Skyvern is resistant to website layout changes, as it doesn’t depend on any predetermined XPaths or other selectors. If a layout ever changes, we can leverage the methodology in #1 to complete the user-specified goal.<p>3. Skyvern accepts a blob of information when navigating workflows—basically just a json blob of whatever information you want to put, and then we use LLMs to map that to information on the screen. For example: if you're generating a quote from Geico, they commonly ask “Were you eligible to drive at 21?”. The answer could be inferred from the driver receiving their license in 2012, and having a birth date of 1996.<p>The above strategy adapts well to a number of use cases that Skyvern is helping companies with today: (1) Automating materials procurement by searching for, adding to cart, and transacting products through vendor websites that don’t have APIs; (2) Registering accounts, filing forms, and searching for information on government websites (ex: registering franchise tax information for Delaware C-corps); (3) Generating insurance quotes by completing multi-step dynamic forms on insurance websites; (4) Automating the job application process by mapping user-specified information (such as a Resume) to a job posting.<p>And here are some use-cases we’re actively looking to expand into: (1) Automating post-checkup data entry with patient data inside medical EHR systems (ie submitting billing codes, adding notes, etc), an (2) Doing customer research ahead of discovery calls by analyzing landing pages and other metadata about a specific business.<p>We’re still very early and would love to get your feedback!
Show HN: Query Your Sheets with SheetSQL
Hello HN!<p>I've developed a tool named SheetSQL that allows you to query, join, export, and schedule your queries to Google Sheets through a straightforward SQL interface in the browser.<p>This tool is a simple, first iteration of the idea, so I'm eager to receive feedback on it. You can contact me at taras [at] sheetsql.io :D. I'd love to find out if it could be useful for folks out there!<p>As someone working in fintech, I constantly deal with sheets and often find it challenging to perform seemingly simple tasks like JOINs natively. However, I'm familiar with SQL, which inspired the creation of SheetSQL. It's designed to assist those who use sheets daily and have some SQL knowledge, making operations across multiple sheets and worksheets as easy as if they were interacting with a Postgres database.<p>For those interested in the technical details, the engine powering the queries in the background is DuckDB. Therefore, you can expect support for all syntax from the latest version of DuckDB :)<p>Cheers,
Show HN: Query Your Sheets with SheetSQL
Hello HN!<p>I've developed a tool named SheetSQL that allows you to query, join, export, and schedule your queries to Google Sheets through a straightforward SQL interface in the browser.<p>This tool is a simple, first iteration of the idea, so I'm eager to receive feedback on it. You can contact me at taras [at] sheetsql.io :D. I'd love to find out if it could be useful for folks out there!<p>As someone working in fintech, I constantly deal with sheets and often find it challenging to perform seemingly simple tasks like JOINs natively. However, I'm familiar with SQL, which inspired the creation of SheetSQL. It's designed to assist those who use sheets daily and have some SQL knowledge, making operations across multiple sheets and worksheets as easy as if they were interacting with a Postgres database.<p>For those interested in the technical details, the engine powering the queries in the background is DuckDB. Therefore, you can expect support for all syntax from the latest version of DuckDB :)<p>Cheers,
Show HN: Query Your Sheets with SheetSQL
Hello HN!<p>I've developed a tool named SheetSQL that allows you to query, join, export, and schedule your queries to Google Sheets through a straightforward SQL interface in the browser.<p>This tool is a simple, first iteration of the idea, so I'm eager to receive feedback on it. You can contact me at taras [at] sheetsql.io :D. I'd love to find out if it could be useful for folks out there!<p>As someone working in fintech, I constantly deal with sheets and often find it challenging to perform seemingly simple tasks like JOINs natively. However, I'm familiar with SQL, which inspired the creation of SheetSQL. It's designed to assist those who use sheets daily and have some SQL knowledge, making operations across multiple sheets and worksheets as easy as if they were interacting with a Postgres database.<p>For those interested in the technical details, the engine powering the queries in the background is DuckDB. Therefore, you can expect support for all syntax from the latest version of DuckDB :)<p>Cheers,
Show HN: Creating custom coloring pages from photos. Great for parents/teachers
Making coloring pages out of your own personal photos is a fun and creative way to engage kids (and adults!) with art. Instead of generic coloring pages, you can turn special memories into custom works of art just waiting to be filled in with color.
Show HN: Creating custom coloring pages from photos. Great for parents/teachers
Making coloring pages out of your own personal photos is a fun and creative way to engage kids (and adults!) with art. Instead of generic coloring pages, you can turn special memories into custom works of art just waiting to be filled in with color.
Show HN: A user-friendly UI for viewing and editing Markdown files
Show HN: A user-friendly UI for viewing and editing Markdown files
Show HN: Flox 1.0 – Open-source dev env as code with Nix
Hey HN,<p>I'm Ron Efroni, CEO at Flox, and today we are releasing version 1.0 of our open source CLI, helping folks manage development environments everywhere.My own experience with development environments began with air-gapped systems, having to actually burn software to a CD to iterate over a very slow and expensive development cycle, sometimes reaching the server rack and realizing I have the wrong disk.... Fast forward to today and there are countless alternatives available backed by incredible compute resources, yet we somehow still find ourselves paying the price of long development cycles. That's why I've been working for over a decade to simplify the development stack so we can spend more time on making 1's and 0's do magical things, and why my co-founder Michael and I started Flox to bring you the solution based on Nix. Today is just the first step on that journey. We hope you'll take a peek at our new release, and very much look forward to continuing the journey with you from here together!<p>Introducing Flox 1.0<p>Flox is a platform that lets developers and operators focus on building fast with reproducible environments that span the enterprise SDLC. Using a declarative framework based on Nix, a package management and configuration tool, Flox allows developers to create environments that contain everything they need to build software.<p>Why Flox?<p>Flox behaves a lot like your favorite and familiar package manager, but it allows you to create as many environments as you want on your machine. Each one can contain a different combination of packages.<p>Environments are portable by default. If you install a package inside one that isn't cross-platform, it's easy to carve out exceptions. It's also easy to write hooks and populate your environment with variables - we designed it to be hackable.<p>Flox environments run in user-space, like, where you are. When you type `ls` after activating a Flox environment you will see the same stuff because you're in the same place - even with all those new packages available. No mounting volumes, no proxying ports. No breaking into the toolset you just conjured.<p>Getting Started: No sign-ups, just one install away. Dive into our GitHub repository (<a href="https://github.com/flox/flox">https://github.com/flox/flox</a>) and start exploring<p>I’m around all day to answer questions, talk Nix, or just reminisce about simpler times ;).<p>Lots of open source love, Ron
Show HN: Flox 1.0 – Open-source dev env as code with Nix
Hey HN,<p>I'm Ron Efroni, CEO at Flox, and today we are releasing version 1.0 of our open source CLI, helping folks manage development environments everywhere.My own experience with development environments began with air-gapped systems, having to actually burn software to a CD to iterate over a very slow and expensive development cycle, sometimes reaching the server rack and realizing I have the wrong disk.... Fast forward to today and there are countless alternatives available backed by incredible compute resources, yet we somehow still find ourselves paying the price of long development cycles. That's why I've been working for over a decade to simplify the development stack so we can spend more time on making 1's and 0's do magical things, and why my co-founder Michael and I started Flox to bring you the solution based on Nix. Today is just the first step on that journey. We hope you'll take a peek at our new release, and very much look forward to continuing the journey with you from here together!<p>Introducing Flox 1.0<p>Flox is a platform that lets developers and operators focus on building fast with reproducible environments that span the enterprise SDLC. Using a declarative framework based on Nix, a package management and configuration tool, Flox allows developers to create environments that contain everything they need to build software.<p>Why Flox?<p>Flox behaves a lot like your favorite and familiar package manager, but it allows you to create as many environments as you want on your machine. Each one can contain a different combination of packages.<p>Environments are portable by default. If you install a package inside one that isn't cross-platform, it's easy to carve out exceptions. It's also easy to write hooks and populate your environment with variables - we designed it to be hackable.<p>Flox environments run in user-space, like, where you are. When you type `ls` after activating a Flox environment you will see the same stuff because you're in the same place - even with all those new packages available. No mounting volumes, no proxying ports. No breaking into the toolset you just conjured.<p>Getting Started: No sign-ups, just one install away. Dive into our GitHub repository (<a href="https://github.com/flox/flox">https://github.com/flox/flox</a>) and start exploring<p>I’m around all day to answer questions, talk Nix, or just reminisce about simpler times ;).<p>Lots of open source love, Ron