The best Hacker News stories from Show from the past day

Latest posts:

Show HN: Frigade - React SDK for building quality onboarding & activation flows

Hey HN! Christian here, cofounder and CTO of Frigade (<a href="https://www.frigade.com">https://www.frigade.com</a>). Our tool helps product and engineering teams quickly build in-app experiences like getting started checklists, product tours, account upsells, and nps surveys. Basically all the little things that can help boost your product's activation and retention rates, but that you don't really want to spend the time building from scratch.<p>We built React UI components such as <Tour />, <Checklist />, and <Announcement /> and combined them with a web app that you can use to control user targeting, content management, sequencing, and more. These components can help power experiences like onboarding forms, surveys, house ads, and feature announcements.<p>About a year ago, we did our original Launch HN post (<a href="https://news.ycombinator.com/item?id=35246292">https://news.ycombinator.com/item?id=35246292</a>) and we got a bunch of great feedback from HN. We also got grilled for having our product and documentation behind a waitlist – whoops! We fixed that, though. We’ve also built a ton of new functionality since then, and we’re dubbing it Frigade 2.0. This includes a free tier so you can check out the product without putting a credit card down, and we recently finished a complete rewrite of our SDK using Radix UI and a theming system built with Emotion.<p>There are other tools in this space but they tend to be "no-code", which sounds good until you realize that they actually inject bloated scripts that slow your product down, and make it hard and annoying for you to customize. We believe that good old code, not "no-code", is the better fit for this space. It lets you unit test, it works with version control, it uses your design system, it rolls out with your CI pipeline, and so on.<p>We solve this problem by: 1) allowing engineers to fully set up the guardrails for what non-technical teammates can do through a one time setup. 2) empowering non-technical team mates to build native experiences that drive better results while building confidence that they won’t mess up your production environment. Ironically, many companies that use our product have churned away from no-code solutions (such as Pendo) as they were too time consuming to manage, caused too many bugs in production, and simply weren’t effective enough.<p>Of course, the upfront cost of setting up Frigade is a bit higher than dropping in a <script /> tag on your website, but once it's set up, non-technical team mates (such as marketers) have full control to create experiences within the guardrails that engineering has created.<p>What do you think about belief and our approach? We’re especially curious to hear from those folks who have experience using these no-code tools or have had to build these kinds of experiences from scratch.

Show HN: Open-source, browser-local data exploration using DuckDB-WASM and PRQL

Hey HN! We’ve built Pretzel, an open-source data exploration and visualization tool that runs fully in the browser and can handle large files (200 MB CSV on my 8gb MacBook air is snappy). It’s also reactive - so if, for example, you change a filter, all the data transform blocks after it re-evaluate automatically. You can try it here: <a href="https://pretzelai.github.io/" rel="nofollow">https://pretzelai.github.io/</a> (static hosted webpage) or see a demo video here: <a href="https://www.youtube.com/watch?v=73wNEun_L7w" rel="nofollow">https://www.youtube.com/watch?v=73wNEun_L7w</a><p>You can play with the demo CSV that’s pre-loaded (GitHub data of text-editor adjacent projects) or upload your own CSV/XLSX file. The tool runs fully in-browser—you can disconnect from the internet once the website loads—so feel free to use sensitive data if you like.<p>Here’s how it works: You upload a CSV file and then, explore your data as a series of successive data transforms and plots. For example, you might: (1) Remove some columns; (2) Apply some filters (remove nulls, remove outliers, restrict time range etc); (3) Do a pivot (i.e, a group-by but fancier); (4) Plot a chart; (5) Download the chart and the the transformed data. See screenshot: <a href="https://imgur.com/a/qO4yURI" rel="nofollow">https://imgur.com/a/qO4yURI</a><p>In the UI, each transform step appears as a “Block”. You can always see the result of the full transform in a table on the right. The transform blocks are editable - for instance in the example above, you can go to step 2, change some filters and the reactivity will take care of re-computing all the cells that follow, including the charts.<p>We wanted Pretzel to run locally in the browser <i>and</i> be extremely performant on large files. So, we parse CSVs with the fastest CSV parser (uDSV: <a href="https://github.com/leeoniya/uDSV">https://github.com/leeoniya/uDSV</a>) and use DuckDB-Wasm (<a href="https://github.com/duckdb/duckdb-wasm">https://github.com/duckdb/duckdb-wasm</a>) to do all the heavy lifting of processing the data. We also wanted to allow for chained data transformations where each new block operates on the result of the previous block. For this, we’re using PRQL (<a href="https://prql-lang.org/" rel="nofollow">https://prql-lang.org/</a>) since it maps 1-1 with chained data transform blocks - each block maps to a chunk of PRQL which when combined, describes the full data transform chain. (PRQL doesn’t support DuckDB’s Pivot statement though so we had to make some CTE based hacks).<p>There’s also an AI block: This is the only (optional) feature that requires an internet connection but we’re working on adding local model support via Ollama. For now, you can use your own OpenAI API key or use an AI server we provide (GPT4 proxy; it’s loaded with a few credits), specify a transform in plain english and get back the SQL for the transform which you can edit.<p>Our roadmap includes allowing API calls to create new columns; support for an SQL block with nice autocomplete features, and a Python block (using Pyodide to run Python in the browser) on the results of the data transforms, much like a jupyter notebook.<p>There’s two of us and we’ve only spent about a week coding this and fixing major bugs so there are still some bugs to iron out. We’d <i>love</i> for you to try this and to get your feedback!

Show HN: Open-source, browser-local data exploration using DuckDB-WASM and PRQL

Hey HN! We’ve built Pretzel, an open-source data exploration and visualization tool that runs fully in the browser and can handle large files (200 MB CSV on my 8gb MacBook air is snappy). It’s also reactive - so if, for example, you change a filter, all the data transform blocks after it re-evaluate automatically. You can try it here: <a href="https://pretzelai.github.io/" rel="nofollow">https://pretzelai.github.io/</a> (static hosted webpage) or see a demo video here: <a href="https://www.youtube.com/watch?v=73wNEun_L7w" rel="nofollow">https://www.youtube.com/watch?v=73wNEun_L7w</a><p>You can play with the demo CSV that’s pre-loaded (GitHub data of text-editor adjacent projects) or upload your own CSV/XLSX file. The tool runs fully in-browser—you can disconnect from the internet once the website loads—so feel free to use sensitive data if you like.<p>Here’s how it works: You upload a CSV file and then, explore your data as a series of successive data transforms and plots. For example, you might: (1) Remove some columns; (2) Apply some filters (remove nulls, remove outliers, restrict time range etc); (3) Do a pivot (i.e, a group-by but fancier); (4) Plot a chart; (5) Download the chart and the the transformed data. See screenshot: <a href="https://imgur.com/a/qO4yURI" rel="nofollow">https://imgur.com/a/qO4yURI</a><p>In the UI, each transform step appears as a “Block”. You can always see the result of the full transform in a table on the right. The transform blocks are editable - for instance in the example above, you can go to step 2, change some filters and the reactivity will take care of re-computing all the cells that follow, including the charts.<p>We wanted Pretzel to run locally in the browser <i>and</i> be extremely performant on large files. So, we parse CSVs with the fastest CSV parser (uDSV: <a href="https://github.com/leeoniya/uDSV">https://github.com/leeoniya/uDSV</a>) and use DuckDB-Wasm (<a href="https://github.com/duckdb/duckdb-wasm">https://github.com/duckdb/duckdb-wasm</a>) to do all the heavy lifting of processing the data. We also wanted to allow for chained data transformations where each new block operates on the result of the previous block. For this, we’re using PRQL (<a href="https://prql-lang.org/" rel="nofollow">https://prql-lang.org/</a>) since it maps 1-1 with chained data transform blocks - each block maps to a chunk of PRQL which when combined, describes the full data transform chain. (PRQL doesn’t support DuckDB’s Pivot statement though so we had to make some CTE based hacks).<p>There’s also an AI block: This is the only (optional) feature that requires an internet connection but we’re working on adding local model support via Ollama. For now, you can use your own OpenAI API key or use an AI server we provide (GPT4 proxy; it’s loaded with a few credits), specify a transform in plain english and get back the SQL for the transform which you can edit.<p>Our roadmap includes allowing API calls to create new columns; support for an SQL block with nice autocomplete features, and a Python block (using Pyodide to run Python in the browser) on the results of the data transforms, much like a jupyter notebook.<p>There’s two of us and we’ve only spent about a week coding this and fixing major bugs so there are still some bugs to iron out. We’d <i>love</i> for you to try this and to get your feedback!

Show HN: Open-source, browser-local data exploration using DuckDB-WASM and PRQL

Hey HN! We’ve built Pretzel, an open-source data exploration and visualization tool that runs fully in the browser and can handle large files (200 MB CSV on my 8gb MacBook air is snappy). It’s also reactive - so if, for example, you change a filter, all the data transform blocks after it re-evaluate automatically. You can try it here: <a href="https://pretzelai.github.io/" rel="nofollow">https://pretzelai.github.io/</a> (static hosted webpage) or see a demo video here: <a href="https://www.youtube.com/watch?v=73wNEun_L7w" rel="nofollow">https://www.youtube.com/watch?v=73wNEun_L7w</a><p>You can play with the demo CSV that’s pre-loaded (GitHub data of text-editor adjacent projects) or upload your own CSV/XLSX file. The tool runs fully in-browser—you can disconnect from the internet once the website loads—so feel free to use sensitive data if you like.<p>Here’s how it works: You upload a CSV file and then, explore your data as a series of successive data transforms and plots. For example, you might: (1) Remove some columns; (2) Apply some filters (remove nulls, remove outliers, restrict time range etc); (3) Do a pivot (i.e, a group-by but fancier); (4) Plot a chart; (5) Download the chart and the the transformed data. See screenshot: <a href="https://imgur.com/a/qO4yURI" rel="nofollow">https://imgur.com/a/qO4yURI</a><p>In the UI, each transform step appears as a “Block”. You can always see the result of the full transform in a table on the right. The transform blocks are editable - for instance in the example above, you can go to step 2, change some filters and the reactivity will take care of re-computing all the cells that follow, including the charts.<p>We wanted Pretzel to run locally in the browser <i>and</i> be extremely performant on large files. So, we parse CSVs with the fastest CSV parser (uDSV: <a href="https://github.com/leeoniya/uDSV">https://github.com/leeoniya/uDSV</a>) and use DuckDB-Wasm (<a href="https://github.com/duckdb/duckdb-wasm">https://github.com/duckdb/duckdb-wasm</a>) to do all the heavy lifting of processing the data. We also wanted to allow for chained data transformations where each new block operates on the result of the previous block. For this, we’re using PRQL (<a href="https://prql-lang.org/" rel="nofollow">https://prql-lang.org/</a>) since it maps 1-1 with chained data transform blocks - each block maps to a chunk of PRQL which when combined, describes the full data transform chain. (PRQL doesn’t support DuckDB’s Pivot statement though so we had to make some CTE based hacks).<p>There’s also an AI block: This is the only (optional) feature that requires an internet connection but we’re working on adding local model support via Ollama. For now, you can use your own OpenAI API key or use an AI server we provide (GPT4 proxy; it’s loaded with a few credits), specify a transform in plain english and get back the SQL for the transform which you can edit.<p>Our roadmap includes allowing API calls to create new columns; support for an SQL block with nice autocomplete features, and a Python block (using Pyodide to run Python in the browser) on the results of the data transforms, much like a jupyter notebook.<p>There’s two of us and we’ve only spent about a week coding this and fixing major bugs so there are still some bugs to iron out. We’d <i>love</i> for you to try this and to get your feedback!

Show HN: Open-source, browser-local data exploration using DuckDB-WASM and PRQL

Hey HN! We’ve built Pretzel, an open-source data exploration and visualization tool that runs fully in the browser and can handle large files (200 MB CSV on my 8gb MacBook air is snappy). It’s also reactive - so if, for example, you change a filter, all the data transform blocks after it re-evaluate automatically. You can try it here: <a href="https://pretzelai.github.io/" rel="nofollow">https://pretzelai.github.io/</a> (static hosted webpage) or see a demo video here: <a href="https://www.youtube.com/watch?v=73wNEun_L7w" rel="nofollow">https://www.youtube.com/watch?v=73wNEun_L7w</a><p>You can play with the demo CSV that’s pre-loaded (GitHub data of text-editor adjacent projects) or upload your own CSV/XLSX file. The tool runs fully in-browser—you can disconnect from the internet once the website loads—so feel free to use sensitive data if you like.<p>Here’s how it works: You upload a CSV file and then, explore your data as a series of successive data transforms and plots. For example, you might: (1) Remove some columns; (2) Apply some filters (remove nulls, remove outliers, restrict time range etc); (3) Do a pivot (i.e, a group-by but fancier); (4) Plot a chart; (5) Download the chart and the the transformed data. See screenshot: <a href="https://imgur.com/a/qO4yURI" rel="nofollow">https://imgur.com/a/qO4yURI</a><p>In the UI, each transform step appears as a “Block”. You can always see the result of the full transform in a table on the right. The transform blocks are editable - for instance in the example above, you can go to step 2, change some filters and the reactivity will take care of re-computing all the cells that follow, including the charts.<p>We wanted Pretzel to run locally in the browser <i>and</i> be extremely performant on large files. So, we parse CSVs with the fastest CSV parser (uDSV: <a href="https://github.com/leeoniya/uDSV">https://github.com/leeoniya/uDSV</a>) and use DuckDB-Wasm (<a href="https://github.com/duckdb/duckdb-wasm">https://github.com/duckdb/duckdb-wasm</a>) to do all the heavy lifting of processing the data. We also wanted to allow for chained data transformations where each new block operates on the result of the previous block. For this, we’re using PRQL (<a href="https://prql-lang.org/" rel="nofollow">https://prql-lang.org/</a>) since it maps 1-1 with chained data transform blocks - each block maps to a chunk of PRQL which when combined, describes the full data transform chain. (PRQL doesn’t support DuckDB’s Pivot statement though so we had to make some CTE based hacks).<p>There’s also an AI block: This is the only (optional) feature that requires an internet connection but we’re working on adding local model support via Ollama. For now, you can use your own OpenAI API key or use an AI server we provide (GPT4 proxy; it’s loaded with a few credits), specify a transform in plain english and get back the SQL for the transform which you can edit.<p>Our roadmap includes allowing API calls to create new columns; support for an SQL block with nice autocomplete features, and a Python block (using Pyodide to run Python in the browser) on the results of the data transforms, much like a jupyter notebook.<p>There’s two of us and we’ve only spent about a week coding this and fixing major bugs so there are still some bugs to iron out. We’d <i>love</i> for you to try this and to get your feedback!

Show HN: Open-source, browser-local data exploration using DuckDB-WASM and PRQL

Hey HN! We’ve built Pretzel, an open-source data exploration and visualization tool that runs fully in the browser and can handle large files (200 MB CSV on my 8gb MacBook air is snappy). It’s also reactive - so if, for example, you change a filter, all the data transform blocks after it re-evaluate automatically. You can try it here: <a href="https://pretzelai.github.io/" rel="nofollow">https://pretzelai.github.io/</a> (static hosted webpage) or see a demo video here: <a href="https://www.youtube.com/watch?v=73wNEun_L7w" rel="nofollow">https://www.youtube.com/watch?v=73wNEun_L7w</a><p>You can play with the demo CSV that’s pre-loaded (GitHub data of text-editor adjacent projects) or upload your own CSV/XLSX file. The tool runs fully in-browser—you can disconnect from the internet once the website loads—so feel free to use sensitive data if you like.<p>Here’s how it works: You upload a CSV file and then, explore your data as a series of successive data transforms and plots. For example, you might: (1) Remove some columns; (2) Apply some filters (remove nulls, remove outliers, restrict time range etc); (3) Do a pivot (i.e, a group-by but fancier); (4) Plot a chart; (5) Download the chart and the the transformed data. See screenshot: <a href="https://imgur.com/a/qO4yURI" rel="nofollow">https://imgur.com/a/qO4yURI</a><p>In the UI, each transform step appears as a “Block”. You can always see the result of the full transform in a table on the right. The transform blocks are editable - for instance in the example above, you can go to step 2, change some filters and the reactivity will take care of re-computing all the cells that follow, including the charts.<p>We wanted Pretzel to run locally in the browser <i>and</i> be extremely performant on large files. So, we parse CSVs with the fastest CSV parser (uDSV: <a href="https://github.com/leeoniya/uDSV">https://github.com/leeoniya/uDSV</a>) and use DuckDB-Wasm (<a href="https://github.com/duckdb/duckdb-wasm">https://github.com/duckdb/duckdb-wasm</a>) to do all the heavy lifting of processing the data. We also wanted to allow for chained data transformations where each new block operates on the result of the previous block. For this, we’re using PRQL (<a href="https://prql-lang.org/" rel="nofollow">https://prql-lang.org/</a>) since it maps 1-1 with chained data transform blocks - each block maps to a chunk of PRQL which when combined, describes the full data transform chain. (PRQL doesn’t support DuckDB’s Pivot statement though so we had to make some CTE based hacks).<p>There’s also an AI block: This is the only (optional) feature that requires an internet connection but we’re working on adding local model support via Ollama. For now, you can use your own OpenAI API key or use an AI server we provide (GPT4 proxy; it’s loaded with a few credits), specify a transform in plain english and get back the SQL for the transform which you can edit.<p>Our roadmap includes allowing API calls to create new columns; support for an SQL block with nice autocomplete features, and a Python block (using Pyodide to run Python in the browser) on the results of the data transforms, much like a jupyter notebook.<p>There’s two of us and we’ve only spent about a week coding this and fixing major bugs so there are still some bugs to iron out. We’d <i>love</i> for you to try this and to get your feedback!

Show HN: Matrix Multiplication with Half the Multiplications

Show HN: Matrix Multiplication with Half the Multiplications

Show HN: Matrix Multiplication with Half the Multiplications

Show HN: Matrix Multiplication with Half the Multiplications

Show HN: FlakeHub Cache: Fast, secure, configurable. A new take on Nix caching

Show HN: A fast HNSW implementation in Rust

Show HN: A fast HNSW implementation in Rust

Show HN: PyKidos, Teach Your Kid Python in the Browser

Show HN: PyKidos, Teach Your Kid Python in the Browser

Show HN: PyKidos, Teach Your Kid Python in the Browser

Show HN: PyKidos, Teach Your Kid Python in the Browser

Show HN: Skyvern – Browser automation using LLMs and computer vision

Hey HN, we're building Skyvern (<a href="https://www.skyvern.com">https://www.skyvern.com</a>), an open-source tool that uses LLMs and computer vision to help companies automate browser-based workflows. You can see some examples here: <a href="https://github.com/Skyvern-AI/skyvern#real-world-examples-of-skyvern">https://github.com/Skyvern-AI/skyvern#real-world-examples-of...</a> and there's a demo video at <a href="https://github.com/Skyvern-AI/skyvern#demo">https://github.com/Skyvern-AI/skyvern#demo</a>, along with some instructions on running it locally.<p>We provide a natural-language API to automate repetitive manual workflows that happen within the companies' backoffices. You can check out our code and play with Skyvern here: <a href="https://github.com/Skyvern-AI/Skyvern">https://github.com/Skyvern-AI/Skyvern</a><p>We talked to hundreds of companies about things they do in the background and found that most of them depend on repetitive manual workflows. The breadth of these workflows surprised us – most companies started off doing things manually, and eventually either hired people to scale the manual work, or wrote scripts using Selenium-like browser automation libraries.<p>In these conversations, one common point stood out: scaling is a pain either way. Companies relying on hiring struggled to adjust team sizes with fluctuating demand. Companies using Selenium and similar tools had a different problem: it can take days or even weeks to get a new workflow automated, and then would require ongoing maintenance any time the underlying websites changed because their XPath based interaction logic suddenly became invalid.<p>We felt like there was a way to get the best of both worlds with LLMs. We could use LLMs to reason through a website’s layout, while preserving the advantage of traditional browser automations allowing it to scale alongside demand. This led us to build Skyvern with a few core functionalities:<p>1. Skyvern can operate on websites it’s never seen before by connecting visible elements with the natural language instructions provided to us. We use a blend of computer vision and DOM parsing to identify a set of possible actions on a website, and multi-modal LLMs to map the natural language instructions to the available actions on the page.<p>2. Skyvern is resistant to website layout changes, as it doesn’t depend on any predetermined XPaths or other selectors. If a layout ever changes, we can leverage the methodology in #1 to complete the user-specified goal.<p>3. Skyvern accepts a blob of information when navigating workflows—basically just a json blob of whatever information you want to put, and then we use LLMs to map that to information on the screen. For example: if you're generating a quote from Geico, they commonly ask “Were you eligible to drive at 21?”. The answer could be inferred from the driver receiving their license in 2012, and having a birth date of 1996.<p>The above strategy adapts well to a number of use cases that Skyvern is helping companies with today: (1) Automating materials procurement by searching for, adding to cart, and transacting products through vendor websites that don’t have APIs; (2) Registering accounts, filing forms, and searching for information on government websites (ex: registering franchise tax information for Delaware C-corps); (3) Generating insurance quotes by completing multi-step dynamic forms on insurance websites; (4) Automating the job application process by mapping user-specified information (such as a Resume) to a job posting.<p>And here are some use-cases we’re actively looking to expand into: (1) Automating post-checkup data entry with patient data inside medical EHR systems (ie submitting billing codes, adding notes, etc), an (2) Doing customer research ahead of discovery calls by analyzing landing pages and other metadata about a specific business.<p>We’re still very early and would love to get your feedback!

Show HN: Skyvern – Browser automation using LLMs and computer vision

Hey HN, we're building Skyvern (<a href="https://www.skyvern.com">https://www.skyvern.com</a>), an open-source tool that uses LLMs and computer vision to help companies automate browser-based workflows. You can see some examples here: <a href="https://github.com/Skyvern-AI/skyvern#real-world-examples-of-skyvern">https://github.com/Skyvern-AI/skyvern#real-world-examples-of...</a> and there's a demo video at <a href="https://github.com/Skyvern-AI/skyvern#demo">https://github.com/Skyvern-AI/skyvern#demo</a>, along with some instructions on running it locally.<p>We provide a natural-language API to automate repetitive manual workflows that happen within the companies' backoffices. You can check out our code and play with Skyvern here: <a href="https://github.com/Skyvern-AI/Skyvern">https://github.com/Skyvern-AI/Skyvern</a><p>We talked to hundreds of companies about things they do in the background and found that most of them depend on repetitive manual workflows. The breadth of these workflows surprised us – most companies started off doing things manually, and eventually either hired people to scale the manual work, or wrote scripts using Selenium-like browser automation libraries.<p>In these conversations, one common point stood out: scaling is a pain either way. Companies relying on hiring struggled to adjust team sizes with fluctuating demand. Companies using Selenium and similar tools had a different problem: it can take days or even weeks to get a new workflow automated, and then would require ongoing maintenance any time the underlying websites changed because their XPath based interaction logic suddenly became invalid.<p>We felt like there was a way to get the best of both worlds with LLMs. We could use LLMs to reason through a website’s layout, while preserving the advantage of traditional browser automations allowing it to scale alongside demand. This led us to build Skyvern with a few core functionalities:<p>1. Skyvern can operate on websites it’s never seen before by connecting visible elements with the natural language instructions provided to us. We use a blend of computer vision and DOM parsing to identify a set of possible actions on a website, and multi-modal LLMs to map the natural language instructions to the available actions on the page.<p>2. Skyvern is resistant to website layout changes, as it doesn’t depend on any predetermined XPaths or other selectors. If a layout ever changes, we can leverage the methodology in #1 to complete the user-specified goal.<p>3. Skyvern accepts a blob of information when navigating workflows—basically just a json blob of whatever information you want to put, and then we use LLMs to map that to information on the screen. For example: if you're generating a quote from Geico, they commonly ask “Were you eligible to drive at 21?”. The answer could be inferred from the driver receiving their license in 2012, and having a birth date of 1996.<p>The above strategy adapts well to a number of use cases that Skyvern is helping companies with today: (1) Automating materials procurement by searching for, adding to cart, and transacting products through vendor websites that don’t have APIs; (2) Registering accounts, filing forms, and searching for information on government websites (ex: registering franchise tax information for Delaware C-corps); (3) Generating insurance quotes by completing multi-step dynamic forms on insurance websites; (4) Automating the job application process by mapping user-specified information (such as a Resume) to a job posting.<p>And here are some use-cases we’re actively looking to expand into: (1) Automating post-checkup data entry with patient data inside medical EHR systems (ie submitting billing codes, adding notes, etc), an (2) Doing customer research ahead of discovery calls by analyzing landing pages and other metadata about a specific business.<p>We’re still very early and would love to get your feedback!

Show HN: Skyvern – Browser automation using LLMs and computer vision

Hey HN, we're building Skyvern (<a href="https://www.skyvern.com">https://www.skyvern.com</a>), an open-source tool that uses LLMs and computer vision to help companies automate browser-based workflows. You can see some examples here: <a href="https://github.com/Skyvern-AI/skyvern#real-world-examples-of-skyvern">https://github.com/Skyvern-AI/skyvern#real-world-examples-of...</a> and there's a demo video at <a href="https://github.com/Skyvern-AI/skyvern#demo">https://github.com/Skyvern-AI/skyvern#demo</a>, along with some instructions on running it locally.<p>We provide a natural-language API to automate repetitive manual workflows that happen within the companies' backoffices. You can check out our code and play with Skyvern here: <a href="https://github.com/Skyvern-AI/Skyvern">https://github.com/Skyvern-AI/Skyvern</a><p>We talked to hundreds of companies about things they do in the background and found that most of them depend on repetitive manual workflows. The breadth of these workflows surprised us – most companies started off doing things manually, and eventually either hired people to scale the manual work, or wrote scripts using Selenium-like browser automation libraries.<p>In these conversations, one common point stood out: scaling is a pain either way. Companies relying on hiring struggled to adjust team sizes with fluctuating demand. Companies using Selenium and similar tools had a different problem: it can take days or even weeks to get a new workflow automated, and then would require ongoing maintenance any time the underlying websites changed because their XPath based interaction logic suddenly became invalid.<p>We felt like there was a way to get the best of both worlds with LLMs. We could use LLMs to reason through a website’s layout, while preserving the advantage of traditional browser automations allowing it to scale alongside demand. This led us to build Skyvern with a few core functionalities:<p>1. Skyvern can operate on websites it’s never seen before by connecting visible elements with the natural language instructions provided to us. We use a blend of computer vision and DOM parsing to identify a set of possible actions on a website, and multi-modal LLMs to map the natural language instructions to the available actions on the page.<p>2. Skyvern is resistant to website layout changes, as it doesn’t depend on any predetermined XPaths or other selectors. If a layout ever changes, we can leverage the methodology in #1 to complete the user-specified goal.<p>3. Skyvern accepts a blob of information when navigating workflows—basically just a json blob of whatever information you want to put, and then we use LLMs to map that to information on the screen. For example: if you're generating a quote from Geico, they commonly ask “Were you eligible to drive at 21?”. The answer could be inferred from the driver receiving their license in 2012, and having a birth date of 1996.<p>The above strategy adapts well to a number of use cases that Skyvern is helping companies with today: (1) Automating materials procurement by searching for, adding to cart, and transacting products through vendor websites that don’t have APIs; (2) Registering accounts, filing forms, and searching for information on government websites (ex: registering franchise tax information for Delaware C-corps); (3) Generating insurance quotes by completing multi-step dynamic forms on insurance websites; (4) Automating the job application process by mapping user-specified information (such as a Resume) to a job posting.<p>And here are some use-cases we’re actively looking to expand into: (1) Automating post-checkup data entry with patient data inside medical EHR systems (ie submitting billing codes, adding notes, etc), an (2) Doing customer research ahead of discovery calls by analyzing landing pages and other metadata about a specific business.<p>We’re still very early and would love to get your feedback!