The best Hacker News stories from Show from the past day

Go back

Latest posts:

Show HN: EdgeDB 1.0

Show HN: EdgeDB 1.0

Show HN: Zingg – open-source entity resolution for single source of truth

Hello HN,<p>I am Sonal, a data consultant from India. For the past few months(and years!), I have been working on an entity resolution tool to build a single source of truth for customers, suppliers, products and parts. Here is a short demo of Zingg in action https://www.youtube.com/watch?v=zOabyZxN9b0<p>As a data consultant, I often struggled to build unified views of core entities on the datalake and the warehouse. Data spread across different systems has variations and consistencies making Customer 360, KYC, AML, segmentation, personalization and other analytics difficult.<p>As I talked with different clients facing this issue, I searched for existing solutions which I could use or recommend. Unfortunately, most of them were very expensive MDM solutions like Tamr, or CDP solutions like Amperity. There were many open source libraries, but they did not tie well into the datalake/warehouse scenarios we were working with, did not scale and/or needed a decent bit of programming or did not generalize. I even tried to build something internally and failed miserably, and that got me hooked :-)<p>As I dug deeper into the problem, I realized that there were multiple challenges. Data matching, at its very core, becomes a cartesian join, as you need to compare every pair of records to figure out the matches. With millions of records, this becomes extremely tough to scale. I referred to various research papers and then implemented a blocking algorithm to overcome this. More details at https://docs.zingg.ai/docs/zModels.html<p>The second challenge was to say which pairs are a match. I wanted to have a machine learning-based approach to handle the different types of entities and the variety of differences in real world data. But I also felt that non ML experts should be able to use Zingg easily, hence took the approach of abstracting the feature generation and hyper-parameter tuning for the classifier.<p>Once I settled on the ML approach, the problem of training data quickly arose, which led me to pick up active learning and build an interactive labeler through which sample records can be marked as matches and non matches to build training sets quickly. I still feel that we should have an unsupervised approach as well, but I have not yet figured out the right way to do so.<p>The Zingg repository is hosted at https://github.com/zinggAI/zingg and we have close to 60 members on our Slack(https://join.slack.com/t/zinggai/shared_invite/zt-w7zlcnol-vEuqU9m~Q56kLLUVxRgpOA). We are now two developers working full time on Zingg!!! I am super happy that early users have been able to use Zingg and push us to build more stuff - model documentation, using pre-existing training data, native Snowflake integration etc.<p>I have been an open source consumer all my dev life, and this is the first time I have made a decent contribution. It is my first time trying to build a community as well. Not sure how the future will unfold, but wanted to reach out to the community here and hear what you think about the problem, the approach, any ideas or suggestions.<p>Thanks for reading along, and please do post your thoughts in the comments below.

Show HN: 'Url Checker' available now on F-Droid

Show HN: Jless, a command-line JSON viewer

Hey, Hacker News! Today I'm proud to release jless, a command-line JSON viewer.<p>jless provides a JSON viewing experience similar to what you see in a browser's network tab in the developer console, but from the comfort of your terminal, with a whole suite of vim-inspired key bindings to easily manipulate your view of the data and full-text regex search. I'm sure many of you have some piped together some combination of cat, jq and less before; hopefully jless can replace that usage (hence the name). It supports newline delimited JSON too, so it can handle any output from jq.<p>I built jless to solve a problem I kept facing while building plaintextsports.com [1][2]. For the live data I use a lot of public, but undocumented APIs, and I was constantly digging through giant JSON files to understand how the data was structured. I tried installing multiple Chrome extensions, but was dissatisfied with all of them. I piped files through jq into less a lot, and that was ok, but not great. The Preview pane in the Network tab of Chrome's dev tools was pretty useful, and I modeled a lot of jless's behavior and appearance off of that, but it didn't fit well into my tmux + vim dev environment, and I couldn't easily use it to inspect files on disk. I wanted that experience, but in my terminal (and with search support).<p>Once I had built a rudimentary version of jless a few months ago, I immediately started using it whenever I was debugging something, and my usage has only grown as I've added more basic functionality. I've finally added all the features I feel like it needs to be functional, useful, and reliable.<p>There's definitely more features I want to add: Windows support, some way to filter data with jq filters (a la fx [3]), yanking objects to the clipboard, being able to hide keys entirely, streaming data in, so you can peek at the start of gigantic file, maybe a way to extract a schema from a file (like [4]), plenty of low-hanging fruit for performance. Support for different hierarchical data formats (YAML, TOML, XML) could be cool someday. I'm sure many people will ask for editing support, but sadly that is not something I plan on adding anytime soon.<p>I also used this project as a chance to learn Rust (code style and design comments appreciated!), which I had only dabbled with before. For a command-line utility, this felt like an obvious choice: small binaries (~3mb), instant startup, and great performance without any effort (try searching for comma in a big file!).<p>I hope you find it useful!<p>[1]: <a href="https://plaintextsports.com" rel="nofollow">https://plaintextsports.com</a>, live sports scores in plain text, no ads, no tracking, no loading<p>[2]: <a href="https://news.ycombinator.com/item?id=26310314" rel="nofollow">https://news.ycombinator.com/item?id=26310314</a><p>[3]: <a href="https://news.ycombinator.com/item?id=29861043" rel="nofollow">https://news.ycombinator.com/item?id=29861043</a><p>[4]: <a href="https://quicktype.io/typescript" rel="nofollow">https://quicktype.io/typescript</a>

Show HN: Jless, a command-line JSON viewer

Hey, Hacker News! Today I'm proud to release jless, a command-line JSON viewer.<p>jless provides a JSON viewing experience similar to what you see in a browser's network tab in the developer console, but from the comfort of your terminal, with a whole suite of vim-inspired key bindings to easily manipulate your view of the data and full-text regex search. I'm sure many of you have some piped together some combination of cat, jq and less before; hopefully jless can replace that usage (hence the name). It supports newline delimited JSON too, so it can handle any output from jq.<p>I built jless to solve a problem I kept facing while building plaintextsports.com [1][2]. For the live data I use a lot of public, but undocumented APIs, and I was constantly digging through giant JSON files to understand how the data was structured. I tried installing multiple Chrome extensions, but was dissatisfied with all of them. I piped files through jq into less a lot, and that was ok, but not great. The Preview pane in the Network tab of Chrome's dev tools was pretty useful, and I modeled a lot of jless's behavior and appearance off of that, but it didn't fit well into my tmux + vim dev environment, and I couldn't easily use it to inspect files on disk. I wanted that experience, but in my terminal (and with search support).<p>Once I had built a rudimentary version of jless a few months ago, I immediately started using it whenever I was debugging something, and my usage has only grown as I've added more basic functionality. I've finally added all the features I feel like it needs to be functional, useful, and reliable.<p>There's definitely more features I want to add: Windows support, some way to filter data with jq filters (a la fx [3]), yanking objects to the clipboard, being able to hide keys entirely, streaming data in, so you can peek at the start of gigantic file, maybe a way to extract a schema from a file (like [4]), plenty of low-hanging fruit for performance. Support for different hierarchical data formats (YAML, TOML, XML) could be cool someday. I'm sure many people will ask for editing support, but sadly that is not something I plan on adding anytime soon.<p>I also used this project as a chance to learn Rust (code style and design comments appreciated!), which I had only dabbled with before. For a command-line utility, this felt like an obvious choice: small binaries (~3mb), instant startup, and great performance without any effort (try searching for comma in a big file!).<p>I hope you find it useful!<p>[1]: <a href="https://plaintextsports.com" rel="nofollow">https://plaintextsports.com</a>, live sports scores in plain text, no ads, no tracking, no loading<p>[2]: <a href="https://news.ycombinator.com/item?id=26310314" rel="nofollow">https://news.ycombinator.com/item?id=26310314</a><p>[3]: <a href="https://news.ycombinator.com/item?id=29861043" rel="nofollow">https://news.ycombinator.com/item?id=29861043</a><p>[4]: <a href="https://quicktype.io/typescript" rel="nofollow">https://quicktype.io/typescript</a>

Show HN: Jless, a command-line JSON viewer

Hey, Hacker News! Today I'm proud to release jless, a command-line JSON viewer.<p>jless provides a JSON viewing experience similar to what you see in a browser's network tab in the developer console, but from the comfort of your terminal, with a whole suite of vim-inspired key bindings to easily manipulate your view of the data and full-text regex search. I'm sure many of you have some piped together some combination of cat, jq and less before; hopefully jless can replace that usage (hence the name). It supports newline delimited JSON too, so it can handle any output from jq.<p>I built jless to solve a problem I kept facing while building plaintextsports.com [1][2]. For the live data I use a lot of public, but undocumented APIs, and I was constantly digging through giant JSON files to understand how the data was structured. I tried installing multiple Chrome extensions, but was dissatisfied with all of them. I piped files through jq into less a lot, and that was ok, but not great. The Preview pane in the Network tab of Chrome's dev tools was pretty useful, and I modeled a lot of jless's behavior and appearance off of that, but it didn't fit well into my tmux + vim dev environment, and I couldn't easily use it to inspect files on disk. I wanted that experience, but in my terminal (and with search support).<p>Once I had built a rudimentary version of jless a few months ago, I immediately started using it whenever I was debugging something, and my usage has only grown as I've added more basic functionality. I've finally added all the features I feel like it needs to be functional, useful, and reliable.<p>There's definitely more features I want to add: Windows support, some way to filter data with jq filters (a la fx [3]), yanking objects to the clipboard, being able to hide keys entirely, streaming data in, so you can peek at the start of gigantic file, maybe a way to extract a schema from a file (like [4]), plenty of low-hanging fruit for performance. Support for different hierarchical data formats (YAML, TOML, XML) could be cool someday. I'm sure many people will ask for editing support, but sadly that is not something I plan on adding anytime soon.<p>I also used this project as a chance to learn Rust (code style and design comments appreciated!), which I had only dabbled with before. For a command-line utility, this felt like an obvious choice: small binaries (~3mb), instant startup, and great performance without any effort (try searching for comma in a big file!).<p>I hope you find it useful!<p>[1]: <a href="https://plaintextsports.com" rel="nofollow">https://plaintextsports.com</a>, live sports scores in plain text, no ads, no tracking, no loading<p>[2]: <a href="https://news.ycombinator.com/item?id=26310314" rel="nofollow">https://news.ycombinator.com/item?id=26310314</a><p>[3]: <a href="https://news.ycombinator.com/item?id=29861043" rel="nofollow">https://news.ycombinator.com/item?id=29861043</a><p>[4]: <a href="https://quicktype.io/typescript" rel="nofollow">https://quicktype.io/typescript</a>

Show HN: Open Policy Registry: a Docker-inspired workflow for OPA policies

Show HN: Open Policy Registry: a Docker-inspired workflow for OPA policies

Show HN: PgCat, Postgres pooler with sharding, load balancing and failover

Show HN: PgCat, Postgres pooler with sharding, load balancing and failover

Show HN: PgCat, Postgres pooler with sharding, load balancing and failover

Show HN: TinyResume – Create a shareable online resume

Show HN: Never have an SSL certificate expire again

Show HN: An AI program to check videos for NSFW content

Show HN: An AI program to check videos for NSFW content

Show HN: Tally Forms – A free Typeform alternative

Show HN: Tally Forms – A free Typeform alternative

Show HN: Tally Forms – A free Typeform alternative

Show HN: Three Magic Words

Here’s a free, fun, novel five-letter word game for the web! It’s a game I originally wrote for the iPhone in 2010, but wasn’t able to finish before my first child was born. When I left my senior web developer job in September 2021 I figured I would postpone looking for work and finish the game before another 11 years passed, and expose myself to new skills doing it (in this case, Swift). I released it on the App Store in December, then turned my attention to doing a web version — when suddenly Wordle was in The NY Times, and then everywhere.<p>Perhaps foolishly, I plowed ahead and here we are. Like Wordle and some other NY Times word games, there is a single daily puzzle, but like traditional crossword puzzles, it gets harder throughout the week.

< 1 2 3 ... 650 651 652 653 654 ... 864 865 866 >