The best Hacker News stories from Show from the past day
Latest posts:
Show HN: A labelling tool to easily extract and label Wikipedia data
Hi HN! I am Maria, solo founder of DataQA (<a href="https://dataqa.ai/" rel="nofollow">https://dataqa.ai/</a>), a tool to search and label documents for various NLP tasks (e.g. entity extraction, entity linking, etc).<p>I have worked as a data scientist and ML engineer for the better part of a decade, and over that time have specialised mainly in applications involving natural language processing (NLP). One of the key questions I have always had at the back of my mind is whether my time was well spent. Whenever I spent more time on feature engineering or trying different models, I always wondered whether I would get better return on investment by simply labelling more data. I have created DataQA to enhance exploration & labelling of documents. It is open-source and ships with the elasticsearch text search engine which I have packaged as a python package (might be topic of a future technical post), as well as a rules-based engine to do pre-labelling of documents using NLP rules. It is very easy to install with a single pip command.<p>One of the key things I wanted to add to DataQA is an integration to Wikipedia. Even though wikipedia is the largest living repository of human knowledge in the world, I still always found it difficult to process it and create structured datasets for my specific applications. Since wiki pages are long-form articles, it is important to divide the text into smaller text chunks. A lot of the interesting data is also sometimes displayed in tables. With DataQA you can now upload a list of wikipedia page urls and the tool will extract the articles, process them and even parse the tables, so you can then label any entities you want. You can find a tutorial here: <a href="https://towardsdatascience.com/a-labelling-tool-to-easily-extract-and-label-wikipedia-data-63f58e2e76ae?gi=13e9b7f5080c" rel="nofollow">https://towardsdatascience.com/a-labelling-tool-to-easily-ex...</a>.<p>The open-source version of DataQA currently only supports csv, but I have an enterprise version with premium features such as labelling of pdfs (with understanding of tables). If you're interested in a free trial, please contact me at contact@dataqa.ai :-).
Show HN: A labelling tool to easily extract and label Wikipedia data
Hi HN! I am Maria, solo founder of DataQA (<a href="https://dataqa.ai/" rel="nofollow">https://dataqa.ai/</a>), a tool to search and label documents for various NLP tasks (e.g. entity extraction, entity linking, etc).<p>I have worked as a data scientist and ML engineer for the better part of a decade, and over that time have specialised mainly in applications involving natural language processing (NLP). One of the key questions I have always had at the back of my mind is whether my time was well spent. Whenever I spent more time on feature engineering or trying different models, I always wondered whether I would get better return on investment by simply labelling more data. I have created DataQA to enhance exploration & labelling of documents. It is open-source and ships with the elasticsearch text search engine which I have packaged as a python package (might be topic of a future technical post), as well as a rules-based engine to do pre-labelling of documents using NLP rules. It is very easy to install with a single pip command.<p>One of the key things I wanted to add to DataQA is an integration to Wikipedia. Even though wikipedia is the largest living repository of human knowledge in the world, I still always found it difficult to process it and create structured datasets for my specific applications. Since wiki pages are long-form articles, it is important to divide the text into smaller text chunks. A lot of the interesting data is also sometimes displayed in tables. With DataQA you can now upload a list of wikipedia page urls and the tool will extract the articles, process them and even parse the tables, so you can then label any entities you want. You can find a tutorial here: <a href="https://towardsdatascience.com/a-labelling-tool-to-easily-extract-and-label-wikipedia-data-63f58e2e76ae?gi=13e9b7f5080c" rel="nofollow">https://towardsdatascience.com/a-labelling-tool-to-easily-ex...</a>.<p>The open-source version of DataQA currently only supports csv, but I have an enterprise version with premium features such as labelling of pdfs (with understanding of tables). If you're interested in a free trial, please contact me at contact@dataqa.ai :-).
Show HN: A labelling tool to easily extract and label Wikipedia data
Hi HN! I am Maria, solo founder of DataQA (<a href="https://dataqa.ai/" rel="nofollow">https://dataqa.ai/</a>), a tool to search and label documents for various NLP tasks (e.g. entity extraction, entity linking, etc).<p>I have worked as a data scientist and ML engineer for the better part of a decade, and over that time have specialised mainly in applications involving natural language processing (NLP). One of the key questions I have always had at the back of my mind is whether my time was well spent. Whenever I spent more time on feature engineering or trying different models, I always wondered whether I would get better return on investment by simply labelling more data. I have created DataQA to enhance exploration & labelling of documents. It is open-source and ships with the elasticsearch text search engine which I have packaged as a python package (might be topic of a future technical post), as well as a rules-based engine to do pre-labelling of documents using NLP rules. It is very easy to install with a single pip command.<p>One of the key things I wanted to add to DataQA is an integration to Wikipedia. Even though wikipedia is the largest living repository of human knowledge in the world, I still always found it difficult to process it and create structured datasets for my specific applications. Since wiki pages are long-form articles, it is important to divide the text into smaller text chunks. A lot of the interesting data is also sometimes displayed in tables. With DataQA you can now upload a list of wikipedia page urls and the tool will extract the articles, process them and even parse the tables, so you can then label any entities you want. You can find a tutorial here: <a href="https://towardsdatascience.com/a-labelling-tool-to-easily-extract-and-label-wikipedia-data-63f58e2e76ae?gi=13e9b7f5080c" rel="nofollow">https://towardsdatascience.com/a-labelling-tool-to-easily-ex...</a>.<p>The open-source version of DataQA currently only supports csv, but I have an enterprise version with premium features such as labelling of pdfs (with understanding of tables). If you're interested in a free trial, please contact me at contact@dataqa.ai :-).
Show HN: A labelling tool to easily extract and label Wikipedia data
Hi HN! I am Maria, solo founder of DataQA (<a href="https://dataqa.ai/" rel="nofollow">https://dataqa.ai/</a>), a tool to search and label documents for various NLP tasks (e.g. entity extraction, entity linking, etc).<p>I have worked as a data scientist and ML engineer for the better part of a decade, and over that time have specialised mainly in applications involving natural language processing (NLP). One of the key questions I have always had at the back of my mind is whether my time was well spent. Whenever I spent more time on feature engineering or trying different models, I always wondered whether I would get better return on investment by simply labelling more data. I have created DataQA to enhance exploration & labelling of documents. It is open-source and ships with the elasticsearch text search engine which I have packaged as a python package (might be topic of a future technical post), as well as a rules-based engine to do pre-labelling of documents using NLP rules. It is very easy to install with a single pip command.<p>One of the key things I wanted to add to DataQA is an integration to Wikipedia. Even though wikipedia is the largest living repository of human knowledge in the world, I still always found it difficult to process it and create structured datasets for my specific applications. Since wiki pages are long-form articles, it is important to divide the text into smaller text chunks. A lot of the interesting data is also sometimes displayed in tables. With DataQA you can now upload a list of wikipedia page urls and the tool will extract the articles, process them and even parse the tables, so you can then label any entities you want. You can find a tutorial here: <a href="https://towardsdatascience.com/a-labelling-tool-to-easily-extract-and-label-wikipedia-data-63f58e2e76ae?gi=13e9b7f5080c" rel="nofollow">https://towardsdatascience.com/a-labelling-tool-to-easily-ex...</a>.<p>The open-source version of DataQA currently only supports csv, but I have an enterprise version with premium features such as labelling of pdfs (with understanding of tables). If you're interested in a free trial, please contact me at contact@dataqa.ai :-).
Show HN: I made a collaborative ASCII editor
I made a website for drawing ascii art with other people https://ascii-collab.app/<p>It's been online for a little over a year so there's a fair bit of stuff to browse if you want to look around (so much that I even made a poster https://ascii-collab.app/poster.png)<p>There are other websites like yourworldoftext that do this but ascii-collab has some extra features like per-user undo/redo, box selection, a color highlight mode to see who made particular changes, and there's admin tools so I can remove spam.<p>The code is open sourced here https://github.com/MartinSStewart/ascii-collab if anyone is interested.<p>Enjoy!
Show HN: I made a collaborative ASCII editor
I made a website for drawing ascii art with other people https://ascii-collab.app/<p>It's been online for a little over a year so there's a fair bit of stuff to browse if you want to look around (so much that I even made a poster https://ascii-collab.app/poster.png)<p>There are other websites like yourworldoftext that do this but ascii-collab has some extra features like per-user undo/redo, box selection, a color highlight mode to see who made particular changes, and there's admin tools so I can remove spam.<p>The code is open sourced here https://github.com/MartinSStewart/ascii-collab if anyone is interested.<p>Enjoy!
Show HN: I made a collaborative ASCII editor
I made a website for drawing ascii art with other people https://ascii-collab.app/<p>It's been online for a little over a year so there's a fair bit of stuff to browse if you want to look around (so much that I even made a poster https://ascii-collab.app/poster.png)<p>There are other websites like yourworldoftext that do this but ascii-collab has some extra features like per-user undo/redo, box selection, a color highlight mode to see who made particular changes, and there's admin tools so I can remove spam.<p>The code is open sourced here https://github.com/MartinSStewart/ascii-collab if anyone is interested.<p>Enjoy!
Show HN: I made a collaborative ASCII editor
I made a website for drawing ascii art with other people https://ascii-collab.app/<p>It's been online for a little over a year so there's a fair bit of stuff to browse if you want to look around (so much that I even made a poster https://ascii-collab.app/poster.png)<p>There are other websites like yourworldoftext that do this but ascii-collab has some extra features like per-user undo/redo, box selection, a color highlight mode to see who made particular changes, and there's admin tools so I can remove spam.<p>The code is open sourced here https://github.com/MartinSStewart/ascii-collab if anyone is interested.<p>Enjoy!
Show HN: Describe SQL using natural language, and execute against real data
I played around with GPT-3 to build this demo. Select a public BigQuery dataset and describe your query in natural English, then edit the generated SQL as needed and execute it.<p>https://app.tabbydata.com/sql-assistant-demo
Show HN: Describe SQL using natural language, and execute against real data
I played around with GPT-3 to build this demo. Select a public BigQuery dataset and describe your query in natural English, then edit the generated SQL as needed and execute it.<p>https://app.tabbydata.com/sql-assistant-demo
Show HN: I made an app that tracks your devices' batteries from your Mac
Show HN: I made an app that tracks your devices' batteries from your Mac
Show HN: I made an app that tracks your devices' batteries from your Mac
Show HN: I made an app that tracks your devices' batteries from your Mac
Show HN: I made a voice activated golf bag that shoots your clubs
Show HN: I made a voice activated golf bag that shoots your clubs
Show HN: I made a voice activated golf bag that shoots your clubs
Show HN: I made a voice activated golf bag that shoots your clubs
Show HN: I made a voice activated golf bag that shoots your clubs
Show HN: JWEB (a modern implementation of the CWEB Literate Programming system)