Bookmark
wooorm/franc
https://github.com/wooorm/franc, posted 2014 by peter in development free language nlp opensource python software
Detect the language of text.
Bookmark
TextBlob: Simplified Text Processing — TextBlob 0.5.0 documentation
https://textblob.readthedocs.org/en/latest/, posted 2013 by peter in development free language nlp python software toread
TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, translation, and more.
Bookmark
An Efficient Way to Extract the Main Topics from a Sentence | The Tokenizer
thetokenizer.com/2013/05/09/efficient-way-to-extract-the-main-topics-of-a-sentence/, posted 2013 by peter in language nlp python toread
Last week, while working on new features for our product, I had to find a quick and efficient way to extract the main topics/objects from a sentence. Since I’m using Python, I initially thought that it’s going to be a very easy task to achieve with NLTK. However, when I tried its default tools (POS tagger, Parser…), I indeed got quite accurate results, but performance was pretty bad. So I had to find a better way. Like I did in my previous post, I’ll start with the bottom line – Here you can find my code for extracting the main topics/noun phrases from a given sentence. It works fine with real sentences (from a blog/news article). It’s a bit less accurate compared to the default NLTK tools, but it works much faster!
Bookmark
translate.google.com/toolkit, posted 2013 by peter in conversion free language nlp online
Google Translator Toolkit is a powerful and easy-to-use editor that helps translators work faster and better.
Bookmark
Delver - a natural language interface to your app
delver.io/, posted 2013 by peter in development language nlp software toread
Down in the depths of your organisation, you have a treasure-trove of valuable data. But how hard is it for your users to retrieve it? Salvage your data with a natural language interface - ask your app English questions, get clear answers and reports back.
Bookmark
High Scalability - High Scalability - DuckDuckGo Architecture - 1 Million Deep Searches a Day and Growing
highscalability.com/blog/2013/1/28/duckduckgo-architecture-1-million-deep-searches-a-day-and-gr.html, posted 2013 by peter in development nlp scalability search
This is an interview with Gabriel Weinberg, founder of Duck Duck Go and general all around startup guru, on what DDG’s architecture looks like in 2012.
Bookmark
BBC News - Phone call translator app to be offered by NTT Docomo
www.bbc.co.uk/news/technology-20004210, posted 2012 by peter in japan language mobile nlp voice
An app offering real-time translations is to allow people in Japan to speak to foreigners over the phone with both parties using their native tongue.
NTT Docomo - the country's biggest mobile network - will initially convert Japanese to English, Mandarin and Korean, with other languages to follow.
Even though the translations are bound to be hilariously bad sometimes, this may still be useful in some situations.
Bookmark
Is Writing Style Sufficient to Deanonymize Material Posted Online? « 33 Bits of Entropy
33bits.org/2012/02/20/is-writing-style-sufficient-to-deanonymize-material-posted-online/, posted 2012 by peter in language nlp privacy science
So what exactly did we achieve? Our research has dramatically increased the number of authors that can be distinguished using writing-style analysis: from about 300 to 100,000. More importantly, the accuracy of our algorithms drops off gently as the number of authors increases, so we can be confident that they will continue to perform well as we scale the problem even further. Our work is therefore the first time that stylometry has been shown to have to have serious implications for online anonymity.
Bookmark
Pattern | CLiPS
www.clips.ua.ac.be/pages/pattern, posted 2011 by peter in development free nlp python software
Pattern is a web mining module for the Python programming language.
It bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider, HTML DOM parser), text analysis (rule-based shallow parser, WordNet interface, syntactical + semantical n-gram search algorithm, tf-idf + cosine similarity + LSA metrics) and data visualization (graph networks).
The module is bundled with 30+ example scripts.
Bookmark
The Easy Way to Extract Useful Text from Arbitrary HTML - AI Depot
ai-depot.com/articles/the-easy-way-to-extract-useful-text-from-arbitrary-html/, posted 2011 by peter in ai development nlp python scraping
This article shows you how to write a relatively simple script to extract text paragraphs from large chunks of HTML code, without knowing its structure or the tags used. It works on news articles and blogs pages with worthwhile text content, among others…
|< First < Previous 11–20 (48) Next > Last >|