https://blog.gatunka.com/2009/09/30/japanese-typewriters/, posted Jul '21 by peter in history japan language toread
With several thousand characters to contend with, how were the Japanese able to use typewriters before the advent of digital technology? The answer is the kanji typewriter (和文タイプライター or 邦文タイプライター), which was invented by Kyota Sugimoto in 1915. This invention was deemed so important that it was selected as one of the ten greatest Japanese inventions by the Japanese Patent Office during their 100th anniversary celebrations in 1985. Here are some photos of that first model. (Photos courtesy Canon Semiconductor Equipment.)
Suppose we want to combine a BERT-based named entity recognition (NER) model with a rule-based NER model built on top of spaCy. Although BERT's NER exhibits extremely high performance, it is usually combined with rule-based approaches for practical purposes. In such cases, what often bothers us is that tokens of spaCy and BERT are different, even if the input sentences are the same. For example, let's say the input sentence is "John Johanson 's house"; BERT tokenizes this sentence like
["john", "johan", "##son", "'", "s", "house"]and spaCy tokenizes it like
["John", "Johanson", "'s", "house"]. To combine the outputs, we need to calculate the correspondence between the two different token sequences. This correspondence is the "alignment".
Free and Open Source Machine Translation API, entirely self-hosted. Unlike other APIs, it doesn't rely on proprietary providers such as Google or Azure to perform translations.
The Apache Tika™ toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more.
https://www.audubon.org/news/no-its-not-actually-murder-crows, posted 2020 by peter in language opinion
Now I will concede that certain terms of venery have made the transition from factoid to actual phrase. Pod of whales. Troop of monkeys. Gaggle of geese. Pack of wolves. Those tend to be used for animals that naturally live in small groups, and those are fine. Keep ‘em.
They’re not the ones that annoy me. But “murder of crows,” and the like—the ones that people giggle over despite no actual instance of anyone using the term to refer to a flock of crows maybe ever in history—those need to go.
Accuracy is part of the reason. Bandwidth is another. Why use our limited brain space on fake animal facts when there are so many interesting things that are actually true? Wombats don’t form wisdoms, but they poop cubes. Did you know that? Cubes! You’ll blow them away at bar trivia with that one.
https://soranews24.com/2020/06/21/11-different-ways-to-say-father-in-japanese/, posted 2020 by peter in culture japan language
Well, actually, there are a ton of different ways to say “father” in Japanese, and what better day to take a look at them than today?
"Today" being yesterday, the third Sunday in June, or Father's Day (父の日).
Japanese language exercises aimed at school children but also great for non-native learners like me. For me it didn't work in Firefox, which is my preferred browser, but this could possibly be because of my
paranoid privacy-enhancing browser extensions.
When Europe lost Latin as a shared communication tool, it was a new Babel Tower: Europeans couldn't understand each other any longer except within the boundaries of their national states. Not surprisingly, people who don't understand each other tend to resort to war to sort out conflicts. But Europeans also tried to replace Latin with some non-verbal tools: one was music. It is a long story that needs to be told from the beginning.
What should have been a heart-wrenching meeting with plans to make changes for the future instead ended up boiling down to one poor choice of words on behalf of the superintendent.
At some point during the conversation with the boy’s father, the superintendent asked: Omae mo hogoshakai ni kuru ka? (お前も保護者会に来るか).