A macOS text service for morphological analysis and in situ marking of Russian syllabic stress

Building on my earlier explorations of the UDAR project, I’ve created a macOS Service-like method for in-situ marking of syllabic stress in arbitrary Russian text. The following video shows it in action:

The Keyboard Maestro is simple; we execute the following script, bracketed by Copy and Paste:

import xerox
import udar
import re

rawText = xerox.paste()
doc1 = udar.Document(rawText, disambiguate=True)
searchText = doc1.stressed()
result = re.sub(r'( ,)', ",", searchText)

This presumes that udar and its prerequisites have already been installed, of course.

So why not build out this idea as an actual macOS text service? In theory, it should be possible. Maybe even ideal but for Python version management I use pyenv. Because some of the UDAR dependencies will not run under my current system Python version of 3.8.5, I use 3.7.4 under pyenv and it appears that the correct Python version run under whatever environment the service launches. Someone with deeper system knowledge could undoubtedly figure it out; but instead I accomplished the same effect via a Keyboard Maestro macro.

Solzhenitsyn on the folly of looking for good/evil dichotomies

Постепенно открылось мне, что линия, разделяющая добро и зло, проходит не между государствами, не между классами, не между партиями, — она проходит через каждое человеческое сердце — и черезо все человеческие сердца. Линия эта подвижна, она колеблется в нас с годами. Даже в сердце, объятом злом, она удерживает маленький плацдарм добра. Даже в наидобрейшем сердце — неискоренённый уголок зла.

Alexander Solzhenitsyn Gulag Archipelago

My rough translation to English:

“Gradually it was revealed to me that the line separating good from evil passes not between States, nor between classes or parties. It passes through every human heart. The line shifts; it oscillates in us with the years. Even in a heart overwhelmed by evil, it retains a small bridgehead of good. Even the kindest hearts, there is an corner of evil not uprooted.”

Solzhenitsyn is Luke Skywalker on the forest moon confronting Vader: “I know there is still good in you.”

On not minding what happens

Over-involvement in the future must be our most maladaptive trait. Back in the 1970’s in Ojai, when Jiddu Krishnamurti drew enormous crowds to his extemporaneous talks, he touched on the liberation that comes from releasing the pointless hold on the future.1 Do you want to know what my secret is? You see, I don’t mind what happens. Jiddu Krishnamurti Lecture, Ojai,California, USA; late 1970's That’s it. Of all the teachings from the broad wisdom traditions, his one secret was not minding what happens.

We're all imposters

Reading Oliver Burkeman’s last advice column in decade-long series in The Guardian, I was struck by his advice on the imposter syndrome: The solution to imposter syndrome is to see that you are one…Humanity is divided into two: on the one hand, those who are improvising their way through life, patching solutions together and putting out fires as they go, but deluding themselves otherwise; and on the other, those doing exactly the same, except that they know it.

The Buddha was a list-maker

Beginning with “The Four Noble Truths”1, “The Noble Eightfold Path”2, and so on, the Buddha was a list-maker. I recently found a wonderful book, now out of print but freely available as a pdf. By David Snyder, Ph.D., it is called “The Complete Book of Buddha’s Lists - Explained” Snyder does a brilliant job of reinterpreting these lists and framing them in the context of what the social sciences say about how we function individually and in groups.

Beginning to experiement with Stanza for natural language processing

After installing Stanza as dependency of UDAR which I recently described, I decided to play around with what is can do. Installation The installation is straightforward and is documented on the Stanza getting started page. First, sudo pip3 install stanza Then install a model. For this example, I installed the Russian model: #!/usr/local/bin/python3 import stanza stanza.download('ru') Usage Part-of-speech (POS) and morphological analysis Here’s a quick example of POS analysis for Russian.

Automated marking of Russian syllabic stress

One of the challenges that Russian learners face is the placement of syllabic stress, an essential determinate of pronunciation. Although most pedagogical texts for students have marks indicating stress, practically no tests intended for native speakers do. The placement of stress is inferred from memory and context. I was delighted to discover Dr. Robert Reynolds’ work on natural language processing of Russian text to mark stress based on grammatical analysis of the text.

sed matching whitespace on macOS

sed is such a useful pattern-matching and substitution tool for work on the command line. But there’s a little quirk on macOS that will trip you up. It tripped me up. On most platforms, \s is the character class for whitespace. It’s ubiquitous in regexes. But on macOS, it doesn’t work. In fact, it silently fails. Consider this bash one-liner which looks like it should work but doesn’t: # should print I am corrupt (W.

Partitioning a large directory into subdirectories by size

Since I’m not fond of carrying around all my photos on a cell phone where they’re perpetually at list of loss, I peridiocally dump the image and video files to a drive on my desktop for later burning to optical disc.1 Saving these images in archival form is a hedge against the bet that my existing backup system won’t fail someday. I’m using Blue-Ray optical discs to archive these image and video files; and each stores 25 GB of data.

More chorus repetition macros for Audacity

In a previous post I described macros to support certain tasks in generating source material for L2 chorus repetition practice. Today, I’ll describe two other macros that automate this practice by slowing the playback speed of the repetition. Background I’ve described the rationale for chorus repetition practice in previous posts. The technique I describe here is to slow the sentence playback speed to give the learner time to build speed by practicing slower repetitions.