Regex to match a cloze

Anki and some other platforms use a particular format to signify cloze deletions in flashcard text. It has a format like any of the following:

  • {{c1::dog::}}
  • {{c2::dog::domestic canine}}

Here’s a regular expression that matches the content of cloze deletions in an arbitrary string, keeping only the main clozed word (in this case dog.)


To see it in action, here it is in action in a Python script:

import re

def stripCloze(searchText):
    return re.sub(r'{{c\d::(.*?)(::[^:]+)?}}', r"\1", searchText)

print(stripCloze("The {{c1::passengers::tourist riders}} spotted a breaching {{c2::whale}}."))

It should return The passengers spotted a breaching whale.

Removing stress marks from Russian text

Previously, I wrote about adding syllabic stress marks to Russian text. Here’s a method for doing the opposite - that is, removing such marks (ударение) from Russian text.

Although there may well be a more sophisticated approach, regex is well-suited to this task. The problem is that

def string_replace(dict,text):
   sorted_dict = {k: dict[k] for k in sorted(dict)}
   for n in sorted_dict.keys():
      text = text.replace(n,dict[n])
   return text

dict = { "а́" : "а", "е́" : "е", "о́" : "о", "у́" : "у",
      "я́" : "я", "ю́" : "ю", "ы́" : "ы", "и́" : "и",
      "ё́" : "ё", "А́" : "А", "Е́" : "Е", "О́" : "О",
      "У́" : "У", "Я́" : "Я", "Ю́" : "Ю", "Ы́" : "Ы",
      "И́" : "И", "Э́" : "Э", "э́" : "э"
print(string_replace(dict, "Существи́тельные в шве́дском обычно де́лятся на пять склоне́ний."))

This should print: Существительные в шведском обычно делятся на пять склонений.

"Delete any app that makes money off your attention."

Listening to Cal Newport interviewed on a recent podcast, something he said resonated. I’m probably paraphrasing, but a key piece of advice was: “Delete any app that makes money off your attention." Seems like really good advice. A smartphone is a collection of tools embedded in a tool. Use it like a tool and not an entertainment device and you’ll be find. For a while, in an effort to pry myself loose from the psychic hold of the smartphone I went back to using some kind of old flip phone.

URL-encoding URLs in AppleScript

The AppleScript Safari API is apparently quite finicky and rejects Russian Cyrillic characters when loading URLs. For example, the following URLстоять#Russian throws an error in AppleScript. Instead, Safari requires URL’s of the form whereas Chrome happily consumes whatever comes along. So, we just need to encode the URL thusly: use framework "Foundation" -- encode Cyrillic test as "%D0" type strings on urlEncode(input) tell current application's NSString to set rawUrl to stringWithString_(input) -- 4 is NSUTF8StringEncoding set theEncodedURL to rawUrl's stringByAddingPercentEscapesUsingEncoding:4 return theEncodedURL as Unicode text end urlEncode When researching Russian words for vocabulary study, I use the URL encoding handler to load the appropriate words into several reference sites in sequential Safari tabs.

Consume media outside one's bubble?

That “reality bubbles” contribute heavily to increasing political polarization is well-known. Customized media diets at scale and social media feeds that are tailored to individual proclivities progressively narrow our understanding of perspectives other than our own. Yet, the cures are difficult and uncertain. Often, though, we’re advised to consume media from the other side of the political divide. A sentence from a recent piece in The Atlantic encapsulates why I think this is such a fraught idea:

Свидетельство того или тому?

I was puzzled by this sentence on the BBC Russian Service: Нет свидетельств тому, что на нынешних выборах дело обстоит иначе. ББС Мошенничество на выборах в США? Проверяем факты в речи Трампа It means “There is no evidence that in the current election things are any different." but the puzzle isn’t the meaning, it’s the grammatical case in which the author has placed the demonstrative pronoun то , which is dative here тому .

Escaping "Anki hell" by direct manipulation of the Anki sqlite3 database

There’s a phenomenon that verteran Anki users are familiar with - the so-called “Anki hell” or “ease hell.” Origins of ease hell The descent into ease hell has to do with the way Anki handles correct and incorrect answers when it presents cards for review. Ease is a numerical score associated with every card in the database and represents a valuation of the difficulty of the card. By default, when cards graduate from the learning phase, an ease of 250% is applied to the card.

Typing Russian stress marks on macOS

While Russian text intended for native speakers doesn’t show accented vowel characters to point out the syllabic stress (ударение) , many texts intended for learners often do have these marks. But how to apply these marks when typing? Typically, for Latin keyboards on macOS, you can hold down the key (like long-press on iOS) and a popup dialog will show you options for that character. But in the standard Russian phonetic keyboard it doesn’t work.

Stripping surveillance parameters from Facebook and Google links

While largely opaque to most users, Facebook and Google massage any links that you acquire on their sites to include data used to track you around the web. This script attempts to strip these surveillance parameters from the URL’s. It is by no means all-inclusive. Imaginably, there are links that I haven’t yet encountered and that need to be considered in a future version. So consider this a proof-of-concept. The problem For example, I performed a Google search1 for “Smarties”.

Predictions 2021

Predictions for 2021 Humans are notoriously poor at assigning probabilities to events, even those that are highly relevant to their daily lives. This year I’m making a deliberate attempt to calibrate my prediction abilities by correlating predictions with reality. The judgments of truth of these outcomes will be made on December 31, 2021, although some of the outcomes will have been decided substantially in advance of that. Coronavirus An effective vaccine will be widely available in Canada: 70%.