Process automation in building Anki vocabulary cards

For the last two years, I’ve been working through a 10,000 word Russian vocabulary ordered by frequency. I have a goal of finishing the list before the end of 2019. This requires not only stubborn persistence but an efficient process of collecting the information that goes onto my Anki flash cards.

My manual process has been to work from a Numbers spreadsheet. As I collect information about each word from several websites, I log it in this table.

More Javascript with Anki

I wrote a piece previously about using JavaScript in Anki cards. Although I haven’t found many uses for employing this idea, it does come up from time-to-time including a recent use-case I’m writing about now.

After downloading a popular French frequency list deck for my daughter to use, I noticed that it omits the gender of nouns in the French prompt. In school, I was always taught to memorize the gender along with the noun. For example, when you memorize the word for law, “loi” you should mermorize it with either the definite article “la” or the indefinite article “une” so that the feminine gender of the noun is inseparable from the noun itself. But this deck has only the noun prompt and I was afraid that my daughter would fail to memorize the noun’s gender. JavaScript to the rescue.

Since the gender is encoded in a field, we can capitalize on that to insert the right article. My preference is to use the definite articles “le” or “la” where possible. But it gets increasingly complex from there. Nouns that begin with a vowel such as “avocat” require “l’avocat” which obscures the gender. In that case, I’d prefer the indefinite article “un avocat”. Then there’s the “h”. Most words beginning with “h” behave like those with vowels. But some words have h aspiré. With those words, we keep the full definite article without the apostrophe.

So we start with a couple easy preliminaries, such as detecting vowels:

//	returns true if the character
//	is a vowel
function vowelTest(s) {
   return (/^[aeiou]$/i).test(s);
}

Now we turn our attention to whether a words would need an apostrophe with the definite article. I’m not actually going to use the apostrophe. Instead we’ll fall back to the indefinite article “un/une” in this case.

// returns true if the word would need
// an apostrophe if used with the
// definite article
function needsApostrophe(str) {
    if(str[0]=='h') {
        //	h words that do not need apostrophe
        var aspire = ["hache","hachisch","haddock","haïku",
            "haillon","haine","hall",
            "halo","halte","hamac",
            "hamburger","hameau","hammam",
            "hampe","hamster","hanche",
            "hand-ball","handicap","hangar",
            "harde","hareng","hargne",
            "haricot","harpail","harpon",
            "hasard","hauteur","havre","hère",
            "hérisson","hernie","héron",
            "héros","herse","hêtre",
            "hiatus","hibou","hic",
            "hickory","hiérarchie","hiéroglyphe",
            "hobby","Hollande","homard",
            "Hongrie","honte","hoquet",
            "houe","houle","hooligan",
            "houppe","housse","houx",
            "houblot","huche","huguenot"
            ];
        return (aspire.indexOf(str) == -1);
    }
    return vowelTest(str[0]);
}

Now we can wrap this up into a function that adds an article, either definite or indefinite to the noun:

//	adds either definite or indefinite article
function addArticle(str,genderstr) {
    if( needsApostrophe(str) ) {
       return (genderstr == "nm" ) ? "un " + str : "une " + str;
    }
       return (genderstr == "nm") ? "le " + str : "la " + str;
}

The first step is to make sure that the part of speech field is visible to the script. We do this by inserting it into the card template.

<span id="pos">{% raw %}{{Part of Speech}}{% endraw %}</span>

Don’t worry, we’ll hide it in a minute.

Then we can obtain the contents of the field and add the gender-specific article accordingly.

var content = document.getElementById("pos").innerHTML;
var fword = document.getElementsByClassName("frenchwordless")[0].innerHTML;
artword = addArticle(fword,content);
document.getElementsByClassName("frenchwordless")[0].innerHTML=artword;

And we can hide the gender sentinel field:

var content = document.getElementById("pos").style.visibility = "hidden";

Ideally, French Anki decks would be constructed in such a way that the gender is embedded in the noun to be memorized, but with a little creative use of JavaScript, we can retool it on-the-fly.

An approach to dealing with spurious sensor data in Indigo

Spurious sensor data can wreak havoc in an otherwise finely-tuned home automation system. I use temperature data from an Aeotech Multisensor 6 to monitor the environment in our greenhouse. Living in Canada, I cannot rely solely on passive systems to maintain the temperature, particularly at night. So, using the temperature and humidity measurements transmitted back to the controller over Z-wave, I control devices inside the greenhouse that heat and humidify the environment.

Follow the intent.

With Trump the usual advice of “Follow the money.” doesn’t work because Congress refuses to force him to disclose his conflicts of interest. As enormous and material as those conflicts must be, I’m just going to focus on what I can see with my own eyes, the man’s apparent intent.

In his public life, Donald Trump has never done anything that did not personally and directly benefit him. Most of us, as we go through life, assemble a collection of acts that are variously self-serving and other-serving. This is the way of life. Normal life. With Trump, not so. Even his meager philanthropic acts are tainted with controversy. The man simply cannot act in sacrificial way. He is incurable.^[In a campaign event in Fort Dodge, Iowa on November 12, 2015, Trump claimed that rival Ben Carson was “pathological” and that “…if you’re pathological, there’s no cure for that, folks, okay? There’s no cure for that.” Since Trump’s own psychopathology is widely questioned, one wonders if he, too, is incurable. Given that narcissistic personality disorder is almost certainly among the potential diagnoses, he probably is incurable.]

They're just paid protesters

In an effort to strip protesters of their legitimacy, Trump and Fox News claim that protesters are simply there because they’re paid by powerful oppositional interests. Never mind that Trump has no evidence for his claim; he has no evidence for practically anything that emerges from his loud mouth. What is more interesting to me is that if money delegitimizes authenticity then presumably we can use this effect to come to additional conclusions.

@realDonaldTrump Russian Twitter bot

Someday, when I have time to burn, I’m going to write a Twitter bot that takes all of Trump’s vacuous tweets and translate them into Russian. It’ll look like this:

There’s something ludicrous about the idea of the Trump, who is distractible, impatient, and incurious being able to learn Russian, an incredibly difficult language.

marking time


marking time,
eyes glazed, pupils constricted
to the head of a pin
from facing the blue white sterile light
for too long
a zombie tribe
numbering in the millions
if not more
waits.

this throng, agitated
in a subdued anesthetized
way,
crowns one of its own
a clown of sorts
knowing little of the past
less of the present
and practically nothing
of the future.
“why not? it could be worse."

in a strange unreality
a vaudeville show becomes
its own rehearsal,
a dreamish state from which
only an atomic flash
can awaken a person.

13 Random thoughts about Canada after living here for a year.

On January 1, 2016 we packed up all our earthly goods and headed south to Canada. (Yes, it’s true. When you live in Minnesota, it’s possible to move south to Canada. Look at the map!) Having lived here for a little over a year, here are some thoughts about living here, in no particular order:

  1. “Sorry” is more of a greeting than just an apology.
  2. Canadians really are polite; but put them behind the wheel of a car and all bets are off.
  3. Universal healthcare works. Americans love to go on and on about socialized medicine; but I’m here to tell you: it works.
  4. Bumper stickers are rare here.
  5. People don’t really talk politics. Well, they talk about U.S. politics.
  6. Left turn arrows on traffic lights are rare. It makes for interesting moments when the light changes.
  7. The electric utility is called “hydro”, which given the Greek origin of the word makes little sense until you realize that it stands for “hydroelectric.”
  8. Youth music is well-supported - both through private and public funding.
  9. State-church separation is fuzzier. For example, the Catholic school system is tax-payer funded. But only the Catholic schools. It has something to do with the Canadian Charter (a.k.a Constitution.) It was apparently some sort of historical compromise in the 1800’s.
  10. Don’t order iced tea in Canada. It’s way too sweet.
  11. As a practical matter, you can’t be elected Prime Minister unless you speak both English and French fluently. This is a really good thing.^[How many languages does Donald Trump speak fluently, for example?]
  12. Speaking of politics, campaigns are time-limited to 6 weeks before an election. How cool is that?
  13. Poutine sounds horrible, but it’s actually pretty good.

Serious audio processing on the command line

I’ve written previously about extracting and processing mp3 files from web pages. The use case that I described, obtaining Russian word pronunciations for Anki cards is basically the same although I’m now obtaining many of my words from Forvo. However, Forvo doesn’t seem to apply any audio dynamic range processing or normalization to the audio files. While many of the pronunciation mp3’s are excellent as-is, some need post-processing chiefly because the amplitude is too low. However, being lazy by nature, I set out to find a way of improving the audio quality automatically before I insert the mp3 file into my new vocabulary cards.