2018: A year of experiments

New Year’s resolution time is at hand. But not for me; at least not in a traditional sense. I was inspired by David Cain’s experiments. In short, he conducts monthly experiments in self-improvement.

The idea of an experiment is appealing in ways that a resolution is not. A resolution presumes an outcome and relies only on the long application of will to see it through. An experiment on the other hand, makes only a conjecture about the outcome and can be conducted for a shorter period.

Peering into Anki using R

Yet another diversion to keep me from focusing on actually using Anki to learn Russian. I stumbled on the R programming language, a language that focuses on statistical analysis.

Here’s a couple snippets that begin to scratch the surface of what’s possible. Important caveat: I’m an R novice at best. There are probably much better ways of doing some of this…

Counting notes with a particular model type

Here we’ll use R to do what we did previously with Python.

Language word frequencies

Since one of the cornerstones of my approach to learning the Russian language has been to track how many words I’ve learned and their frequencies, I was intrigued by reading the following statistics today:

  • The 15 most frequent words in the language account for 25% of all the words in typical texts.
  • The first 100 words account for 60% of the words appearing in texts.
  • 97% of the words one encounters in a ordinary text will be among the first 4000 most frequent words.

In other words, if you learn the first 4000 words of a language, you’ll be able to understand nearly everything.

Anki database adventures: Counting notes by model type

Continuing my series on accessing the Anki database outside of the Anki application environment, here’s a piece on accessing the note type model. You may wish to start here with the first article on accessing the Anki database. This is geared toward mac OS. (If you’re not on mac OS, then start here instead.)

The note type model

Since notes contain flexible fields in Anki, the model for a note type is in JSON. The best guess definition of the JSON is:

Accessing the Anki database with Python: Working with a specific deck

I previously wrote about accessing the Anki database using Python on mac OS. Extending that post, I’ll show how to work with a specific deck in this short post.

To use a named deck you’ll need its deck ID. Fortunately there’s a built-in method for finding a deck ID by name:

col = Collection(COLLECTION_PATH)
dID = col.decks.id(DECK_NAME)

Now in queries against the cards and notes tables we can apply the deck ID to restrict them to a certain deck. For example, to find all of the cards currently in the learning stage:

Working with the Anki database on mac OS using Python

Not long ago I ran across this post detailing a method for opening and inspecting the Anki database using Python outside the Anki application environment. However, the approach requires linking to the Anki code base which is inaccessible on mac OS since the Python code is packaged into a Mac app on this platform.

The solution I’ve found is inelegant; but just involves downloading the Anki code base to a location on your file system where you can link to it in your code. You can find the Anki code here on github.

Process automation in building Anki vocabulary cards

For the last two years, I’ve been working through a 10,000 word Russian vocabulary ordered by frequency. I have a goal of finishing the list before the end of 2019. This requires not only stubborn persistence but an efficient process of collecting the information that goes onto my Anki flash cards.

My manual process has been to work from a Numbers spreadsheet. As I collect information about each word from several websites, I log it in this table.

More Javascript with Anki

I wrote a piece previously about using JavaScript in Anki cards. Although I haven’t found many uses for employing this idea, it does come up from time-to-time including a recent use-case I’m writing about now.

After downloading a popular French frequency list deck for my daughter to use, I noticed that it omits the gender of nouns in the French prompt. In school, I was always taught to memorize the gender along with the noun. For example, when you memorize the word for law, “loi” you should mermorize it with either the definite article “la” or the indefinite article “une” so that the feminine gender of the noun is inseparable from the noun itself. But this deck has only the noun prompt and I was afraid that my daughter would fail to memorize the noun’s gender. JavaScript to the rescue.