anki

Parsing Russian Wiktionary content using XPath

As readers of this blog know, I’m an avid user of Anki to learn Russian. I have a number of sources for reference content that go onto my Anki cards. Notably, I use Wiktionary to get word definitions and the word with the proper syllabic stress marked. (This is an aid to pronunciation for Russian language learners.) Since I’m lazy to the core, I came up with a system way of grabbing the stress-marked word from the Wiktionary page using lxml and XPath.

Directly setting an Anki card's interval in the sqlite3 database

It’s always best to let Anki set intervals according to its view of your performance on testing. That said, there are times when directly altering the interval makes sense. For example, to build out a complete representation of the entire Russian National Corpus, I’m forced to enter vocabulary terms that should be obvious to even elementary Russian learners but which aren’t yet in my nearly 24,000 card database. Therefore, I’m entering these cards gradually.

Regex to match a cloze

Anki and some other platforms use a particular format to signify cloze deletions in flashcard text. It has a format like any of the following: {{c1::dog::}} {{c2::dog::domestic canine}} Here’s a regular expression that matches the content of cloze deletions in an arbitrary string, keeping only the main clozed word (in this case dog.) {{c\d::(.*?)(::[^:]+)?}} To see it in action, here it is in action in a Python script:

An alternative method for keyboard input switching on macOS

macOS offers a variety of virtual keyboard layouts which are accessible through System Preferences > Keyboard > Input Sources. Because I spend about half of my time writing in Russian and half in English, rapid switching between keyboard layouts is important. Optionally in the Input Sources preference pane, you can choose to use the Caps lock key to toggle between sources. This almost always works well with the exception of Anki.

Sunday, September 16, 2018

Regex 101 is a great online regex tester. Speaking of regular expressions, for the past year, I’ve used an automated process for building Anki flash cards. One of the steps in the process is to download Russian word pronunciations from Wiktionary. When Wiktionary began publishing transcoded mp3 files rather than just ogg files, they broke the URL scheme that I relied on to download content. The new regex for this scheme is: (?

Peering into Anki using R

Yet another diversion to keep me from focusing on actually using Anki to learn Russian. I stumbled on the R programming language, a language that focuses on statistical analysis. Here’s a couple snippets that begin to scratch the surface of what’s possible. Important caveat: I’m an R novice at best. There are probably much better ways of doing some of this… Counting notes with a particular model type Here we’ll use R to do what we did previously with Python.

Anki database adventures: Counting notes by model type

Continuing my series on accessing the Anki database outside of the Anki application environment, here’s a piece on accessing the note type model. You may wish to start here with the first article on accessing the Anki database. This is geared toward mac OS. (If you’re not on mac OS, then start here instead.) The note type model Since notes contain flexible fields in Anki, the model for a note type is in JSON.

Accessing the Anki database with Python: Working with a specific deck

I previously wrote about accessing the Anki database using Python on mac OS. Extending that post, I’ll show how to work with a specific deck in this short post. To use a named deck you’ll need its deck ID. Fortunately there’s a built-in method for finding a deck ID by name: col = Collection(COLLECTION_PATH) dID = col.decks.id(DECK_NAME) Now in queries against the cards and notes tables we can apply the deck ID to restrict them to a certain deck.

Working with the Anki database on mac OS using Python

Not long ago I ran across this post detailing a method for opening and inspecting the Anki database using Python outside the Anki application environment. However, the approach requires linking to the Anki code base which is inaccessible on mac OS since the Python code is packaged into a Mac app on this platform. The solution I’ve found is inelegant; but just involves downloading the Anki code base to a location on your file system where you can link to it in your code.

Process automation in building Anki vocabulary cards

For the last two years, I’ve been working through a 10,000 word Russian vocabulary ordered by frequency. I have a goal of finishing the list before the end of 2019. This requires not only stubborn persistence but an efficient process of collecting the information that goes onto my Anki flash cards. My manual process has been to work from a Numbers spreadsheet. As I collect information about each word from several websites, I log it in this table.