While my Russian Anki deck contains around 27,000 cards, I’m always making more. (There are a lot words in the Russian language!) Over the years, I’ve become more and more efficient with card production but one of the missing pieces was finding a code-readable source of word definitions. There’s no shortage of dictionary sites, but scraping data from any site is complicated by the ways in which front-end developers spread the semantic content across multiple HTML tags arranged in deep and cryptic hierarchies.
It’s possible to use cloze deletion cards within standard Anki note types using the Anki Cloze Anything setup. But additional scripts are required to allow it to function seamlessly in a typical language-learning environment. I’ll show you how to flexibly display a sentence with or without Anki Cloze Anything markup and also not break AwesomeTTS. Anki’s built-in cloze deletion system The built-in cloze deletion feature in Anki is an excellent way for language learners to actively test their recall.
I think this is the last word on fixing Knowclip .apkg files. I’ve developed this in bits and pieces; but hopefully this is the last word on the subject. See my previous articles, here and here, for the details. This issue, again, is that Knowclip gives these notes and cards sequential id values starting at 1. But Anki uses the note.id and the card.id as the creation date. I logged it as an issue on Github, but as of 2021-04-15 no action has been taken.
(N.B. A much-improved version of this script is published in a later post) Fixing the Knowclip note files as I described previously, it turns out, is only half of the fix with the broken .apkg files. You also need to fix the cards table. Why? Same reason. The rows are number sequentially from 1. But since Anki uses the card id field as the date added, the added field is always wrong.
(N.B. A much-improved version of this script is published in a later post) Language learners who want to develop their listening comprehension skills often turn to YouTube for videos that feature native language content. Often these videos have subtitles in the original language. A handful of applications allow users to take these videos along with their subtitles and chop them up into sentence-length bites that are suitable for Anki cards. Once such application is Knowclip.
The Anki add-on AwesomeTTS has been a vital tool for language learners using the Anki application on the desktop. It allows you to have elements of the card read aloud using text-to-speech capabilities. The new developer of the add-on has added a number of voice options, including the Microsoft Azure voices. The neural voices for Russian are quite good. But they have one major issue, syllabic stress marks that are sometimes seen in text intended for language learners cause the Microsoft Azure voices to grossly mispronounce the word.
After developing a rudimentary approach to detecting resistant language learning cards in Anki, I began teasing out individual factors. Once I was able to adjust the number of lapses for the age of the card, I could examine the effect of different factors on the difficulty score that I described previously. Findings Some of the interesting findings from this analysis: Prompt-answer direction - 62% of lapses were in the Russian → English (recognition) direction.
Regardless of how closely you adhere to the 20 rules for formating knowledge, there are cards that seem destined to leechdom. For me part of the problem is that with languages, straight-up vocabulary cards take words out of the rich context in which they exist in the wild. With my maturing collection of Russian decks, I recently started to go through these resistant cards and figure out why they are so difficult.
As readers of this blog know, I’m an avid user of Anki to learn Russian. I have a number of sources for reference content that go onto my Anki cards. Notably, I use Wiktionary to get word definitions and the word with the proper syllabic stress marked. (This is an aid to pronunciation for Russian language learners.) Since I’m lazy to the core, I came up with a system way of grabbing the stress-marked word from the Wiktionary page using lxml and XPath.
It’s always best to let Anki set intervals according to its view of your performance on testing. That said, there are times when directly altering the interval makes sense. For example, to build out a complete representation of the entire Russian National Corpus, I’m forced to enter vocabulary terms that should be obvious to even elementary Russian learners but which aren’t yet in my nearly 24,000 card database. Therefore, I’m entering these cards gradually.