Monday, May 23, 2022

We are all confident idiots - David Dunning (of the Dunning-Kruger effect) in this 2014 article discusses the effect and gives some interesting narrative commentary.

An ignorant mind is precisely not a spotless, empty vessel, but one that’s filled with the clutter of irrelevant or misleading life experiences, theories, facts, intuitions, strategies, algorithms, heuristics, metaphors, and hunches that regrettably have the look and feel of useful and accurate knowledge.

The effect has taken a lot of hits recently over the statistical underpinning in the original paper, but the overconfidence effect is demonstrably alive and well.


An interesting article on frisson which is a complex of physical and emotional phenomena that occur on encountering some aesthetic stimulus. In the case of the article - music. I get this. Certain types of passages reliably create this sense of ecstasy, longing, and physically, goosebumps.


A list of PKM systems. That’s “personal knowldge management” systems. I use DEVONthink for storing, linking and synthesizing notes, but I have a weakness for investigating other systems.


Improving the efficiency of Hugo static site deployment to S3. I’m really proud of this solution to a vexing problem. My upload sites for this site and now around 45 seconds are less compared to 10 minutes previously.

Three-line (though non-standard) interlinear glossing

Still thinking about interlinear glossing for my language learning project. The leizig.js library is great but my use case isn’t really what the author had in mind. I really just need to display a unit consisting of the word as it appears in the text, the lemma for that word form, and (possibly) the part of speech. For academic linguistics purposes, what I have in mind is completely non-standard. The other issue with leizig.

Splitting text into sentences: Russian edition

Splitting text into sentences is one of those tasks that looks simple but on closer inspection is more difficult than you think. A common approach is to use regular expressions to divide up the text on punction marks. But without adding layers of complexity, that method fails on some sentences. This is a method using spaCy.

My favourite Cyrillic font

I’ve tried a lot of fonts for Cyrillic. My favourite is Georgia. As a non-native Russian speaker, there’s something about serif fonts, either on-screen or in print, that makes the text so much more legible.