cli

Getting plaintext into Anki fields on macOS: An update

A few years ago, I wrote about my problems with HTML in Anki fields. If you check out that previous post you’ll get the backstory about my objection. The gist is this: If you copy something from the web, Anki tries to maintain the formatting. Basically it just pastes the HTML off the clipboard. Supposedly, Anki offers to strip the formatting with Shift-paste, but I’ve point out to the developer specific examples where this fails.

Extracting title title of a web page from the command line

I was using a REST API at https://textance.herokuapp.com/title but it seems awfully fragile. Sure enough this morning, the entire application is down. It’s also not open-source and I have no idea who actually runs this thing. Here’s the solution: #!/bin/bash url=$(pbpaste) curl $url -so - | pup 'meta[property=og:title] attr{content}' It does require pup. On macOS, you can install via brew install pup. There are other ways using regular expressions but no dependency on pup but parsing HTML with regex is not such a good idea.

Bash variable scope and pipelines

I alluded to this nuance involving variable scope in my post on automating pdf processing, but I wanted to expand on it a bit. Consider this little snippet: i=0 printf "foo:bar:baz:quux" | grep -o '[^:]+' | while read -r line ; do printf "Inner scope: %d - %s\n" $i $line ((i++)) [ $i -eq 3 ] && break; done printf "====\nOuter scope\ni = %d\n" $i; If you run this script - not in interactive mode in the shell - but as a script, what will i be in the outer scope?

Automating the handling of bank and financial statements

In my perpetual effort to get out of work, I’ve developed a suite of automation tools to help file statements that I download from banks, credit cards and others. While my setup described here is tuned to my specific needs, any of the ideas should be adaptable for your particular circumstances. For the purposes of this post, I’m going to assume you already have Hazel. None of what follows will be of much use to you without it.

Converting Cyrillic UTF-8 text encoded as Latin-1

This may be obvious to some, but visually-recognizing character encoding at a glance is not always obvious. For example, pronunciation files downloaded form Forvo have the following appearance: pronunciation_ru_оÑ‚бывание.mp3 How can we extact the actual word from this gibberish? Optimally, the filename should reflect that actual word uttered in the pronunciation file, after all. Step 1 - Extracting the interesting bits The gibberish begins after the pronunciation_ru_ and ends before the file extension.

accentchar: a command-line utility to apply Russian stress marks

I’ve written a lot about applying and removing syllabic stress marks in Russian text because I use it a lot when making Anki cards. This iteration is a command line tool for applying the stress mark at a particular character index. The advantage of these little shell tools is that they can be composable, integrating into different tools as the need arises. #!/usr/local/bin/zsh while getopts i:w: flag do case "${flag}" in i) index=${OPTARG};; w) word=${OPTARG};; esac done if [ $word ]; then temp=$word else read temp fi outword="" for (( i=0; i<${#temp}; i++ )); do thischar="${temp:$i:1}" if [ $i -eq $index ]; then thischar=$(echo $thischar | perl -C -pe 's/(.