Converting Cyrillic UTF-8 text encoded as Latin-1
This may be obvious to some, but visually-recognizing character encoding at a glance is not always obvious.
For example, pronunciation files downloaded form Forvo have the following appearance:
pronunciation_ru_оÑбÑвание.mp3
How can we extact the actual word from this gibberish? Optimally, the filename should reflect that actual word uttered in the pronunciation file, after all.
Step 1 - Extracting the interesting bits
The gibberish begins after the pronunciation_ru_
and ends before the file extension. Any regex tool can tease that out.