Skip to main content

Scandinavian Characters and Character Set Conversions

If site's old server used other encoding than UTF-8, you may find unexpected Scandinavian letters mixed in files.

Updated today

Scandinavian Letter in File Names

If Scandinavian letters are muddled up in file names, for example, because the old server used a different character set than UTF-8 encoding, the file names can be converted with the convmv command. Common old character sets are Windows-1255, ISO-8859-1 and ISO-8859-15. Depending on what was in use on the old server, one of the following commands must be run:

convmv -f ISO-8859-15 -t UTF-8 -r *

or

convmv -f windows-1255 -t UTF-8 -r *

The command is run in WordPress’s main directory /data/wordpress and due to the parameter -r, the conversion is also performed on files in all subfolders. The command above will indicate what the result of the conversion would be, but it does not perform it yet. If the conversion is not performed correctly, it could completely mess up the character set of the file names. When it is certain that the character set conversion is correct, the parameter --notest must be added to the command above, in order for the command to be run.

Scandinavian Letters Within the Text Files

In case like this file content can be converted with the recode command. The file’s current character set can be viewed with the file command, such as below:

file example.php

The conversion can be done, for example, by running the command:

recode ISO-8859-15..UTF-8*.php
Did this answer your question?