Quick and dirty filename cleanup

In a follow on from my last post, I want to talk about replacing spaces with underscores. To do this you can use the tr command. From the man page tr will “Translate, squeeze, and/or delete characters from standard input, writing to standard output.”. This is a small script that I run after bashpodder runs that will replace spaces with underscores.

#!/bin/bash
ls | while read -r FILE
do
mv -v “$FILE” `echo ${FILE} | tr ‘ ‘ ‘_’ `
done

Simple but effective …. well that was until I started coming across file names that have an acute, circumflex, umlaut etc in them. Transferring these files to my mp3 player caused it problems so I modified my script to convert all the file names to the ASCII character set first and then replace the spaces.

#!/bin/bash
ls | while read -r FILE
do
mv -v “$FILE” `echo ${FILE} |  iconv -f utf-8 -t ASCII -c | tr ‘ ‘ ‘_’ `
done

The magic is handled by the iconv and from the man page “iconv – Convert encoding of given files from one encoding to another”. In the script the program will convert form utf-8 encoding to ASCII. The only problem is that it will fail with an error if there is no equivelent charachter in the translating to encoding scheme. For that reason I use the -c option which will “Omit invalid characters from output.” So in the case of a file named Über Schön.ogg the iconv pipe will convert the filename to “ber Schn.ogg” and the tr pipe will convert that to “ber_Schn.ogg”.

I think you’ll agree that that is not ideal but then this is supposed to be a quick and dirty fix.

This entry was posted in General and tagged . Bookmark the permalink.

1 Response to Quick and dirty filename cleanup

  1. neb says:

    Thank you – this is exactly what I needed. I looked all over for stripping umlauts from a batch of file names.

Leave a Reply

Your email address will not be published. Required fields are marked *