Finding the leaves in a tree

I often have deep directory trees where I am only interested in the deepest layers and not the supporting sub directories. An example of this might be a music collection where you have moved all your albums into directories based on artist name, then album, then disk, then songs etc. I wanted a script to be able to display all the deepest directories in a directory tree. Take this tree as an example:

$find -type d
.
./x
./x/y
./x/y/z
./a
./a/b
./a/b/c
./1
./b
./b/b
./c
./c/c
./c/c/c
./c/c/c/c
./c/c/c/c/c
./c/c/c/c/c/c
./11
./11/2
./11/2/3
./11/1
./11/1/1
./11/1/1/1
./11/1/11
./11/11
./11/11/11
./z

I put this script together which cycles through the output of the find command looking for directories only and then compares this line to the previous one.

$ cat twigs
#!/bin/bash
LASTTWIG=""
for LINE in `find -type d`;do
 if [ ! "${LASTTWIG}" = "" ];then
   echo ${LINE} | grep ${LASTTWIG}/ > /dev/null
   if [ "$?" != "0" ]; then
     echo ${LASTTWIG}
   fi
  fi
  LASTTWIG=${LINE}
done
echo ${LASTTWIG}

The last echo is necessary to get the last directory which will always be a end of chain directory. The output gives something like this.

[user@localhost x]$ ./twigs
./x/y/z
./a/b/c
./1
./b/b
./c/c/c/c/c/c
./11/2/3
./11/1/1/1
./11/1/11
./11/11/11
./z
[user@localhost x]$

One strange issue I ran across when using a find -type d | while read LINE;do iteration was that the pipe caused the last line not to work as the variable was stored in a sub process as the pipe | executed the while loop in a sub shell. The post Variable Scope in Bash helped to clear it up.

Posted in General | Tagged | Leave a comment

DNS working but not resolving

I’ve had a funny situation on a Solaris 10 box where DNS appeared to be working but domain names were not returning. I was able to configm DNS was working using dig

$ dig www.google.com

; <<>> DiG 9.3.4 <<>> www.google.com
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 412
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 7, ADDITIONAL: 3

;; QUESTION SECTION:
;www.google.com.                        IN      A

;; ANSWER SECTION:
www.google.com.         258091  IN      CNAME   www.l.google.com.
www.l.google.com.       289     IN      A       66.249.91.103
www.l.google.com.       289     IN      A       66.249.91.104
www.l.google.com.       289     IN      A       66.249.91.147
www.l.google.com.       289     IN      A       66.249.91.99

;; AUTHORITY SECTION:
l.google.com.           20626   IN      NS      b.l.google.com.
l.google.com.           20626   IN      NS      c.l.google.com.
l.google.com.           20626   IN      NS      d.l.google.com.
l.google.com.           20626   IN      NS      e.l.google.com.
l.google.com.           20626   IN      NS      f.l.google.com.
l.google.com.           20626   IN      NS      g.l.google.com.
l.google.com.           20626   IN      NS      a.l.google.com.

;; ADDITIONAL SECTION:
a.l.google.com.         83697   IN      A       209.85.139.9
b.l.google.com.         86389   IN      A       64.233.179.9
e.l.google.com.         86008   IN      A       209.85.137.9

;; Query time: 2 msec
;; SERVER: 10.94.70.5#53(10.94.70.5)
;; WHEN: Tue Sep  2 09:44:17 2008
;; MSG SIZE  rcvd: 276

And pinging the ip address showed the host was up.

$ ping 66.249.91.103
66.249.91.103 is alive

However pinging using a host names failed.

$ ping www.google.com
ping: unknown host www.google.com

The mystery begins to clear up when we use the getent command. From wikipedia: getent is a unixtext files called databases. This includes the passwd and group databases which store user information – hence getent is a common way to look up user details on Unix. Since getent uses the same name service as the system, getent will show all information, including that gained from network information sources such as LDAP. command that helps a user get entries in a number of important

The databases it searches in are: passwd, group, hosts, services, protocols, or networks.”

$ getent hosts www.google.com
$

The point here is that dig bypasses the host file and goes directly to the dns servers, while ping will obey the Name Service Switch settings as defined in /etc/nsswitch.conf. The solution is simply to edit the file and add the word dns to the line ipnodes:    files

ipnodes:    files dns

Posted in General | 2 Comments

Quick and dirty filename cleanup

In a follow on from my last post, I want to talk about replacing spaces with underscores. To do this you can use the tr command. From the man page tr will “Translate, squeeze, and/or delete characters from standard input, writing to standard output.”. This is a small script that I run after bashpodder runs that will replace spaces with underscores.

#!/bin/bash
ls | while read -r FILE
do
mv -v “$FILE” `echo ${FILE} | tr ‘ ‘ ‘_’ `
done

Simple but effective …. well that was until I started coming across file names that have an acute, circumflex, umlaut etc in them. Transferring these files to my mp3 player caused it problems so I modified my script to convert all the file names to the ASCII character set first and then replace the spaces.

#!/bin/bash
ls | while read -r FILE
do
mv -v “$FILE” `echo ${FILE} |  iconv -f utf-8 -t ASCII -c | tr ‘ ‘ ‘_’ `
done

The magic is handled by the iconv and from the man page “iconv – Convert encoding of given files from one encoding to another”. In the script the program will convert form utf-8 encoding to ASCII. The only problem is that it will fail with an error if there is no equivelent charachter in the translating to encoding scheme. For that reason I use the -c option which will “Omit invalid characters from output.” So in the case of a file named Über Schön.ogg the iconv pipe will convert the filename to “ber Schn.ogg” and the tr pipe will convert that to “ber_Schn.ogg”.

I think you’ll agree that that is not ideal but then this is supposed to be a quick and dirty fix.

Posted in General | Tagged | 1 Comment

Bash scripting and files with spaces in them

I’ve decided to post this one as it’s one of those things that I have come across so often but each time I froget the solution. The problem arises when you are trying to script using the for command and you run up against filenames that have spaces. No matter what way you try and quote them the for command will split them out. Assume we have two files in a directory. One with spaces in it’s name and one without.

ken@berta:~/x$ ls -1
file-name-one.txt
file name two.txt

Say we want to rename the files using the mv command. We could use the following for loop:

ken@berta:~/x$ for FILE in *.txt;do mv ${FILE} ${FILE}.bak;done
mv: target `two.txt.bak’ is not a directory
ken@berta:~/x$ ls -1
ken@berta:~/x$ file-name-one.txt.bak
ken@berta:~/x$ file name two.txt

There was no problem with the first file as it has no spaces. Rather than passing one variable “file name two.txt” to the mv command, the for command split it up into three separate variables using the space character as a delimiter. What this actually expanded to was this:

ken@berta:~/x$ mv file name two.txt file name two.txt.bak
mv: target `two.txt.bak’ is not a directory

Here the move command (mv) is trying to move all the files named ‘file’, ‘name’, ‘two.txt’, ‘file’ and ‘name’ into a directory called ‘two.txt.bak’. Obviously the files don’t exist but as the mv command first checks for the existance of the destination directory ‘two.txt.bak’, it gives the error that there is no directory by that name.

I’ve found a few ways to get around this but the one I like the best is to use the while loop instead. The while loop is nice as you can also use it to read in from a text file.

ken@berta:~/x$ ls -1
file-name-one.txt
file name two.txt
ken@berta:~/x$ ls -1 *.txt| while read FILE;do mv -v “${FILE}” “${FILE}”.bak ;done
`file-name-one.txt’ -> `file-name-one.txt.bak’
`file name two.txt’ -> `file name two.txt.bak’
ken@berta:~/x$ ls -1
file-name-one.txt.bak
file name two.txt.bak
ken@berta:~/x$

Here we pipe the output of the ls command into the while command loop. You need to put the double quotes around the variable or it will be treated as multiple variables.

Posted in General | Tagged | 3 Comments

HPR Episode on dvgrab

The episode I recorded about my last post has been posted on Hacker Public Radio.

Posted in Podcasts | Leave a comment

Archiving your DV tapes

I was listening to a podcast about archiving your data and I decided it was time to archive my Digital Video tapes. Good thing I did because although the oldest were only four (4) years old they were giving read errors playing them. Not good ! Warning to all – Copy your DV tapes ASAP.

A 60 minute DV tape takes about 15Gb of disk space in raw dv format which means my 100 tape collection would reach 1.5Tb. 15Tb. In a year or two it will be possible for me to afford a 13Tb drive but for now re-encoding is the only answer. I looked around for tips on what the best format to use was and I decided to use H264 as it seems to be supported widely. I found this excellent post by Elias Torres entitled “How do I manage my personal videos?” and I just copied and pasted the mencoder settings into my script. See the post itself for more information on the mencoder settings.

The tool to use for dv capture on the command line is dvgrab which is part of the Kino project. This is a great tool and allows me to dump out each clip to a different file with the start time as the file name.

dvgrab –opendml –size 0 –autosplit -t

The only gripe is that (on my camera) it doesn’t detect the end of tape which means the program never terminates and so it’s difficult to build a bash script around it. I did some searches for work arounds and I remember a post from IBM developerworks Linux tip: Controlling the duration of scheduled jobs. This shows how you can start a process in the background and make note of the process ID, go to sleep for a period of time and then wake and kill the process. As all my tapes are an hour long stopping the capture process after 70 minutes seemed like a good setting. As for the rest I just copied and pasted.

This is the full script that would transfer a tape to disk and then encode it using mencoder.

#!/bin/bash
runtime=${1:-70m}
mypid=$$
# Run dvgrab in background
dvgrab –opendml –size 0 –autosplit -t &
dvgrabpid=$!
sleep $runtime
kill -s SIGTERM $dvgrabpid
for file in *.avi
do
mencoder $file -ofps 30000/1001 -nosound -ovc x264 -x264encopts pass=1:bitrate=5500:turbo=1:me=umh:nodct_decimate:interlaced:no8x8dct:threads=4:fast_pskip:nobrdo:trellis=0:nr=350:turbo=1:nomixed_refs:noglobal_header:qp_min=10:qp_max=51:nobime:keyint=290:keyint_min=29:frameref=1:bframes=0:nob_adapt:nob_pyramid:noweight_b:subq=5:chroma_me:nocabac:nodeblock -passlogfile ./h264.log -o /dev/null
mencoder $file  -ofps 30000/1001 -oac mp3lame -lameopts abr:br=140:aq=4:vol=1.5:mode=1:highpassfreq=0:lowpassfreq=0 -ovc x264 -x264encopts pass=2:bitrate=5500:me=umh:nodct_decimate:interlaced:no8x8dct:threads=4:fast_pskip:nobrdo:trellis=0:nr=350:nomixed_refs:noglobal_header:qp_min=10:qp_max=51:nobime:keyint=290:keyint_min=29:frameref=1:bframes=0:nob_adapt:nob_pyramid:noweight_b:subq=5:chroma_me:nocabac:nodeblock -passlogfile ./h264.log -o H264-$file
mv $file ${file}.done
done

I actually split the transfer and conversion into two different processes. I had my laptop transfer the tapes and move the rawdv files to a big raid0 (yes that’s raid zero) temp directory on my server. The server then processed the files 24×7.

I wrote a looping script on my laptop so that once I pressed a key it would use the program dvcont rewind to rewind the tape. After 5 minutes that would be terminated and then dvgrab would run as above. The only difference was that instead of converting the files I moved them to the server as *.avi.temp first and when they were over I renamed them to just *.avi. That prevented the looping conversion script from starting to encode a partially transferred video.

 

Edit: Thanks Vladimir

Posted in Podcasts | Tagged | 3 Comments

Use a different email alias for every company you deal with.

I got a email from a chap in dell telling me of a 10% offer on notebooks. This was strange as I had not given them permission to send me promotional emails. To make matters worse me email address and those of 486 other people were in the ‘to’ field. About a half an hour later I got “—–, —– would like to recall the message, “— — — — — — — —“. Then two hours later the message was sent again this time using the BCC field.

So he made a mistake and tried to do a recall but that only works in the Microsoft Exchange world. My biggest problem is that he used and had access to my email address even though I did not opt in for any mailings apart form my order. Spammers could potentially use my email address but I’m not that worried about it as I always use different email addresses for every company I deal with. That way I know which company sold my email to spammers or have bad security and their address list got stolen.

Originally I had everything that prefixes my domain name coming to one account but now I make an alias for each address I use. That way company@example.com gets redirected to my real account me@example.com. Anything not for a valid email account or a redirect just gets dropped into a spam folder. I check through this now and again just checking the receiver address to see if it’s something I missed.

Posted in General | 2 Comments

Are you interested in recording a radio show ?

Podcasting (on demand Internet Radio) is an excellent way to turn many of those boring tasks we all do into a productive time. Be it mowing the lawn or suffering you’re daily commute if you can listen while doing an activity then you can make that time more productive by listening to the many podcasts that are available. Regardless of the topic there is probably a podcast that you can subscribe to and enjoy.

If you’re interested in technology and find that you would like to record a podcast but don’t want all the trouble of maintaining a RSS feed, website and forums then you should consider submitting a podcast to Hacker (in the good sense) Public Radio. This is a collection of people that record a audio show of between 5 and 45 minutes on any technology topic. Regular presenters like myself agree to produce one show per month you can submit a show at any time at all.

The topics are wide and varied from hardware or software reviews to beer brewing tips. If one show is not enough time for a topic then you can do a in-depth series. Don’t worry too much about the audio quality as most sins are forgiven if the topic is good. And the topic can be very technical or completely non-technical like my recent stop smoking episode.

Most importantly is that you record something and send it in after all who’s going to hear it.

Posted in Podcasts | Tagged , | 1 Comment

HPR Episode 140 out :: Device Configuration

Yet another in the LPI Certification. We’ve already covered a lot of this stuff before but no harm to review it again.

LPIC topic 1.101.6 — Configure Communication Devices [1]

Weight: 1
Objective:

Candidates should be able to install and configure internal and external communication devices such as modems, ISDN adapters, and DSL switches. This objective includes verification of compatibility requirements (especially important if that modem is winmodem), necessary hardware settings for internal devices (IRQs, DMAs, I/O ports), and loading and configuring suitable device drivers. It also includes communication device and interface configuration requirements, such as the correct serial port for 115.2 Kbps, and the correct modem settings for outbound PPP connection(s).

Key files, terms, and utilities include:
/proc/dma Direct memory accessing channels in use
/proc/interrupts Interrupts in use
/proc/ioports I/O ports in use
setserial(8) Configure serial port access for an internal modem

Notes from Leading Edge Training Notes released under the GNU Free Documentation License

Posted in Podcasts | Tagged , , | Leave a comment

PCI Cards podcast released

Hacker Public Radio has another in the series on the LPI certification. This was recorded last month but due to a mixup it only got released today. I’m changing from the IBM documentation to some produced under the GFDL.

GNU Free Documentation License
elpicx Live-CD/DVD
Leading Edge Training Notes

lspci -h|less
lspci -n|less
locate pci.ids | less
less ‘locate pci.ids | head -1`
lspci | less
lspci -s 00:1d -v |less
less /proc/pci
echo "Read http://www.rt.com/man/pnpdump.8.html"
less /proc/interupts
less /proc/ioports
less /proc/iomem
less /proc/dma

Posted in Podcasts | Leave a comment