August 10th, 2006

How to list just directories in bash

This morning, I was trying to find a way to list just the subdirectories in the current directory, in a bash shell script I was writing. I thought it would be simple, but everything I tried seemed to either take an extraordinarily long time, or felt like an ugly hack.

The first thing I tried was:
find . -type d

But this was extremely slow, because it was recursively searching inside every subdirectory as well. I just wanted a list of subdirectories inside the current directory. I won’t bore you/clutter this post up with any more of my less-than-ideal methods.

What follows are a couple of ways of doing what I was trying to do, which I found in a post (and its comments) on the Ubuntu Blog, “List only the directories“:

ls -l | grep ā€œ^dā€

This works, but gives a ‘long’ directory listing, when all I wanted was a list of directory names.


find . -type d -maxdepth 1 -mindepth 1This one was my favorite, since it used the method I originally tried, but it fixed the slowness by using parameters to avoid recursion. It gave me a couple warnings about the order of the parameters, though, so I changed it to this:
find . -maxdepth 1 -mindepth 1 -type d


ls -d */This gave me the same output as the ‘find’ method did, but some timing tests showed me that the ‘find’ method was about 2 times faster.

July 15th, 2006

finding unique ips in access log that have something in common

search for a string in access log, extract only ip address from matching line, sort the list of ip’s and remove duplicates, output the [shortened] list…

for ip in `grep -i firefox /cygdrive/w/resin/log/access.log \
| grep -o “^[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}” \
| sort -u \
| wc -l`; \
do echo $ip; \
done

July 5th, 2006

Bash history substitution

Anyone who has had an introductory unix course should know about the bash shell’s “history” command, which gives you a numbered list of commands that you’ve run previously.

You can execute one of those commands again by doing:
$ ![number] (for a particular command you’ve seen on the history list)
or
$ !! (for the previous command/last command in the history list)

One thing that I didn’t learn in any class, but did find out about from a co-worker, several years ago, was history substitution. It’s easy to run the previous command with minor changes, by using a caret-delimited substitution expression. Here’s a very simple example:

view a file:
$ cat spugbrap.txt

then, edit that same file:
$ ^cat^vim^

that gets expanded to (and executed as):
$ vim spugbrap.txt

I’ve used this feature countless times since learning about it, but it always suffered from a limitation: If the pattern you’re trying to match occurs multiple times in the previous commandline, this subtitution method only replaces the first occurrance of it. So, I recently decided to find out how to substitute multiple occurances, since I was sure there had to be an easy way. Here’s one way I found:

watch a couple of tomcat log files continuously:
$ tail -F ~/tomcat/logs/stdout_20060704.log ~/tomcat/logs/stderr_20060704.log

The next day, the log file names are different, because they’re date-based. so I want to change 20060704 to 20060705, and it needs to happen twice because the date occurs twice in that commandline. No problem! Assuming the previously executed command was the “tail” commandline, above, simply enter this:
$ !!:gs/20060704/20060705/

that gets expanded to (and executed as):
$ tail -F ~/tomcat/logs/stdout_20060705.log ~/tomcat/logs/stderr_20060705.log

What if my previous command was long, and I need to make multiple substitutions with multiple strings?

previous command:
$ tail -F stdout_20060704.log stderr_20060704.log host-manager.2006-07-04.log catalina.2006-07-04.log admin.2006-07-04.log localhost.2006-07-04.log manager.2006-07-04.log jakarta_service_20060704.log

to change the dates, which occur in two formats:
$ !!:gs/20060704/20060705/:gs/2006-07-04/2006-07-05/

If the command I want to run (with substitutions) was not the previous command, but some other command that appears in the numbered list from running ‘history’, put the history line number between the exclamation point and the colon:
$ !123:gs/oldstr/replacementstr/

For more information about this, and other bash history manipulation capabilities, check out:
Bash Features - Using History Interactively

May 4th, 2006

How to list duplicate lines in a text file, with counts next to each unique line

At some point, last year (it’s been in my ‘toblog’ file all this time), I needed to analyze the lines in a text file, removing duplicate lines, while counting how many times each duplicated line occurred within the file, and sorting from most common to least common.

For example, using a text file called ‘dupetest.txt’, containing:

foo bar baz
foo qux corge
spugbrap likes bacon
foo qux corge
spugbrap likes bacon
foo bar baz
oatmeal cookies are good
oatmeal cookies are good
foo bar baz
foo qux corge
foo bar baz

The output I want is:

4 foo bar baz
3 foo qux corge
2 spugbrap likes bacon
2 oatmeal cookies are good

I knew there had to be a simple way of doing this by just stringing together a few unix commands (in cygwin), but finding the right combination of commands took me some effort. Here’s what I came up with:

sort dupetest.txt | uniq -c -d | sort -n -r

June 9th, 2005

misc notes on my recent experience with parallelknoppix, clusterknoppix, and fedora

This is far from complete, and could use a lot of detail such as links
to the web sites I mention in here, etc. Maybe I’ll update it later
with that information.

- can’t write to NTFS in Parallel Knoppix
- openmosix terminal server lets slave nodes boot from image stored on
master HD, but that requires:
- slave notes have network boot capability (PXE)
- may require BIOS flash upgrade
- may need to change options in BIOS setup
- enable network boot/PXE option
- change boot order to try network first
- master needs to load drivers for relevant NICs on slaves
- sometimes challenging to find which driver
- pcimodules
- dmesg
- lspci -v
- creative google searching
- drivers in other OSes on that machine may provide clues
as to exactly which model number/chipset/etc. a NIC is, so you can
then google those numbers in search of info on which driver to use in
knoppix.. check hardware properties in windows device manager, etc.,
and look at driver versions/details.
- checking all drivers in list requires much more disk space on master
- GUI driver checkbox list very slow
- i liked to just edit the terminal server startup script,
modifying the regular expression that checks certain checkboxes on the
list by default. but that was probably more trouble than it was worth
to try and tell someone else to do.
- if BIOS doesn’t support PXE, download driver in a boot disk
image from the etherboot project, at http://rom-o-matic.net, then
write that image to a floppy with RAWRITE tool, and boot a slave from
that floppy.
- need to copy cd image to master HD for slaves to remote-mount
- plug master and slaves into one hub/switch, isolated from internet,
etc, and disable any network hardware that is not relevant. this may
not be required, but it simplifies things, so it’s more likely to
work.
- couldn’t get parallelknoppix to use my USB HD with FAT partition for
permanent storage for the terminal server. could use the drive
normally in parallelknoppix, but not for the main purpose i needed
non-NTFS storage for.
- machines with too little memory couldn’t have a large enough ram
disk, so they didn’t want to be master

- tried in vain to get clusterknoppix or parallelknoppix working on my
network at home, attempting with 6 different machines, in various
combinations, spending a total of probably 40 hours on this task. it
pains me to say that I never did get a useful cluster working at home.
luckily, our group was able to get one working between our 4 laptops,
in a matter of only about 4 hours, including recording a 30 minute
video of the process after somewhat perfecting it.

- when a master’s terminal server is running and slaves are connected,
they mount a directory on the master, and they are able to read/write
files anywhere in that directory tree. supposedly this is not
designed to be secure, it’s designed to be quick and easy and used in
environments that are as secure as they need to be.

- ClusterKnoppix
- hardware support nicer for me (such as mouse buttons)
- includes CaptiveNTFS for mounting NTFS partitions read/write
instead of read-only
- openmosix viewer seemed to automatically see other
ClusterKnoppix machines on the LAN, and automatically clustered them
and showed their processor usage in the one viewer window, but that
only worked for me two weeks ago. last week when I tried again, I
couldn’t get any 2 machines in my house to recognize each other as
nodes to cluster.

- challenges I encountered with Fedora:
- tried to install fedora on several machines, but the only
install that really worked out well was on a machine that I dedicated
to fedora.
- I let fedora start with an empty hard disk and partition it
automatically, etc. that machine worked out fine, and that’s what I
ended up doing my individual assignment on.
- was able to easily:
- use sendmail for local email
- create samba shares that my windows machines could mount
- read ntfs, and write ntfs as well after i played
with the mount command/options
- set up web server, vnc server, etc
- one machine took 7 hours to install from the 4 fedora CDs.
painful. then, it was not even able to start up after the install.
kept hanging during the fedora startup sequence.
- another machine had the same kind of hanging issue, but did
not take nearly as long to install initially.
- tried obsessively to get fedora to install and boot off of
an external USB hard drive.
- various discussion threads can be found by googling
which explain step-by-step how people have accomplished this.
- it was not simple/straightforward whatsoever, and I
never did get it to work, after spending probably 20+ hours on it over
the course of several days.
- getting it to install on the usb drive was the easy
part. getting it to boot from it was not.
- the main reason for trying to do this was because most
of my machines only have NTFS partitions, and fedora wasn’t crazy
about that fact. so I wanted to be able to install it on FAT or ext3
partitions on the external HD, so I could avoid messing with my work
laptop’s hard drive and stuff.
- managed to corrupt the MBR on my work laptop in the process
- found useful NT admin password resetter tool, which
allowed me to then boot from the windows XP setup CD and go into the
Recovery Console, where the FIXMBR command saved me.
- two other machines at home were disqualified by the fact
that they only had NTFS partitions.