May 4th, 2006

How to list duplicate lines in a text file, with counts next to each unique line

At some point, last year (it’s been in my ‘toblog’ file all this time), I needed to analyze the lines in a text file, removing duplicate lines, while counting how many times each duplicated line occurred within the file, and sorting from most common to least common.

For example, using a text file called ‘dupetest.txt’, containing:

foo bar baz
foo qux corge
spugbrap likes bacon
foo qux corge
spugbrap likes bacon
foo bar baz
oatmeal cookies are good
oatmeal cookies are good
foo bar baz
foo qux corge
foo bar baz

The output I want is:

4 foo bar baz
3 foo qux corge
2 spugbrap likes bacon
2 oatmeal cookies are good

I knew there had to be a simple way of doing this by just stringing together a few unix commands (in cygwin), but finding the right combination of commands took me some effort. Here’s what I came up with:

sort dupetest.txt | uniq -c -d | sort -n -r

July 14th, 2005

How to reattach to GNU screen sessions in windows (cygwin)

I use a Windows version of the GNU Screen window manager/terminal multiplexer every day, and wanted to share a trick that I’ve learned over time. A lot of people have trouble with reattaching to existing screen sessions in the windows/cygwin version. You can probably accomplish this different ways, but this is what I have been doing every day at work for at least 9 months now:

  1. open first bash shell window
  2. run: screen
  3. open another bash shell window
  4. run: screen -x
    NOTE: this window will probably not reattach, at least it doesn’t for me. if it does, these instructions are useless for you.
  5. close second bash window (opened in step 3).
  6. open another bash shell window
  7. run: screen -x

At this point, both bash windows are now connected to the same screen session.

Variations on this may work.. I’ve done this in a lot of different conditions, including of course ssh-ing into a box and attaching to the existing screen session, and also using ssh-agent just right to avoid having to type passwords all day long. More on that later, but for now, I just wanted to get this basic information out there.

The main idea is that the first attempt to attach to an existing screen session will fail, but if you try again in a new window it should work.

November 11th, 2004

bash programmable completion

Last week, I was trying to get my directories all set up in a new development environment, and I ended up with a really deep nesting of directories to get to my actual code. After organizing everything in a standard web application source tree way, with java source code in directories based on the package, and packages named with the standard Sun-recommended naming convention, the path to one of my servlets looked something like this:

/home/spugbrap/projects/appName/src/com/baz/\
bar/foo/servlet/MyServlet.java

I created some soft links to get to specific places within the tree, but sometimes traversing the tree is necessary/useful. Most of the directories in the package hierarchy are *basically* empty except for a single subdirectory, so tab completion should come in handy.

So, to get to that servlet directory from my home directory, I was doing something like this ( indicates where I attempted filename completion by hitting the tab key):

15:04:52 Thu Nov 11 [~]
$ cd pro/app/src/

uh oh. already ran into a problem. When I said the directories in the package hierarchy are *basically* empty, I didn’t mention they contain 1 subdirectory to get to the next level in the package hierarchy, as well as a CVS directory since this is all version controlled. So, when I got to the “src” directory, tab completion was not as useful as I had hoped. I needed a way to ignore the CVS directories when performing completion, so I could just do this to get to my final destination:

15:04:52 Thu Nov 11 [~]
$ cd pro/app/src/////se

So, I googled for a while, finding lots of information on how easy this is to do with “zsh”, or on Mac OS X (I think), but was not finding anything about doing it with “bash”, particularly in cygwin (although I figured that part shouldn’t be relevant).

Eventually, I stumbled across a project that allows you to programmatically tell bash exactly how to complete things, even to the point of completing differently based on the command you’re planning to execute, including completing entries from your ~/.ssh/known_hosts file if your command-line starts with “ssh” when you hit tab. This did exactly what I wanted, and much more, so now my directory navigation is much easier. I still use my soft links most of the time, but when I need to actually traverse the tree manually, I can do it in a minimal number of keystrokes.

Here’s a link to this bash completion project on freshmeat:
http://freshmeat.net/projects/bashcompletion/

Enjoy!
-dave