March 27th, 2008

Recursively grep for a substring, open all results in TextPad with cursor positioned appropriately

I’ve been using Ext-JS on a new project, recently. It’s pretty neat, and the examples are impressive, but the documentation leaves a lot to be desired. I needed to make a section of a we page collapsible, and it seemed like the Ext.Panel class was the way to do that, but I was having trouble figuring out exactly how to get my existing HTML content into a collapsible Ext.Panel. Almost as a last resort, I ended up grepping my local ext-2.0/examples directory tree to find examples that instantiate Ext.Panel objects:

$ grep -Ri “new Ext.Panel” *
code-display.js: var panel = new Ext.Panel({
core/templates.js: var p = new Ext.Panel({
core/templates.js: var p2 = new Ext.Panel({
feed-viewer/MainPanel.js: this.preview = new Ext.Panel({
feed-viewer/MainPanel.js: tab = new Ext.Panel({z
[…]

This was not very useful. I needed to see the whole constructor invocation for each of those cases. So, I decided to grep again, showing just the filenames (using the -l parameter), so I could open all of those files in TextPad. The first part of that (showing just the filenames) was the easy part:

$ grep -Rli “new Ext.Panel” *
code-display.js
core/templates.js
feed-viewer/MainPanel.js
form/combos.js
form/custom.js
[…]

Next, I needed to change those file paths from cygwin/unix-style paths to windows paths, so they could be passed to TextPad on the command-line. Time for a for loop:

$ for f in `grep -Rli “new Ext.Panel” *`; do cygpath -w -a $f; done
c:\api\js\ext-2.0\examples\code-display.js
c:\api\js\ext-2.0\examples\core\templates.js
c:\api\js\ext-2.0\examples\feed-viewer\MainPanel.js
c:\api\js\ext-2.0\examples\form\combos.js
c:\api\js\ext-2.0\examples\form\custom.js
[…]

Okay, so I could have probably built an environment variable as I was looping through and converting these paths, but if I ever wanted to run this on a longer path, with more search results, that command-line could get extremely long.

So, I checked the TextPad help to see if I could pass in the name of a file containing full file paths for TextPad to open. Sure enough:

@filename
Open all the files that are listed, one per line, in the specified file. This overrides the option to load the workspace, specified on the General page of the Preferences dialog box.

You just need to put an at sign (@) before the filename, and TextPad will look at that file to find a list of files to open. So, I decided to create a temporary file, output the filenames found and converted by my set of commands (above) into that temporary file, and then run TextPad, passing the temporary filename preceded by an @ sign.

But wait! I noticed something else in the TextPad help that seemed like a cool idea:

Notes:

  • [...]
  • If the filename to be edited (not printed) is followed by “(
    <line>[,<col>])”, with no intervening spaces, the file will be opened with the cursor at that position. If
    <line> is a hex number (eg. 0×1a22), a hex view of the file will be created, with the cursor at that address.
eg. TEXTPAD.EXE -ac "Read me.txt"(51,20)
In this example TextPad will start up and open "Read me.txt" at line 51, column 20 and display it in a cascaded window.

So, I decided to figure out a way to put the filenames to open, as well as the row and column number to position the cursor at within each of those files, into the temporary file that I was going to pass to TextPad. I already knew how to get grep to output line numbers (using the -n parameter), so I thought that would be the easy part.

However, it seems that you can’t specify both the -l (show filenames) and -n (show line numbers) parameters on the grep commandline. No, -l does more than simply tell it to show the filename next to each matching line (-H does that). -l tells it to ONLY show the filenames. Here’s the -l parameter definition from the grep man page:

-l, --files-with-matches
Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning will stop on the first match.

As far as I could tell, if I wanted line numbers and filenames, I needed to use -n and -H, and deal with the fact that the output would also include the text of the matching line. I also threw in -m 1 to limit the output to only one result per file, since the cursor can only be positioned in one place for each file. I didn’t need the -m previously, because the -l parameter already limited it to one result per file, since it only showed the filenames of each matching file. Here’s what the grep commandline and output looked like, at this point:

$ grep -RHn -m 1 “new Ext.Panel” *
code-display.js:11: var panel = new Ext.Panel({
core/templates.js:30: var p = new Ext.Panel({
feed-viewer/MainPanel.js:10: this.preview = new Ext.Panel({
form/combos.js:49: new Ext.Panel({
form/custom.js:40: var panel = new Ext.Panel({
[…]

At first, I thought the matching line text was just in my way, so I used sed to filter it out, and to replace the colon (:) between the filename and the line number with an open parenthesis, to prepare it for the format TextPad wanted:

$ grep -RHn -m 1 “new Ext.Panel” * | sed -e ’s/\(^[^:]\+\):\([0-9]\+\):.*$/\1(\2/g’
code-display.js(11
core/templates.js(30
feed-viewer/MainPanel.js(10
form/combos.js(49
form/custom.js(40
[…]

Next, I needed to get the offsets or column numbers for each matching line number that the previous command returned, to tell TextPad exactly where to put the cursor in each file. At first, I thought I could do this with grep, but the closest grep parameter seemed to be -b:

-b, --byte-offset
Print the byte offset within the input file before each line of output

However, -b gives the absolute byte offset starting from the very beginning of the file, rather than the offset within the matching lines. So, I had to find a different way to get the column offset within each matching line. This is when I realized that having the matching line text returned by my grep command could actually be useful. I figured I could just split that text out and count the characters leading up to the matching string with wc -c, among other things.

Anyways, after a lot of trial and error, a lot of re-checking man pages for bash, grep, wc, etc., I ended up with the following set of commands:

textpad $(for g in `for f in \`grep -Rli "new Ext.Panel" *\`; do (grep -Hn -m 1 "new Ext.Panel" $f | sed -e 's/\(^[^:]\+\):\([0-9]\+\):.*$/\1(\2/g'); done`; do echo `cygpath -w -a ${g/\(*/}`\(${g/*\(/},`grep -m 1 "new Ext.Panel" ${g/(*/} | sed -e 's/\t/ /g' -e 's/new Ext.Panel.*$//g' | wc -c`\); done) &

I’m sure this could be done more efficiently, but this was a fun challenge to take on, and I managed to find a way to do what I wanted to do. Feel free to leave a comment if you know a better way of doing this!

April 13th, 2007

launching textpad from cygwin

This is a simple bash function that I use pretty often. It comes in handy when I’m navigating a tree of source code in a cygwin bash shell, and want to edit a file in TextPad. You can just put this line in your .bashrc file, and make sure the directory where TextPad.exe lives is in your $PATH environment variable:

function tp() { textpad $(cygpath --mixed $1) & }

This allows me to do things like:

$find . -name '*Foo*.java'

which returns results like:

./src/com/spugbrap/foo/bar/TestFooImpl.java
./servlet/com/spugbrap/baz/FooDispatcher.java

Then I can just copy one of those full (but relative) paths to the clipboard, and paste it into a command that looks like this:

$tp ./servlet/com/spugbrap/baz/FooDispatcher.java

Now, regardless of where the root of this relative path exists on the file system, it will open that file in TextPad.

The only limitation that I run into with this is that it only lets you specify one file to open. It could probably be modified to handle multiple files pretty easily, but this hasn’t bothered me enough to deal with yet.

August 10th, 2006

How to list just directories in bash

This morning, I was trying to find a way to list just the subdirectories in the current directory, in a bash shell script I was writing. I thought it would be simple, but everything I tried seemed to either take an extraordinarily long time, or felt like an ugly hack.

The first thing I tried was:
find . -type d

But this was extremely slow, because it was recursively searching inside every subdirectory as well. I just wanted a list of subdirectories inside the current directory. I won’t bore you/clutter this post up with any more of my less-than-ideal methods.

What follows are a couple of ways of doing what I was trying to do, which I found in a post (and its comments) on the Ubuntu Blog, “List only the directories“:

ls -l | grep “^d”

This works, but gives a ‘long’ directory listing, when all I wanted was a list of directory names.


find . -type d -maxdepth 1 -mindepth 1This one was my favorite, since it used the method I originally tried, but it fixed the slowness by using parameters to avoid recursion. It gave me a couple warnings about the order of the parameters, though, so I changed it to this:
find . -maxdepth 1 -mindepth 1 -type d


ls -d */This gave me the same output as the ‘find’ method did, but some timing tests showed me that the ‘find’ method was about 2 times faster.

July 13th, 2006

Generating a random fake name from the commandline

Today I needed to come up with a list of lots of fake names for test data. In the past, I either manually entered well-known fictional character names (e.g. Homer J. Simpson), or used strings of characters (like ‘asdf g. hjkl’ or ‘aaaaaaaaaaaaaa’).

I remembered seeing some sort of test data generator, somewhere, recently, so I googled for “fake name generator test data“.

What I found was http://www.fakenamegenerator.com, which generates realistic test data based on country, name origin, and gender specifications. More info on the site at the end of this post. Here’s a set of commands that I put together to retrieve one fake name from that site:

curl -s -b agreement=Yes “http://www.fakenamegenerator.com”
| grep -o ‘on Google”>\([^<]\+\)’
| sed -e “s/[^>]*>\([^<<]\+\)<.*/\1/g”

When you use the site like a normal human being, the fake data that is generated includes full name, address, email address (usable, provided by an anonymous email service), phone number, mother’s maiden name, date of birth, and credit card number (+ expiration date). Very cool! For a very small fee, you can also order a bulk batch of data, which also includes fake Social Security Numbers).

However, being the penny-pinching and geeky type, I wanted to be able to generate my own list of fake names (without all the other info), for free. The set of commands listed above work right now, from a cygwin bash shell, but will probably break sometime in the future, when the HTML structure of the page changes. Oh well.

Oh yeah, don’t forget to read their terms of service* before using the service… Right now, I could not find anything prohibiting the use of automated tools to generate and retrieve names, but use the above set of commands at your own risk!

* The terms of service page only displays one time, unless you clear/disable your cookies.

July 5th, 2006

Bash history substitution

Anyone who has had an introductory unix course should know about the bash shell’s “history” command, which gives you a numbered list of commands that you’ve run previously.

You can execute one of those commands again by doing:
$ ![number] (for a particular command you’ve seen on the history list)
or
$ !! (for the previous command/last command in the history list)

One thing that I didn’t learn in any class, but did find out about from a co-worker, several years ago, was history substitution. It’s easy to run the previous command with minor changes, by using a caret-delimited substitution expression. Here’s a very simple example:

view a file:
$ cat spugbrap.txt

then, edit that same file:
$ ^cat^vim^

that gets expanded to (and executed as):
$ vim spugbrap.txt

I’ve used this feature countless times since learning about it, but it always suffered from a limitation: If the pattern you’re trying to match occurs multiple times in the previous commandline, this subtitution method only replaces the first occurrance of it. So, I recently decided to find out how to substitute multiple occurances, since I was sure there had to be an easy way. Here’s one way I found:

watch a couple of tomcat log files continuously:
$ tail -F ~/tomcat/logs/stdout_20060704.log ~/tomcat/logs/stderr_20060704.log

The next day, the log file names are different, because they’re date-based. so I want to change 20060704 to 20060705, and it needs to happen twice because the date occurs twice in that commandline. No problem! Assuming the previously executed command was the “tail” commandline, above, simply enter this:
$ !!:gs/20060704/20060705/

that gets expanded to (and executed as):
$ tail -F ~/tomcat/logs/stdout_20060705.log ~/tomcat/logs/stderr_20060705.log

What if my previous command was long, and I need to make multiple substitutions with multiple strings?

previous command:
$ tail -F stdout_20060704.log stderr_20060704.log host-manager.2006-07-04.log catalina.2006-07-04.log admin.2006-07-04.log localhost.2006-07-04.log manager.2006-07-04.log jakarta_service_20060704.log

to change the dates, which occur in two formats:
$ !!:gs/20060704/20060705/:gs/2006-07-04/2006-07-05/

If the command I want to run (with substitutions) was not the previous command, but some other command that appears in the numbered list from running ‘history’, put the history line number between the exclamation point and the colon:
$ !123:gs/oldstr/replacementstr/

For more information about this, and other bash history manipulation capabilities, check out:
Bash Features - Using History Interactively