March 27th, 2008

Recursively grep for a substring, open all results in TextPad with cursor positioned appropriately

I’ve been using Ext-JS on a new project, recently. It’s pretty neat, and the examples are impressive, but the documentation leaves a lot to be desired. I needed to make a section of a we page collapsible, and it seemed like the Ext.Panel class was the way to do that, but I was having trouble figuring out exactly how to get my existing HTML content into a collapsible Ext.Panel. Almost as a last resort, I ended up grepping my local ext-2.0/examples directory tree to find examples that instantiate Ext.Panel objects:

$ grep -Ri “new Ext.Panel” *
code-display.js: var panel = new Ext.Panel({
core/templates.js: var p = new Ext.Panel({
core/templates.js: var p2 = new Ext.Panel({
feed-viewer/MainPanel.js: this.preview = new Ext.Panel({
feed-viewer/MainPanel.js: tab = new Ext.Panel({z
[…]

This was not very useful. I needed to see the whole constructor invocation for each of those cases. So, I decided to grep again, showing just the filenames (using the -l parameter), so I could open all of those files in TextPad. The first part of that (showing just the filenames) was the easy part:

$ grep -Rli “new Ext.Panel” *
code-display.js
core/templates.js
feed-viewer/MainPanel.js
form/combos.js
form/custom.js
[…]

Next, I needed to change those file paths from cygwin/unix-style paths to windows paths, so they could be passed to TextPad on the command-line. Time for a for loop:

$ for f in `grep -Rli “new Ext.Panel” *`; do cygpath -w -a $f; done
c:\api\js\ext-2.0\examples\code-display.js
c:\api\js\ext-2.0\examples\core\templates.js
c:\api\js\ext-2.0\examples\feed-viewer\MainPanel.js
c:\api\js\ext-2.0\examples\form\combos.js
c:\api\js\ext-2.0\examples\form\custom.js
[…]

Okay, so I could have probably built an environment variable as I was looping through and converting these paths, but if I ever wanted to run this on a longer path, with more search results, that command-line could get extremely long.

So, I checked the TextPad help to see if I could pass in the name of a file containing full file paths for TextPad to open. Sure enough:

@filename
Open all the files that are listed, one per line, in the specified file. This overrides the option to load the workspace, specified on the General page of the Preferences dialog box.

You just need to put an at sign (@) before the filename, and TextPad will look at that file to find a list of files to open. So, I decided to create a temporary file, output the filenames found and converted by my set of commands (above) into that temporary file, and then run TextPad, passing the temporary filename preceded by an @ sign.

But wait! I noticed something else in the TextPad help that seemed like a cool idea:

Notes:

  • [...]
  • If the filename to be edited (not printed) is followed by “(
    <line>[,<col>])”, with no intervening spaces, the file will be opened with the cursor at that position. If
    <line> is a hex number (eg. 0×1a22), a hex view of the file will be created, with the cursor at that address.
eg. TEXTPAD.EXE -ac "Read me.txt"(51,20)
In this example TextPad will start up and open "Read me.txt" at line 51, column 20 and display it in a cascaded window.

So, I decided to figure out a way to put the filenames to open, as well as the row and column number to position the cursor at within each of those files, into the temporary file that I was going to pass to TextPad. I already knew how to get grep to output line numbers (using the -n parameter), so I thought that would be the easy part.

However, it seems that you can’t specify both the -l (show filenames) and -n (show line numbers) parameters on the grep commandline. No, -l does more than simply tell it to show the filename next to each matching line (-H does that). -l tells it to ONLY show the filenames. Here’s the -l parameter definition from the grep man page:

-l, --files-with-matches
Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning will stop on the first match.

As far as I could tell, if I wanted line numbers and filenames, I needed to use -n and -H, and deal with the fact that the output would also include the text of the matching line. I also threw in -m 1 to limit the output to only one result per file, since the cursor can only be positioned in one place for each file. I didn’t need the -m previously, because the -l parameter already limited it to one result per file, since it only showed the filenames of each matching file. Here’s what the grep commandline and output looked like, at this point:

$ grep -RHn -m 1 “new Ext.Panel” *
code-display.js:11: var panel = new Ext.Panel({
core/templates.js:30: var p = new Ext.Panel({
feed-viewer/MainPanel.js:10: this.preview = new Ext.Panel({
form/combos.js:49: new Ext.Panel({
form/custom.js:40: var panel = new Ext.Panel({
[…]

At first, I thought the matching line text was just in my way, so I used sed to filter it out, and to replace the colon (:) between the filename and the line number with an open parenthesis, to prepare it for the format TextPad wanted:

$ grep -RHn -m 1 “new Ext.Panel” * | sed -e ’s/\(^[^:]\+\):\([0-9]\+\):.*$/\1(\2/g’
code-display.js(11
core/templates.js(30
feed-viewer/MainPanel.js(10
form/combos.js(49
form/custom.js(40
[…]

Next, I needed to get the offsets or column numbers for each matching line number that the previous command returned, to tell TextPad exactly where to put the cursor in each file. At first, I thought I could do this with grep, but the closest grep parameter seemed to be -b:

-b, --byte-offset
Print the byte offset within the input file before each line of output

However, -b gives the absolute byte offset starting from the very beginning of the file, rather than the offset within the matching lines. So, I had to find a different way to get the column offset within each matching line. This is when I realized that having the matching line text returned by my grep command could actually be useful. I figured I could just split that text out and count the characters leading up to the matching string with wc -c, among other things.

Anyways, after a lot of trial and error, a lot of re-checking man pages for bash, grep, wc, etc., I ended up with the following set of commands:

textpad $(for g in `for f in \`grep -Rli "new Ext.Panel" *\`; do (grep -Hn -m 1 "new Ext.Panel" $f | sed -e 's/\(^[^:]\+\):\([0-9]\+\):.*$/\1(\2/g'); done`; do echo `cygpath -w -a ${g/\(*/}`\(${g/*\(/},`grep -m 1 "new Ext.Panel" ${g/(*/} | sed -e 's/\t/ /g' -e 's/new Ext.Panel.*$//g' | wc -c`\); done) &

I’m sure this could be done more efficiently, but this was a fun challenge to take on, and I managed to find a way to do what I wanted to do. Feel free to leave a comment if you know a better way of doing this!

September 14th, 2007

Regular expressions for converting code-indentation spaces to tabs in TextPad

I’ve never been a fan of using tabs to indent my code; I prefer spaces. Writing code is an art form, and when you use tabs to indent, you can’t assume that it will still look pretty on someone else’s machine, or in another application, etc., since tab sizes are platform-dependent and–although usually user-definable–the default size (typically 8 spaces, I think) tends to be much larger than my own 2-4 space indentation style.

However, some of my co-workers indent with tabs, and others indent with spaces. I’ve found that it’s easier for me to avoid inadvertently messing up existing code when I just bite the bullet and use tabs.

So, what follows are a couple of search/replace regular expressions I’ve recently used in TextPad, to make some existing code more consistent, by converting the spaces to tabs in certain relevant locations.


1. Regular expressions for aligning inline comments with tabs

Find what:   \;( *) {2}(\t*)\/\/
 
Replace with:   \;\1\t\2//

Customization: The number 2, in curly braces (above), should be replaced with the number of spaces that are used for indentation, in the code you’re running this on. In this case, the code was indented in increments of 2 spaces. Sometimes I deal with code that’s indented in increments of 4 spaces, in which case that 2 would change to a 4.

Manner of execution: Run via Replace All, repeatedly, until there are no more matches.

Recommended scope: I used it on a very specific block of selected code. Keep reading for the specific format.

When I used this on a block of code that consisted of variable declarations and //-style comments, it ended up making the comment blocks all line up nicely. Here is an animated GIF1 of a series of screenshots, showing how repeatedly running this expression transformed the comments into tab-indented comments that lined up nicely:

animated progression of screenshots showing comments separated from code with spaces, and eventually just tabs (and all lined up nicely)

I’m not sure how useful this will be, in general, because I think I just kind of lucked out when I got the results that I did. I thought it was worth sharing, though, because it impressed me when it produced results that were better than I had actually envisioned. :)

Two important things to note are:

  • The actual variable declarations (beginning of each line) were already indented with 1 tab.
  • There were a consistent number of spaces between the semicolons ending the variable declarations, and the double slashes starting the comments. While this didn’t help them line up, as spaces, it did make things pretty when the spaces were replaced with tabs.

If you’d like, you can see what my TextPad Replace dialog looked like, so you can see things like which checkboxes are checked, etc. Also, please note that in my TextPad preferences, I have it set to Use POSIX regular expression syntax (as previously mentioned, in another TextPad search/replace expression entry, a couple years ago).


2. Regular expressions for changing leading indentation spaces to tabs

Find what:   ^(\t*) {4}
 
Replace with:   \1\t

Customization: As with the first expression, above, the number in curly braces should be changed to the spacing increment used for indentation, in the code you’re running this on.

Manner of execution: Run via Replace All, repeatedly, until there are no more matches.

Recommended scope: It can be used on a block of selected text, the entire active document, or even on all open documents, if you’re brave and somewhat evil. :)


Footnotes

1 Sorry for using an animated GIF. I’d rather not have something constantly flashing on the page, because it distracts, and resembles an advertisement. I would have preferred another solution, but I racked my brain for over two hours, trying to decide how to present this series of 9 screenshots. I didn’t want to alienate readers who might read this using a feed reader that doesn’t support javascript, and I hate making people click through from the feed to the main site–I strongly prefer full-text rss feeds! Laying the 9 images out horizontally or vertically took up either too much space, or required making the images too tiny to read. In the end, this animated GIF seemed like the best portable way to show the effect I was trying to show. I won’t make a habit of using them, though. :)

[return to top] / [expression 1] / [expression 2]

April 13th, 2007

launching textpad from cygwin

This is a simple bash function that I use pretty often. It comes in handy when I’m navigating a tree of source code in a cygwin bash shell, and want to edit a file in TextPad. You can just put this line in your .bashrc file, and make sure the directory where TextPad.exe lives is in your $PATH environment variable:

function tp() { textpad $(cygpath --mixed $1) & }

This allows me to do things like:

$find . -name '*Foo*.java'

which returns results like:

./src/com/spugbrap/foo/bar/TestFooImpl.java
./servlet/com/spugbrap/baz/FooDispatcher.java

Then I can just copy one of those full (but relative) paths to the clipboard, and paste it into a command that looks like this:

$tp ./servlet/com/spugbrap/baz/FooDispatcher.java

Now, regardless of where the root of this relative path exists on the file system, it will open that file in TextPad.

The only limitation that I run into with this is that it only lets you specify one file to open. It could probably be modified to handle multiple files pretty easily, but this hasn’t bothered me enough to deal with yet.