Gri: Using System Tools With Gri

Chapters:
  1: Introduction
  2: Simple example
  3: Invocation
  4: Finer Control
  5: X-Y Plots
  6: Contour Plots
  7: Image Plots
  8: Examples
  9: Gri Commands
  10: Programming
  11: Environment
  12: Emacs Mode
  13: History
  14: Installation
  15: Gri Bugs
  16: Test Suite
  17: Gri in Press
  18: Acknowledgments
  19: License

Indices:
  Concepts
  Commands
  Variables

11.2: Using System Tools With Gri

Using system tools to manipulate your data makes sense for several reasons. First, you may be familiar with those tools already. Second, learning these tools will help you in all your work.

Why Use The Environment: Introduction
Grep: Search files for patterns
Sed: Serial editor
Awk: Search and manipulate data
Perl: Search and manipulate data

11.2.1: Introduction

Each of the programs listed in the sections below is available for Unix. Some (e.g. Perl and the Awk variant called Gawk) are available on other operating systems as well. Each of these tools is fully documented in Unix manpages, so here I'll just give an indication of a couple of useful techniques you might want to use.

Bear in mind that these tools can do very similar jobs. For example, Awk can do much of what Sed and Grep can do, but also a whole lot more. If you don't know Sed or Grep, I suggest you learn Awk instead. Then again, Perl can also do anything Gawk can do, and a whole lot more! (For one thing, it is easier to work with multiple files in Perl.) In fact, Perl is the most powerful of this list. If you know none of these tools, you might want to learn Perl instead of the others. But Perl is more complicated for simple work than Awk is, so the most reasonable path might be to learn both Awk and Perl.

For simple applications, you will probably want to use these tools in a piped open command, e.g.

open "awk '{print $1, $3/$2}' MyFile |"

which creates a temporary file (automatically erased when Gri finishes) which contains the output from running the system command that preceeds the pipe symbol (`|') (see Open).

(Here and in all the examples of this chapter, it is assumed that the user's input file is named `MyFile'.)

For more complicated appplications, you may use the Gri `system' command as follows.

system perl >tmp <<"EOF" open(IN, "MyFile"); while(<IN>) { ($x, $y) = split; print "$x", " ", cos($y), "\n"; } EOF open tmp read columns x y draw curve system rm -f tmp

Here a temporary file, named `tmp', has been used to store the results of the calculation. Note that this file was specifically cleaned up by the second `system' command. (Many folks, including the author, would prefer to take the perl script out of the above and make it a standalone executable script, calling it from Gri with the one-line form. But it is just a matter of style.)

11.2.2: Grep

The most common application of Grep is to select lines matching a pattern, or not matching a pattern. Here is how to skip all lines with the word "HEADER" in them:

open "grep -v 'HEADER' MyFile |" ...

Note that Gawk and Perl do this just as easily.

11.2.3: Sed

Sed is normally used for simple changes to files. For example, if you have columnar data which are separated with comma characters instead of whitespace, you could make it Gri compatible by

open "sed -e 's/,/ /g' MyFile |"

Where the `-e' flag indicates that the next item is a command to Sed, in this case a command to switch ("s") the comma character with the blank character, globally ("g") across each line of the file. See also the overview of Perl.

11.2.4: Awk

Awk is great for one-liners. If your system lacks Awk, you can procure the GNU version, called Gawk, from the web for free. For better or worse, Gawk is not fully compatible with Awk. The good thing is that Gawk is pretty much the same on all operating systems, whereas the installed Awk may not be. I use Gawk instead of Awk, for this reason and because it is normally faster.

The main concept in Awk is of "patterns" and "actions." In the Awk syntax, actions are written in braces following patterns written with no braces. (This will become clear presently.)

Whenever a line in the data file matches the pattern, the action is done.

The default pattern is to match to every line in the file. This is done if no pattern is supplied.

The default action is to print the line, and this is done if no action is supplied.

Here are a few examples. To skip first 10 lines of a file:

open "awk 'NR>10' MyFile |" read ...

Here the pattern was that `NR' (a special Awk variable for the number of the record, starting with 1 for the first line of the file) exceeded 10. And the action was taken by default since nothing was supplied between braces, to print this line.

To plot the cosine of the second column against the first column:

open "awk '{print ($1, cos($2))}' MyFile |" read columns x y draw curve

Here no pattern was supplied, so the action was done for every line.

Combining these two forms, then, and supplying both a pattern and an action, here is how one might print the first and eighth columns of the file `MyFile', but only for the first 10 lines of the file:

open "awk NR<=10 {print ($1, $8)} MyFile |"

11.2.5: Perl

Perl can do almost anything with your data, since it is a full programming language designed to also emulate several Unix utilities.

In perl, as in other commands, the commandline switch `-e' indicates that a Perl command is given in the next word of the command line. The commandline switch `-p' indicates to print the line, after any indicated Perl actions have been done on it. Here, for example, is how one would emulate a Sed replacement of comma by blank:

open "perl -pe 's/,/ /g' MyFile |"

Perl also has a commandline switch `-a' indicating that lines should be "autosplit" into an array called `$F'. The first element of this array is `$F[0]'. Splitting is normally done on white-space character(s), although this may be changed if desired. Here, for example, is how to take the cosine of the second column of a file, and print this after the first column:

open "perl -pea 'print($F[0], cos($F[1]))' MyFile |"