Episode Details

HPR4597: UNIX Curio #2 - fgrep

Published 1 week, 1 day ago

Description

This show has been flagged as Clean by the host.

This series is dedicated to exploring little-known—and occasionally useful—trinkets lurking in the dusty corners of UNIX-like operating systems.

Imagine, if you will, a Jane Austen novel about three sisters. The first is well-known and celebrated by everyone; the second, while slightly smarter and more capable, is significantly less popular; and the third languishes in near-total isolation and obscurity. These three sisters live on any UNIX-like system, and their names are grep , egrep , and fgrep .

We will assume you are already familiar with grep —egrep works pretty much the same, except she handles e xtended regular expression syntax. (When writing shell scripts intended to be portable, be careful to call egrep if your expression uses + , ? , | , or braces as metacharacters. Some versions of GNU grep make no distinction between basic and extended regular expressions, so you may be surprised when your script works on one system but not another.)

But our subject for today is poor, unnoticed fgrep . While the plainest sister of the three, she really doesn't deserve to be ignored. The "f" in her name stands either for f ixed-string or f ast, depending on who you ask. She does not handle regular expressions at all; the pattern she is given is taken literally. This is a great advantage when what you are searching for contains characters having special meaning in a regular expression.

Suppose you have a directory full of PHP scripts and want to find references to an array element called $tokens[0] . You can try grep (note that the single quotes are necessary to prevent the shell from interpreting $tokens as a shell variable):

$ grep '$tokens[0]' *.php

But there is no output. The reason is that the brackets have special significance to grep ; [0] is interpreted as a character class containing only 0. Therefore, this command looks for the string $tokens0 , which is not what we want. We would have to escape the brackets with backslashes to get the correct match (some implementations may require you to escape the dollar sign also):

$ grep '$tokens\[0\]' *.php
parser.php:   $outside[] = $tokens[0];

Instead of fooling with all that escaping (which might get tedious if our pattern contains many special characters), we can just use fgrep instead:

$ fgrep '$tokens[0]' *.php
parser.php:   $outside[] = $tokens[0];

One place where fgrep can be particularly handy is when searching through log files for IP addresses. With ordinary grep , the pattern 43.2.1.0 would match 43.221.0.123, 43.2.110.123, and a bunch of other IP addresses you're not interested in because the dot metacharacter will match any character. To make sure you only matched a literal dot you'd have to escape each one with a backslash or, better yet, use fgrep .

But what about the claim that fgrep is fast? On GNU systems, there is usually one single binary that changes its behavior depending on whether it is called as grep , egrep , or fgrep . (Actually, this is in line with the POSIX standard ¹ , which deprecates egrep and fgrep in favor of a single grep command taking the -E option for using extended regular expr

Episode Details

HPR4597: UNIX Curio #2 - fgrep

Description

Listen Now

Love PodBriefly?