Filters and Regular Expressions - BunksAllowed

BunksAllowed is an effort to facilitate Self Learning process through the provision of quality tutorials.

Community

Filters and Regular Expressions

Share This

Filters are commands that read data, perform operations on that data, and then send the results to the standard output. 

Filters generate different kinds of output, depending on their task. 
  • Some filters generate information only about the input, other filters output selected parts of the input, and still other filters output an entire version of the input, but in a modified way. 
  • Some filters are limited to one of these, while others have options that specify one or the other. 
You can think of a filter as operating on a stream of data—receiving data and generating modified output. As data is passed through the filter, it is analyzed, screened, or modified.

The data stream input to a filter consists of a sequence of bytes that can be received from files, devices, or the output of other commands or filters. The filter operates on the data stream, but it does not modify the source of the data. If a filter receives input from a file, the file itself is not modified. Only its data is read and fed into the filter.

The output of a filter is usually sent to the standard output. It can then be redirected to another file or device, or piped as input to another utility or filter. All the features of redirection and pipes apply to filters. Often data is read by one filter and its modified output piped into another filter.

Many utilities and filters use patterns to locate and select specific text in your file. Sometimes, you may need to use patterns in a more flexible and powerful way, searching for several different variations on a given pattern. You can include a set of special characters in your pattern to enable a flexible search. A pattern that contains such special characters is called a regular expression. Regular expressions can be used in most filters and utilities that employ pattern searches such as sed, awk, grep, and egrep.

You can save the output of a filter in a file or send it to a printer. To do so, you need to use redirection or pipes. To save the output of a filter to a file, you redirect it to a file using the redirection operation, >. To send output to the printer, you pipe the output to the lpr utility, which then prints it. In the next command, the cat command pipes its output to the lpr command, which then prints it.

$ cat complist | lpr
All filters accept input from the standard input. In fact, the output of one filter can be piped as the input for another filter. Many filters also accept input directly from files, however. Such filters can take filenames as their arguments and read data directly from those files.

The grep and fgrep filters search the contents of files for a pattern. They then inform you of what file the pattern was found in and print the lines in which it occurred in each file. Preceding each line is the name of the file in which the line is located. grep can search for only one pattern, whereas fgrep can search for more than one pattern at a time.

The grep filter takes two types of arguments. The first argument is the pattern to be searched for; the second argument is a list of filenames, which are the files to be searched. You enter the filenames on the command line after the pattern. You can also use special characters, such as the asterisk, to generate a file list.

$ grep pattern filenames-list
If you want to include more than one word in the pattern search, you enclose the words within single quotation marks. This is to quote the spaces between the words in the pattern. Otherwise, the shell would interpret the space as a delimiter or argument on the command line, and grep would try to interpret words in the pattern as part of the file list. In the next example, grep searches for the pattern "text file":

$ grep 'text file' preface
A text file in Unix
text files, changing or
If you use more than one file in the file list, grep will output the name of the file before the matching line. In the next example, two files, preface and intro, are searched for the pattern "data". Before each occurrence, the filename is output.

$ grep data preface intro
 preface: data in the file.
 intro: new data
As mentioned earlier, you can also use shell file expansion characters to generate a list of files to be searched. In the next example, the asterisk file expansion character is used to generate a list of all files in your directory. This is a simple way of searching all of a directory's files for a pattern.

$ grep data *
The special characters are often useful for searching a selected set of files. For example, if you want to search all your C program source code files for a particular pattern, you can specify the set of source code files with *.c. Suppose you have an unintended infinite loop in your program and you need to locate all instances of iterations. The next example searches only those files with a .c extension for the pattern "while" and displays the lines of code that perform iterations:

$ grep while *.c






Happy Exploring!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.