Finding strings in text files using grep with regular expression

29/12/2020
Chưa phân loại
grep is one of most popular tools for searching and finding strings in a text file. The name ‘grep’ derives from a command in the now-obsolete Unix ed line editor tool—the ed command for searching globally through a file for a regular expression and then printing those lines was g/re/p, where re was the regular expression you would use. Eventually, the grep command was written to do this search on a file when not using ed.

In this article, we show you to run advance string searching using Grep with regular expression by giving you 10 hands-on examples on its implementations. Many examples discussed in this article have practical implications meaning you can use them in your daily Linux programming. The following samples describe some regexp examples for commonly searched-for patterns.

Ex 1: Find a Single Charterer in a Text File

To output lines in the file ‘book’ that contain a ‘$’ character, type:

$ grep ’$’ book

Ex 2: Find a Single string in a Text File

To output lines in the file ‘book’ that contains the string ‘$14.99’, type:

$ grep$14.99’ book

Ex 3: Find a Single Special Charterer in a Text File

To output lines in the file ‘book’ that contain a ‘’ character, type:

$ grep ’\’ book

Ex 4: Matching Lines Beginning with Certain Text

Use ‘ˆ’ in a regexp to denote the beginning of a line.

To output all lines in ‘/usr/dict/words’ beginning with ‘pro’, type:

$ grep ’ˆpro’ /usr/dict/words

To output all lines in the file ‘book’ that begin with the text ‘in the beginning’, regardless of case, type:

$ grep -i ’ˆin the beginning’ book

NOTE: These regexps were quoted with’ characters; this is because some shells otherwise treat the ‘ˆ’ character as a special “metacharacter”

In addition to word and phrase searches, you can use grep to search for complex text patterns called regular expressions. A regular expression—or “regexp”—is a text string of special characters that specifies a set of patterns to match.

Technically speaking, the word or phrase patterns are regular expressions—just very simple ones. In a regular expression, most characters—including letters and numbers—represent themselves. For example, the regexp pattern 1 matches the string ‘1’, and the pattern boy matches the string ‘boy’.

There are a number of reserved characters called metacharacters that do not represent themselves in a regular expression, but they have a special meaning that is used to build complex patterns. These metacharacters are as follows: ., *, [, ], ˆ, $, and . It is good to note that such metacharacters are common among almost all of common and special Linux distributions. Here is a good article that covers special meanings of the metacharacters and gives examples of their usage.

Ex 5: Matching Lines Ending with Certain Text

Use ‘$’ as the last character of quoted text to match that text only at the end of a line. To output lines in the file ‘going’ ending with an exclamation point, type:

$ grep!$’ going

Ex 6: Matching Lines of a Certain Length

To match lines of a particular length, use that number of ‘.’ characters between ‘ˆ’ and ‘$’—for ex- ample, to match all lines that are two characters (or columns) wide, use ‘ˆ..$’ as the regexp to search for.

To output all lines in ‘/usr/dict/words’ that are exactly three characters wide, type:

$ grep ’ˆ…$’ /usr/dict/words

For longer lines, it is more useful to use a different construct: ‘ˆ.{number}$’, where number is the number of lines to match. Use ‘,’ to specify a range of numbers.

To output all lines in ‘/usr/dict/words’ that are exactly twelve characters wide, type:

$ grep ’ˆ.{12}$’ /usr/dict/words

To output all lines in ‘/usr/dict/words’ that are twenty-two or more characters wide, type:

$ grep ’ˆ.{22,}$’ /usr/dict/words

Ex 7: Matching Lines That Contain Any of Some Regexps

To match lines that contain any of a number of regexps, specify each of the regexps to search for between alternation operators (‘|’) as the regexp to search for. Lines containing any of the given regexps will be output.

To output all lines in ‘playboy’ that contains either the patterns ‘the book’ or ‘cake’, type:

$ grep ’the book|cake’ playboy

Ex 8: Matching Lines That Contain All of Some Regexps

To output lines that match all of a number of regexps, use grep to output lines containing the first regexp you want to match, and pipe the output to a grep with the second regexp as an argument. Continue adding pipes to grep searches for all the regexps you want to search for.

To output all lines in ‘playlist’ that contains both patterns ‘the shore’ and ‘sky’, regardless of case, type:

$ grep -i ’the shore’ playlist | grep -i sky

Ex 9: Matching Lines That Only Contain Certain Characters

To match lines that only contain certain characters, use the regexp ‘ˆ[characters]*$’, where characters are the ones to match.  To output lines in ‘/usr/dict/words’ that only contain vowels, type:

$ grep -i ’ˆ[aeiou]*$’ /usr/dict/words

The ‘-i’ option matches characters regardless of case; so, in this example, all vowel characters are matched regardless of case.

Ex 10: Finding Phrases Regardless of Spacing

One way to search for a phrase that might occur with extra spaces between words, or across a line or page break, is to remove all linefeeds and extra spaces from the input, and then grep that. To do this, pipe the input to tr with ‘’rn:>|-’’ as an argument to the ‘-d’ option (removing all line breaks from the input); pipe that to the fmt filter with the ‘-u’ option (outputting the text with uniform spacing); and pipe that to grep with the pattern to search for.

To search across line breaks for the string ‘at the same time as’ in the file ‘docs’, type:

$ cat docs | tr -d ’rn:>|
-’ | fmt -u | grep ’at the same time as

Summary

In this article, we reviewed 10 practical examples of using Grep Linux command for searching and finding strings in a text file. Along the way, we learned how to use regular expressions in conjunction with Grep to conduct complex searches on text files. By now you have a better idea on how powerful Linux search functions are.

Here are additional resources for those interested in learning more about Linux programming:

Resources for System Administrators

Resources for Linux Kernel Programmers

Linux File System Dictionary

Comprehensive Review of How Linux File and Directory System Works

ONET IDC thành lập vào năm 2012, là công ty chuyên nghiệp tại Việt Nam trong lĩnh vực cung cấp dịch vụ Hosting, VPS, máy chủ vật lý, dịch vụ Firewall Anti DDoS, SSL… Với 10 năm xây dựng và phát triển, ứng dụng nhiều công nghệ hiện đại, ONET IDC đã giúp hàng ngàn khách hàng tin tưởng lựa chọn, mang lại sự ổn định tuyệt đối cho website của khách hàng để thúc đẩy việc kinh doanh đạt được hiệu quả và thành công.
Bài viết liên quan

Ubuntu LTS Releases: Everything You Need to Know

Millions of people around the world use Ubuntu as their Linux distribution of choice, and most have wondered at one point...
29/12/2020

Vim Shortcuts

One of the main reasons why Vim is good is because everything Vim does circulate around the keyboard. Yes, you don’t...
29/12/2020

How to Lock Screen on Ubuntu

The lock screen is an essential security feature of any operating system. Ubuntu is a Linux distro and that also puts a...
29/12/2020
Bài Viết

Bài Viết Mới Cập Nhật

Reliable IPv4 and IPv6 Subnet Rental Services: The Perfect Solution for Global Businesses
23/12/2024

Tìm Hiểu Về Thuê Proxy US – Lợi Ích và Cách Sử Dụng Hiệu Quả
11/12/2024

Mua Proxy V6 Nuôi Facebook Spam Hiệu Quả Tại Onetcomvn
03/06/2024

Hướng dẫn cách sử dụng ProxyDroid để duyệt web ẩn danh
03/06/2024

Mua proxy Onet uy tín tại Onet.com.vn
03/06/2024