Finding strings in text files using grep with regular expression

29/12/2020
Chưa phân loại
grep is one of most popular tools for searching and finding strings in a text file. The name ‘grep’ derives from a command in the now-obsolete Unix ed line editor tool—the ed command for searching globally through a file for a regular expression and then printing those lines was g/re/p, where re was the regular expression you would use. Eventually, the grep command was written to do this search on a file when not using ed.

In this article, we show you to run advance string searching using Grep with regular expression by giving you 10 hands-on examples on its implementations. Many examples discussed in this article have practical implications meaning you can use them in your daily Linux programming. The following samples describe some regexp examples for commonly searched-for patterns.

Ex 1: Find a Single Charterer in a Text File

To output lines in the file ‘book’ that contain a ‘$’ character, type:

$ grep ’$’ book

Ex 2: Find a Single string in a Text File

To output lines in the file ‘book’ that contains the string ‘$14.99’, type:

$ grep$14.99’ book

Ex 3: Find a Single Special Charterer in a Text File

To output lines in the file ‘book’ that contain a ‘’ character, type:

$ grep ’\’ book

Ex 4: Matching Lines Beginning with Certain Text

Use ‘ˆ’ in a regexp to denote the beginning of a line.

To output all lines in ‘/usr/dict/words’ beginning with ‘pro’, type:

$ grep ’ˆpro’ /usr/dict/words

To output all lines in the file ‘book’ that begin with the text ‘in the beginning’, regardless of case, type:

$ grep -i ’ˆin the beginning’ book

NOTE: These regexps were quoted with’ characters; this is because some shells otherwise treat the ‘ˆ’ character as a special “metacharacter”

In addition to word and phrase searches, you can use grep to search for complex text patterns called regular expressions. A regular expression—or “regexp”—is a text string of special characters that specifies a set of patterns to match.

Technically speaking, the word or phrase patterns are regular expressions—just very simple ones. In a regular expression, most characters—including letters and numbers—represent themselves. For example, the regexp pattern 1 matches the string ‘1’, and the pattern boy matches the string ‘boy’.

There are a number of reserved characters called metacharacters that do not represent themselves in a regular expression, but they have a special meaning that is used to build complex patterns. These metacharacters are as follows: ., *, [, ], ˆ, $, and . It is good to note that such metacharacters are common among almost all of common and special Linux distributions. Here is a good article that covers special meanings of the metacharacters and gives examples of their usage.

Ex 5: Matching Lines Ending with Certain Text

Use ‘$’ as the last character of quoted text to match that text only at the end of a line. To output lines in the file ‘going’ ending with an exclamation point, type:

$ grep!$’ going

Ex 6: Matching Lines of a Certain Length

To match lines of a particular length, use that number of ‘.’ characters between ‘ˆ’ and ‘$’—for ex- ample, to match all lines that are two characters (or columns) wide, use ‘ˆ..$’ as the regexp to search for.

To output all lines in ‘/usr/dict/words’ that are exactly three characters wide, type:

$ grep ’ˆ…$’ /usr/dict/words

For longer lines, it is more useful to use a different construct: ‘ˆ.{number}$’, where number is the number of lines to match. Use ‘,’ to specify a range of numbers.

To output all lines in ‘/usr/dict/words’ that are exactly twelve characters wide, type:

$ grep ’ˆ.{12}$’ /usr/dict/words

To output all lines in ‘/usr/dict/words’ that are twenty-two or more characters wide, type:

$ grep ’ˆ.{22,}$’ /usr/dict/words

Ex 7: Matching Lines That Contain Any of Some Regexps

To match lines that contain any of a number of regexps, specify each of the regexps to search for between alternation operators (‘|’) as the regexp to search for. Lines containing any of the given regexps will be output.

To output all lines in ‘playboy’ that contains either the patterns ‘the book’ or ‘cake’, type:

$ grep ’the book|cake’ playboy

Ex 8: Matching Lines That Contain All of Some Regexps

To output lines that match all of a number of regexps, use grep to output lines containing the first regexp you want to match, and pipe the output to a grep with the second regexp as an argument. Continue adding pipes to grep searches for all the regexps you want to search for.

To output all lines in ‘playlist’ that contains both patterns ‘the shore’ and ‘sky’, regardless of case, type:

$ grep -i ’the shore’ playlist | grep -i sky

Ex 9: Matching Lines That Only Contain Certain Characters

To match lines that only contain certain characters, use the regexp ‘ˆ[characters]*$’, where characters are the ones to match.  To output lines in ‘/usr/dict/words’ that only contain vowels, type:

$ grep -i ’ˆ[aeiou]*$’ /usr/dict/words

The ‘-i’ option matches characters regardless of case; so, in this example, all vowel characters are matched regardless of case.

Ex 10: Finding Phrases Regardless of Spacing

One way to search for a phrase that might occur with extra spaces between words, or across a line or page break, is to remove all linefeeds and extra spaces from the input, and then grep that. To do this, pipe the input to tr with ‘’rn:>|-’’ as an argument to the ‘-d’ option (removing all line breaks from the input); pipe that to the fmt filter with the ‘-u’ option (outputting the text with uniform spacing); and pipe that to grep with the pattern to search for.

To search across line breaks for the string ‘at the same time as’ in the file ‘docs’, type:

$ cat docs | tr -d ’rn:>|
-’ | fmt -u | grep ’at the same time as

Summary

In this article, we reviewed 10 practical examples of using Grep Linux command for searching and finding strings in a text file. Along the way, we learned how to use regular expressions in conjunction with Grep to conduct complex searches on text files. By now you have a better idea on how powerful Linux search functions are.

Here are additional resources for those interested in learning more about Linux programming:

Resources for System Administrators

Resources for Linux Kernel Programmers

Linux File System Dictionary

Comprehensive Review of How Linux File and Directory System Works

ONET IDC thành lập vào năm 2012, là công ty chuyên nghiệp tại Việt Nam trong lĩnh vực cung cấp dịch vụ Hosting, VPS, máy chủ vật lý, dịch vụ Firewall Anti DDoS, SSL… Với 10 năm xây dựng và phát triển, ứng dụng nhiều công nghệ hiện đại, ONET IDC đã giúp hàng ngàn khách hàng tin tưởng lựa chọn, mang lại sự ổn định tuyệt đối cho website của khách hàng để thúc đẩy việc kinh doanh đạt được hiệu quả và thành công.
Bài viết liên quan

What is Git?

What is Git? Today most software development projects are collaborative efforts. These projects can have hundreds or thousands...
28/12/2020

Building a Contact Form in PHP

How to create contact form using HTML, CSS and PHP A contact form is a very essential part of any website. The visitors...
28/12/2020

Install Asterisk VoIP Server on Ubuntu

Asterisk is a free and open source framework for building your own communication applications. With Asterisk, you can build...
29/12/2020
Bài Viết

Bài Viết Mới Cập Nhật

Cách gắn set proxy cho điện thoại android, oppo, giả lập android, Ldplayer Bằng Proxydroid
20/09/2023

Mua Proxy Socks5 VN Chơi Game Gia Lập Tăng Cường Trải Nghiệm Chơi Game
22/06/2023

Mua Proxy Mỹ, Us Nuôi Tài Khoản Etsy, eBay Tìm Hiểu Về Mua Proxy Mỹ tại Onet.com.vn
22/06/2023

Mua Proxy Game – Giải pháp tuyệt vời cho việc chơi game trên mạng mà không bị giới hạn về vị trí địa lý
03/06/2023

Sử dụng Proxy để Quản Lý Tài Khoản Quảng Cáo Ads Một Cách An Toàn
27/05/2023