Regular Expressions in Python

28/12/2020
In this article, we will take a brief look at regular expressions in python. We will work on built in functions with examples followed by a table which explains what each character means in regular expression for a better understanding.

What is a regular expression?

Before we move towards practical examples, we need to know what a regular expression really is. A regular expression is a sequence of characters which defines the structure of an input or a search pattern. Imagine putting in an email or password on some random website like Facebook, Twitter or Microsoft. Try putting it wrong and by wrong I mean try going against their convention. It will clearly point out those errors for you. You will not be allowed to go to the next step until your input matches the pattern that they have set in the backend. That specific pattern, which restricts you from putting any sort of additional or irrelevant information, is known as regex or regular expression.

Regular Expressions in Python

Regular expressions play no different part in python as in other programming languages. Python contains the module re which provides full support for the usage of regular expressions. Any time an unsuitable or unmatchable information is entered or any sort of error occurs, this re module is going to catch it as an exception which ultimately help solves the required problems.

Regular Expressions patterns

There are a lot of characters available written in a sequence which makes a specific regular expression pattern. Except for control characters, (+ ? . * ^ $ ( ) [ ] { } | ), all characters match themselves. However, control characters can be escaped by prewriting a backslash.

Following is a table which consists of a pattern and description about their working in python.

Pattern Description  
[Pp]ython Match “Python” or “python”
Tub[Ee] Match “TubE” or “Tube”
[aeiou] Match any lower case vowel
[0-9] Match any digit between 0 to 9
[a-z] Match any lowercase ASCII letter
[A-Z] Match any uppercase ASCII letter
[a-zA-Z0-9] Match any lowercase, uppercase ASCII letter
or a digit between 0 to 9
[^aeiou] Match anything but not lowercase vowels
[^0-9] Match anything but not digit
. Match any character except new line
d Match any digit: [0-9]
D Match a non-digit: [^0-9]
s Match white spaces
S Match non-white spaces
A Match beginning of string
Z Match end of string
w Match word characters
W Match non-word characters
[…] Match any single character in brackets
[^…] Match any single character not in brackets
$ Match the end of line
^ Match the beginning of line

Match and Search Functions in Python

Now, here we are going to see two examples with the two built in functions that exist in python. One is match and the other one is search function. Both of them take the same parameters which are as follows:

  • Pattern – A regular expression to be matched or searched.
  • String – A string which would be matched or searched in a sentence or in an input.

Before we jump into example part here is another thing that you need to know. Two methods can be used to get matching groups which are as follows:

  • groups()
  • group(num=0,1,2…)

What happens is that when match or search functions are used, it makes sub groups of all the related patterns found in strings and structure them at positions starting from 0. See the example below to get a better idea.

Match Function (Example)

In the following example, we have taken a list in which we have used a regular expression which checks the words starting with letter ‘a’ and will select only if both words start with the same letter i.e.: ‘a’.

import re
arraylist = [“affection affect”, “affection act”, “affection Programming”]
for element in arraylist:
    k = re.match((aw+)W(gw+), element)
        if k:
            print((z.groups()))

Output:

(‘affection’, ‘affect’)
(‘affection’, ‘act’)

Third element in the list will not be considered as it doesn’t match the regex which says that both words should start with ‘a’.

Search Function (Example)

This function is different from match. Search scans through the whole sentence while match does not. In the following example, Search method is successful but match function is not.

import re
Input = “DocumentationNew”
v = re.search((ta.*), Input)
if v:
    print(“result: ” v.group(1))

Output:

result: tationNew

‘ta.*’ means anything after ‘ta’ which gives us our result as ‘tationNew’ from the searched Input “DocumentationNew”.

Conclusion

Regular Expressions are crucial to all software developers and now you can see easily how to use Regular Expressions in the Python programming language.

ONET IDC thành lập vào năm 2012, là công ty chuyên nghiệp tại Việt Nam trong lĩnh vực cung cấp dịch vụ Hosting, VPS, máy chủ vật lý, dịch vụ Firewall Anti DDoS, SSL… Với 10 năm xây dựng và phát triển, ứng dụng nhiều công nghệ hiện đại, ONET IDC đã giúp hàng ngàn khách hàng tin tưởng lựa chọn, mang lại sự ổn định tuyệt đối cho website của khách hàng để thúc đẩy việc kinh doanh đạt được hiệu quả và thành công.
Bài viết liên quan

Virtual Environments in Python 3

Like most people, I hate installing unnecessary packages on my workstation. After you are done with them, uninstalling...
29/12/2020

shutil module in Python

File Management and Handling file objects are considered to be one of the most tricky tasks in all programming languages....
28/12/2020

Libvirt with Python

In one of my previous posts, I showed how one can get started with Libvirt and KVM. This virtualization stack is meant...
29/12/2020
Bài Viết

Bài Viết Mới Cập Nhật

Mua proxy v4 chạy socks5 để chơi game an toàn, tốc độ cao ở đâu?
18/05/2024

Thuê mua proxy Telegram trọn gói, tốc độ cao, giá siêu hời
18/05/2024

Thuê mua proxy Viettel ở đâu uy tín, chất lượng và giá tốt? 
14/05/2024

Dịch vụ thuê mua proxy US UK uy tín, chất lượng số #1
13/05/2024

Thuê mua proxy Việt Nam: Báo giá & các thông tin MỚI NHẤT
13/05/2024