Regular expression operations in Python
https://docs.python.org/3/library/re.html
re — Regular expression operations
Source code: Lib/re/ This module provides regular expression matching operations similar to those found in Perl. Both patterns and strings to be searched can be Unicode strings ( str) as well as 8-...
docs.python.org
Test site for regular expresion
regex101: build, test, and debug regex
Regular expression tester with syntax highlighting, explanation, cheat sheet for PHP/PCRE, Python, GO, JavaScript, Java, C#/.NET, Rust.
regex101.com
The Zen of Python
Python Easter Egg
$> python -c "import this" //실행하면 아래 내용이 출력됨
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
....
Basic Pattern Expressions
python 을 실행한 후 아래 command 를 차례로 입력한다.
import re
line = "Beautiful is better than ugly."
matches = re.findall("Beautiful", line)
print(matches)
output : ['Beautiful']
matches2 = re.findall("beautiful", line, re.IGNORECASE)
print(matches2)
output : ['Beautiful']
Start or End Matching
zen2 = """Although never is often better than * right * now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea - - let's do more of those!"""
m = re.findall("^If", zen2, re.MULTILINE) # ^ : start with If
m2 = re.findall("idea\.", zen2, re.MULTILINE) # find "idea." \. means real '.'.
m22 = re.findall("idea.*", zen2, re.MULTILINE) #zere or more occurrence of any characters followed by idea
m3 = re.findall("idea.$", zen2, re.MULTILINE) # $ : end with idea.
print(m, m2, m22, m3)
m = re.findall("^If", zen2, re.MULTILINE) # ^ : start with If
m2 = re.findall("idea\.", zen2, re.MULTILINE) # find "idea." \. means real '.'.
m22 = re.findall("idea.*", zen2, re.MULTILINE) #zere or more occurrence of any characters followed by idea
m3 = re.findall("idea.$", zen2, re.MULTILINE) # $ : end with idea.
output : ['If', 'If'] ['idea.', 'idea.'] ['idea.', 'idea.', "idea - - let's do more of those!"] ['idea.', 'idea.']
Select One and None
import re
string = "Two aa too"
# m = re.findall("t[ow]o", string) #[ow] : ‘o’ or ‘w’
m = re.findall("t[ow]o", string, re.IGNORECASE)
print(m)
output : ['Two', 'too']
m = re.findall("t[^w]o", string, re.IGNORECASE) #[]안의 ^은 NOT임
print(m)
output : ['too']
Find Numbers
import re
string = "123?45yy7890 hi 999 hello"
m1 = re.findall("\d", string) # ‘\d’ 은 숫자
m2 = re.findall("[0-9]{1,2}", string) #[0-9]:0부터9까지 {1,2}: 1글자에서 2글자.
m3 = re.findall("[1-5]{1,2}", string) #[1-5]:1부터5까지 {1,2}: 1글자에서 2글자.
print("m1=", m1)
output : m1= ['1', '2', '3', '4', '5', '7', '8', '9', '0', '9', '9', '9']
print("m2=", m2)
output : m2= ['12', '3', '45', '78', '90', '99', '9']
print("m3=", m3)
output : m3= ['12', '3', '45']
Compiled Pattern
import re
string = "123?45yy7890 hi 999 hello"
# pattern = re.compile("[0-9]{1,3}") #[0-9]:0부터9까지 {1,3}: 1글자에서 3글자.
pattern = re.compile("(\d{1,3})") #숫자중 {1,3}: 1글자에서 3글자.
mm = re.findall(pattern, string) # mm 은 list type
print(mm)
output : ['123', '45', '789', '0', '999']
for m in re.finditer(pattern, string): # m은 tuple type
print(m.groups())
output :
('123',)
('45',)
('789',)
('0',)
('999',)
Compiled Pattern (Cont'd)
import re
string = "aaaaaaa<hr>This</hr>"
pattern = re.compile("<(.*)>") cf. re.compile("<.*>") #<hr>This</hr>
# 소괄호 ( 은 찾고자 하는 문자 시작을 의미한다.
mm = re.findall(pattern, string)
print(mm)
output : ['hr>This</hr']
for m in re.finditer(pattern, string):
print(m.groups(1))
output : ('hr>This</hr',)
re.sub() : replace
import re
string = "https://aaa.bbb.com/"
x = re.sub("(http(s)*|:|/)", '', str)
print(x)
Try This
The Zen of Python에서 'is better than' 이 있는 문장들을 찾아내고,
무엇이 무엇보다 나은지를 소문자로 출력하시오.
- 나은것만 출력하기
- 나은것 > 못한것
ex.
beautiful > ugly
simple > complex
now > never
'Python' 카테고리의 다른 글
postman 사용법 w/ Django (0) | 2022.06.02 |
---|