Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

房间布局和面积的提取 #33

Open
JKYSTUDY opened this issue Dec 28, 2019 · 0 comments
Open

房间布局和面积的提取 #33

JKYSTUDY opened this issue Dec 28, 2019 · 0 comments

Comments

@JKYSTUDY
Copy link

由于数据的不规律,爬取贝壳时用下标提取常常会有一些异常值,建议使用正则表达式
pattern_layout = re.compile(r'[0-9]{1,2}[\u4e00-\u9fa5][0-9]{1,2}[\u4e00-\u9fa5][0-9]{1,2}[\u4e00-\u9fa5]')
pattern_size = re.compile(r'([0-9]{1,3})㎡')
descs = desc2.text.strip().replace("\n", "").replace(" ", "").replace("/", "")
m_layout = pattern_layout.search(descs)
m_size = pattern_size.search(descs)
layout = m_layout.group()
size = m_size.group()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant