Skip to content

Commit 2c99b9f

Browse files
committed
'新增文书爬虫示例'
1 parent d52089b commit 2c99b9f

File tree

2 files changed

+175
-5
lines changed

2 files changed

+175
-5
lines changed

Diff for: README.md

+13-5
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@
5555

5656
> **参数生成**
5757
58-
[拼多多](https://github.com/wkunzhi/Python3-Spider/tree/master/【拼多多】登陆参数生成) | [小牛在线](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【小牛在线】登录参数生成) | [开鑫贷](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【开鑫贷】登陆参数生成) | [时光网](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【时光网】登陆参数生成) | [百度](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【百度】自动登录) | [公众号密码加密](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【微信】登录参数生成) | [移动](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【移动】登录参数生成) | [好莱客](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【好莱客】参数解析) | [青海移动](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【青海移动】登陆参数生成) | [新浪微博](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【新浪微博】密码解密) | [汽车之家](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【汽车之家】参数解密) | [steam](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【steam】登录) | [百度wap端sig生成](https://github.com/wkunzhi/Python3-Spider/tree/master/%E5%85%B6%E4%BB%96%E5%AE%9E%E6%88%98/%E3%80%90%E7%99%BE%E5%BA%A6%E3%80%91wap%E7%AB%AFsig%E7%94%9F%E6%88%90)
58+
[拼多多](https://github.com/wkunzhi/Python3-Spider/tree/master/【拼多多】登陆参数生成) 失效! | [小牛在线](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【小牛在线】登录参数生成) | [开鑫贷](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【开鑫贷】登陆参数生成) | [时光网](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【时光网】登陆参数生成) | [百度](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【百度】自动登录) | [公众号密码加密](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【微信】登录参数生成) | [移动](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【移动】登录参数生成) | [好莱客](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【好莱客】参数解析) | [青海移动](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【青海移动】登陆参数生成) | [新浪微博](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【新浪微博】密码解密) | [汽车之家](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【汽车之家】参数解密) | [steam](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【steam】登录) | [百度wap端sig生成](https://github.com/wkunzhi/Python3-Spider/tree/master/%E5%85%B6%E4%BB%96%E5%AE%9E%E6%88%98/%E3%80%90%E7%99%BE%E5%BA%A6%E3%80%91wap%E7%AB%AFsig%E7%94%9F%E6%88%90)
5959

6060

6161
> **自动登录**
@@ -64,13 +64,10 @@
6464

6565
> **其他实战**
6666
67-
[抖音无水印视频解析](https://github.com/wkunzhi/Python3-Spider/tree/master/【抖音】无水印视频解析) | [企业名片查询](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【企业名片】企业查询) | [百度找回密码](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【百度】网页找回密码) | [美女壁纸下载](https://github.com/wkunzhi/Python3-Spider/tree/master/【双色球】头奖分布) | [美女壁纸下载](https://github.com/wkunzhi/Python3-Spider/tree/master/【壁纸】美女壁纸下载器) | [美团 解析与token生成](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【美团】数据解析、token生成) | [bilibili 视频下载](https://github.com/wkunzhi/Python3-Spider/tree/master/【bilibili】视频下载) | [51job 查岗位](https://github.com/wkunzhi/Python3-Spider/tree/master/【51Job】查岗位) | [百度 翻译](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【百度】翻译) | [美团 全国区域](https://github.com/wkunzhi/Python3-Spider/tree/master/各站案例/MeiTuanArea) | [企业名片查询](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【餐饮】查询信息) | [快递查询](https://github.com/wkunzhi/Python3-Spider/tree/master/【快递】单号查询) | [金逸电影 注册](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【金逸电影】自动注册) | [Python加密库Demo](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【Python加密库】Demo) | [百度街拍图片下载](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【百度街拍】图片下载) | [京东商品数据爬取](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【京东】商品数据爬取) | [房价获取](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【房价】房价获取)
67+
[文书网app查询接口](https://github.com/wkunzhi/Python3-Spider/tree/master/【文书】app查询接口) | [抖音无水印视频解析](https://github.com/wkunzhi/Python3-Spider/tree/master/【抖音】无水印视频解析) | [企业名片查询](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【企业名片】企业查询) | [百度找回密码](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【百度】网页找回密码) | [美女壁纸下载](https://github.com/wkunzhi/Python3-Spider/tree/master/【双色球】头奖分布) | [美女壁纸下载](https://github.com/wkunzhi/Python3-Spider/tree/master/【壁纸】美女壁纸下载器) | [美团 解析与token生成](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【美团】数据解析、token生成) | [bilibili 视频下载](https://github.com/wkunzhi/Python3-Spider/tree/master/【bilibili】视频下载) | [51job 查岗位](https://github.com/wkunzhi/Python3-Spider/tree/master/【51Job】查岗位) | [百度 翻译](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【百度】翻译) | [美团 全国区域](https://github.com/wkunzhi/Python3-Spider/tree/master/各站案例/MeiTuanArea) | [企业名片查询](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【餐饮】查询信息) | [快递查询](https://github.com/wkunzhi/Python3-Spider/tree/master/【快递】单号查询) | [金逸电影 注册](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【金逸电影】自动注册) | [Python加密库Demo](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【Python加密库】Demo) | [百度街拍图片下载](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【百度街拍】图片下载) | [京东商品数据爬取](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【京东】商品数据爬取) | [房价获取](https://github.com/wkunzhi/Python3-Spider/tree/master/其他实战/【房价】房价获取)
6868

6969

7070

71-
## 抖音视频解析器
72-
![](https://zok-blog.oss-cn-hangzhou.aliyuncs.com/images/20200210/317223745.png)
73-
7471

7572
## 原创工具
7673
> 此工具包在我另外一个项目中,欢迎 star
@@ -101,6 +98,17 @@
10198
- 打开 `auto_login_pyppeteer.py` Run 代码,输入淘宝账号、密码即可自动登录
10299

103100

101+
----
102+
103+
104+
##文书网app
105+
106+
[《入门级安卓逆向 - 文书网app爬虫教程》](https://www.zhangkunzhi.com/index.php/archives/162/)
107+
108+
![](https://static.zhangkunzhi.com/typecho/2020/07/24/603402218498341/1595560337.png)
109+
110+
111+
104112
----
105113

106114
### 美女壁纸下载器

Diff for: 【文书】app查询接口/main.py

+162
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
# __author__ = "zok" [email protected]
2+
# Date: 2020/7/24 Python:3.7
3+
4+
import requests
5+
import time
6+
import random
7+
import json
8+
import base64
9+
import pyDes
10+
from datetime import datetime
11+
12+
13+
class TripleDesUtils:
14+
15+
def encryption(self, data: str, key, iv) -> str:
16+
"""3des 加密
17+
"""
18+
_encryption_result = pyDes.triple_des(key, pyDes.CBC, iv, None, pyDes.PAD_PKCS5).encrypt(data)
19+
_encryption_result = self._base64encode(_encryption_result).decode()
20+
return _encryption_result
21+
22+
def decrypt(self, data: str, key, iv) -> str:
23+
"""3des 解密
24+
"""
25+
data = self._base64decode(data)
26+
_decrypt_result = pyDes.triple_des(key, pyDes.CBC, iv, None, pyDes.PAD_PKCS5).decrypt(data).decode('utf-8')
27+
return _decrypt_result
28+
29+
@staticmethod
30+
def _base64encode(data):
31+
try:
32+
_b64encode_result = base64.b64encode(data)
33+
except Exception as e:
34+
raise Exception(f"base64 encode error:{e}")
35+
return _b64encode_result
36+
37+
@staticmethod
38+
def _base64decode(data):
39+
try:
40+
_b64decode_result = base64.b64decode(data)
41+
except Exception as e:
42+
raise Exception(f"base64 decode error:{e}")
43+
return _b64decode_result
44+
45+
46+
class WenShu:
47+
48+
def __init__(self):
49+
self.js = None
50+
51+
@staticmethod
52+
def get_now_data():
53+
"""时间
54+
"""
55+
return datetime.now().strftime('%Y%m%d')
56+
57+
@staticmethod
58+
def random_key():
59+
"""字符串
60+
"""
61+
random_str = ''
62+
base_str = 'ABCDEFGHIGKLMNOPQRSTUVWXYZabcdefghigklmnopqrstuvwxyz0123456789'
63+
length = len(base_str) - 1
64+
for i in range(24):
65+
random_str += base_str[random.randint(0, length)]
66+
return random_str
67+
68+
@staticmethod
69+
def make_id():
70+
"""id
71+
"""
72+
return datetime.now().strftime('%Y%m%d%H%M%S')
73+
74+
def make_cipher_text(self):
75+
"""生成 ciphertext
76+
"""
77+
time_13 = str(int(round(time.time() * 1000)))
78+
key = self.random_key()
79+
now = self.get_now_data()
80+
_str = des3.encryption(time_13, key, now)
81+
_str = key + now + _str
82+
new_str = ''
83+
for i in _str:
84+
if i != 1:
85+
new_str += " "
86+
new_str += str(bin(ord(i))[2:])
87+
88+
msg = """【key生成】: {key}\n【now生成】: {now}\n【_str生成】: {_str}\n【ciphertext生成】: {ciphertext}""".format(key=key,
89+
now=now,
90+
_str=_str,
91+
ciphertext=new_str)
92+
print(msg)
93+
94+
return new_str.strip()
95+
96+
def make_request(self):
97+
"""生成明文的请求 data 内容
98+
【这里需要根据实际需求修改请求内容】自行抓包研究!!
99+
"""
100+
info = {
101+
"id": self.make_id(), # 年月日时分秒
102+
"command": "queryDoc", # 固定
103+
"params": {
104+
"devid": "41d861ffe5b347d28454dc3f07dd4212", # 设备号
105+
"devtype": "1",
106+
"ciphertext": self.make_cipher_text(),
107+
"pageSize": "20",
108+
"sortFields": "s50:desc", # 固定
109+
"pageNum": "1",
110+
"queryCondition": [{
111+
"key": "s8",
112+
"value": "02"
113+
}] # 关键词 + 搜索文本的类型;
114+
}
115+
}
116+
return info
117+
118+
def to_index(self):
119+
url = 'http://wenshuapp.court.gov.cn/appinterface/rest.q4w'
120+
headers = {
121+
'Content-Type': 'application/x-www-form-urlencoded',
122+
'User-Agent': 'Dalvik/2.1.0 (Linux; U; Android 9; MIX 2 MIUI/V11.0.2.0.PDECNXM)',
123+
'Host': 'wenshuapp.court.gov.cn',
124+
'Connection': 'Keep-Alive',
125+
'Accept-Encoding': 'gzip',
126+
}
127+
txt = str(self.make_request())
128+
129+
request = base64.b64encode(txt.encode('utf-8')).decode('utf-8')
130+
data = {
131+
'request': request
132+
}
133+
msg = """【明文请求体】: {txt}\n【密文请求体】: {data}\n【官网速度较慢,耐心等待】....""".format(txt=txt, data=data)
134+
print(msg)
135+
response = requests.post(url, headers=headers, data=data)
136+
if 'HTTP Status 503' in response.text:
137+
print('【服务器繁忙】 爬的人太多了, 请重试')
138+
exit()
139+
data = json.loads(response.text)
140+
content = data.get('data').get('content')
141+
key = data.get('data').get('secretKey')
142+
iv = self.get_now_data()
143+
msg = """【页面访问结果】: {text}\n【捕获key】:{key}\n【捕获iv】:{iv}\n【捕获content】:{content}""".format(text=response.text,
144+
key=key, iv=iv,
145+
content=content)
146+
print(msg)
147+
self.parse_html(content, key, iv)
148+
149+
def parse_html(self, content, key, iv):
150+
_str = des3.decrypt(content, key, iv)
151+
print("【解密返回结果】:", _str)
152+
153+
154+
des3 = TripleDesUtils()
155+
156+
if __name__ == '__main__':
157+
"""
158+
《入门级安卓逆向 - 文书网app爬虫》
159+
https://www.zhangkunzhi.com/index.php/archives/162/
160+
"""
161+
ws = WenShu()
162+
ws.to_index()

0 commit comments

Comments
 (0)