request的内部结构
Session对象 与 mount
方法
import requests
s = requests.Session()
s.get('https://www.google.com')
查看Session
源码中,初始化的动作
class Session(SessionRedirectMixin):
"""A Requests session.
Provides cookie persistence, connection-pooling, and configuration.
Basic Usage::
>>> import requests
>>> s = requests.Session()
>>> s.get('https://httpbin.org/get')
<Response [200]>
Or as a context manager::
>>> with requests.Session() as s:
... s.get('https://httpbin.org/get')
<Response [200]>
"""
__attrs__ = [
'headers', 'cookies', 'auth', 'proxies', 'hooks', 'params', 'verify',
'cert', 'adapters', 'stream', 'trust_env',
'max_redirects',
]
def __init__(self):
...
# Default connection adapters.
self.adapters = OrderedDict()
self.mount('https://', HTTPAdapter()) # class requests.adapters.HTTPAdapter(pool_connections=10, pool_maxsize=10, max_retries=0, pool_block=False)
self.mount('http://', HTTPAdapter()) # class requests.adapters.HTTPAdapter(pool_connections=10, pool_maxsize=10, max_retries=0, pool_block=False)
# HTTP适配器所做的只是根据目标URL为不同的请求提供不同的配置
HTTP Adapter
HTTP适配器
根据目标URL, 为不同的请求提供不同的配置
HTTP适配器中的 pool_connections
, ·一个连接池对应一个主机·
pool_connections
– The number of urllib3 connection pools to cache.
一个连接池对应一个主机
HTTP基于TCP协议
。 HTTP连接也是TCP连接
,由五个值的元组标识:
(<protocol>, <src addr>, <src port>, <dest addr>, <dest port>)
假设已经与www.example.com
建立了HTTP / TCP连接
,并假设服务器支持Keep-Alive
,
因为上述的五个值均不变,那么下次您将请求发送到www.example.com/a
或www.example.com/b
时, 可以使用相同的连接
例子:
HTTPAdapter(pool_connections = 1)
已安装到https
,这意味着一次只能保留一个连接池
。
调用s.get('https://www.baidu.com')
之后,缓存的连接池为connectionpool('https://www.baidu.com')
。
当请求s.get('https://www.zhihu.com')
的时候,session
发现它不能使用以前缓存的连接,因为它不是同一主机(一个连接池对应一个主机)。
因此,session
必须创建一个新的连接池或连接(如果需要)。 由于pool_connections = 1
,会话无法同时容纳两个连接池,因此它放弃了旧的连接池
(即connectionpool('https://www.baidu.com')
),
并保留了新的连接池
(即connectionpool('https: //www.zhihu.com”
)。
所以会在在日志中看到三个正在启动新的HTTPS连接
(1)
import requests
from requests.adapters import HTTPAdapter
from http.client import HTTPConnection # py3
HTTPConnection.debuglevel = 1
s = requests.Session()
s.mount('https://', HTTPAdapter(pool_connections=1))
s.get('https://www.baidu.com')
s.get('https://www.zhihu.com')
s.get('https://www.baidu.com')
# INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): www.baidu.com
# DEBUG:requests.packages.urllib3.connectionpool:"GET / HTTP/1.1" 200 None
# INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): www.zhihu.com
# DEBUG:requests.packages.urllib3.connectionpool:"GET / HTTP/1.1" 200 2621
# INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): www.baidu.com
# DEBUG:requests.packages.urllib3.connectionpool:"GET / HTTP/1.1" 200 None
(2)
# 只创建了两次连接,并节省了一个连接建立时间
s = requests.Session()
s.mount('https://', HTTPAdapter(pool_connections=2))
s.get('https://www.baidu.com')
s.get('https://www.zhihu.com')
s.get('https://www.baidu.com')
# 只创建了两次连接,并节省了一个连接建立时间
# INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): www.baidu.com
# DEBUG:requests.packages.urllib3.connectionpool:"GET / HTTP/1.1" 200 None
# INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): www.zhihu.com
# DEBUG:requests.packages.urllib3.connectionpool:"GET / HTTP/1.1" 200 2623
# DEBUG:requests.packages.urllib3.connectionpool:"GET / HTTP/1.1" 200 None
HTTP适配器中的 pool_maxsize
,保存的可重复使用的连接数 (多线程环境使用
)
只有在多线程
环境中使用Session时,才应该关心pool_maxsize
,例如使用同一Session
从多个线程发出并发请求
。
pool_maxsize
是用于初始化urllib3的HTTPConnectionPool
的参数, 即上面提到的连接池
。
HTTPConnectionPool
是一个容器
,用于收集到特定主机的连接
,
pool_maxsize
是要保存的可重复使用的连接数
如果在一个线程
中运行代码,则既不可能也不需要创建与同一主机的多个连接
,
这是因为请求库被阻塞
,因此HTTP请求
总是一个接一个地发送
。
s = requests.Session()
s.mount('https://', HTTPAdapter(pool_connections=1, pool_maxsize=1))
t1 = Thread(target=thread_get, args=('https://www.zhihu.com',))
t2 = Thread(target=thread_get, args=('https://www.zhihu.com/question/36612174',))
t1.start()
t2.start()
t1.join();t2.join()
t3 = Thread(target=thread_get, args=('https://www.zhihu.com/question/39420364',))
t4 = Thread(target=thread_get, args=('https://www.zhihu.com/question/21362402',))
t3.start();t4.start()
t3.join();t4.join()
INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): www.zhihu.com
INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (2): www.zhihu.com
DEBUG:requests.packages.urllib3.connectionpool:"GET /question/36612174 HTTP/1.1" 200 21906
DEBUG:requests.packages.urllib3.connectionpool:"GET / HTTP/1.1" 200 2606
WARNING:requests.packages.urllib3.connectionpool:Connection pool is full, discarding connection: www.zhihu.com
# 只允许一个连接
INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (3): www.zhihu.com
DEBUG:requests.packages.urllib3.connectionpool:"GET /question/39420364 HTTP/1.1" 200 28739
DEBUG:requests.packages.urllib3.connectionpool:"GET /question/21362402 HTTP/1.1" 200 57556
WARNING:requests.packages.urllib3.connectionpool:Connection pool is full, discarding connection: www.zhihu.com
s = requests.Session()
s.mount('https://', HTTPAdapter(pool_connections=1, pool_maxsize=2))
s.mount('https://baidu.com', HTTPAdapter(pool_connections=1, pool_maxsize=1))
t1 = Thread(target=thread_get, args=('https://www.zhihu.com',))
t2 =Thread(target=thread_get, args=('https://www.zhihu.com/question/36612174',))
t1.start();t2.start()
t1.join();t2.join()
t3 = Thread(target=thread_get, args=('https://www.zhihu.com/question/39420364',))
t4 = Thread(target=thread_get, args=('https://www.zhihu.com/question/21362402',))
t3.start();t4.start()
t3.join();t4.join()
INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): www.zhihu.com
INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (2): www.zhihu.com
DEBUG:requests.packages.urllib3.connectionpool:"GET /question/36612174 HTTP/1.1" 200 21906
DEBUG:requests.packages.urllib3.connectionpool:"GET / HTTP/1.1" 200 2623
DEBUG:requests.packages.urllib3.connectionpool:"GET /question/39420364 HTTP/1.1" 200 28739
DEBUG:requests.packages.urllib3.connectionpool:"GET /question/21362402 HTTP/1.1" 200 57669
» laike9m – Requests’ secret:pool_connections and pool_maxsize urllib3.connectionpool.HTTPConnectionPool