python - Requests: Explanation of the .text format -
i'm using requests
module along python 2.7 build basic web crawler.
source_code = requests.get(url) plain_text = source_code.text
now, in above lines of code, i'm storing source code of specified url , other metadata inside source_code
variable. now, in source_code.text
, .text
attribute? not function. couldn't find in documentation explains origin or feature of .text
either.
requests.get()
returns response
object; object has .text
attribute; not 'source code' of url, object lets access source code (the body) of response, other information. response.text
attribute gives body of response, decoded unicode
.
see response content section of quickstart documentation:
when make request, requests makes educated guesses encoding of response based on http headers. text encoding guessed requests used when access
r.text
.
further information can found in api documentation, see response.text
entry:
content of response, in unicode.
if response.encoding none, encoding guessed using
chardet
.the encoding of response content determined based solely on http headers, following rfc 2616 letter. if can take advantage of non-http knowledge make better guess @ encoding, should set
r.encoding
appropriately before accessing property.
you can use response.content
access response body undecoded, raw bytes.
Comments
Post a Comment