python - Requests: Explanation of the .text format -


i'm using requests module along python 2.7 build basic web crawler.

source_code = requests.get(url) plain_text = source_code.text 

now, in above lines of code, i'm storing source code of specified url , other metadata inside source_code variable. now, in source_code.text, .text attribute? not function. couldn't find in documentation explains origin or feature of .text either.

requests.get() returns response object; object has .text attribute; not 'source code' of url, object lets access source code (the body) of response, other information. response.text attribute gives body of response, decoded unicode.

see response content section of quickstart documentation:

when make request, requests makes educated guesses encoding of response based on http headers. text encoding guessed requests used when access r.text.

further information can found in api documentation, see response.text entry:

content of response, in unicode.

if response.encoding none, encoding guessed using chardet.

the encoding of response content determined based solely on http headers, following rfc 2616 letter. if can take advantage of non-http knowledge make better guess @ encoding, should set r.encoding appropriately before accessing property.

you can use response.content access response body undecoded, raw bytes.


Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

ruby on rails - Seeing duplicate requests handled with Unicorn -