[python] Crawling 해보기

카테고리 없음

[python] Crawling 해보기

프로틴형님 2022. 8. 7. 03:34

Crawling 해보기

Web Crawler : 사전적으로 '기는 것', '파충류'라는 의미이다. 웹 페이지의 데이터를 모아주는 소프트웨어
Web Crawling : 크롤러를 사용해 웹 페이지의 데이터를 추출해 내는 행위
Parsing : 데이터를 의미있게 변경하는 작업
parser : Parsing을 도와주는 프로그램

라이브러리 준비

pip install requests
pip install bs4

>> 스크립트로 requests 라이브러리 설치 위치 확인

import requests

print(requests)

요청하고 응답받기

google 사이트 html code 읽기

import requests

url = "http://www.google.com"
response = requests.get(url)

print(response.text)

[text, url, content, endcoding, headers, json, links, ok, status_code] key가 있음

Beautiful Soup

str 타입을 BeautifulSoup 형태로 변형
=> BeautifulSoup( 데이터 , 파싱방법 )

import requests
from bs4 import BeautifulSoup

url = "http://www.google.com"
response = requests.get(url)

print(BeautifulSoup(response.text, 'html.parser'))

html tag 가져오기

import requests
from bs4 import BeautifulSoup

url = "http://www.google.com"
response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')

print(soup.title)
print(soup.title.string)
print(soup.findAll('span'))

현재글[python] Crawling 해보기

멋쟁이사자처럼 X 넥슨 MOD Suppoters Hackathon, KMU SUMMER AI, 철학의물음들, 코틀린, spring boot, 응용 통계학, 모바일 프로그래밍, nginx, 크롤링, pandas, kubernetes, 안드로이드 스튜디오, Python, BST, MOD, eda, 철학, numpy, matplotlib, 철학의 물음들,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

이혁규의 개발비책

[python] Crawling 해보기

Crawling 해보기

라이브러리 준비

요청하고 응답받기

Beautiful Soup

'카테고리 없음'의 다른글

티스토리툴바