1200字范文,内容丰富有趣,写作的好帮手!
1200字范文 > 豆瓣电影TOP250爬取 并获得相关类型的推荐

豆瓣电影TOP250爬取 并获得相关类型的推荐

时间:2021-07-02 07:46:19

相关推荐

豆瓣电影TOP250爬取 并获得相关类型的推荐

import requestsimport randomfrom bs4 import BeautifulSoupimport lxml'''/top250/top250?start=25/top250?start=50&filter='''header1 = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ''(KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36','Host': ""} #谷歌header2 = {'User-Agent':"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"" (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18362",'Host': ""} # ieheader3 = {'User-Agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:71.0) Gecko/0101 Firefox/71.0",'Host': ""}header_list = [header1, header2, header3]datas = {}comedy = {} #喜剧love = {} #爱情sci_fi = {} #科幻thriller = {} #惊悚crime = {} #犯罪animation = {} #动画for i in range(1, 11):if i == 1:url = "/top250"else:url = '/top250?start=%d&filter='%((i-1)*25)header = header_list[random.randint(0, 2)]req = requests.get(url, headers = header)html = req.textbf = BeautifulSoup(html, 'lxml')soup = bf.find_all('div', class_ = 'info')for item in soup:data = {}movie_name = item.find('a').find('span').stringscore_str = item.find('div', class_= 'star').find('span', class_ = 'rating_num').stringscore = float(score_str)director_str = item.find('div', class_ = 'bd').find('p')director_str = str(director_str)director_str = director_str.replace(' ', '')director_str = director_str.replace('<pclass="">', '')director_str = director_str.replace('TimRobbins/...<br/>', '')director_str = director_str.replace('</p>', '')director_str = director_str.replace('...<br/>', '')director_str = director_str.split()director = director_str[0]starring = director_str[1]time = director_str[2]type = director_str[-1]data['name'] = movie_namedata['director'] = director[3 : ]data['type'] = typedata['time'] = timedata['score'] = scoredatas[movie_name] = dataif '喜剧' in type and score >= 9.0:comedy[movie_name] = dataif '爱情' in type and score >= 9.0:love[movie_name] = dataif '科幻' in type and score >= 9.0:sci_fi[movie_name] = dataif '惊悚' in type and score >= 9.0:thriller[movie_name] = dataif '犯罪' in type and score >= 9.0:crime[movie_name] = dataif '动画' in type and score >= 9.0:animation[movie_name] = data#超级推荐:print("豆瓣评分最高" + '>'*10)datas = sorted(datas.items(), key = lambda x:x[1]['score'], reverse=True)i = 0tplt = "{0:{2}^10}\t\t\t{1:{2}<10}"print(tplt.format("电影名称", "评分", chr(12288)))for value in datas:print(tplt.format(value[1]["name"], value[1]["score"], chr(12288)))i += 1if i == 10:breakprint()#喜剧电影print("喜剧电影推荐" + '>'*10)comedy = sorted(comedy.items(), key = lambda x:x[1]['score'], reverse=True)i = 0tplt = "{0:{2}^10}\t\t\t{1:{2}<10}"print(tplt.format("电影名称", "评分", chr(12288)))for value in comedy:print(tplt.format(value[1]["name"], value[1]["score"], chr(12288)))i += 1if i == 10:breakprint()#爱情电影print("爱情电影推荐" + '>'*10)love = sorted(love.items(), key = lambda x:x[1]['score'], reverse=True)i = 0tplt = "{0:{2}^10}\t\t\t{1:{2}<10}"print(tplt.format("电影名称", "评分", chr(12288)))for value in love:print(tplt.format(value[1]["name"], value[1]["score"], chr(12288)))i += 1if i == 10:breakprint()#科幻电影print("科幻电影推荐" + '>'*10)sci_fi = sorted(sci_fi.items(), key = lambda x:x[1]['score'], reverse=True)i = 0tplt = "{0:{2}^10}\t\t\t{1:{2}<10}"print(tplt.format("电影名称", "评分", chr(12288)))for value in sci_fi:print(tplt.format(value[1]["name"], value[1]["score"], chr(12288)))i += 1if i == 10:breakprint()#惊悚电影print("惊悚电影推荐" + '>'*10)thriller = sorted(thriller.items(), key = lambda x:x[1]['score'], reverse=True)i = 0tplt = "{0:{2}^10}\t\t\t{1:{2}<10}"print(tplt.format("电影名称", "评分", chr(12288)))for value in thriller:print(tplt.format(value[1]["name"], value[1]["score"], chr(12288)))i += 1if i == 10:breakprint()#犯罪电影print("犯罪电影推荐" + '>'*10)crime = sorted(crime.items(), key = lambda x:x[1]['score'], reverse=True)i = 0tplt = "{0:{2}^10}\t\t\t{1:{2}<10}"print(tplt.format("电影名称", "评分", chr(12288)))for value in crime:print(tplt.format(value[1]["name"], value[1]["score"], chr(12288)))i += 1if i == 10:breakprint()#动画电影print("动画电影推荐" + '>'*10)animation = sorted(animation.items(), key = lambda x:x[1]['score'], reverse=True)i = 0tplt = "{0:{2}^10}\t\t\t{1:{2}<10}"print(tplt.format("电影名称", "评分", chr(12288)))for value in animation:print(tplt.format(value[1]["name"], value[1]["score"], chr(12288)))i += 1if i == 10:breakprint()

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。