每一个英雄的网页特点
https://lol.qq.com/data/info-defail.shtml?id= 加上英雄编号
每个英雄的各种皮肤的特点
直接对着网页检查
那么可以发现它的皮肤存放在相应的网页中
https://game.gtimg.cn/images/lol/act/img/skin/big + 英雄id + 皮肤id.jpg。
可以发现获取指定英雄皮肤id的 URL 就是:https://game.gtimg.cn/images/lol/act/img/js/hero/ + 英雄id.js
获取英雄的数目,观察其网站能够发现原来是可以通过读取.js文件,来获得相应的数据
因此总结出来就是我们要先获取所有的英雄信息,转换成为相应的文本格式,将名字和id放入到一个字典中
然后通过遍历字典中的英雄来获取相应的皮肤信息
综上所述
可以直接给出代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
| import os import json import requests from tqdm import tqdm
def lol_spider(): heros = [] hero_skins = [] url = 'https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js' hero_text = requests.get(url).text hero_json = json.loads(hero_text)['hero'] path = os.getcwd() workspace = os.getcwd() skin_path = "{}\\{}".format(workspace, 'skins') for hero in hero_json: hero_dict = {'id': hero['heroId'], 'name': hero['name']} heros.append(hero_dict) for hero in heros: hero_id = hero['id'] hero_name = hero['name'] dir_name = skin_path + '\\{}'.format(hero_name) if not os.path.exists(dir_name): os.mkdir(dir_name) os.chdir(dir_name) hero_skin_url = 'https://game.gtimg.cn/images/lol/act/img/js/hero/' + hero_id + '.js' skin_text = requests.get(hero_skin_url).text skin_json = json.loads(skin_text) skin_list = skin_json['skins'] hero_skins.clear() for skin in skin_list: hero_skins.append(skin['name'].replace('/', '').replace('\\', '').replace(' ', '')) skins_num = len(hero_skins) s = '' for i in tqdm(range(skins_num), desc='【' + hero_name + '】皮肤下载'): if len(str(i)) == 1: s = '00' + str(i) elif len(str(i)) == 2: s = '0' + str(i) elif len(str(i)) == 3: pass try: skin_url = 'https://game.gtimg.cn/images/lol/act/img/skin/big' + hero_id + '' + s + '.jpg' img = requests.get(skin_url) except: continue if img.status_code == 200: with open(hero_skins[i] + '.jpg', 'wb') as f: f.write(img.content)
if __name__ == '__main__': lol_spider()
|