格子上每个在售角色的网页,都按照同样的格式存储信息,浏览器随便打开一个角色页面,然后按F12键,可以在网页源码中搜索“charObj”,如下图所示。

举个栗子,下面的JSON数据是一个110W评分的大唐门。数据如此清晰明了,简直是爬虫的天堂。而且,格子不监控ip访问量,只要访问间隔不太短,可以一直爬取数据。
{ "CYLevel": 3, // 畅游VIP等级 "TLLevel": 9, // 天龙VIP等级 "CLNum": 2, "itemShopNum": 0, "petShopNum": 0, "gemNum3": 0, "gemNum4": 0, "gemNum5": 0, "gemNum6": 28, "gemNum7": 38, "gemNum8": 7, "gemNum9": 0, "charName": "遥夜迢迢ˋ", "level": 119, "sex": 0, "menpai": 11, //门派编号 "equipScore": 1113056, //装备评分 "equipScoreHH": 1113060, // 历史最高装备评分 "title": "权倾秀阁无双帝", "maxHp": 1606102, // 血量 "maxMp": 73041, // 蓝量 "str": 756, // 力量 "strPlus": 44, "spr": 2007, // 灵气 "sprPlus": 345, "con": 13574, // 体力 "conPlus": 755, "com": 1479, // 定力 "comPlus": 128, "dex": 5358, // 身法 "dexPlus": 347, "phyAttack": 28473, "phyAttackPlus": 1142, "magAttack": 67975, "magAttackPlus": 7803, "phyDef": 87604, "phyDefPlus": 2286, "magDef": 39716, "magDefPlus": 5009, "hit": 139579, "hitPlus": 5049, "miss": 34911, "missPlus": 1731, "criticalAtt": 2405, "criticalDef": 539, "coldAtt": 1231, "fireAtt": 1235, "lightAtt": 1377, "postionAtt": 26287, "coldDef": 696, "fireDef": 151, "lightDef": 216, "postionDef": 239, "resistColdDef": 46, "resistFireDef": 60, "resistLightDef": 91, "resistPostionDef": 2233, "resistColdDefLimit": 0, "resistFireDefLimit": 20, "resistLightDefLimit": 0, "resistPostionDefLimit": 51, "qianNeng": 0, "equipEnchanceSetAttackLevel": 4, "equipEnchanceSetDefenceLevel": 4, "xinFaList": [], "xiuLianList": [], "miFaList": [], "upgradeScore": 4410, "petList": [], "zhenYuanList": [], "bkBgBaseInfo": {}, "items": { // 包括背包里的所有物品 "commonItem": {}, "equip": {}, "petEquip": {}, "card": {} }, "miJi": {}, "chuanCiJianMian": 4395, // 穿刺减免 "chuanCiShangHai": 11079, // 穿刺伤害 "secondGemStatus": 1, // 是否有万宝 "SecondGemInfo": {}, "gemXiuLianScore": 1455, // 宝石进阶评分 "gemJinJieScore": 1455, "infants": [], "shenDing": {}, "yushou": {}, "zhenyuanInfo": {}, "tujianInfo": { // 新外观数据 "playerDressInfo": [{ // 时装数据 "dataId": 10126219, "num": 1, "isBind": 1, "pos": 6, "takeLevel": 1, "name": "倾国两相欢", "icon": "Cloth21_5", "desc": "", "useTimeDesc": "", "isMingke": 0, "gemAttr": [] },...], "playerExWeaponInfo": [{}], // 幻世武器数据 "playerExRideInfo": [{ // 坐骑数据 "exteriorId": 142, "name": "沧澜羽翼", "icon": "RideHeader8_7", "extType": 1, "useTimeDesc": "", "speed": 85 }], "playerExHairInfo": [{ //发型数据 "exteriorId": 25, "name": "雪羽发", "icon": "Woman_hair2_11", "extType": 0 }, ...], "playerExFaceInfo": [{ // 脸型数据 "exteriorId": 13, "name": "灵媚型", "icon": "Woman_face1_13", "extType": 0 },...], "playerExFutiInfo": [{ // 附体风格数据 "exteriorId": 1, "name": "默认", "icon": "PetPossJian1_6", "extType": 3, "futiLevelInfo": [{ "exteriorId": 1001, "name": "原始风格", "icon": "PetPossJian1_6", "extType": 0, "level": 1, "isJihuo": 1 }] },...], "playerExHeadInfo": [{ // 头像数据 "exteriorId": 16, "name": "雪萤", "icon": "GirlProtagonist_16", "extType": 0 }], "playerExFrameInfo": [{}], // 头像框数据 "infantDressInfo": [{}], // 子女时装数据 "infantExHairInfo": [{}], "infantClorInfo": [{ "red": 0, "green": 0, "blue": 0 }, }, "version": 1 }
不仅如此,商品号反映了账号上架的确切时间,比如202104101602078709就是这个大唐门的上架时间,2021年4月10日16:02,其余尾数是更精确的毫秒。
还想顺便吐槽一下,贴吧空降的记录贴,截图交易页面,只打码价格,不打码商品号,完全是脱裤子放屁。因为格子的商品链接是这样婶的:
http://tl.cyg.changyou.com/goods/char_detail?serial_num=202104101602078709
加粗的商品号改成任意其他角色的商品号就能进入对应商品的页面,不管成交还是下架,都会显示价格,因此不给商品号打码就毫无意义了。
最后附上已开发的高级搜号网址
摸鱼堡版权所有丨如未注明,均为原创丨本网站采用BY-NC-SA协议进行授权
转载请注明转自:http://moyubao.net/coder/1107/
转载请注明转自:http://moyubao.net/coder/1107/
人物信息完全没有 ,是异步加载么'”。>小白太难了
为什么我用python爬不到全部源码呢,只显示一部分~。~
应该不是异步加载的吧。你设置请求头了嘛,User-agent的,不然确实可能request不到网页全部,加油吧
直接就能爬 ,页面不全是因为字太多了 前面的在控制台显示不全,保存到txt就出来了@~@我笑了。