maimaidx导分查分器研究

字数统计: 1.6k阅读时长: 7 min

 2024/06/30 

maimaidx导分查分器研究

前言

原因是一个maimai菜鸡+残疾人，习惯了使用break一键导分，而在突然的某一天break寄了以后，突发奇想，想搞一个本地的break进行自己本地的一键导分一键查分。进行作为一个学逆向的，搞这种web和爬虫的东西从头学起真的很难，为什么我不是web仔？？

手机下的抓包

因为之前有某o的导入抽卡数据的经验，这边尝试用手机上的抓包工具stream进行对maimai官方的网页抓包的，结果直接好像被ban了，可能是微信检测出来了。。。

windows下的抓包

因为在微信中访问网页是用微信自带的浏览器进行访问的，无法进行F12查看源码，所以我们用抓包工具进行抓包，再进行重发包访问。

最有用的还是这个userid

用hackbar进行重发包，改个Cookie的数据就可以让网页正常在浏览器中显示了

然后做一个简单python脚本进行爬虫，将有用的网页的源码dump下来

import requests
session = requests.Session()
url1 = 'https://maimai.wahlap.com/maimai-mobile/record/musicGenre/search/?genre=99&diff=0'
url2 = 'https://maimai.wahlap.com/maimai-mobile/record/musicGenre/search/?genre=99&diff=1'
url3 = 'https://maimai.wahlap.com/maimai-mobile/record/musicGenre/search/?genre=99&diff=2'
url4 = 'https://maimai.wahlap.com/maimai-mobile/record/musicGenre/search/?genre=99&diff=3'
url5 = 'https://maimai.wahlap.com/maimai-mobile/record/musicGenre/search/?genre=99&diff=4'
headers = {
    'Host': 'maimai.wahlap.com',
    'Connection': 'keep-alive',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 NetType/WIFI MicroMessenger/7.0.20.1781(0x6700143B) WindowsWechat(0x63090b11) XWEB/8555 Flue',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'Sec-Fetch-Site': 'none',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-User': '?1',
    'Sec-Fetch-Dest': 'document',
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'zh-CN,zh;q=0.9',
    'Cookie': '_t=053aec723085f36dd132b8af227c0658; userId=2352886346959880;'#每个人的都不一样
}

response_basic = session.get(url=url1, headers=headers)
response_advance = session.get(url=url2, headers=headers)
response_expert = session.get(url=url3, headers=headers)
response_master = session.get(url=url4, headers=headers)
response_rema = session.get(url=url5, headers=headers)

if "登录失败，请重试。" not in response_basic.text:
    # 打开一个记事本文件并写入响应文本
    with open('info_basic.txt', 'w', encoding='utf-8') as file:
        file.write(response_basic.text)
    print("basic表已保存")
else:
    print("登录失败，请重试。未写入basic表。")
    
if "登录失败，请重试。" not in response_advance.text:
    # 打开一个记事本文件并写入响应文本
    with open('info_advance.txt', 'w', encoding='utf-8') as file:
        file.write(response_advance.text)
    print("advance表已保存")
else:
    print("登录失败，请重试。未写入advance表。")    

if "登录失败，请重试。" not in response_expert.text:
    # 打开一个记事本文件并写入响应文本
    with open('info_expert.txt', 'w', encoding='utf-8') as file:
        file.write(response_expert.text)
    print("expert表已保存")
else:
    print("登录失败，请重试。未写入expert表。")

if "登录失败，请重试。" not in response_master.text:
    # 打开一个记事本文件并写入响应文本
    with open('info_master.txt', 'w', encoding='utf-8') as file:
        file.write(response_master.text)
    print("master表已保存")
else:
    print("登录失败，请重试。未写入master表。")

if "登录失败，请重试。" not in response_rema.text:
    # 打开一个记事本文件并写入响应文本
    with open('info_rema.txt', 'w', encoding='utf-8') as file:
        file.write(response_rema.text)
    print("rema表已保存")
else:
    print("登录失败，请重试。未写入rema表。")

dump下来5个等级的网页源码

找到要处理的数据，写个脚本用BeautifulSoup抓取有用的数据并处理

from bs4 import BeautifulSoup

with open('info_rema.txt', 'r', encoding='utf-8') as file:
    data = file.read()

# 使用BeautifulSoup解析HTML
soup = BeautifulSoup(data, 'html.parser')

# 定义一个函数来提取信息
def extract_music_info(div):
    level = div.find('div', class_='music_lv_block').text.strip()
    name = div.find('div', class_='music_name_block').text.strip()
    score_block = div.find('div', class_='music_score_block w_112 t_r f_l f_12')
    score = score_block.text.strip() if score_block else '未进行游玩'
    details_block = div.find('div', class_='music_score_block w_190 t_r f_l f_12')
    details = details_block.text.strip() if details_block else '未进行游玩'
    return {
        'name': name,
        'score': score,
        'details': details,
        'level': level
    }

# 查找所有相关的div
divs = soup.find_all('div', class_='w_450 m_15 p_r f_0')

# 提取每个div的信息
music_info_list = [extract_music_info(div) for div in divs]

# 打印提取的信息
for info in music_info_list:
    print(f"歌曲名: {info['name']}")
    print(f"等级：{info['level']}")
    print(f"成绩: {info['score']}")
    print(f"分数: {info['details']}")
    print('---')