最近發(fā)現(xiàn)一些快手的作者,作品還不錯(cuò),出于學(xué)習(xí)研究的目的,決定看一下怎么爬取數(shù)據(jù)。現(xiàn)在網(wǎng)上有一些爬蟲工具,不過大部分都失效了,或者不開源。于是自己就寫了一個(gè)小工具。先看一下成果: 軟件只需要填寫作者uid以及網(wǎng)頁版的請(qǐng)求Cookie,即可實(shí)現(xiàn)自動(dòng)下載,下載目錄在程序根目錄下的Download文件夾。
最近發(fā)現(xiàn)一些快手的作者,作品還不錯(cuò),出于學(xué)習(xí)研究的目的,決定看一下怎么爬取數(shù)據(jù),F(xiàn)在網(wǎng)上有一些爬蟲工具,不過大部分都失效了,或者不開源。于是自己就寫了一個(gè)小工具。先看一下成果:
軟件只需要填寫作者uid以及網(wǎng)頁版的請(qǐng)求Cookie,即可實(shí)現(xiàn)自動(dòng)下載,下載目錄在程序根目錄下的Download文件夾。
由于快手的風(fēng)控比較厲害,軟件也做了應(yīng)對(duì)措施。不過需要用戶點(diǎn)擊軟件中的提示文字,復(fù)制粘貼到瀏覽器,把請(qǐng)求的json保存到本地文件。使用軟件提供的解析本地json按鈕解析下載即可。如果返回的json文件很短或者沒有數(shù)據(jù),需要在快手的任意一個(gè)頁面刷新一下,也就是告訴快手風(fēng)控,現(xiàn)在是正常瀏覽,沒有機(jī)器人的行為。
下面說一下構(gòu)建整個(gè)App的思路。
打開 https://live.kuaishou.com/ ,在頂部搜索你要爬取的作者昵稱,進(jìn)入作者主頁。也可以從App端分享作者的主頁鏈接,粘貼進(jìn)來。作者主頁加載完成后,地址欄的地址一定要是類似: https://live.kuaishou.com/profile/xxxxxx。 后面的xxxxxx就是作者的user id。這個(gè)記住,復(fù)制出來,后面會(huì)用到。
按F12打開瀏覽器的開發(fā)者工具(我之前就說過開發(fā)者工具是好東西,研究爬蟲必備,一定要好好學(xué)習(xí))。
選擇開發(fā)者工具頂部的“網(wǎng)絡(luò)”,“全部”,如圖所示。在請(qǐng)求列表中找到user id,點(diǎn)擊它,右面就會(huì)出來請(qǐng)求的標(biāo)頭。里面有個(gè)Cookie,需要記住,復(fù)制出來。如果沒有的話,記得刷新頁面。
在列表里面可以看到很多請(qǐng)求,我們需要從中找到網(wǎng)頁端展示作品列表的那條請(qǐng)求,即public開頭的,或者直接在左上角搜索public,即可過濾絕大部分無關(guān)請(qǐng)求。這個(gè)請(qǐng)求的響應(yīng)數(shù)據(jù)里面有作者作品的完整json響應(yīng)。
你可以右擊它,在新標(biāo)簽頁面打開,打開后地址欄會(huì)顯示完成的瀏覽器請(qǐng)求地址。這個(gè)網(wǎng)址需要記住,后續(xù)會(huì)用到。那個(gè)count默認(rèn)是12或者20,我們用到時(shí)候,直接拉滿,9999即可。
安裝postman interceptor攔截器,安裝地址 https://chromewebstore.google.com/detail/postman-interceptor/aicmkgpgakddgnaphhhpliifpcfhicfo 不得不說,這又是一個(gè)神器,搭配開發(fā)者工具,理論上可以搞定幾乎所有的爬蟲需求了。
打開Postman,點(diǎn)擊右下角的Start Proxy,
開啟攔截后,重新回到網(wǎng)頁版作者主頁,刷新一下頁面,等頁面加載完成后,點(diǎn)擊停止攔截。否則列表會(huì)一直增多,因?yàn)樗麜?huì)攔截電腦的所有網(wǎng)絡(luò)請(qǐng)求。這時(shí)Postman攔截器就會(huì)攔截到一大堆請(qǐng)求,同理,找到public請(qǐng)求,或者在左上角輸入public,即可過濾出來我們需要的。
點(diǎn)擊這個(gè)請(qǐng)求鏈接
這是Postman會(huì)打開一個(gè)新的窗口,包含了請(qǐng)求這個(gè)鏈接的所有參數(shù)以及標(biāo)頭信息。
點(diǎn)擊Postman最右面的代碼工具即可生成我們需要的代碼。你可以選擇C#、python、js、curl等等。
http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
xmlns:local="clr-namespace:KuaishouDownloader"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:ui="http://schemas.lepo.co/wpfui/2022/xaml"
Title="MainWindow"
Width="900"
Height="760"
ExtendsContentIntoTitleBar="True"
WindowBackdropType="Mica"
WindowCornerPreference="Default"
WindowStartupLocation="CenterScreen"
mc:Ignorable="d">
https://www.kuaishou.com/profile/xxxxxx/開頭的,復(fù)制xxxxxx過來" />
using KuaishouDownloader.Models;
using Newtonsoft.Json;
using RestSharp;
using System.Diagnostics;
using System.IO;
using System.Text.RegularExpressions;
using System.Windows;
using Wpf.Ui;
using Wpf.Ui.Controls;
namespace KuaishouDownloader
{
///
/// Interaction logic for MainWindow.xaml
///
public partial class MainWindow
{
string downloadFolder = AppContext.BaseDirectory;
SnackbarService? snackbarService = null;
public MainWindow()
{
InitializeComponent();
this.Loaded += MainWindow_Loaded;
}
private void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
snackbarService = new SnackbarService();
snackbarService.SetSnackbarPresenter(snackbarPresenter);
if (File.Exists("AppConfig.json"))
{
var model = JsonConvert.DeserializeObject(File.ReadAllText("AppConfig.json"));
if (model != null)
{
tbUid.Text = model.Uid;
tbCookie.Text = model.Cookie;
}
}
}
private void Theme_Click(object sender, RoutedEventArgs e)
{
if (Wpf.Ui.Appearance.ApplicationThemeManager.GetAppTheme() == Wpf.Ui.Appearance.ApplicationTheme.Light)
{
themeButton.Icon = new SymbolIcon(SymbolRegular.WeatherSunny48);
Wpf.Ui.Appearance.ApplicationThemeManager.Apply(Wpf.Ui.Appearance.ApplicationTheme.Dark);
}
else
{
themeButton.Icon = new SymbolIcon(SymbolRegular.WeatherMoon48);
Wpf.Ui.Appearance.ApplicationThemeManager.Apply(Wpf.Ui.Appearance.ApplicationTheme.Light);
}
}
private async void Download_Click(object sender, RoutedEventArgs e)
{
try
{
btnDownload.IsEnabled = false;
btnParseJson.IsEnabled = false;
if (string.IsNullOrEmpty(tbUid.Text) || string.IsNullOrEmpty(tbCookie.Text))
{
snackbarService?.Show("提示", $"請(qǐng)輸入uid以及cookie", ControlAppearance.Caution, null, TimeSpan.FromSeconds(3));
return;
}
var json = JsonConvert.SerializeObject(new AppConfig() { Uid = tbUid.Text, Cookie = tbCookie.Text }, Formatting.Indented);
File.WriteAllText("AppConfig.json", json);
var options = new RestClientOptions("https://live.kuaishou.com")
{
Timeout = TimeSpan.FromSeconds(15),
UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36",
};
var client = new RestClient(options);
var request = new RestRequest($"/live_api/profile/public?count=9999&pcursor=&principalId={tbUid.Text}&hasMore=true", Method.Get);
request.AddHeader("host", "live.kuaishou.com");
request.AddHeader("connection", "keep-alive");
request.AddHeader("cache-control", "max-age=0");
request.AddHeader("sec-ch-ua", "\"Not)A;Brand\";v=\"99\", \"Google Chrome\";v=\"127\", \"Chromium\";v=\"127\"");
request.AddHeader("sec-ch-ua-mobile", "?0");
request.AddHeader("sec-ch-ua-platform", "\"Windows\"");
request.AddHeader("upgrade-insecure-requests", "1");
request.AddHeader("accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7");
request.AddHeader("sec-fetch-site", "none");
request.AddHeader("sec-fetch-mode", "navigate");
request.AddHeader("sec-fetch-user", "?1");
request.AddHeader("sec-fetch-dest", "document");
request.AddHeader("accept-encoding", "gzip, deflate, br, zstd");
request.AddHeader("accept-language", "zh,en;q=0.9,zh-CN;q=0.8");
request.AddHeader("cookie", tbCookie.Text);
request.AddHeader("x-postman-captr", "9467712");
RestResponse response = await client.ExecuteAsync(request);
Debug.WriteLine(response.Content);
var model = JsonConvert.DeserializeObject(response.Content!);
if (model == null || model?.Data?.List == null || model?.Data?.List?.Count == 0)
{
snackbarService?.Show("提示", $"獲取失敗,可能觸發(fā)了快手的風(fēng)控機(jī)制,請(qǐng)等一段時(shí)間再試。", ControlAppearance.Danger, null, TimeSpan.FromSeconds(3));
return;
}
await Download(model!);
}
finally
{
btnDownload.IsEnabled = true;
btnParseJson.IsEnabled = true;
}
}
private async void ParseJson_Click(object sender, RoutedEventArgs e)
{
try
{
btnDownload.IsEnabled = false;
btnParseJson.IsEnabled = false;
var dialog = new Microsoft.Win32.OpenFileDialog();
dialog.Filter = "Json文件(.Json)|*.json";
bool? result = dialog.ShowDialog();
if (result == false)
{
return;
}
var model = JsonConvert.DeserializeObject(File.ReadAllText(dialog.FileName)!);
if (model == null || model?.Data?.List == null || model?.Data?.List?.Count == 0)
{
snackbarService?.Show("提示", $"不是正確的json", ControlAppearance.Caution, null, TimeSpan.FromSeconds(3));
return;
}
await Download(model!);
}
finally
{
btnDownload.IsEnabled = true;
btnParseJson.IsEnabled = true;
}
}
private async Task Download(KuaishouModel model)
{
progress.Value = 0;
progress.Minimum = 0;
progress.Maximum = (double)model?.Data?.List?.Count!;
snackbarService?.Show("提示", $"解析到{model?.Data?.List?.Count!}個(gè)作品,開始下載", ControlAppearance.Success, null, TimeSpan.FromSeconds(5));
imgHeader.Source = new System.Windows.Media.Imaging.BitmapImage(new Uri(model?.Data?.List?[0]?.Author?.Avatar!));
tbNickName.Text = model?.Data?.List?[0]?.Author?.Name;
string pattern = @"\d{4}/\d{2}/\d{2}/\d{2}";
for (int i = 0; i < model?.Data?.List!.Count; i++)
{
DateTime dateTime = DateTime.Now;
string fileNamePrefix = "";
var item = model?.Data?.List[i]!;
Match match = Regex.Match(item.Poster!, pattern);
if (match.Success)
{
dateTime = new DateTime(int.Parse(match.Value.Split("/")[0]), int.Parse(match.Value.Split("/")[1]),
int.Parse(match.Value.Split("/")[2]), int.Parse(match.Value.Split("/")[3]), 0, 0);
if (cbAddDate.IsChecked == true)
fileNamePrefix = match.Value.Split("/")[0] + "-" + match.Value.Split("/")[1] + "-" + match.Value.Split("/")[2]
+ " " + match.Value.Split("/")[3] + "-00-00 ";
}
downloadFolder = Path.Combine(AppContext.BaseDirectory, "Download", item?.Author?.Name! + "(" + item?.Author?.Id! + ")");
Directory.CreateDirectory(downloadFolder);
switch (item?.WorkType)
{
case "single":
case "vertical":
case "multiple":
{
await DownLoadHelper.Download(item?.ImgUrls!, dateTime, downloadFolder, fileNamePrefix);
}
break;
case "video":
{
await DownLoadHelper.Download(new List() { item?.PlayUrl! }, dateTime, downloadFolder, fileNamePrefix);
}
break;
}
progress.Value = i + 1;
tbProgress.Text = $"{i + 1} / {model?.Data?.List!.Count}";
Random random = new Random();
if (cbLongInterval.IsChecked == true)
await Task.Delay(random.Next(5000, 10000));
else
await Task.Delay(random.Next(1000, 5000));
}
snackbarService?.Show("提示", $"下載完成,共下載{model?.Data?.List!.Count}個(gè)作品", ControlAppearance.Success, null, TimeSpan.FromDays(1));
}
private void CopyUrl(object sender, System.Windows.Input.MouseButtonEventArgs e)
{
if (string.IsNullOrEmpty(tbUid.Text))
{
snackbarService?.Show("提示", "請(qǐng)輸入uid以及cookie", ControlAppearance.Caution, null, TimeSpan.FromSeconds(3));
return;
}
Clipboard.SetText($"https://live.kuaishou.com/live_api/profile/public?count=9999&pcursor=&principalId={tbUid.Text}&hasMore=true");
snackbarService?.Show("提示", "復(fù)制完成,請(qǐng)粘貼到瀏覽器打開", ControlAppearance.Success, null, TimeSpan.FromSeconds(3));
}
private void Info_Click(object sender, RoutedEventArgs e)
{
flyout.IsOpen = true;
}
}
}
public static async Task Download(List urls, DateTime dateTime, string downloadFolder, string fileNamePrefix)
{
string file = string.Empty;
try
{
var downloader = new DownloadService();
foreach (var url in urls)
{
Uri uri = new Uri(url);
file = downloadFolder + "\\" + fileNamePrefix + Path.GetFileName(uri.LocalPath);
if (!File.Exists(file))
await downloader.DownloadFileTaskAsync(url, file);
//修改文件日期時(shí)間為發(fā)博的時(shí)間
File.SetCreationTime(file, dateTime);
File.SetLastWriteTime(file, dateTime);
File.SetLastAccessTime(file, dateTime);
}
}
catch
{
Debug.WriteLine(file);
Trace.Listeners.Add(new TextWriterTraceListener(downloadFolder + "\\_FailedFiles.txt", "myListener"));
Trace.TraceInformation(file);
Trace.Flush();
}
}
打開
https://github.com/hupo376787/KuaishouDownloader/releases/tag/1.0
,點(diǎn)擊下載zip文件,解壓縮后,就可以像開頭那樣使用了。
機(jī)器學(xué)習(xí):神經(jīng)網(wǎng)絡(luò)構(gòu)建(下)
閱讀華為Mate品牌盛典:HarmonyOS NEXT加持下游戲性能得到充分釋放
閱讀實(shí)現(xiàn)對(duì)象集合與DataTable的相互轉(zhuǎn)換
閱讀鴻蒙NEXT元服務(wù):論如何免費(fèi)快速上架作品
閱讀算法與數(shù)據(jù)結(jié)構(gòu) 1 - 模擬
閱讀5. Spring Cloud OpenFeign 聲明式 WebService 客戶端的超詳細(xì)使用
閱讀Java代理模式:靜態(tài)代理和動(dòng)態(tài)代理的對(duì)比分析
閱讀Win11筆記本“自動(dòng)管理應(yīng)用的顏色”顯示規(guī)則
閱讀本站所有軟件,都由網(wǎng)友上傳,如有侵犯你的版權(quán),請(qǐng)發(fā)郵件[email protected]
湘ICP備2022002427號(hào)-10 湘公網(wǎng)安備:43070202000427號(hào)© 2013~2025 haote.com 好特網(wǎng)