iis服务器助手广告广告
返回顶部
首页 > 资讯 > 后端开发 > Python >python3爬虫-知乎登陆
  • 772
分享到

python3爬虫-知乎登陆

爬虫 2023-01-31 00:01:09 772人浏览 薄情痞子

Python 官方文档:入门教程 => 点击学习

摘要

py文件: from fake_useragent import UserAgent import requests from Http import cookiejar import base64 from PIL import Im

py文件:

from fake_useragent import UserAgent
import requests
from Http import cookiejar
import base64
from PIL import Image
import time, JSON
import hashlib, hMac
import execjs
from urllib import parse

ua = UserAgent()


class MyException(Exception):
    def __init__(self, status, msg):
        self.status = status
        self.msg = msg


class ZhiHu:

    def __init__(self, username=None, passWord=None):
        self.username = username
        self.password = password
        self.session = requests.Session()
        self.session.headers = {
            "user-agent": ua.random,
            "referer": "https://www.zhihu.com/",
            'host': 'www.zhihu.com',
        }

        self.session.cookies = cookiejar.LWPCookieJar(filename="./cookies.txt")

        self.login_param = {
            "client_id": "c3cef7c66a1843f8b3a9e6a1e3160e20",
            "grant_type": "password",
            "source": "com.zhihu.WEB",
            "username": "",
            "password": "",
            "ref_source": "homepage",
            "utm_source": "baidu",

        }

    def load_cookies(self):
        '''加载cookies,保存在session中'''
        try:
            self.session.cookies.load(ignore_discard=True, ignore_expires=True)
            return True
        except FileNotFoundError:
            return False

    def login(self, captcha_lang: str = "en", is_load_cookies: bool = True):
        '''
        这里进行登陆操作
        :param lang:  使用怎样的登陆验证,en表示验证码,zh表示点击倒立汉字
        :param is_load_cookies:  是否使用保存的cookies进行登陆
        :return:
        '''

        if self.load_cookies() and is_load_cookies:
            # 进行登陆操作
            print("读取cookies文件")
            if self.check__login():
                print("登陆成功")
                return
            print("cookies已经失效")

        # 走到这里说明是没有登陆的,在这里进行登陆操作

        # 检测用户名和密码已经输入了
        self.check_user_input()

        # 获取到xsrf的值,并且设置请求头
        headers = self.session.headers.copy()
        xsrf = self.get_xsrf()
        headers.update({
            "content-type": "application/x-www-fORM-urlencoded",
            "x-xsrftoken": xsrf,
            "x-zse-83": "3_1.1",
        })

        self.login_param.update({
            "username": self.username,
            "password": self.password,
            "lang": captcha_lang
        })

        # 进行formdata的创建
        timestamp = int(time.time() * 1000)
        self.login_param.update({
            "timestamp": timestamp,
            "captcha": self.get_captcha() or "",
            "signature": self.get_signature(timestamp)
        })

        formdata = self.__encrypt(self.login_param)

        url = "https://www.zhihu.com/api/v3/oauth/sign_in"

        # 进行登陆操作
        self.session.post(url=url, headers=headers, data=formdata)
        if self.check__login():
            self.session.cookies.save()
            print("cookies以写入文件")
            print("登录成功")
            return True
        print("登录失败")

    def check__login(self):
        '''判断是否已经是登陆状态'''
        url = "https://www.zhihu.com/"
        response = self.session.get(url=url, allow_redirects=False)
        if response.status_code == 302:
            return False
        elif response.status_code == 200:
            return True

    def check_user_input(self):
        if not self.username:
            self.username = input("请输入手机号>>:").strip()
        if self.username.isdigit() and not self.username.startswith("+86"):
            self.username = "+86" + self.username

        if not self.password:
            self.password = input("请输入密码>>:").strip()

    def get_captcha(self):
        '''获取到验证码,这里至少请求一次,请求的方法的顺序get,put,post'''
        lang = self.login_param.get("lang")
        if lang == "en":
            captcha_api = "https://www.zhihu.com/api/v3/oauth/captcha?lang=en"
        else:
            captcha_api = "https://www.zhihu.com/api/v3/oauth/captcha?lang=cn"
        response = self.session.get(captcha_api)
        is_use_verify = response.json().get("show_captcha", False)
        if is_use_verify:
            # 使用验证,请求方式顺序为put,post
            # 先获取验证图片的base64
            response = self.session.put(captcha_api)
            base64_img = response.json()['img_base64'].replace(r'\n', '')
            with open("./captcha.png", "wb") as f:
                f.write(base64.b64decode(base64_img))
            img = Image.open("./captcha.png")
            if lang == "en":
                img.show()
                code = input("请输入图片中的验证码>>:").strip()
            else:
                import matplotlib.pyplot as plt
                plt.imshow(img)
                print('点击所有倒立的汉字,在命令行中按回车提交')
                points = plt.ginput(7)
                code = json.dumps({'img_size': [200, 44],
                                   'input_points': [[i[0] / 2, i[1] / 2] for i in points]})

            self.session.post(captcha_api, data={"input_text": code}, headers={"user-agent": ua.random, })
            return code

    def get_no_captch(self):
        '''调用这个方法,可以实现不需要验证码就可以登录'''
        lang = self.login_param.get("lang")
        if lang == "en":
            captcha_api = "https://www.zhihu.com/api/v3/oauth/captcha?lang=en"
        else:
            captcha_api = "https://www.zhihu.com/api/v3/oauth/captcha?lang=cn"
        while True:
            print("正在请求验证码....")
            time.sleep(0.5)
            response = self.session.get(captcha_api)
            is_use_verify = str(response.json().get("show_captcha"))
            if is_use_verify == 'false':
                return ""
            print("继续...")

    def get_signature(self, timestamp):
        '''获取signature的值'''
        ha = hmac.new(key=b"d1b964811afb40118a12068ff74a12f4", digestmod=hashlib.sha1)
        client_id = self.login_param.get("client_id")
        grant_type = self.login_param.get("grant_type")
        source = self.login_param.get("source")
        ha.update(bytes(grant_type + client_id + source + str(timestamp), encoding="utf-8"))
        return ha.hexdigest()

    def get_xsrf(self):
        url = "https://www.zhihu.com/signin"
        response = self.session.get(url=url, headers=self.session.headers, allow_redirects=False)
        _xsrf = response.cookies.get("_xsrf")
        return _xsrf

    def __encrypt(self, data: dict):
        data = parse.urlencode(data)
        with open("./01.js", "r") as f:
            js_code = f.read()
        ctx = execjs.compile(js_code)
        res = ctx.call("Q", data)
        return res


if __name__ == '__main__':
    zhihu = ZhiHu()
    zhihu.login()

 

js文件:

window = {
    "encodeURIComponent": encodeURIComponent
}
navigator = {
    "userAgent": "5.0 (windows NT 10.0; WOW64) AppleWebKit/537.36 (Khtml, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
}

function s(e) {
    return (s = "function" == typeof Symbol && "symbol" == typeof Symbol.t ? function (e) {
                return typeof e
            }
            : function (e) {
                return e && "function" == typeof Symbol && e.constructor === Symbol && e !== Symbol.prototype ? "symbol" : typeof e
            }
    )(e)
}

var t = "1.1"
    , __g = {};

function i() {
}

function h(e) {
    this.s = (2048 & e) >> 11,
        this.i = (1536 & e) >> 9,
        this.h = 511 & e,
        this.A = 511 & e
}

function A(e) {
    this.i = (3072 & e) >> 10,
        this.A = 1023 & e
}

function n(e) {
    this.n = (3072 & e) >> 10,
        this.e = (768 & e) >> 8,
        this.a = (192 & e) >> 6,
        this.s = 63 & e
}

function e(e) {
    this.i = e >> 10 & 3,
        this.h = 1023 & e
}

function a() {
}

function c(e) {
    this.n = (3072 & e) >> 10,
        this.e = (768 & e) >> 8,
        this.a = (192 & e) >> 6,
        this.s = 63 & e
}

function o(e) {
    this.A = (4095 & e) >> 2,
        this.s = 3 & e
}

function r(e) {
    this.i = e >> 10 & 3,
        this.h = e >> 2 & 255,
        this.s = 3 & e
}

function k(e) {
    this.s = (4095 & e) >> 10,
        this.i = (1023 & e) >> 8,
        this.h = 1023 & e,
        this.A = 63 & e
}

function B(e) {
    this.s = (4095 & e) >> 10,
        this.n = (1023 & e) >> 8,
        this.e = (255 & e) >> 6
}

function f(e) {
    this.i = (3072 & e) >> 10,
        this.A = 1023 & e
}

function u(e) {
    this.A = 4095 & e
}

function C(e) {
    this.i = (3072 & e) >> 10
}

function b(e) {
    this.A = 4095 & e
}

function g(e) {
    this.s = (3840 & e) >> 8,
        this.i = (192 & e) >> 6,
        this.h = 63 & e
}

function G() {
    this.c = [0, 0, 0, 0],
        this.o = 0,
        this.r = [],
        this.k = [],
        this.B = [],
        this.f = [],
        this.u = [],
        this.C = !1,
        this.b = [],
        this.g = [],
        this.G = !1,
        this.Q = null,
        this.R = null,
        this.w = [],
        this.x = 0,
        this.D = {
            0: i,
            1: h,
            2: A,
            3: n,
            4: e,
            5: a,
            6: c,
            7: o,
            8: r,
            9: k,
            10: B,
            11: f,
            12: u,
            13: C,
            14: b,
            15: g
        }
}

i.prototype.M = function (e) {
    e.G = !1
}
    ,
    h.prototype.M = function (e) {
        switch (this.s) {
            case 0:
                e.c[this.i] = this.h;
                break;
            case 1:
                e.c[this.i] = e.k[this.A]
        }
    }
    ,
    A.prototype.M = function (e) {
        e.k[this.A] = e.c[this.i]
    }
    ,
    n.prototype.M = function (e) {
        switch (this.s) {
            case 0:
                e.c[this.n] = e.c[this.e] + e.c[this.a];
                break;
            case 1:
                e.c[this.n] = e.c[this.e] - e.c[this.a];
                break;
            case 2:
                e.c[this.n] = e.c[this.e] * e.c[this.a];
                break;
            case 3:
                e.c[this.n] = e.c[this.e] / e.c[this.a];
                break;
            case 4:
                e.c[this.n] = e.c[this.e] % e.c[this.a];
                break;
            case 5:
                e.c[this.n] = e.c[this.e] == e.c[this.a];
                break;
            case 6:
                e.c[this.n] = e.c[this.e] >= e.c[this.a];
                break;
            case 7:
                e.c[this.n] = e.c[this.e] || e.c[this.a];
                break;
            case 8:
                e.c[this.n] = e.c[this.e] && e.c[this.a];
                break;
            case 9:
                e.c[this.n] = e.c[this.e] !== e.c[this.a];
                break;
            case 10:
                e.c[this.n] = s(e.c[this.e]);
                break;
            case 11:
                e.c[this.n] = e.c[this.e] in e.c[this.a];
                break;
            case 12:
                e.c[this.n] = e.c[this.e] > e.c[this.a];
                break;
            case 13:
                e.c[this.n] = -e.c[this.e];
                break;
            case 14:
                e.c[this.n] = e.c[this.e] < e.c[this.a];
                break;
            case 15:
                e.c[this.n] = e.c[this.e] & e.c[this.a];
                break;
            case 16:
                e.c[this.n] = e.c[this.e] ^ e.c[this.a];
                break;
            case 17:
                e.c[this.n] = e.c[this.e] << e.c[this.a];
                break;
            case 18:
                e.c[this.n] = e.c[this.e] >>> e.c[this.a];
                break;
            case 19:
                e.c[this.n] = e.c[this.e] | e.c[this.a]
        }
    }
    ,
    e.prototype.M = function (e) {
        e.r.push(e.o),
            e.B.push(e.k),
            e.o = e.c[this.i],
            e.k = [];
        for (var t = 0; t < this.h; t++)
            e.k.unshift(e.f.pop());
        e.u.push(e.f),
            e.f = []
    }
    ,
    a.prototype.M = function (e) {
        e.o = e.r.pop(),
            e.k = e.B.pop(),
            e.f = e.u.pop()
    }
    ,
    c.prototype.M = function (e) {
        switch (this.s) {
            case 0:
                e.C = e.c[this.n] >= e.c[this.e];
                break;
            case 1:
                e.C = e.c[this.n] <= e.c[this.e];
                break;
            case 2:
                e.C = e.c[this.n] > e.c[this.e];
                break;
            case 3:
                e.C = e.c[this.n] < e.c[this.e];
                break;
            case 4:
                e.C = e.c[this.n] == e.c[this.e];
                break;
            case 5:
                e.C = e.c[this.n] != e.c[this.e];
                break;
            case 6:
                e.C = e.c[this.n];
                break;
            case 7:
                e.C = !e.c[this.n]
        }
    }
    ,
    o.prototype.M = function (e) {
        switch (this.s) {
            case 0:
                e.o = this.A;
                break;
            case 1:
                e.C && (e.o = this.A);
                break;
            case 2:
                e.C || (e.o = this.A);
                break;
            case 3:
                e.o = this.A,
                    e.Q = null
        }
        e.C = !1
    }
    ,
    r.prototype.M = function (e) {
        switch (this.s) {
            case 0:
                for (var t = [], n = 0; n < this.h; n++)
                    t.unshift(e.f.pop());
                e.c[3] = e.c[this.i](t[0], t[1]);
                break;
            case 1:
                for (var r = e.f.pop(), o = [], i = 0; i < this.h; i++)
                    o.unshift(e.f.pop());
                e.c[3] = e.c[this.i][r](o[0], o[1]);
                break;
            case 2:
                for (var a = [], c = 0; c < this.h; c++)
                    a.unshift(e.f.pop());
                e.c[3] = new e.c[this.i](a[0], a[1])
        }
    }
    ,
    k.prototype.M = function (e) {
        switch (this.s) {
            case 0:
                e.f.push(e.c[this.i]);
                break;
            case 1:
                e.f.push(this.h);
                break;
            case 2:
                e.f.push(e.k[this.A]);
                break;
            case 3:
                e.f.push(e.g[this.A])
        }
    }
    ,
    B.prototype.M = function (t) {
        switch (this.s) {
            case 0:
                var s = t.f.pop();
                t.c[this.n] = t.c[this.e][s];
                break;
            case 1:
                var i = t.f.pop()
                    , h = t.f.pop();
                t.c[this.e][i] = h;
                break;
            case 2:
                var A = t.f.pop();
                t.c[this.n] = eval(A)
        }
    }
    ,
    f.prototype.M = function (e) {
        e.c[this.i] = e.g[this.A]
    }
    ,
    u.prototype.M = function (e) {
        e.Q = this.A
    }
    ,
    C.prototype.M = function (e) {
        throw e.c[this.i]
    }
    ,
    b.prototype.M = function (e) {
        var t = this
            , n = [0];
        e.k.forEach(function (e) {
            n.push(e)
        });
        var r = function (r) {
            var o = new G;
            return o.k = n,
                o.k[0] = r,
                o.J(e.b, t.A, e.g, e.w),
                o.c[3]
        };
        r.toString = function () {
            return "() { [native code] }"
        }
            ,
            e.c[3] = r
    }
    ,
    g.prototype.M = function (e) {
        switch (this.s) {
            case 0:
                for (var t = {}, n = 0; n < this.h; n++) {
                    var r = e.f.pop();
                    t[e.f.pop()] = r
                }
                e.c[this.i] = t;
                break;
            case 1:
                for (var o = [], i = 0; i < this.h; i++)
                    o.unshift(e.f.pop());
                e.c[this.i] = o
        }
    }
    ,
    G.prototype.v = function (e) {
        for (var t = new Buffer(e, "base64").toString("binary"), n = [], r = 0; r < t.length - 1; r += 2)
            n.push(t.charCodeAt(r) << 8 | t.charCodeAt(r + 1));
        this.b = n
    }
    ,
    G.prototype.y = function (e) {
        for (var t = new Buffer(e, "base64").toString("binary"), n = 66, r = [], o = 0; o < t.length; o++) {
            var i = 24 ^ t.charCodeAt(o) ^ n;
            r.push(String.fromCharCode(i)),
                n = i
        }
        return r.join("")
    }
    ,
    G.prototype.F = function (e) {
        var t = this;
        this.g = e.map(function (e) {
            return "string" == typeof e ? t.y(e) : e
        })
    }
    ,
    G.prototype.J = function (e, t, n) {
        for (t = t || 0,
                 n = n || [],
                 this.o = t,
                 "string" == typeof e ? (this.F(n),
                     this.v(e)) : (this.b = e,
                     this.g = n),
                 this.G = !0,
                 this.x = Date.now(); this.G;) {
            var r = this.b[this.o++];
            if ("number" != typeof r)
                break;
            var o = Date.now();
            if (500 < o - this.x)
                return;
            this.x = o;
            try {
                this.M(r)
            } catch (e) {
                if (this.R = e,
                    !this.Q)
                    throw "execption at " + this.o + ": " + e;
                this.o = this.Q
            }
        }
    }
    ,
    G.prototype.M = function (e) {
        var t = (61440 & e) >> 12;
        new this.D[t](e).M(this)
    }
    ,
1 && (new G).J("4AeTAJwAqACcAaQAAAAYAJAAnAKoAJwDgAWTACwAnAKoACACGAESOTRHkQAkAbAEIAMYAJwFoAASAzREJAQYBBIBNEVkBnCiGAC0BjRAJAAYBBICNEVkBnDGGAC0BzRAJACwCJAAnAmoAJwKoACcC4ABnAyMBRAAMwZgBnESsA0aADRAkQAkABGCnA6gABoCnA+hQDRHGAKcEKAAMQdgBnFasBEaADRAkQAkABgCnBKgABoCnBOhQDRHZAZxkrAUGgA0QJEAJAAYApwVoABgBnG6sBYaADRAkQAkABgCnBegAGAGceKwGBoANECRACQAnAmoAJwZoABgBNIOsBoaADRAkQAkABgCnBugABoCnByhQDRHZAZyRrAdGgA0QJEAJAAQACAFsB4gBhgAnAWgABIBNEEkBxgHEgA0RmQGdJoQCBoFFAE5gCgFFAQ5hDSCJAgYB5AAGACcH4AFGAEaCDRSEP8xDzMQIAkQCBoFFAE5gCgFFAQ5hDSCkQAkCBgBGgg0UhD/MQ+QACAIGAkaBxQBOYGSABoAnB+EBRoIN1AUCDmRNJMkCRAIGgUUATmAKAUUBDmENIKRACQIGAEaCDRSEP8xD5AAIAgYCRoHFAI5gZIAGgCcH4QFGgg3UBQQOZE0kyQJGAMaCRQ/OY+SABoGnCCEBTTAJAMYAxoJFAY5khI/Nk+RABoGnCCEBTTAJAMYAxoJFAw5khI/Nk+RABoGnCCEBTTAJAMYAxoJFBI5khI/Nk+RABoGnCCEBTTAJAMYBxIDNEEkB3JsHgNQAA==", 0, ["BRgg", "BSITFQkTERw=", "LQYfEhMA", "PxMVFBMZKB8DEjQaBQcZExMC", "", "NhETEQsE", "Whg=", "Wg==", "MhUcHRARDhg=", "NBcPBxYeDQMF", "Lx4ODys+GhMC", "LgM7OwAKDyk6Cg4=", "Mx8SGQUvMQ==", "SA==", "ORoVGCQgERcCAxo=", "BTcAERcCAxo=", "BRg3ABEXAgMaFAo=", "SQ==", "OA8LGBsP", "GC8LGBsP", "Tg==", "PxAcBQ==", "Tw==", "KRsJDgE=", "TA==", "LQofHg4DBwsP", "TQ==", "PhMaNCwZAxoUDQUeGQ==", "PhMaNCwZAxoUDQUeGTU0GQIeBRsYEQ8=", "Qg==", "BWpUGxkfGRsZFxkbGR8ZGxkHGRsZHxkbGRcZG1MbGR8ZGxkXGRFpGxkfGRsZFxkbGR8ZGxkHGRsZHxkbGRcZGw==", "ORMRCyk0Exk8LQ==", "ORMRCyst"]);
var Q = function (e) {
    return __g._encrypt(e)
};

 

参考的是这位博主的博客:https://home.cnblogs.com/u/zkqiang

 

--结束END--

本文标题: python3爬虫-知乎登陆

本文链接: https://www.lsjlt.com/news/182058.html(转载时请注明来源链接)

有问题或投稿请发送至: 邮箱/279061341@qq.com    QQ/279061341

本篇文章演示代码以及资料文档资料下载

下载Word文档到电脑,方便收藏和打印~

下载Word文档
猜你喜欢
  • python3爬虫-知乎登陆
    py文件: from fake_useragent import UserAgent import requests from http import cookiejar import base64 from PIL import Im...
    99+
    2023-01-31
    爬虫
  • 一个简单的python爬虫,爬取知乎
    一个简单的python爬虫,爬取知乎主要实现 爬取一个收藏夹 里 所有问题答案下的 图片文字信息暂未收录,可自行实现,比图片更简单具体代码里有详细注释,请自行阅读项目源码:# -*- coding:utf-8 ...
    99+
    2023-06-02
  • python爬虫之利用selenium+opencv识别滑动验证并模拟登陆知乎功能
    滑动验证距离 分别获取验证码背景图和滑块图两张照片,然后利用opencv库,通过高斯模糊和Canny算法进行处理,然后通过matchTemplate方法进行两张图的匹配,获得滑动距离...
    99+
    2024-04-02
  • 【Python3爬虫】拉勾网爬虫
    一、思路分析:在之前写拉勾网的爬虫的时候,总是得到下面这个结果(真是头疼),当你看到下面这个结果的时候,也就意味着被反爬了,因为一些网站会有相应的反爬虫措施,例如很多网站会检测某一段时间某个IP的访问次数,如果访问频率太快以至于看起来不像正...
    99+
    2023-01-31
    爬虫 拉勾网
  • Python3 爬虫 requests
    刚学Python爬虫不久,迫不及待的找了一个网站练手,新笔趣阁:一个小说网站。 安装Python以及必要的模块(requests,bs4),不了解requests和bs4的同学可以去官网看个大概之后再回来看教程 刚开始写爬虫的小白都有...
    99+
    2023-01-31
    爬虫 requests
  • Python3网络爬虫实战-10、爬虫框
    我们直接用 Requests、Selenium 等库写爬虫,如果爬取量不是太大,速度要求不高,是完全可以满足需求的。但是写多了会发现其内部许多代码和组件是可以复用的,如果我们把这些组件抽离出来,将各个功能模块化,就慢慢会形成一个框架雏形,久...
    99+
    2023-01-31
    爬虫 实战 网络
  • Python3网络爬虫实战-11、爬虫框
    ScrapySplash 是一个 Scrapy 中支持 JavaScript 渲染的工具,本节来介绍一下它的安装方式。ScrapySplash 的安装分为两部分,一个是是 Splash 服务的安装,安装方式是通过 Docker,安装之后会...
    99+
    2023-01-31
    爬虫 实战 网络
  • Python3网络爬虫实战-15、爬虫基
    在写爬虫之前,还是需要了解一些爬虫的基础知识,如 HTTP 原理、网页的基础知识、爬虫的基本原理、Cookies 基本原理等。 那么本章内容就对一些在做爬虫之前所需要的基础知识做一些简单的总结。 在本节我们会详细了解 HTTP 的基本原理...
    99+
    2023-01-31
    爬虫 实战 网络
  • Python3网络爬虫(十一):爬虫黑科
    原文链接: Jack-Cui,http://blog.csdn.net/c406495762 运行平台: Windows Python版本: Python3.x IDE: Sublime text3 1 前言 近期,有些朋友问我一些关...
    99+
    2023-01-31
    爬虫 网络
  • Python3网络爬虫实战-17、爬虫基
    爬虫,即网络爬虫,我们可以把互联网就比作一张大网,而爬虫便是在网上爬行的蜘蛛,我们可以把网的节点比做一个个网页,爬虫爬到这就相当于访问了该页面获取了其信息,节点间的连线可以比做网页与网页之间的链接关系,这样蜘蛛通过一个节点后可以顺着节点连线...
    99+
    2023-01-31
    爬虫 实战 网络
  • python3爬虫-通过requests
    import requests from fake_useragent import UserAgent from lxml import etree from urllib.parse import urljoin import py...
    99+
    2023-01-31
    爬虫 requests
  • 【Python3爬虫】常见反爬虫措施及解
    这一篇博客,是关于反反爬虫的,我会分享一些我遇到的反爬虫的措施,并且会分享我自己的解决办法。如果能对你有什么帮助的话,麻烦点一下推荐啦。   UserAgent中文名为用户代理,它使得服务器能够识别客户使用的操作系统及版本、CPU 类...
    99+
    2023-01-30
    爬虫 措施 常见
  • Python3 爬虫 scrapy框架
    上次用requests写的爬虫速度很感人,今天打算用scrapy框架来实现,看看速度如何。 第一步,安装scrapy,执行一下命令 pip install Scrapy 第二步,创建项目,执行一下命令 scrapy startproje...
    99+
    2023-01-31
    爬虫 框架 scrapy
  • Python3网络爬虫入门知识点有哪些
    本篇内容介绍了“Python3网络爬虫入门知识点有哪些”的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所成!  ...
    99+
    2024-04-02
  • python3爬虫之开篇
    写在前面的话:   折腾爬虫也有一段时间了,从一开始的懵懵懂懂,到现在的有一定基础,对于这一路的跌跌撞撞,个人觉得应该留下一些文字性的东西,毕竟好记性不如烂笔头,而且毕竟这是吃饭的家伙,必须用心对待才可以,从今天起,我将会把关于爬虫的东西...
    99+
    2023-01-30
    爬虫 开篇
  • python3爬虫-通过selenium
    from selenium import webdriver from selenium.common.exceptions import NoSuchElementException from selenium.webdriver.c...
    99+
    2023-01-31
    爬虫 selenium
  • node.js怎么爬取知乎图片
    这篇文章主要介绍“node.js怎么爬取知乎图片”的相关知识,小编通过实际案例向大家展示操作过程,操作方法简单快捷,实用性强,希望这篇“node.js怎么爬取知乎图片”文章能帮助大家解决问题。原理初入爬虫的坑,没有太多深奥的理论知识,要获取...
    99+
    2023-07-04
  • 如何用python爬取知乎话题?
    因为要做观点,观点的屋子类似于知乎的话题,所以得想办法把他给爬下来,搞了半天最终还是妥妥的搞定了,代码是python写的,不懂得麻烦自学哈!懂得直接看代码,绝对可用 #coding:utf-8 """ @author:haoning @cr...
    99+
    2023-01-31
    如何用 话题 python
  • python3 爬虫笔记(一)beaut
    很多人学习python,爬虫入门,在python爬虫中,有很多库供开发使用。 用于请求的urllib(python3)和request基本库,xpath,beautiful soup,pyquery这样的解析库。其中xpath中用到大量的...
    99+
    2023-01-30
    爬虫 笔记 beaut
  • python3 urllib 爬虫乱码问
    #!/usr/bin/env python # -*- coding: utf-8 -*- from bs4 import BeautifulSoup from urllib.request import urlopen ba...
    99+
    2023-01-31
    爬虫 乱码 urllib
软考高级职称资格查询
编程网,编程工程师的家园,是目前国内优秀的开源技术社区之一,形成了由开源软件库、代码分享、资讯、协作翻译、讨论区和博客等几大频道内容,为IT开发者提供了一个发现、使用、并交流开源技术的平台。
  • 官方手机版

  • 微信公众号

  • 商务合作