今天老大布置了一个任务,让我写一个查询百度搜索结果,并提取title的小工具。
鉴于本人python太菜,所以就用了php来写,话不多说,开撸
首先,我们需要使用到PHP的爬虫queryList
然后下载包。因为我本地是5.3版本,所以我用的是v3。
包加载完了之后,我们就直接上代码了:
require 'vendor/phpQuery.php';
require 'vendor/QueryList.php';
use QL\QueryList;
$post = $_POST;
if (!empty($_POST)) {
for($b = $post['min']; $b <= @$post['max']; $b++) {
$url = 'http://www.baidu.com/s?wd=' . urlencode($post['keyword']) . '&pn=' . $b * 10;
$data = QueryList::Query($url, ['title' => array('h3', 'text')])->data;
foreach ($data as $v) {
echo $v['title'];
echo "</br>";
}
}die;
}
?>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>随机数生成</title>
<style>
html, body, h1, form, fieldset, legend, ol ,li{
padding:0;
margin:0;
}
ol{
list-style:none;
}
body{
background:#fff;
color:#111;
padding:20px;
}
form#payment{
background:#9cbc2c;
-webkit-border-radius:5px;
border-radius:5px;
padding:20px;
width:400px;
}
form#payment fieldset{
border:none;
margin-bottom:10px;
}
form#payment fieldset:last-of-type{ margin-bottom:0; }
form#payment legend{
color:#384313;
font-size:16px;
font-weight:bold;
padding-bottom:10;
text-shadow:0px 1px 1px #c0d576;
}
form#payment > fieldset>legend:before{
content:"Step" counter(fieldset)":";
counter-increment:fieldsets;
}
form#payment fieldset fieldset legend{
color:#111;
font-size:13px;
font-weight:normal;
padding-bottom:0;
}
form#payment ol li{
background:#b9cf6a;
background:rgba(255, 255, 255, 0.3);
border:#e3ebc3;
border-color:rgba(255, 255, 255, 0.6);
border-style:solid;
border-width:2px;
-webkit-border-radius:5px;
line-height:30px;
padding:5px 10px;
margin-bottom:2px;
}
form#payment ol ol li{
bakcground:none;
border:none;
float:left;
}
form#payment label{
float:left;
font-size:13px;
width:110px;
}
form#payment fieldset fieldset label{
background:none no-repeat left 50%;
line-height:20px;
padding:0 0 0 30px;
width:auto;
}
form#payment fieldset fieldset label:hover{cursor:pointer;}
form#payment input:not([type=radio]), form#payment textarea{
background:#fff;
border:#fc3 solid 1px;
-webkit-border-radius:3px;
outline:none;
padding:5px;
}
</style>
</head>
<body>
<form id=payment action="" method="POST">
<fieldset>
<legend>百度搜索结果爬取</legend>
<ol>
<li>
<label for="name">关键字:</label>
<input type="text" id="name" name="keyword" placeholder="请输入需要查询的关键字" required autofocus>
</li>
<li>
<label for="phone">起始页数:</label>
<input type="tel" placeholder="请输入需要爬取的页数" id="phone" name="min">
</li>
<li>
<label for="phone">结束页数:</label>
<input type="tel" placeholder="请输入需要爬取的页数" id="phone" name="max">
</li>
</ol>
</fieldset>
<fieldset>
<button type="submit">提交</button>
</fieldset>
</form>
</body>
</html>
这里的代码还包含了前端的代码。PS:我不会告诉你们只是为了单纯的好看
下面是附带的安装包和代码源件