-
Notifications
You must be signed in to change notification settings - Fork 0
Command Line
Pider framework shipped with serveral commands for multiple purposes and each one accepts a different set of arguments and options.
A lot default behaviors in Pider framework are controlled by configurations. Pider adopts the php array style configurations.
-
src/Config/config.php(framework wide) -
Config/config.php(inside aPiderproject's root)
Settings from these files are merged in the listed order of preference.The project configurations in the gut of Config/config.php have higher priority than the framework wide configurations(src/Config/config.php). It's recommended that configurations of framework is just for default behaviors and can't be modified unneccessarily, but configurations of project can be adjust for each project requirement.
You can start by runing the pider tool with no arguments and it will print some usage help and the available commands.
[root@41f16764df90 pider]# ./pider
.______ __ _______ _______ .______
| _ \ | | | \ | ____|| _ \
| |_) | | | | .--. || |__ | |_) |
| ___/ | | | | | || __| | /
| | | | | '--' || |____ | |\ \----.
| _| |__| |_______/ |_______|| _| `._____|
Usage:
./pider [command]
Description:
Project tools for pider
Available commands:
help
list
crawl
runspider
rundigest
checkurl Once you don't know or remember the usage of a command. You can just run the command with the only help option to get the detail usage (./pider list --help or ./pider crawl --help).
root@41f16764df90 pider]# ./pider list --help
Description:
list all availabe spiders
Usage:
list
Options:
-h, --help Display this help message
-q, --quiet Do not output any message
-V, --version Display this application version
--ansi Force ANSI output
--no-ansi Disable ANSI output
-n, --no-interaction Do not ask any interactive question
-v|vv|vvv, --verbose Increase the verbosity of messages: 1 for normal output, 2 for more verbose output and 3 for debug[root@41f16764df90 pider]# ./pider crawl --help
Description:
crawl urls supplied
Usage:
crawl [options] [--] [<url>]
Arguments:
url url to crawled
Options:
-f, --file=FILE file contains urls to be crawled
-s, --spider[=SPIDER] spider be appointed
-t, --filetype[=FILETYPE] filetype specified, defaults: txt
-a, --attach[=ATTACH] data will be attached to request,json format
-l, --loglevel[=LOGLEVEL] log which matches level option will output
-h, --help Display this help message
-q, --quiet Do not output any message
-V, --version Display this application version
--ansi Force ANSI output
--no-ansi Disable ANSI output
-n, --no-interaction Do not ask any interactive question
-v|vv|vvv, --verbose Increase the verbosity of messages: 1 for normal output, 2 for more verbose output and 3 for debug Spiders locate in spiders of your project root directory by default, and you can change the default behavior on configuration.
[root@41f16764df90 pider]# ls -la
rwxr-xr-x 1 root root 15 Sep 12 12:07 console -> src/bin/console
drwxr-xr-x 3 root root 96 Sep 11 09:41 doc
-rwxr-xr-x 1 root root 11902 Sep 11 11:56 install.sh
lrwxrwxrwx 1 root root 14 Sep 13 11:19 pider -> src/bin/pider2
lrwxrwxrwx 1 root root 14 Sep 12 07:24 piderd -> src/bin/piderd
drwxr-xr-x 3 root root 96 Sep 11 11:51 setup
drwxr-xr-x 4 root root 128 Sep 27 08:29 spiders
drwxr-xr-x 27 root root 864 Sep 21 06:36 src[root@41f16764df90 pider]# ls -la spiders/
total 4
drwxr-xr-x 4 root root 128 Sep 27 08:29 .
drwxr-xr-x 29 root root 928 Sep 28 10:27 ..
-rw-r--r-- 1 root root 0 Sep 11 09:41 .spdierignores
-rw-r--r-- 1 root root 558 Sep 27 08:29 ExampleSpider.php[root@41f16764df90 pider]# ./pider list
All Available Spiders:
* ExampleSpider
- with just a spider name
./pider runspdier ExampleSpider - with specified spider path
./pider runspider spiders/ExampleSpider.php
[root@41f16764df90 pider]# ./pider checkurl http://www.example.com
URL:
http://www.example.com
Available spiders:
* ExampleSpider./pider crawl http://www.example.com./pider crawl -f /path/to/file