improvements

* chance to continue on mismatch url for redirectTab
* support empty field selectors
* add Extractor.results()
* add ExtractResult.walk(), ExtractResult.visit()
* add ! directive to click elements
* code optimize
This commit is contained in:
2021-04-20 14:20:05 +08:00
parent 108ebb835f
commit e87e7010ec
7 changed files with 146 additions and 28 deletions

View File

@ -17,6 +17,8 @@ Extract current page
```js
$('.item', ['a', 'a@href']);
new Extractor().task('.item', ['a', 'a@href']).start();
// fieldSelectors can be empty strings if items have no child to select
new Extractor().task('.item a', ['', '@href']).start();
```
> `$(...args)` is the short form of `new Extractor().task(...args).start();`, which is introduced later.
@ -67,7 +69,9 @@ job = new Extractor().task('.search-list-item', ['a@href'], ["http://sample.com/
job.stop();
```
## Extract Attributes.
> Next time you call `job.start();`, the job will continues from where it stopped.
## Extract Attributes
e.g.: link text and target (use 'selector@attribute')
@ -75,6 +79,14 @@ e.g.: link text and target (use 'selector@attribute')
new Extractor().task('.item', ['a', 'a@href']).start();
```
## Click Selected Elements
The following clicks selected links and extracts link `text` and `href`
```js
new Extractor().task('.item', ['!a', 'a@href']).start();
```
## Advanced Usage
### Use Task Chain.
@ -202,6 +214,26 @@ To stop watching, you can either `close current window`, or:
e.stop();
```
## Results Operation
To get the results of a task:
```js
let results = job.results(0);
```
Visit URLs (if any) in the results one by one:
```js
results.visit();
```
Walk through all results one by one:
```js
results.walk((row,col,value)=>{console.log(value)});
```
## Developpment
Clone this project and execute: