update documents
This commit is contained in:
@ -1,7 +1,7 @@
|
|||||||
{
|
{
|
||||||
"manifest_version": 2,
|
"manifest_version": 2,
|
||||||
"name": "Data Extracter",
|
"name": "Data Extracter",
|
||||||
"version": "0.1.0",
|
"version": "0.5.0",
|
||||||
"author": "jebbs",
|
"author": "jebbs",
|
||||||
"description": "Extract data from web page elements as sheet.",
|
"description": "Extract data from web page elements as sheet.",
|
||||||
"icons": {
|
"icons": {
|
||||||
|
|||||||
70
readme.md
70
readme.md
@ -78,8 +78,54 @@ e.task('.search-list-item', ['a@href'], ["http://sample.com/abc"])
|
|||||||
.start();
|
.start();
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Extractor Options
|
||||||
|
|
||||||
|
Specify extra options, to make task do some actions before scrape the data.
|
||||||
|
|
||||||
|
```js
|
||||||
|
var job = new Extractor({ "scrollToBottom": 1 });
|
||||||
|
```
|
||||||
|
|
||||||
|
Available options:
|
||||||
|
|
||||||
|
- `scrollToBottom`: Try scroll pages to the bottom, some elements are loaded only we user need them.
|
||||||
|
|
||||||
|
|
||||||
|
### Export Result of Any Task
|
||||||
|
|
||||||
|
To a multiple task Extractor `e`:
|
||||||
|
|
||||||
|
```js
|
||||||
|
e = new Extractor()
|
||||||
|
e.task('.search-list-item', ['a@href'], ["http://sample.com/abc"])
|
||||||
|
.task('list-item', ["a.title", "p.content"])
|
||||||
|
.start();
|
||||||
|
```
|
||||||
|
|
||||||
|
User will be asked to export the final result when it finishes.
|
||||||
|
|
||||||
|
Incase you want to export it again, use:
|
||||||
|
|
||||||
|
```js
|
||||||
|
e.export()
|
||||||
|
```
|
||||||
|
|
||||||
|
To export another task result, other than the final one:
|
||||||
|
|
||||||
|
```js
|
||||||
|
// export the result of first task
|
||||||
|
// to the example above, that is a list of urls
|
||||||
|
e.export(0)
|
||||||
|
// export the result of second task
|
||||||
|
e.export(1)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Task Management
|
||||||
|
|
||||||
### Continue Tasks
|
### Continue Tasks
|
||||||
|
|
||||||
|
Sometimes, it's hard to finish them in an single execution, that why we need "Continuing of Tasks".
|
||||||
|
|
||||||
You can always continue tasks (with following), even it stops in the middle of a task:
|
You can always continue tasks (with following), even it stops in the middle of a task:
|
||||||
|
|
||||||
```js
|
```js
|
||||||
@ -99,9 +145,11 @@ e.restart(0)
|
|||||||
e.restart(1)
|
e.restart(1)
|
||||||
```
|
```
|
||||||
|
|
||||||
### Save Result of Any Task
|
### Save & Load State
|
||||||
|
|
||||||
To a multiple task Extractor `e`:
|
It may also be hard to finish tasks in even a single day, we need a way to save current state, and come back tommorow.
|
||||||
|
|
||||||
|
Create and run an extractor:
|
||||||
|
|
||||||
```js
|
```js
|
||||||
e = new Extractor()
|
e = new Extractor()
|
||||||
@ -110,20 +158,16 @@ e.task('.search-list-item', ['a@href'], ["http://sample.com/abc"])
|
|||||||
.start();
|
.start();
|
||||||
```
|
```
|
||||||
|
|
||||||
User will be asked to save the final result when it finishes.
|
Save the state:
|
||||||
|
|
||||||
Incase you want to save it again, use:
|
|
||||||
|
|
||||||
```js
|
```js
|
||||||
e.save()
|
e.save();
|
||||||
```
|
```
|
||||||
|
|
||||||
To save another task result, other than the final one:
|
Load the state:
|
||||||
|
|
||||||
|
Open the popup window, upload the saved state file. Then, and in the backgoud console:
|
||||||
|
|
||||||
```js
|
```js
|
||||||
// save the result of first task
|
e = new Extractor().load();
|
||||||
// to the example above, that is a list of urls
|
```
|
||||||
e.save(0)
|
|
||||||
// save the result of second task
|
|
||||||
e.save(1)
|
|
||||||
```
|
|
||||||
Reference in New Issue
Block a user