data-extracter-extesion/readme.md

# DataExtracter Help
----------------------------

DataExtracter helps you quickly extract data from any web pages.

All you need to do is:

- Find out the selectors for target data
- Type scripts in the console of `extension backgroud page`, as introduced bellow.

 ![](images/console.png)

## Qucik Start

Extract current page
```js
$('.item', ['a', 'a@href']);
```

Extract multiple pages (1-10, interval 1)

```js
$('.item', ['a', 'a@href'],"http://sample.com/?pn=${page}", 1, 10, 1);
```

Extract multiple urls (list)

```js
$('.item', ['a', 'a@href'],["http://sample.com/abc","http://sample.com/xyz"]);
```

Extract specified pages (1,3,5)

```js
$('.item', ['a', 'a@href'], "http://sample.com/?pn=${page}", [1, 3, 5]);
```

## Task Call Signitures

```ts
// extract data from current page
function (itemsSelector:string, fieldSelectors:string[])
// extract data from a range of pages
function (itemsSelector:string, fieldSelectors:string[], urlTemplate:string, from:number, to:number, interval:number)
// extract data from a list of pages
function (itemsSelector:string, fieldSelectors:string, urlTemplate:string, pages:number[])
// extract data from a list of pages
function (itemsSelector:string, fieldSelectors:string[], urls:string[])
// extract data of urls which extracted from last task result
function (itemsSelector:string, fieldSelectors:string[], urls:ExtractResult)
```

## Stop Tasks

The only way to stop tasks before its finish, is `Closing the target tab`.

> Tasks wait for their target elements' appearance, given some elements were loaded asynchronously.
> If you typed wrong selectors, the task waits forever for  elements which don't exists.

## Extract Attributes.

e.g.: link text and target (use 'selector@attribute')

```js
new Extractor().task('.item', ['a', 'a@href']).start();
```

## Advanced Usage

### Use Task Chain.

e.g.: Collect links from `http://sample.com/abc`, then, Extract data of each link

```js
e = new Extractor()
e.task('.search-list-item', ['a@href'], ["http://sample.com/abc"])
    .task('list-item', ["a.title", "p.content"])
    .start();
```

### Continue Tasks

You can always continue tasks (with following), even it stops in the middle of a task:

```js
e.start()
```

The `Extractor` kept the state of last execution, and starts from where it stopped.

### Restart Tasks

What should I do, if I don't like to continue from last state, but restart from certain task?

```js
// restart all tasks
e.restart(0)
// restart from 2nd task
e.restart(1)
```

### Save Result of Any Task

To a multiple task Extractor `e`:

```js
e = new Extractor()
e.task('.search-list-item', ['a@href'], ["http://sample.com/abc"])
    .task('list-item', ["a.title", "p.content"])
    .start();
```

User will be asked to save  the final result when it finishes.

Incase you want to save it again, use:

```js
e.save()
```

To save another task result, other than the final one:

```js
// save the result of first task
// to the example above, that is a list of urls
e.save(0)
// save the result of second task
e.save(1)
```