# DataExtracter Help ---------------------------- DataExtracter helps you quickly extract data from any web pages. All you need to do is: - Find out the selectors for target data - Type scripts in the console of `extension backgroud page`, as introduced bellow. ![](images/console.png) ## Qucik Start Extract current page ```js $('.item', ['a', 'a@href']); ``` Extract multiple pages (1-10, interval 1) ```js $('.item', ['a', 'a@href'],"http://sample.com/?pn=${page}", 1, 10, 1); ``` Extract multiple urls (list) ```js $('.item', ['a', 'a@href'],["http://sample.com/abc","http://sample.com/xyz"]); ``` Extract specified pages (1,3,5) ```js $('.item', ['a', 'a@href'], "http://sample.com/?pn=${page}", [1, 3, 5]); ``` ## Task Call Signitures ```ts // extract data from current page function (itemsSelector:string, fieldSelectors:string[]) // extract data from a range of pages function (itemsSelector:string, fieldSelectors:string[], urlTemplate:string, from:number, to:number, interval:number) // extract data from a list of pages function (itemsSelector:string, fieldSelectors:string, urlTemplate:string, pages:number[]) // extract data from a list of pages function (itemsSelector:string, fieldSelectors:string[], urls:string[]) // extract data of urls which extracted from last task result function (itemsSelector:string, fieldSelectors:string[], urls:ExtractResult) ``` ## Stop Tasks The only way to stop tasks before its finish, is `Closing the target tab`. > Tasks wait for their target elements' appearance, given some elements were loaded asynchronously. > If you typed wrong selectors, the task waits forever for elements which don't exists. ## Extract Attributes. e.g.: link text and target (use 'selector@attribute') ```js new Extractor().task('.item', ['a', 'a@href']).start(); ``` ## Advanced Usage ### Use Task Chain. e.g.: Collect links from `http://sample.com/abc`, then, Extract data of each link ```js e = new Extractor() e.task('.search-list-item', ['a@href'], ["http://sample.com/abc"]) .task('list-item', ["a.title", "p.content"]) .start(); ``` ### Continue Tasks You can always continue tasks (with following), even it stops in the middle of a task: ```js e.start() ``` The `Extractor` kept the state of last execution, and starts from where it stopped. ### Restart Tasks What should I do, if I don't like to continue from last state, but restart from certain task? ```js // restart all tasks e.restart(0) // restart from 2nd task e.restart(1) ``` ### Save Result of Any Task To a multiple task Extractor `e`: ```js e = new Extractor() e.task('.search-list-item', ['a@href'], ["http://sample.com/abc"]) .task('list-item', ["a.title", "p.content"]) .start(); ``` User will be asked to save the final result when it finishes. Incase you want to save it again, use: ```js e.save() ``` To save another task result, other than the final one: ```js // save the result of first task // to the example above, that is a list of urls e.save(0) // save the result of second task e.save(1) ```