# DataExtracter Help ---------------------------- DataExtracter helps you quickly extract data from any web pages. All you need to do is: - Find out the selectors (JQuery selectors) for target data - Type scripts in the console of `extension backgroud page`, as introduced bellow. ![](images/console.png) ## Qucik Start Extract current page ```js new Extractor().task(".list-item", ["a.title", "p.content"]).start(); ``` Extract multiple pages (1-10, interval 1) ```js new Extractor().task(".list-item", ["a.title", "p.content"],"http://sample.com/?pn=${page}", 1, 10, 1).start(); ``` Extract multiple urls (list) ```js new Extractor().task(".list-item", ["a.title", "p.content"],["http://sample.com/abc","http://sample.com/xyz"]).start(); ``` Extract specified pages (1,3,5) ```js new Extractor().task(".list-item", ["a.title", "p.content"], "http://sample.com/?pn=${page}", [1, 3, 5]).start(); ``` ## Extractor.task() Signitures ```ts // a task extracting data from current page task(itemsSelector:string, fieldSelectors:string[]) // a task extracting data from a range of pages task(itemsSelector:string, fieldSelectors:string[], urlTemplate:string, from:number, to:number, interval:number) // a task extracting data from a list of pages task(itemsSelector:string, fieldSelectors:string, urlTemplate:string, pages:number[]) // a task extracting data from a list of pages task(itemsSelector:string, fieldSelectors:string[], urls:string[]) // a task extracting data of urls which extracted from last task result task(itemsSelector:string, fieldSelectors:string[], urls:ExtractResult) ``` ## Advanced Usage ### Stop Tasks Tasks wait for their target elements' appearance, given some elements were loaded asynchronously. But if you typed wrong selectors, the task waits forever for elements which don't exists. The only way to stop tasks before its finish, is `Closing the host tab`. ### Extract Attributes. e.g.: link text and target (use 'selector@attribute') ```js new Extractor().task('.list-item', ['a.title', 'a.title@href']).start(); ``` ### Use Task Chain. e.g.: Collect links from `http://sample.com/abc`, then, Extract data of each link ```js new Extractor() .task('.search-list-item', ['.item a@href'], ["http://sample.com/abc"]) .task('list-item', ["a.title", "p.content"]) .start(); ``` ### Save Result of Any Task To a multiple task (chain) Extractor `e`: ```js e = new Extractor() e.task('.search-list-item', ['.item a@href'], ["http://sample.com/abc"]) .task('list-item', ["a.title", "p.content"]) .start(); ``` User will be asked to save the final result when it finishes. Incase you want to save it again, use: ```js e.save() ``` You may want to save another task's result, other than the final: ```js // save the result of first task // to the example above, that is a list of urls e.save(1) ``` ### Restart Tasks In cases some later task fails, you don't need to restart all task. Here we have 2 tasks: ```js e = new Extractor() e.task('.search-list-item', ['.item a@href'], ["http://sample.com/abc"]) .task('list-item', ["a.title", "p.content"]) .start(); ``` Suppose the second task fails, we can restart and continue from the task 2: ```js e.restart(2); ``` If you'd like restart all task, use: ```js e.start(); // or e.restart(); ```