# DataExtracter Help ---------------------------- DataExtracter helps you quickly extract data from any web pages. All you need to do is: - Find out the selectors (JQuery selectors) for target data - Call Extractor methods in `extension backgroud page console`, as introduced bellow. Where is the extension backgroud page console? Goto and click `backgroud page` link of the extension ![](images/extnsion.png) In the opening window, find `Console`, and type your scripts. ![](images/console.png) ## Qucik Start Extract current page ```js new Extractor().task(".list-item", ["a.title", "p.content"]).start(); ``` Extract multiple pages (1-10, interval 1) ```js new Extractor().task(".list-item", ["a.title", "p.content"],"http://sample.com/?pn=${page}", 1, 10, 1).start(); ``` Extract multiple urls (list) ```js new Extractor().task(".list-item", ["a.title", "p.content"],["http://sample.com/abc","http://sample.com/xyz"]).start(); ``` Extract specified pages (1,3,5) ```js new Extractor().task(".list-item", ["a.title", "p.content"], "http://sample.com/?pn=${page}", [1, 3, 5]).start(); ``` ## Extractor.task() Signitures: ```ts // a task extracting data from current page task(itemsSelector:string, fieldSelectors:string[]) // a task extracting data from a range of pages task(itemsSelector:string, fieldSelectors:string[], urlTemplate:string, from:number, to:number, interval:number) // a task extracting data from a list of pages task(itemsSelector:string, fieldSelectors:string, urlTemplate:string, pages:number[]) // a task extracting data from a list of pages task(itemsSelector:string, fieldSelectors:string[], urls:string[]) // a task extracting data of urls which extracted from last task result task(itemsSelector:string, fieldSelectors:string[], urls:ExractResult) ``` ## Advanced Usage: ### Stop tasks The only way to stop tasks before its finish, is `Closing the Tab` which runs tasks. ### Extract attributes. e.g.: link text and target (use 'selector@attribute') ```js new Extractor().task('.list-item', ['a.title', 'a.title@href']).start(); ``` ### Use task chain. e.g.: Collect links from `http://sample.com/abc` & Extract data of each link ```js new Extractor() .task('.search-list-item', ['.item a@href'], ["http://sample.com/abc"]) .task('list-item', ["a.title", "p.content"]) .start(); ``` ### Save result of any task To a multiple task (chain) Extractor `e`: ```js e = new Extractor() e.task('.search-list-item', ['.item a@href'], ["http://sample.com/abc"]) .task('list-item', ["a.title", "p.content"]) .start(); ``` User will be asked to save the final result when it finishes. You may want to save another task's result, other than the final: ```js // save the result of first task // that is, a list of urls e.save(1) ``` Incase you want to save it again, use: ```js e.save() ``` ### Restart tasks In cases some later task fails, you don't need to restart all task. Here we have 2 tasks: ```js e = new Extractor() e.task('.search-list-item', ['.item a@href'], ["http://sample.com/abc"]) .task('list-item', ["a.title", "p.content"]) .start(); ``` Suppose the second task fails, we can restart and continue from the task 2: ```js e.restart(2); ``` If you'd like restart all task, use: ```js e.start(); // or e.restart(); ```