spiderTarget Class |
[This is preliminary documentation and is subject to change.]
Namespace: imbWEM.Core.crawler.targets
public class spiderTarget : ISpiderTarget
The spiderTarget type exposes the following members.
| Name | Description | |
|---|---|---|
| spiderTarget(String, spiderTargetCollection) | Initializes a new instance of the spiderTarget class | |
| spiderTarget(spiderLink, spiderTargetCollection) | Initializes a new instance of the spiderTarget class |
| Name | Description | |
|---|---|---|
| content | ||
| contentBlocks | ||
| contentTree | ||
| duplicateOf |
Reference to the first crawled target, having the same HTML source code hash fingerprint
| |
| evaluatedLanguage |
Language that was found during evaluation
| |
| evaluation | ||
| isDuplicate |
True if this target is content duplicate (confirmed by HTML source code hash) of another, already crawled target. Target that was loaded first has False, any other duplicate has True.
| |
| isLoaded |
Da li je target ucitan?
| |
| IsRelevant |
Gets a value indicating whether this target is relevant. (shortcut for testing evaluation result language)
| |
| iterationDiscovery |
Discovery iteration
| |
| iterationLoaded | ||
| key | ||
| linkVectors | ||
| marks | ||
| page |
Attached page
| |
| pageHash | ||
| pageText | ||
| parent | ||
| targetHash | ||
| tokens |
Token table describing this target - tokens that are extracted from URL
| |
| url |
| Name | Description | |
|---|---|---|
| AddVector |
Adds the new vector to the target. originPage has to be specified otherwise exception will be thrown. Returns true if it is new vector for this target
| |
| AttachPage |
Attaches the page - if the page was already attached returns false | |
| Dispose | Releases all resources used by the spiderTarget | |
| Equals | (Inherited from Object.) | |
| GetHashCode | (Inherited from Object.) | |
| GetHtmlDocument |
Gets the HTML document from loaded page (HtmlDocument)
| |
| GetIndexPage |
Gets indexPage entry from current Index database instance
| |
| getQuery | ||
| GetType | (Inherited from Object.) | |
| GetVectors |
Gets the all vectors, including those that are coming from the same page
| |
| ToString | (Inherited from Object.) |