Documentation ¶
Index ¶
Constants ¶
View Source
const ( // PageExtension is the file extension that downloaded pages get. PageExtension = ".html" // PageDirIndex is the file name of the index file for every dir. PageDirIndex = "index" + PageExtension )
Variables ¶
This section is empty.
Functions ¶
func GetPageFilePath ¶
GetPageFilePath returns a filename for a URL that represents a page.
Types ¶
type Config ¶ added in v0.1.1
type Config struct { URL string Includes []string Excludes []string ImageQuality uint // image quality from 0 to 100%, 0 to disable reencoding MaxDepth uint // download depth, 0 for unlimited Timeout uint // time limit in seconds to process each http request OutputDirectory string Username string Password string Proxy string }
Config contains the scraper configuration.
type Scraper ¶
Scraper contains all scraping data.
func (*Scraper) GetFilePath ¶
GetFilePath returns a file path for a URL to store the URL content in.
func (*Scraper) RemoveAnchor ¶
RemoveAnchor removes anchors from URLS.
Click to show internal directories.
Click to hide internal directories.