Documentation ¶
Overview ¶
Package scrapePkg handles the chifra scrape command. It The application creates TrueBlocks' chunked index of address appearances -- the fundamental data structure of the entire system. It also, optionally, pins each chunk of the index to IPFS. is a long running process, therefore we advise you run it as a service or in terminal multiplexer such as tmux. You may start and stop as needed, but doing so means the scraper will not be keeping up with the front of the blockchain. The next time it starts, it will have to catch up to the chain, a process that may take several hours depending on how long ago it was last run. See the section below and the "Papers" section of our website for more information on how the scraping process works and prerequisites for its proper operation. You may adjust the speed of the index creation with the --sleep and --block_cnt options. On some machines, or when running against some EVM node software, the scraper may overburden the hardware. Slowing things down will ensure proper operation. Finally, you may optionally --pin each new chunk to IPFS which naturally shards the database among all users. By default, pinning is against a locally running IPFS node, but the --remote option allows pinning to an IPFS pinning service such as Pinata.
Index ¶
- Variables
- func Notify[T notify.NotificationPayload](notification notify.Notification[T]) error
- func NotifyChunkWritten(chunk index.Chunk, chunkPath string) (err error)
- func NotifyConfigured() (bool, string)
- func ResetOptions(testMode bool)
- func RunScrape(cmd *cobra.Command, args []string) error
- func ServeScrape(w http.ResponseWriter, r *http.Request) error
- type BlazeManager
- func (bm *BlazeManager) AllowMissing() bool
- func (bm *BlazeManager) AsciiFileToAppearanceMap(fn string) (map[string][]index.AppearanceRecord, base.FileRange, int)
- func (bm *BlazeManager) BlockCount() base.Blknum
- func (bm *BlazeManager) Consolidate(blocks []base.Blknum) (error, bool)
- func (bm *BlazeManager) EndBlock() base.Blknum
- func (bm *BlazeManager) FirstSnap() base.Blknum
- func (bm *BlazeManager) HandleBlaze(blocks []base.Blknum) (err error, ok bool)
- func (bm *BlazeManager) IsSnap(block base.Blknum) bool
- func (bm *BlazeManager) IsTestMode() bool
- func (bm *BlazeManager) PerChunk() base.Blknum
- func (bm *BlazeManager) ProcessAppearances(appearanceChannel chan scrapedData, appWg *sync.WaitGroup, ...) (err error)
- func (bm *BlazeManager) ProcessBlocks(blockChannel chan base.Blknum, blockWg *sync.WaitGroup, ...) (err error)
- func (bm *BlazeManager) ProcessTimestamps(tsChannel chan tslib.TimestampRecord, tsWg *sync.WaitGroup) (err error)
- func (bm *BlazeManager) RipeFolder() string
- func (bm *BlazeManager) ScrapeBatch(blocks []base.Blknum) (error, bool)
- func (bm *BlazeManager) SnapTo() base.Blknum
- func (bm *BlazeManager) StageFolder() string
- func (bm *BlazeManager) StartBlock() base.Blknum
- func (bm *BlazeManager) UnripeFolder() string
- func (bm *BlazeManager) WriteAppearances(bn base.Blknum, addrMap uniq.AddressBooleanMap) (err error)
- func (bm *BlazeManager) WriteTimestamps(blocks []base.Blknum) error
- type ScrapeOptions
Constants ¶
This section is empty.
Variables ¶
var ErrConfiguredButNotRunning = fmt.Errorf("listener is configured but not running")
Functions ¶
func Notify ¶
func Notify[T notify.NotificationPayload](notification notify.Notification[T]) error
Notify may be used to tell other processes about progress.
func NotifyConfigured ¶
NotifyConfigured returns true if notification feature is configured
func ResetOptions ¶
func ResetOptions(testMode bool)
func RunScrape ¶
RunScrape handles the scrape command for the command line. Returns error only as per cobra.
func ServeScrape ¶
func ServeScrape(w http.ResponseWriter, r *http.Request) error
ServeScrape handles the scrape command for the API. Returns an error.
Types ¶
type BlazeManager ¶
type BlazeManager struct {
// contains filtered or unexported fields
}
BlazeManager manages the scraper by keeping track of the progress of the scrape and maintaining the timestamp array and processed map. The processed map helps us know if every block was visited or not.
func (*BlazeManager) AllowMissing ¶
func (bm *BlazeManager) AllowMissing() bool
AllowMissing returns true for all chains but mainnet and the value of the config item on mainnet (false by default). The scraper will halt if AllowMissing is false and a block with zero appearances is encountered.
func (*BlazeManager) AsciiFileToAppearanceMap ¶
func (bm *BlazeManager) AsciiFileToAppearanceMap(fn string) (map[string][]index.AppearanceRecord, base.FileRange, int)
AsciiFileToAppearanceMap reads the appearances from the stage file and returns them as a map
func (*BlazeManager) BlockCount ¶
func (bm *BlazeManager) BlockCount() base.Blknum
BlockCount returns the number of blocks to process for this pass of the scraper.
func (*BlazeManager) Consolidate ¶
func (bm *BlazeManager) Consolidate(blocks []base.Blknum) (error, bool)
Consolidate calls into the block scraper to (a) call Blaze and (b) consolidate if applicable
func (*BlazeManager) EndBlock ¶
func (bm *BlazeManager) EndBlock() base.Blknum
EndBlock returns the last block to process for this pass of the scraper.
func (*BlazeManager) FirstSnap ¶
func (bm *BlazeManager) FirstSnap() base.Blknum
FirstSnap returns the first block to process.
func (*BlazeManager) HandleBlaze ¶
func (bm *BlazeManager) HandleBlaze(blocks []base.Blknum) (err error, ok bool)
HandleBlaze does the actual scraping, walking through block_cnt blocks and querying traces and logs and then extracting addresses and timestamps from those data structures.
func (*BlazeManager) IsSnap ¶
func (bm *BlazeManager) IsSnap(block base.Blknum) bool
IsSnap returns true if the block is a snap point.
func (*BlazeManager) IsTestMode ¶
func (bm *BlazeManager) IsTestMode() bool
IsTestMode returns true if the scraper is running in test mode.
func (*BlazeManager) PerChunk ¶
func (bm *BlazeManager) PerChunk() base.Blknum
PerChunk returns the number of blocks to process per chunk.
func (*BlazeManager) ProcessAppearances ¶
func (bm *BlazeManager) ProcessAppearances(appearanceChannel chan scrapedData, appWg *sync.WaitGroup, tsChannel chan tslib.TimestampRecord) (err error)
ProcessAppearances processes scrapedData objects shoved down the appearanceChannel
func (*BlazeManager) ProcessBlocks ¶
func (bm *BlazeManager) ProcessBlocks(blockChannel chan base.Blknum, blockWg *sync.WaitGroup, appearanceChannel chan scrapedData) (err error)
ProcessBlocks processes the block channel and for each block query the node for both traces and logs. Send results down appearanceChannel.
func (*BlazeManager) ProcessTimestamps ¶
func (bm *BlazeManager) ProcessTimestamps(tsChannel chan tslib.TimestampRecord, tsWg *sync.WaitGroup) (err error)
ProcessTimestamps processes timestamp data (currently by printing to a temporary file)
func (*BlazeManager) RipeFolder ¶
func (bm *BlazeManager) RipeFolder() string
RipeFolder returns the folder where the stage file is stored.
func (*BlazeManager) ScrapeBatch ¶
func (bm *BlazeManager) ScrapeBatch(blocks []base.Blknum) (error, bool)
ScrapeBatch is called each time around the forever loop. It calls into HandleBlaze and writes the timestamps if there's no error.
func (*BlazeManager) SnapTo ¶
func (bm *BlazeManager) SnapTo() base.Blknum
SnapTo returns the number of blocks to process per chunk.
func (*BlazeManager) StageFolder ¶
func (bm *BlazeManager) StageFolder() string
StageFolder returns the folder where the stage file is stored.
func (*BlazeManager) StartBlock ¶
func (bm *BlazeManager) StartBlock() base.Blknum
StartBlock returns the start block for the current pass of the scraper.
func (*BlazeManager) UnripeFolder ¶
func (bm *BlazeManager) UnripeFolder() string
UnripeFolder returns the folder where the stage file is stored.
func (*BlazeManager) WriteAppearances ¶
func (bm *BlazeManager) WriteAppearances(bn base.Blknum, addrMap uniq.AddressBooleanMap) (err error)
WriteAppearances writes the appearance for a chunk to a file
func (*BlazeManager) WriteTimestamps ¶
func (bm *BlazeManager) WriteTimestamps(blocks []base.Blknum) error
type ScrapeOptions ¶
type ScrapeOptions struct { BlockCnt uint64 `json:"blockCnt,omitempty"` // Maximum number of blocks to process per pass Sleep float64 `json:"sleep,omitempty"` // Seconds to sleep between scraper passes Touch uint64 `json:"touch,omitempty"` // First block to visit when scraping (snapped back to most recent snap_to_grid mark) RunCount uint64 `json:"runCount,omitempty"` // Run the scraper this many times, then quit Publisher string `json:"publisher,omitempty"` // For some query options, the publisher of the index DryRun bool `json:"dryRun,omitempty"` // Show the configuration that would be applied if run,no changes are made Settings config.ScrapeSettings `json:"settings,omitempty"` // Configuration items for the scrape Globals globals.GlobalOptions `json:"globals,omitempty"` // The global options Conn *rpc.Connection `json:"conn,omitempty"` // The connection to the RPC server BadFlag error `json:"badFlag,omitempty"` // An error flag if needed // EXISTING_CODE PublisherAddr base.Address `json:"-"` }
ScrapeOptions provides all command options for the chifra scrape command.
func GetOptions ¶
func GetOptions() *ScrapeOptions
func GetScrapeOptions ¶
func GetScrapeOptions(args []string, g *globals.GlobalOptions) *ScrapeOptions
GetScrapeOptions returns the options for this tool so other tools may use it.
func (*ScrapeOptions) HandleScrape ¶
func (opts *ScrapeOptions) HandleScrape() error
HandleScrape enters a forever loop and continually scrapes --block_cnt blocks (or less if close to the head). The forever loop pauses each round for --sleep seconds (or, if not close to the head, for .25 seconds).
func (*ScrapeOptions) HandleTouch ¶
func (opts *ScrapeOptions) HandleTouch() error
func (*ScrapeOptions) Prepare ¶
func (opts *ScrapeOptions) Prepare() (ok bool, err error)
Prepare performs actions that need to be done prior to entering the forever loop. Returns true if processing should continue, false otherwise. The routine cleans the temporary folders (if any) and then makes sure the zero block (reads the allocation file, if present) is processed.
func (*ScrapeOptions) ScrapeInternal ¶
func (opts *ScrapeOptions) ScrapeInternal() error
ScrapeInternal handles the internal workings of the scrape command. Returns an error.
func (*ScrapeOptions) String ¶
func (opts *ScrapeOptions) String() string
String implements the Stringer interface