krangio

package module

v0.2.12 Latest Latest Go to latest Published: Feb 16, 2023 License: BSD-3-Clause Imports: 5 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/krang-backlink/io

Links

Open Source Insights

README ¶

Krang IO

This module contains the Go bindings and Proto file definitions and types for Krang and it's microservices to be used by GRPC servers.

Installation

go get -u github.com/krang-backlink/io

Types

All tasks should accept and return a krangio.Page, see types.go for more detailed insight into it's field data. Below is an example of the Page struct stored in Mongo.

type (
	// Page represents an individual task scrape including
	// metadata from the Task.
	Page struct {
		ID             primitive.ObjectID  `bson:"_id,omitempty" json:"id"`
		ScrapeID       *primitive.ObjectID `bson:"scrape_id" json:"scrape_id"`
		URL            string              `bson:"url" json:"url"`
		GroupSlug      string              `bson:"group_slug,omitempty" json:"group_slug"`
		TaskID         int64               `bson:"task_id,omitempty" json:"task_id"`
		SearchTerm     string              `json:"search_term" bson:"search_term"`
		RelevancyScore uint                `json:"relevancy_score" bson:"relevancy_score"`
		SiteScore      uint                `json:"site_score" bson:"site_score"`
		Scrape         Scrape              `bson:"scrape" json:"scrape"`
		UpdatedAt      time.Time           `bson:"updated_at" json:"updated_at"`
		CreatedAt      time.Time           `bson:"created_at" json:"created_at"`
	}
)

Errors

Any error that is returned from a Lambda function should be of type krangio.Error for error reporting. See below on how to create and consume one.

Types

type (
	// Error represents an error that occurred during the
	// processing of a Krang Lambda function.
	Error struct {
		Err     *errors.Error `json:"error"`
		Service string        `json:"service"` // Currently running function, for example "scrape"
		Meta    Meta          `json:"meta"`
	}
	// Meta represents the attributes of a failed task.
	Meta struct {
		GroupSlug  string         `json:"group_slug"`
		TaskID     int64          `json:"task_id"`
		ScrapeID   string         `json:"scrape_id"`
		URL        string         `json:"url"`
		SearchTerm string         `json:"search_term"`
		// Any additional data
		Data       map[string]any `json:"data"`
	}
)

Create a new Lambda error

meta := krangio.Meta{
	GroupSlug:  in.GroupSlug,
	TaskID:     in.TaskID,
	URL:        in.URL,
	SearchTerm: in.SearchTerm,
}

status, err := myThing()
if err != nil {
	return in, lambda.NewError(err, ServiceName, meta)
}

Proto

Each service contains its own proto definition along with its implementation for Server/Client.

message CompleteRequest {
	string id = 1;
	string scrape_id = 2;
	string url = 3;
	string group_slug = 4;
	int64 task_id = 5;
	string search_term = 6;
}

message Response {
	bool error = 2;
	string message = 3;
}

service TasksService {
	rpc CompleteTask(CompleteRequest) returns(Response) {}
}

Usage

func Send() error {
	conn, err := grpc.Dial(":9000", grpc.WithTransportCredentials(insecure.NewCredentials())
	if err != nil {
	return err
	}

	s := proto.NewTasksServiceClient(conn)

	response, err := s.Scrape(context.Background(), &proto.CompleteRequest{
		GroupSlug: "",
		TaskId:    0,
		PageId:    "",
		Url:       "",
	})
	if err != nil {
		return err
	}

	fmt.Println(response)

	return nil
}

Development

To set up this repository, run:

make setup

To generate the proto files run:

make generate

Documentation ¶

Constants ¶

View Source

const (
	// LogDatabase defines the database name for log entries
	// via Mongo.
	LogDatabase = "logs"
	// LogService defines the collection name for log entries
	// via Mongo.
	LogService = "api"
)

Variables ¶

This section is empty.

Functions ¶

func GetObjectID ¶ added in v0.2.0

func GetObjectID(hex string) *primitive.ObjectID

GetObjectID returns the primitive.ObjectID if there is one set, otherwise it returns nil.

Types ¶

type BackLinkCheck ¶ added in v0.0.15

type BackLinkCheck struct {
	GroupSlug string `json:"group_slug" bson:"group_slug"`
	LinkID    int64  `json:"link_id" bson:"link_id"`
	URL       string `json:"url" bson:"url"`
	Link      string `json:"link" bson:"link"`

} //@name BackLinkCheck

BackLinkCheck represents the data sent to the Lambda function for checking if a backlink appears on the page.

type Error ¶

type Error struct {
	Err     *errors.Error `json:"error" bson:"error"`
	Service string        `json:"service" bson:"service"` // Currently running function, for example "scrape"
}

Error represents an error that occurred during the processing of a Krang Lambda function.

func NewError ¶

func NewError(err error, service string) *Error

NewError returns a new Lambda error.

func (*Error) Error ¶

func (e *Error) Error() string

Error returns the JSON representation of the error message by implementing the error interface.

func (*Error) ToMap ¶ added in v0.1.3

func (e *Error) ToMap() map[string]any

ToMap returns a map of the error if there is one.

type Page ¶

type Page struct {
	ID             primitive.ObjectID  `json:"id" bson:"_id,omitempty"`
	ScrapeID       *primitive.ObjectID `json:"scrape_id" bson:"scrape_id"`
	UUID           string              `json:"uuid,omitempty" bson:"-"` // Used for SQS dedupe.
	URL            string              `json:"url" bson:"url"`
	GroupSlug      string              `json:"group_slug" bson:"group_slug"`
	ProjectID      int64               `json:"project_id" bson:"project_id"`
	TaskID         int64               `json:"task_id" bson:"task_id"`
	SearchTerm     string              `json:"search_term" bson:"search_term"`
	RelevancyScore int                 `json:"relevancy_score" bson:"relevancy_score"`
	SiteScore      int                 `json:"site_score" bson:"site_score"`
	Scrape         Scrape              `json:"scrape" bson:"scrape,omitempty"`
	Status         ScrapeStatus        `json:"status" bson:"status"`
	Usage          PageUsage           `json:"usage" bson:"usage"`
	UpdatedAt      time.Time           `json:"updated_at" bson:"updated_at"`
	CreatedAt      time.Time           `json:"created_at" bson:"created_at"`

} //@name Page

Page represents an individual task scrape including metadata from the Task.

func (*Page) HasScrape ¶ added in v0.2.0

func (p *Page) HasScrape() bool

HasScrape determines if a page has a Scrape ID attached to it.

func (*Page) LogMessage ¶ added in v0.2.0

func (p *Page) LogMessage(service string) string

LogMessage returns a formatted message for processing Lambda functions.

func (*Page) LoggerFields ¶ added in v0.2.0

func (p *Page) LoggerFields(service string) map[string]any

LoggerFields returns logrus Fields to log the Page meta data.

type PageUsage ¶ added in v0.2.7

type PageUsage struct {
	Ahrefs PageUsageAhrefs `json:"ahrefs" bson:"ahrefs"`

} //@name PageUsage

PageUsage represents any costs that have been associated with the page.

type PageUsageAhrefs ¶ added in v0.2.7

type PageUsageAhrefs struct {
	Rows         int  `json:"rows_used" bson:"rows"`
	UnitCostRows int  `json:"unit_cost_rows" bson:"unit_cost_rows"`
	Cached       bool `json:"cached" bson:"cached"`
	Called       bool `json:"called" bson:"called"`

} //@name PageUsageAhrefs

PageUsageAhrefs represents the total amount of cost a singular call to Ahrefs cost.

type Scrape ¶

type Scrape struct {
	ID         primitive.ObjectID `json:"id" bson:"_id,omitempty"`
	URL        string             `json:"-" bson:"url" swagggerignore:"true"`
	HTTPStatus int                `json:"http_status" bson:"http_status"`
	Content    ScrapeContent      `json:"content" bson:"content"`
	Metrics    ScrapeMetrics      `json:"metrics" bson:"metrics"`
	Message    string             `json:"message" bson:"message"`
	Status     ScrapeStatus       `json:"status" bson:"status"`
	Error      any                `json:"error" bson:"error"`
	Service    string             `json:"service" bson:"service"` // Currently running function, for example "scrape"`
	UpdatedAt  time.Time          `json:"updated_at" bson:"updated_at"`
	CreatedAt  time.Time          `json:"created_at" bson:"created_at"`

} //@name Scrape

Scrape represents an individual scrape of a page and its various metrics.

type ScrapeAhrefs ¶ added in v0.2.7

type ScrapeAhrefs struct {
	DR   float64  `json:"dr" bson:"dr"`     // Domain Ranking
	Rank *float64 `json:"rank" bson:"rank"` // Ahrefs Rank

} //@name ScrapeAhrefs

ScrapeAhrefs represents the metrics retrieved from the Ahrefs API including cost, rows and if it was cached.

type ScrapeContent ¶

type ScrapeContent struct {
	H1            string          `json:"h1" bson:"h1"`
	H2            string          `json:"h2" bson:"h2"`
	Title         string          `json:"title" bson:"title"`
	ExternalLinks int             `json:"external_links" bson:"external_links"`
	Keywords      []ScrapeKeyword `json:"keywords" bson:"keywords"`

} //@name ScrapeContent

ScrapeContent represents the HTML markup of a page including any <body> content that's relevant for scoring.

type ScrapeKeyword ¶ added in v0.0.13

type ScrapeKeyword struct {
	Term     string  `json:"term" bson:"term"`
	Salience float64 `json:"salience" bson:"salience"`

} //@name ScrapeKeyword

ScrapeKeyword represents a singular entity extracted from a given piece of text.

type ScrapeMetrics ¶

type ScrapeMetrics struct {
	Ahrefs ScrapeAhrefs `json:"ahrefs" bson:"ahrefs"`

} //@name ScrapeMetrics

ScrapeMetrics represents the scores and metrics retrieved from Ahrefs, Moz and Majestic.

type ScrapeStatus ¶ added in v0.1.3

type ScrapeStatus string

ScrapeStatus status represents the status of a page task.

const (
	// ScrapeStatusProcessing is the status that defines
	// a processing page.
	ScrapeStatusProcessing ScrapeStatus = "processing"
	// ScrapeStatusFailed is the status that defines
	// a failed page task.
	ScrapeStatusFailed ScrapeStatus = "failed"
	// ScrapeStatusTimedOut is the status that defines
	// a timed out page task.
	ScrapeStatusTimedOut ScrapeStatus = "timed-out"
	// ScrapeStatusSuccess is the status that defines
	// a successful page task.
	ScrapeStatusSuccess ScrapeStatus = "success"
)

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
proto
backlink
worker

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

Krang IO

Installation

Types

Errors

Types

Create a new Lambda error

Proto

Usage

Development

Documentation ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func GetObjectID ¶ added in v0.2.0

Types ¶

type BackLinkCheck ¶ added in v0.0.15

type Error ¶

func NewError ¶

func (*Error) Error ¶

func (*Error) ToMap ¶ added in v0.1.3

type Page ¶

func (*Page) HasScrape ¶ added in v0.2.0

func (*Page) LogMessage ¶ added in v0.2.0

func (*Page) LoggerFields ¶ added in v0.2.0

type PageUsage ¶ added in v0.2.7

type PageUsageAhrefs ¶ added in v0.2.7

type Scrape ¶

type ScrapeAhrefs ¶ added in v0.2.7

type ScrapeContent ¶

type ScrapeKeyword ¶ added in v0.0.13

type ScrapeMetrics ¶

type ScrapeStatus ¶ added in v0.1.3

Source Files ¶

Directories ¶