google.cloud.forseti.services.inventory.crawler module

Crawler implementation.

class Crawler(config)[source]

Bases: google.cloud.forseti.services.inventory.base.crawler.Crawler

Simple single-threaded Crawler implementation.

dispatch(callback)[source]

Dispatch crawling of a subtree.

Parameters:callback (function) – Callback to dispatch.
get_client()[source]

Get the GCP API client.

Returns:GCP API client
Return type:object
on_child_error(error)[source]

Process the error generated by child of a resource

Inventory does not stop for children errors but raise a warning
Parameters:error (str) – error message to handle
run(resource)[source]

Run the crawler, given a start resource.

Parameters:resource (object) – Resource to start with.
Returns:The filled progresser described in inventory
Return type:QueueProgresser
update(resource)[source]

Update the row of an existing resource

Parameters:resource (Resource) – Resource to update.
Raises:Exception – Reraises any exception.
visit(resource)[source]

Handle a newly found resource.

Parameters:resource (object) – Resource to handle.
Raises:Exception – Reraises any exception.
write(resource)[source]

Save resource to storage.

Parameters:resource (object) – Resource to handle.
class CrawlerConfig(storage, progresser, api_client, variables=None)[source]

Bases: google.cloud.forseti.services.inventory.base.crawler.CrawlerConfig

Crawler configuration to inject dependencies.

class ParallelCrawler(config)[source]

Bases: google.cloud.forseti.services.inventory.crawler.Crawler

Multi-threaded Crawler implementation.

_process_queue()[source]

Process items in the queue until the shutdown event is set.

_start_workers()[source]

Start a pool of worker threads for processing the dispatch queue.

dispatch(callback)[source]

Dispatch crawling of a subtree.

Parameters:callback (function) – Callback to dispatch.
on_child_error(error)[source]

Process the error generated by child of a resource

Inventory does not stop for children errors but raise a warning
Parameters:error (str) – error message to handle
run(resource)[source]

Run the crawler, given a start resource.

Parameters:resource (Resource) – Resource to start with.
Returns:The filled progresser described in inventory
Return type:QueueProgresser
update(resource)[source]

Update the row of an existing resource

Parameters:resource (Resource) – The db row of Resource to update
Raises:Exception – Reraises any exception.
write(resource)[source]

Save resource to storage.

Parameters:resource (Resource) – Resource to handle.
class ParallelCrawlerConfig(storage, progresser, api_client, threads=10, variables=None)[source]

Bases: google.cloud.forseti.services.inventory.base.crawler.CrawlerConfig

Multithreaded crawler configuration, to inject dependencies.

run_crawler(storage, progresser, config, parallel=True)[source]

Run the crawler with a determined configuration.

Parameters:
  • storage (object) – Storage implementation to use.
  • progresser (object) – Progresser to notify status updates.
  • config (object) – Inventory configuration on server
  • parallel (bool) – If true, use the parallel crawler implementation.
Returns:

The progresser implemented in inventory

Return type:

QueueProgresser