ForkDatabricks (DBRX)Databricks (DBRX)published Feb 5, 2024seen 5d

databricks/bazel-remote

forked from buchgr/bazel-remote

Open original ↗

Captured source

source ↗
published Feb 5, 2024seen 5dcaptured 14hhttp 200method plain

databricks/bazel-remote

Description: A remote cache for Bazel

License: Apache-2.0

Stars: 0

Forks: 0

Open issues: 0

Created: 2024-02-05T13:09:00Z

Pushed: 2025-04-17T12:54:09Z

Default branch: master

Fork: yes

Parent repository: buchgr/bazel-remote

Archived: no

README: !Build status

bazel-remote cache

bazel-remote is a HTTP/1.1 and gRPC server that is intended to be used as a remote build cache for REAPI clients like Bazel or as a component of a remote execution service.

The cache contents are stored in a directory on disk with a maximum cache size, and bazel-remote will automatically enforce this limit as needed, by deleting the least recently used files. S3, GCS and experimental Azure blob storage proxy backends are also supported.

Note that while bazel-remote is consumable as a go module, we provide no guarantees on the stability or backwards compatibility of the APIs. We do attempt to keep the standalone executable backwards-compatible between releases however, and cache directory format changes are only allowed in major version upgrades.

Project status: bazel-remote has been serving TBs of cache artifacts per day since April 2018, both on commodity hardware and AWS servers. Outgoing bandwidth can exceed 15 Gbit/s on the right AWS instance type.

HTTP/1.1 REST API

Cache entries are set and retrieved by key, and there are two types of keys that can be used: 1. Content addressed storage (CAS), where the key is the lowercase SHA256 hash of the entry. The REST API for these entries is: /cas/ or with an optional but ignored instance name: //cas/. 2. Action cache, where the key is an arbitrary 64 character lowercase hexadecimal string. Bazel uses the SHA256 hash of an action as the key, to store the metadata created by the action. The REST API for these entries is: /ac/ or with an optional instance name: //ac/.

Values are stored via HTTP PUT requests, and retrieved via GET requests. HEAD requests can be used to confirm whether a key exists or not.

If GET requests specify zstd in the Accept-Encoding header, then zstandard-encoded data may be returned.

To upload zstandard compressed data, PUT requests must set Content-Encoding: zstd and include a custom X-Digest-SizeBytes header with the size of the uncompressed entry. The key must also refer to the uncompressed entry.

If the --enable_ac_key_instance_mangling flag is specified and the instance name is not empty, then action cache keys are hashed along with the instance name to produce the action cache lookup key. Since the URL path is processed with Go's path.Clean function before extracting the instance name, clients should avoid using repeated slashes, ./ and ../ in the URL.

Values stored in the action cache are validated as an ActionResult protobuf message as per the Bazel Remote Execution API v2 unless validation is disabled by configuration. The HTTP server also supports reading and writing JSON encoded protobuf ActionResult messages to the action cache by using HTTP headers Accept: application/json for GET requests and Content-type: application/json for PUT requests.

Useful endpoints

/status

Returns the cache status/info.

$ curl http://localhost:8080/status
{
"CurrSize": 414081715503,
"ReservedSize": 876400,
"MaxSize": 8589934592000,
"NumFiles": 621413,
"ServerTime": 1588329927,
"GitCommit": "940d540d3a7f17939c3df0038530122eabef2f19",
"NumGoroutines": 12
}

/cas/e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

The empty CAS blob is always available, even if the cache is empty. This can be used to test that a bazel-remote instance is running and accepting requests.

$ curl --head --fail http://localhost:8080/cas/e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
HTTP/1.1 200 OK
Content-Length: 0
Date: Fri, 01 May 2020 10:42:06 GMT

Prometheus Metrics

To query endpoint metrics see github.com/slok/go-http-metrics's query examples.

gRPC API

bazel-remote also supports the ActionCache, ContentAddressableStorage and Capabilities services in the Bazel Remote Execution API v2, and the corresponding parts of the Byte Stream API.

When using the --enable_ac_key_instance_mangling feature, clients are advised to avoid repeated slashes, ../ and ./ strings in the instance name, for consistency with the HTTP interface.

Prometheus Metrics

To query endpoint metrics see github.com/grpc-ecosystem/go-grpc-prometheus's metrics documentation.

Experimental Remote Asset API Support

There is (very) experimental support for a subset of the Fetch service in the Remote Asset API which can be enabled with the --experimental_remote_asset_api flag.

To use this with Bazel, specify --experimental_remote_downloader=grpc://replace-with-your.host:port.

Byte Stream compressed-blobs

This version of bazel-remote supports the Byte Stream compressed-blobs REAPI feature, which provides a way for clients to upload and download CAS blobs compressed with zstandard, in order to improve network efficiency.

Uploaded CAS blobs are stored in a zstandard compressed format by default, which can increase the effective cache size and reduce load on the server if clients also download blobs in zstandard compressed form. If you would rather store CAS blobs in uncompressed form, add --storage_mode uncompressed to your configuration.

Usage

If a YAML configuration file is specified by the --config_file command line flag or BAZEL_REMOTE_CONFIG_FILE environment variable, then other command line flags and environment…

Excerpt shown — open the source for the full document.