GitHub

Running the data service stack

The knowledge graph data service is comprised of three components: 1) Indexers, 2) an IPFS cache, and 3) the API. The indexers read through the knowledge graph blockchain serially and index relevant events sequentially. For any events that read from IPFS, it reads from the IPFS cache. Reading from IPFS can be slow, especially for large files, so the IPFS cache is a separate process that reads through the chain in an optimized way and writes the IPFS contents to a local store on disk. Lastly the API reads indexed data from the database and serves it to consumers in an ergonomic way.

Install dependencies and run migrations

The data service is dependent on the following tools:

The database has an expected schema for the IPFS cache and indexers. For now all of the schemas are managed through the API project.

To run migrations, first populate an .env file in the /api directory with the following:

DATABASE_URL="postgresql://localhost:5432/gaia" # or any connection string
CHAIN_ID="80451" # or 19411 for mainnet
IPFS_KEY=''
IPFS_GATEWAY_WRITE=''
IPFS_GATEWAY_READ=''
IPFS_ALTERNATIVE_GATEWAY_KEY='' # when using Pinata according to the docs example use JWT (it contains the API secret)
IPFS_ALTERNATIVE_GATEWAY_WRITE=''
RPC_ENDPOINT=''
DEPLOYER_PK=''
OPENSEARCH_URL="http://localhost:9200" # OpenSearch server URL (optional - if not set, search routes won't be added)

You can run a PostgreSQL container using the docker compose up command and then set the DATABASE_URL to postgresql://postgres:postgres@localhost:5432/gaia.

Then run the following commands from within the /api directory:

bun install
bun run db:migrate

If done correctly, you should see logs signaling a successful migration.

Running the IPFS cache

The indexers depend on the IPFS cache to handle preprocessing of IPFS contents. To run the cache, populate an .env file in the root of this directory.

SUBSTREAMS_API_TOKEN=""
SUBSTREAMS_ENDPOINT=""
DATABASE_URL="postgresql://localhost:5432/gaia" # or any connection string

Then run the following command

cargo run -p cache
# or with the --release flag to run in "production" mode
# cargo run -p cache --release

If done correctly you should see the indexer begin processing events and writing data to the ipfs_cache table in your postgres database.

The cache will continue to populate so long as the Rust process is still executing. If you run the process again, it will start from the beginning of the chain, but skip any cache entries that already exist in the database.

Running the knowledge graph indexer

The knowledge graph indexer reads through the chain sequentially, listening for any events related to published edits. When it encounters an IPFS hash it reads from the cache, runs any transformations, then writes to the database.

To run the knowledge graph indexer, run the following commands:

cargo run -p indexer
# or with the --release flag to run in "production" mode
# cargo run -p indexer --release

If done correctly you should see the indexer begin processing the knowledge graph events sequentially.

Running the actions indexer

The actions indexer processes all knowledge graph onchain actions. Currently the only action implemented is entity curation/voting.

To run the actions indexer, run the following commands:

cargo run -p actions-indexer
# or with the --release flag to run in "production" mode
# cargo run -p indexer --release

Other indexers

Currently only the knowledge graph indexer is implemented, but in the near future there will be other indexers for processing governance events or managing the knowledge graph's history.

Documentation

Architecture and design documents are in the docs/ directory:

Hermes Architecture - Event streaming from blockchain to Kafka
K8s Secrets Isolation - Kubernetes secrets management

Project-specific documentation lives in each project's directory:

Atlas - Canonical graph computation
Hermes Substream - Event filtering and modification

Name		Name	Last commit message	Last commit date
Latest commit History 430 Commits
.github/workflows		.github/workflows
.sqlx		.sqlx
.zed		.zed
actions-indexer-pipeline		actions-indexer-pipeline
actions-indexer-repository		actions-indexer-repository
actions-indexer-shared		actions-indexer-shared
actions-indexer		actions-indexer
actions-substream		actions-substream
api		api
atlas		atlas
cache		cache
deployer		deployer
docs		docs
hermes-instrumentation		hermes-instrumentation
hermes-ipfs-cache		hermes-ipfs-cache
hermes-kafka		hermes-kafka
hermes-pipeline		hermes-pipeline
hermes-relay		hermes-relay
hermes-schema		hermes-schema
hermes-substream		hermes-substream
hermes		hermes
indexer-substream		indexer-substream
indexer		indexer
indexer_utils		indexer_utils
ipfs		ipfs
kg-indexer		kg-indexer
monitoring		monitoring
scoring-service		scoring-service
sdk		sdk
search-admin		search-admin
search-indexer-deploy		search-indexer-deploy
search-indexer-repository		search-indexer-repository
search-indexer-shared		search-indexer-shared
search-indexer		search-indexer
stream		stream
todos		todos
wire		wire
.gitattributes		.gitattributes
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
biome.json		biome.json
geo_substream.spkg		geo_substream.spkg
rust-toolchain.toml		rust-toolchain.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Running the data service stack

Install dependencies and run migrations

Running the IPFS cache

Running the knowledge graph indexer

Running the actions indexer

Other indexers

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

geobrowser/gaia

Folders and files

Latest commit

History

Repository files navigation

Running the data service stack

Install dependencies and run migrations

Running the IPFS cache

Running the knowledge graph indexer

Running the actions indexer

Other indexers

Documentation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages