Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 108 additions & 13 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,21 +14,28 @@ pathrex/
├── Cargo.toml # Crate manifest (edition 2024)
├── build.rs # Links LAGraph + LAGraphX; optionally regenerates FFI bindings
├── src/
│ ├── lib.rs # Public modules: graph, formats, lagraph_sys; utils is pub(crate)
│ ├── lib.rs # Public modules: graph, formats, rpq, sparql, lagraph_sys, utils
│ ├── main.rs # Binary entry point (placeholder)
│ ├── lagraph_sys.rs # FFI module — includes generated bindings
│ ├── lagraph_sys_generated.rs# Bindgen output (checked in, regenerated in CI)
│ ├── utils.rs # Internal helpers: CountingBuilder, CountOutput, VecSource,
│ │ # grb_ok! and la_ok! macros
│ ├── utils.rs # Public helpers: CountingBuilder, CountOutput, VecSource,
│ │ # grb_ok! and la_ok! macros, build_graph
│ ├── graph/
│ │ ├── mod.rs # Core traits (GraphBuilder, GraphDecomposition, GraphSource,
│ │ │ # Backend, Graph<B>), error types, RAII wrappers, GrB init
│ │ └── inmemory.rs # InMemory marker, InMemoryBuilder, InMemoryGraph
│ ├── rpq/
│ │ ├── mod.rs # RPQ evaluation trait (RpqEvaluator), RpqResult, RpqError
│ │ ├── nfarpq.rs # NFA-based RPQ evaluator using LAGraph_RegularPathQuery
│ │ └── rpqmatrix.rs # Plan-based RPQ evaluator using LAGraph_RPQMatrix
│ ├── sparql/
│ │ └── mod.rs # SPARQL parsing (spargebra), PathTriple extraction, parse_rpq
│ └── formats/
│ ├── mod.rs # FormatError enum, re-exports
│ └── csv.rs # Csv<R> — CSV → Edge iterator (CsvConfig, ColumnSpec)
├── tests/
│ └── inmemory_tests.rs # Integration tests for InMemoryBuilder / InMemoryGraph
│ ├── inmemory_tests.rs # Integration tests for InMemoryBuilder / InMemoryGraph
│ └── nfarpq_tests.rs # Integration tests for NfaRpqEvaluator
├── deps/
│ └── LAGraph/ # Git submodule (SparseLinearAlgebra/LAGraph)
└── .github/workflows/ci.yml # CI: build GraphBLAS + LAGraph, cargo build & test
Expand Down Expand Up @@ -204,6 +211,79 @@ Configuration is via [`CsvConfig`](src/formats/csv.rs:17):
[`ColumnSpec`](src/formats/csv.rs:11) is either `Index(usize)` or `Name(String)`.
Name-based lookup requires `has_header: true`.

### SPARQL parsing (`src/sparql/mod.rs`)

The [`sparql`](src/sparql/mod.rs) module uses the [`spargebra`](https://crates.io/crates/spargebra)
crate to parse SPARQL 1.1 query strings and extract the single property-path
triple pattern that pathrex's RPQ evaluators operate on.

**Supported query form:** `SELECT` queries with exactly one triple or property
path pattern in the `WHERE` clause, e.g.:

```sparql
SELECT ?x ?y WHERE { ?x <knows>/<likes>* ?y . }
```

Key public items:

- [`parse_query(sparql)`](src/sparql/mod.rs:51) — parses a SPARQL string into a
[`spargebra::Query`].
- [`extract_path(query)`](src/sparql/mod.rs:73) — validates a parsed `Query` is a
`SELECT` with a single path pattern and returns a [`PathTriple`](src/sparql/mod.rs:62).
- [`parse_rpq(sparql)`](src/sparql/mod.rs:196) — convenience function combining
`parse_query` + `extract_path` in one call.
- [`PathTriple`](src/sparql/mod.rs:62) — holds the extracted `subject`
([`TermPattern`]), `path` ([`PropertyPathExpression`]), and `object`
([`TermPattern`]).
- [`ExtractError`](src/sparql/mod.rs:31) — error enum for extraction failures
(`NotSelect`, `NotSinglePath`, `UnsupportedSubject`, `UnsupportedObject`,
`VariablePredicate`).
- [`RpqParseError`](src/sparql/mod.rs:204) — combined error for [`parse_rpq`]
wrapping both [`SparqlSyntaxError`] and [`ExtractError`].
- [`DEFAULT_BASE_IRI`](src/sparql/mod.rs:44) — `"http://example.org/"`, the
default base IRI constant.

The module also handles spargebra's desugaring of sequence paths
(`?x <a>/<b>/<c> ?y`) from a chain of BGP triples back into a single
[`PropertyPathExpression::Sequence`].

### RPQ evaluation (`src/rpq/`)

The [`rpq`](src/rpq/mod.rs) module provides an abstraction for evaluating
Regular Path Queries (RPQs) over edge-labeled graphs using GraphBLAS/LAGraph.

Key public items:

- [`RpqEvaluator`](src/rpq/mod.rs:47) — trait with a single method
[`evaluate(subject, path, object, graph)`](src/rpq/mod.rs:48) that takes
SPARQL [`TermPattern`] endpoints, a [`PropertyPathExpression`] path, and a
[`GraphDecomposition`], returning an [`RpqResult`](src/rpq/mod.rs:42).
- [`RpqResult`](src/rpq/mod.rs:42) — wraps a [`GraphblasVector`] of reachable
vertices.
- [`RpqError`](src/rpq/mod.rs:21) — error enum covering parse errors, extraction
errors, unsupported paths, missing labels/vertices, and GraphBLAS failures.

#### `NfaRpqEvaluator` (`src/rpq/nfarpq.rs`)

[`NfaRpqEvaluator`](src/rpq/nfarpq.rs:265) implements [`RpqEvaluator`] by:

1. Converting a [`PropertyPathExpression`] into an [`Nfa`](src/rpq/nfarpq.rs:27)
via Thompson's construction ([`Nfa::from_property_path()`](src/rpq/nfarpq.rs:35)).
2. Eliminating ε-transitions via epsilon closure
([`NfaBuilder::epsilon_closure()`](src/rpq/nfarpq.rs:198)).
3. Building one `LAGraph_Graph` per NFA label transition
([`Nfa::build_lagraph_matrices()`](src/rpq/nfarpq.rs:43)).
4. Calling [`LAGraph_RegularPathQuery`] with the NFA matrices, data-graph
matrices, start/final states, and source vertices.

Supported path operators: `NamedNode`, `Sequence`, `Alternative`,
`ZeroOrMore`, `OneOrMore`, `ZeroOrOne`. `Reverse` and `NegatedPropertySet`
return [`RpqError::UnsupportedPath`].

Subject/object resolution: a [`TermPattern::Variable`] means "all vertices";
a [`TermPattern::NamedNode`] resolves to a single vertex via
[`GraphDecomposition::get_node_id()`](src/graph/mod.rs:195).

### FFI layer

[`lagraph_sys`](src/lagraph_sys.rs) exposes raw C bindings for GraphBLAS and
Expand All @@ -212,10 +292,11 @@ LAGraph. Safe Rust wrappers live in [`graph::mod`](src/graph/mod.rs):
- [`LagraphGraph`](src/graph/mod.rs:48) — RAII wrapper around `LAGraph_Graph` (calls
`LAGraph_Delete` on drop). Also provides
[`LagraphGraph::from_coo()`](src/graph/mod.rs:85) to build directly from COO arrays.
- [`GraphblasVector`](src/graph/mod.rs:124) — RAII wrapper around `GrB_Vector`.
- [`GraphblasVector`](src/graph/mod.rs:128) — RAII wrapper around `GrB_Vector`
(derives `Debug`).
- [`ensure_grb_init()`](src/graph/mod.rs:39) — one-time `LAGraph_Init` via `std::sync::Once`.

### Macros (`src/utils.rs`)
### Macros & helpers (`src/utils.rs`)

Two `#[macro_export]` macros handle FFI error mapping:

Expand All @@ -225,20 +306,28 @@ Two `#[macro_export]` macros handle FFI error mapping:
appending the required `*mut i8` message buffer, and maps failure to
`GraphError::LAGraph(info, msg)`.

A convenience function is also provided:

- [`build_graph(edges)`](src/utils.rs:184) — builds an `InMemoryGraph` from a
slice of `(&str, &str, &str)` triples (source, target, label). Used by
integration tests.

## Coding Conventions

- **Rust edition 2024**.
- Error handling via `thiserror` derive macros; two main error enums:
[`GraphError`](src/graph/mod.rs:15) and [`FormatError`](src/formats/mod.rs:24).
- Error handling via `thiserror` derive macros; three main error enums:
[`GraphError`](src/graph/mod.rs:15), [`FormatError`](src/formats/mod.rs:24),
and [`RpqError`](src/rpq/mod.rs:21).
- `FormatError` converts into `GraphError` via `#[from] FormatError` on the
`GraphError::Format` variant.
- Unsafe FFI calls are confined to `lagraph_sys`, `graph/mod.rs`, and
`graph/inmemory.rs`. All raw pointers are wrapped in RAII types that free
resources on drop.
- Unsafe FFI calls are confined to `lagraph_sys`, `graph/mod.rs`,
`graph/inmemory.rs`, and `rpq/nfarpq.rs`. All raw pointers are wrapped in
RAII types that free resources on drop.
- `unsafe impl Send + Sync` is provided for `LagraphGraph` and
`GraphblasVector` because GraphBLAS handles are thread-safe after init.
- Unit tests live in `#[cfg(test)] mod tests` blocks inside each module.
Integration tests that need GraphBLAS live in [`tests/inmemory_tests.rs`](tests/inmemory_tests.rs).
Integration tests that need GraphBLAS live in [`tests/inmemory_tests.rs`](tests/inmemory_tests.rs)
and [`tests/nfarpq_tests.rs`](tests/nfarpq_tests.rs).

## Testing

Expand All @@ -256,7 +345,13 @@ native libraries.

Tests in `src/formats/csv.rs` are pure Rust and need no native dependencies.

Tests in `src/graph/inmemory.rs` and [`tests/inmemory_tests.rs`](tests/inmemory_tests.rs)
Tests in `src/sparql/mod.rs` are pure Rust and need no native dependencies.

Tests in `src/rpq/nfarpq.rs` (NFA construction unit tests) are pure Rust and need no
native dependencies.

Tests in `src/graph/inmemory.rs`, [`tests/inmemory_tests.rs`](tests/inmemory_tests.rs),
and [`tests/nfarpq_tests.rs`](tests/nfarpq_tests.rs)
call real GraphBLAS/LAGraph and require the native libraries to be present.

## CI
Expand Down
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ csv = "1.4.0"
libc = "0.2"
oxrdf = "0.3.3"
oxttl = "0.2.3"
spargebra = "0.4.6"
thiserror = "1.0"

[features]
Expand Down
1 change: 1 addition & 0 deletions build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ fn regenerate_bindings() {
.allowlist_function("LAGraph_Delete")
.allowlist_function("LAGraph_Cached_AT")
.allowlist_function("LAGraph_MMRead")
.allowlist_function("LAGraph_RegularPathQuery")
.default_enum_style(bindgen::EnumVariation::Rust {
non_exhaustive: false,
})
Expand Down
1 change: 1 addition & 0 deletions src/graph/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,7 @@ impl Drop for LagraphGraph {
unsafe impl Send for LagraphGraph {}
unsafe impl Sync for LagraphGraph {}

#[derive(Debug)]
pub struct GraphblasVector {
pub inner: GrB_Vector,
}
Expand Down
34 changes: 34 additions & 0 deletions src/lagraph_sys_generated.rs
Original file line number Diff line number Diff line change
Expand Up @@ -261,3 +261,37 @@ unsafe extern "C" {
msg: *mut ::std::os::raw::c_char,
) -> ::std::os::raw::c_int;
}
unsafe extern "C" {
pub fn LAGraph_RegularPathQuery(
reachable: *mut GrB_Vector,
R: *mut LAGraph_Graph,
nl: usize,
QS: *const GrB_Index,
nqs: usize,
QF: *const GrB_Index,
nqf: usize,
G: *mut LAGraph_Graph,
S: *const GrB_Index,
ns: usize,
msg: *mut ::std::os::raw::c_char,
) -> ::std::os::raw::c_int;
}
#[repr(u32)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
pub enum RPQMatrixOp {
RPQ_MATRIX_OP_LABEL = 0,
RPQ_MATRIX_OP_LOR = 1,
RPQ_MATRIX_OP_CONCAT = 2,
RPQ_MATRIX_OP_KLEENE = 3,
RPQ_MATRIX_OP_KLEENE_L = 4,
RPQ_MATRIX_OP_KLEENE_R = 5,
}
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct RPQMatrixPlan {
pub op: RPQMatrixOp,
pub lhs: *mut RPQMatrixPlan,
pub rhs: *mut RPQMatrixPlan,
pub mat: GrB_Matrix,
pub res_mat: GrB_Matrix,
}
4 changes: 3 additions & 1 deletion src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
pub mod formats;
pub mod graph;
pub mod rpq;
pub mod sparql;
#[allow(unused_unsafe, dead_code)]
pub(crate) mod utils;
pub mod utils;

pub mod lagraph_sys;
54 changes: 54 additions & 0 deletions src/rpq/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
//! Regular Path Query (RPQ) evaluation over edge-labeled graphs.
//! ```rust,ignore
//! use pathrex::sparql::parse_rpq;
//! use pathrex::rpq::{RpqEvaluator, nfarpq::NfaRpqEvaluator};
//!
//! let triple = parse_rpq("SELECT ?x ?y WHERE { ?x <knows>/<likes>* ?y . }")?;
//! let result = NfaRpqEvaluator.evaluate(&triple.subject, &triple.path, &triple.object, &graph)?;
//! ```

pub mod nfarpq;

use crate::graph::GraphDecomposition;
use crate::graph::GraphblasVector;
use crate::sparql::ExtractError;
use spargebra::SparqlSyntaxError;
use spargebra::algebra::PropertyPathExpression;
use spargebra::term::TermPattern;
use thiserror::Error;

#[derive(Debug, Error)]
pub enum RpqError {
#[error("SPARQL syntax error: {0}")]
Parse(#[from] SparqlSyntaxError),

#[error("query extraction error: {0}")]
Extract(#[from] ExtractError),

#[error("unsupported path expression: {0}")]
UnsupportedPath(String),

#[error("label not found in graph: '{0}'")]
LabelNotFound(String),

#[error("vertex not found in graph: '{0}'")]
VertexNotFound(String),

#[error("GraphBLAS/LAGraph error: {0}")]
GraphBlas(String),
}

#[derive(Debug)]
pub struct RpqResult {
pub reachable: GraphblasVector,
}

pub trait RpqEvaluator {
fn evaluate<G: GraphDecomposition>(
&self,
subject: &TermPattern,
path: &PropertyPathExpression,
object: &TermPattern,
graph: &G,
) -> Result<RpqResult, RpqError>;
}
Loading
Loading