v0.2.0 — 16 Language Parsers

Fast, flexible graph database for code relationships

A fast, reliable, and flexible graph database optimized for storing and querying code relationships, with production-ready language parsers for 16 languages — Python, Rust, TypeScript, Go, C/C++, Java, Kotlin, C#, PHP, Ruby, Swift, Tcl, Verilog, COBOL, Fortran.

Get Started View the GitHub repo

Language Parsers

1,300+

Tests

<1ms

Node Lookup

Unsafe Code

Overview

The five-minute loop, explained

CodeGraph is a graph database for code relationships. This page shows what the ecosystem does, why the design constraints matter, and how to get to a first run.

Choose a parser

Pick from 16 language parsers that all implement the same CodeParser trait. Each parser extracts functions, classes, imports, and relationships into the same graph format.

Parse your codebase

Parse single files or entire directories. The parser walks the AST and populates the graph with nodes for every entity and edges for every relationship — calls, containment, inheritance, and more.

Query the graph

Retrieve nodes by ID (<1ms), walk neighbors (<10ms), find transitive dependencies, detect circular dependencies, or traverse call chains up to depth 5.

Export and analyze

Export graphs to DOT (Graphviz), JSON (D3.js), CSV, or RDF N-Triples. Build custom analysis tools on top of the extracted relationships.

Features

What makes CodeGraph different from a normal graph database

A purpose-built graph database for code that combines persistence, performance, and a unified parser ecosystem.

Persistent Storage

Production-ready RocksDB backend with crash-safe write-ahead logging. Atomic batch operations ensure data integrity. Memory backend included for testing.

16 Language Parsers

Comprehensive coverage from Python to COBOL. All parsers implement a unified CodeParser trait with drop-in interchangeability.

Efficient Queries

Single node lookup at ~7ns (1000x better than the 1ms target). Neighbor queries at ~410ns-40us. BFS traversal at depth 5 completes in ~5ms.

1,300+ Tests

Comprehensive test coverage across all 18 crates. Test-driven development ensures reliability. If it is not tested, it is broken.

Zero Unsafe Code

Memory-safe by default. No global state, no automatic file scanning, no convention-over-configuration. Explicit error handling with no panics in library code.

Schema-less Properties

Flexible JSON properties on nodes and edges. Add arbitrary metadata to your code entities without predefined schemas or migrations.

Language Support

16 production-ready language parsers

A unified parser API across all languages with standardized entity types, consistent error handling, and drop-in interchangeability.

Crate	Version	Tests	Description
codegraph (core)	0.2.0	124	Graph database core
codegraph-parser-api	0.2.1	23	Unified parser trait and types
codegraph-python	0.4.3	121	Python parser
codegraph-typescript	0.4.2	103	TypeScript/JavaScript parser
codegraph-rust	0.2.1	97	Rust parser
codegraph-go	0.1.6	73	Go parser
codegraph-c	0.1.4	160	C parser (with kernel/EDA support)
codegraph-cpp	0.2.2	40	C++ parser
codegraph-java	0.1.2	62	Java parser
codegraph-kotlin	0.1.2	60	Kotlin parser
codegraph-csharp	0.1.2	64	C# parser
codegraph-php	0.2.1	86	PHP parser
codegraph-ruby	0.2.1	51	Ruby parser
codegraph-swift	0.1.2	41	Swift parser
codegraph-tcl	0.1.1	58	Tcl/SDC/UPF parser (with EDA support)
codegraph-verilog	0.1.0	51	SystemVerilog/Verilog parser
codegraph-cobol	0.1.0	47	COBOL parser
codegraph-fortran	0.1.0	47	Fortran parser

Core Principles

Design philosophy

Four principles guide every design decision in the CodeGraph ecosystem.

Unified Parser API

"One trait to parse them all."

All 16 language parsers implement the same CodeParser trait, providing consistent API across languages, standardized entity types, uniform error handling, and drop-in interchangeability.

Performance First

"Sub-100ms queries or it didn't happen."

Single node lookup in ~7ns. Neighbor queries in ~410ns-40us. Graph traversal at depth 5 in ~5ms. 100K node graphs are practical and fast.

Test-Driven Development

"If it's not tested, it's broken."

1,300+ tests across the workspace covering every crate. From the core graph engine (124 tests) to language parsers like codegraph-c (160 tests) and codegraph-python (121 tests).

Zero Magic

"Explicit over implicit, always."

No global state. No automatic file scanning. No convention-over-configuration. Explicit error handling with no panics in library code. Zero unsafe code. Persistence is primary with RocksDB backend and crash-safe write-ahead logging.

Performance

Performance targets you can count on

Every operation has a clear performance budget. Actual benchmarks consistently exceed targets.

Node Lookup

<1ms

target · actual ~7ns

Neighbor Query

<10ms

target · actual ~410ns-40us

BFS (depth=5)

<50ms

target · actual ~5ms

Batch Insert

<500ms

10K nodes · actual ~7ms

Graph Load

<5s

100K nodes + 500K edges

Deep Dive

What Is CodeGraph? How the Unified Parser Architecture Works

A clear explanation of what CodeGraph is, who built it, how the architecture works, and how to use it.

CodeGraph is a graph database purpose-built for code. Unlike general-purpose graph databases, CodeGraph understands code structure — it knows what a function is, what a class is, how imports work, and how files relate to each other. The ecosystem provides a complete solution for building code analysis tools: a persistent graph database, a unified parser API, and 16 production-ready language parsers.

The parser architecture follows a layered design: user tools sit on top of code helpers, which use the query builder, which talks to the core graph engine, which is backed by RocksDB. Each layer has well-defined boundaries, can be tested independently, and has minimal dependencies on upper layers.

The core graph stores nodes (representing code entities like functions, classes, files) and edges (representing relationships like calls, contains, imports). Both nodes and edges carry flexible JSON properties, so you can attach arbitrary metadata without schema migrations. The adjacency indexing provides O(1) neighbor lookups.

CodeGraph supports two backends: RocksDB for production use with crash-safe write-ahead logging, and an in-memory backend for testing. Batch operations on both nodes and edges are atomic.

Read the full README →

Quick Start

How to go from reading about CodeGraph to using it

Add CodeGraph to your Cargo.toml and start parsing code in minutes.

Using the Complete Solution (Database + Parser)

// Cargo.toml
[dependencies]
codegraph = "0.2.0"
codegraph-parser-api = "0.2.1"
codegraph-python = "0.4.3"

// main.rs
use codegraph::CodeGraph;
use codegraph_parser_api::CodeParser;
use codegraph_python::PythonParser;

let parser = PythonParser::new();
let mut graph = CodeGraph::open("./project.graph")?;
let info = parser.parse_directory(Path::new("./src"), &mut graph)?;
println!("Parsed {} files", info.files.len());
    

Add dependencies

Add codegraph, codegraph-parser-api, and your chosen language parser to Cargo.toml.

Create parser and graph

Instantiate a parser (e.g. PythonParser::new()) and open or create a graph with CodeGraph::open().

Parse and query

Parse a file with parser.parse_file() or a directory with parser.parse_directory(). Query nodes, edges, and neighbors on the resulting graph.

Export and integrate

Export to DOT, JSON, CSV, or RDF formats for visualization or further analysis.

FAQ

The fastest answers to the questions people ask first

Start here if you want the creator, the language support, the performance numbers, or the design philosophy without reading the whole README first.

Who created CodeGraph?

CodeGraph is created and maintained by Andrey Vasilevsky (anvanster@gmail.com). The project is developed openly on GitHub under the Apache 2.0 license.

What languages does CodeGraph support?

CodeGraph ships with 16 production-ready language parsers: Python, Rust, TypeScript/JavaScript, Go, C, C++, Java, Kotlin, C#, PHP, Ruby, Swift, Tcl/SDC/UPF, Verilog/SystemVerilog, COBOL, and Fortran. All parsers share a unified API through the CodeParser trait.

What storage backends are supported?

CodeGraph supports two backends: RocksDB for production use with persistence, crash-safe write-ahead logging, and atomic batch operations; and an in-memory backend for testing and development.

How fast is it?

Single node lookup completes in ~7ns (target: <1ms). Neighbor queries run in ~410ns-40us (target: <10ms). BFS traversal at depth 5 completes in ~5ms (target: <50ms). Batch inserts of 10K nodes complete in ~7ms (target: <500ms).

Is it safe to use in production?

Yes. CodeGraph contains zero unsafe code, uses RocksDB with crash-safe write-ahead logging, has explicit error handling with no panics in library code, and includes 1,300+ tests across all crates. The project follows semantic versioning.

What is out of scope?

CodeGraph explicitly does not provide semantic analysis (no type inference or advanced static analysis), IDE integration (no LSP server), build system integration, or a complete analysis framework. You build the analysis logic, CodeGraph provides the infrastructure.

Primary Sources

Every claim on this page is grounded in the repository

All information is sourced from the official README, documentation, and crate metadata so you can verify the details yourself.

GitHub Repository

The source for the README, workspace structure, crate metadata, and quick-start examples.

github.com/anvanster/codegraph →

Crates.io

Published crates including codegraph core, parser API, and all 16 language parsers.

crates.io/crates/codegraph →

Documentation

Comprehensive docs generated from source code covering the API, traits, and architecture.

docs.rs/codegraph →