Neo4j GDS (Graph Data Science) vs. Core Neo4j (Cypher)

Neo4j GDS (Graph Data Science) vs. Core Neo4j (Cypher) #


1. Purpose Comparison #

Aspect Neo4j (Cypher) GDS (Graph Data Science)
Primary Purpose Transactional queries, CRUD operations Graph analytics, algorithms, machine learning
Execution Works directly on disk database Projects optimized graph into memory
Speed Good for pattern match and retrieval Fast for graph-wide computations
Scale Suited for operational systems Handles millions-billions of nodes/relationships
Isolation Operates on live data Safe, read-only in-memory graphs
Flexibility Good for flexible queries Pre-built scalable algorithms (PageRank, Louvain, etc.)
Optimization Query optimization on indexes Memory-efficient subgraph projection
Persistence Directly modifies database (unless read-only) Results can stay in memory or optionally write back

2. Summary Flow of GDS Workflow #

Step Cypher Call Purpose
① Project Graph gds.graph.project Create an in-memory optimized graph
② List Graphs gds.graph.list Manage in-memory graph catalog
③ Run Algorithm (Mutate) gds.pageRank.mutate, gds.degree.mutate, etc. Compute and store properties in memory
④ Stream Results gds.graph.nodeProperties.stream Retrieve computed properties
⑤ (Optional) Write to DB gds.pageRank.write, etc. Persist computed results to database
⑥ Drop Graph gds.graph.drop Free memory by deleting in-memory graphs
// Neo4j GDS Flow Diagram
+----------------+   +-----------------------+   +----------------------+
| Neo4j Database |==>|  GDS Graph Projection |==>|  Graph Catalog       |
| (Stored Nodes, |   |  (In-Memory Subgraph) |   | (Manage In-Memory    |
|  Relationships)|   |                       |   |  Graphs: List, Drop) |
+----------------+   +-----------------------+   +----------------------+
                                                           ||
                                                           ||
                                                           \/
                                                 +---------------------+
                                                 |   GDS Algorithms    |
                                                 |  (PageRank,         |
                                                 |  Community Detect., |
                                                 |  Similarity, ML)    |
                                                 +---------------------+
                                                           ||
                                                           ||
                                                           \/
                                                 +---------------------+
                                                 |      Results        |
                                                 | (Mutate, Write back,|
                                                 |  Stream to client)  |
                                                 +---------------------+

3. Example: GDS Workflow Code Snippet (Impossible by Cypher Alone) #

// Project graph into memory
CALL gds.graph.project(
  'friends-graph',
  'Person',
  'FRIEND'
);

// Run PageRank algorithm and store scores in memory
CALL gds.pageRank.mutate(
  'friends-graph',
  { mutateProperty: 'pageRankScore' }
);

// Stream top PageRank results
CALL gds.graph.nodeProperties.stream(
  'friends-graph',
  ['pageRankScore']
)
YIELD nodeId, propertyValue
RETURN gds.util.asNode(nodeId).name AS personName, propertyValue AS pageRankScore
ORDER BY pageRankScore DESC
LIMIT 10;

// Clean up memory
CALL gds.graph.drop('friends-graph');

🚀 This full in-memory graph analysis flow cannot be achieved using Cypher alone.

4. Key Points #

  • GDS Graphs ≠ Neo4j Database: They are temporary, in-memory copies optimized for analytics.
  • Projection: Only include nodes, relationships, properties you need.
  • Graph Catalog: Manage multiple in-memory graphs independently.
  • Mutate Mode: Save computed values without touching database.
  • Write Mode: Explicitly write analytics results back to database if needed.
  • Drop: Always free memory after analytics is complete.

🚀 Final Mindset #

Use Cypher for database operations.
Use GDS for fast, scalable, and isolated graph analytics and machine learning.

📚 Reference #