Neo4j GDS (Graph Data Science) vs. Core Neo4j (Cypher) #
1. Purpose Comparison #
Aspect | Neo4j (Cypher) | GDS (Graph Data Science) |
---|---|---|
Primary Purpose | Transactional queries, CRUD operations | Graph analytics, algorithms, machine learning |
Execution | Works directly on disk database | Projects optimized graph into memory |
Speed | Good for pattern match and retrieval | Fast for graph-wide computations |
Scale | Suited for operational systems | Handles millions-billions of nodes/relationships |
Isolation | Operates on live data | Safe, read-only in-memory graphs |
Flexibility | Good for flexible queries | Pre-built scalable algorithms (PageRank, Louvain, etc.) |
Optimization | Query optimization on indexes | Memory-efficient subgraph projection |
Persistence | Directly modifies database (unless read-only) | Results can stay in memory or optionally write back |
2. Summary Flow of GDS Workflow #
Step | Cypher Call | Purpose |
---|---|---|
① Project Graph | gds.graph.project |
Create an in-memory optimized graph |
② List Graphs | gds.graph.list |
Manage in-memory graph catalog |
③ Run Algorithm (Mutate) | gds.pageRank.mutate , gds.degree.mutate , etc. |
Compute and store properties in memory |
④ Stream Results | gds.graph.nodeProperties.stream |
Retrieve computed properties |
⑤ (Optional) Write to DB | gds.pageRank.write , etc. |
Persist computed results to database |
⑥ Drop Graph | gds.graph.drop |
Free memory by deleting in-memory graphs |
// Neo4j GDS Flow Diagram
+----------------+ +-----------------------+ +----------------------+
| Neo4j Database |==>| GDS Graph Projection |==>| Graph Catalog |
| (Stored Nodes, | | (In-Memory Subgraph) | | (Manage In-Memory |
| Relationships)| | | | Graphs: List, Drop) |
+----------------+ +-----------------------+ +----------------------+
||
||
\/
+---------------------+
| GDS Algorithms |
| (PageRank, |
| Community Detect., |
| Similarity, ML) |
+---------------------+
||
||
\/
+---------------------+
| Results |
| (Mutate, Write back,|
| Stream to client) |
+---------------------+
3. Example: GDS Workflow Code Snippet (Impossible by Cypher Alone) #
// Project graph into memory
CALL gds.graph.project(
'friends-graph',
'Person',
'FRIEND'
);
// Run PageRank algorithm and store scores in memory
CALL gds.pageRank.mutate(
'friends-graph',
{ mutateProperty: 'pageRankScore' }
);
// Stream top PageRank results
CALL gds.graph.nodeProperties.stream(
'friends-graph',
['pageRankScore']
)
YIELD nodeId, propertyValue
RETURN gds.util.asNode(nodeId).name AS personName, propertyValue AS pageRankScore
ORDER BY pageRankScore DESC
LIMIT 10;
// Clean up memory
CALL gds.graph.drop('friends-graph');
🚀 This full in-memory graph analysis flow cannot be achieved using Cypher alone.
4. Key Points #
- GDS Graphs ≠ Neo4j Database: They are temporary, in-memory copies optimized for analytics.
- Projection: Only include nodes, relationships, properties you need.
- Graph Catalog: Manage multiple in-memory graphs independently.
- Mutate Mode: Save computed values without touching database.
- Write Mode: Explicitly write analytics results back to database if needed.
- Drop: Always free memory after analytics is complete.
🚀 Final Mindset #
Use Cypher for database operations.
Use GDS for fast, scalable, and isolated graph analytics and machine learning.