Database

Database

How to Deploy and Use a Graph Database

Highlights

If you work with complex data, a graph database can simplify your workflow. Unlike relational databases that organize data into tables, a graph database structures information as a network of interconnected nodes. This approach is significantly faster and more efficient for managing relationships between data points.

In this guide, we dive into graph databases, covering their key use cases and advantages over traditional database models, and the steps to deploy and use them. We’ll also explain how to query data using Cypher, and then demonstrate the workflow through a step-by-step tutorial using FalkorDB.

What is a Graph database?

A graph database is built to store and query relationships between data efficiently. In graph databases, you structure data as nodes (entities) and relationships (edges). These nodes and entities can be assigned labels, and you can associate properties with them. This structure allows for highly efficient querying of complex relationships, making graph databases particularly well-suited for applications that involve interconnected data. Modern graph databases like FalkorDB are designed for high performance, fetching connections in milliseconds—even when managing billions of nodes.

Key Components of a Graph Database

Nodes – Represent entities such as people, products, or locations. Each node can have multiple properties (key-value pairs) describing its attributes.
Relationships (Edges) – Define the connections between nodes. These relationships have types (e.g., “FRIENDS_WITH” or “PURCHASED”) and can also carry properties.
Properties – Additional details assigned to nodes and relationships, such as names, timestamps, or weights, enhancing query capabilities.
Labels – Group nodes into categories (e.g., “User,” “Product”), enabling more efficient indexing and retrieval.

Graph Databases vs Relational Databases

Relational databases store data in tables with predefined schemas and rely on joins to establish relationships, which can become inefficient for complex queries. As data volume and relationship complexity grow, relational databases require increasingly expensive joins, leading to slower queries and performance bottlenecks. Graph databases mitigate this by storing relationships as first-class entities, allowing direct traversal of connected nodes without the need for costly joins. This results in significantly faster queries and better scalability for complex, interconnected data.

Additionally, relationships are stored natively alongside the data elements, known as nodes. This structural approach provides a more flexible format, optimizing data traversing by enabling rapid access to vast networks of connections. In fact, this configuration allows millions of connections to be accessed per second, showcasing the impressive performance capabilities of graph databases.

By eliminating the traditional JOIN operations found in relational databases, graph databases enhance both speed and efficiency, making them ideal for handling complex queries and large datasets with ease.

Graph Databases vs Vector Databases

In vector databases, data is stored as high-dimensional vector embeddings, which are numerical representations generated by machine learning models to capture the features of data. When querying, the input is converted into a vector embedding, and similarity searches are performed between the query vector and stored embeddings using distance metrics like cosine similarity or Euclidean distance to retrieve the most relevant results.

Most vector databases cannot explicitly capture relationships between data points, unlike graph databases. This means they struggle with queries that require understanding structured relationships, such as tracing multi-hop connections, identifying dependencies between entities, or performing path-based reasoning. While vector databases excel in finding similar items based on feature proximity, they are not designed for scenarios where relationship-driven insights, such as network analysis or hierarchical dependencies, are crucial.

Modern graph databases, like FalkorDB, allow you to store vector embeddings along with nodes and relationships between nodes, thereby giving you the best of both worlds.

Graph Databases vs NoSQL Databases

Traditional NoSQL databases store data in JSON, key-value, column-family, or document formats, requiring developers to handle complex relationship queries programmatically since these databases are primarily designed for scalability and flexibility rather than structured relationships. While NoSQL databases excel at handling unstructured or semi-structured data and large-scale distributed storage, they often struggle with efficiently querying deeply connected data.

Since graph databases also allow you to store metadata with nodes, you get the benefits that NoSQL databases offer, without sacrificing the ability to perform complex relationship queries.

Graph Database Examples: How Tech Giants Use Graphs to Scale and Optimize

Graph databases have become a cornerstone of building hallucination-free AI and highly scalable systems. Did you know that tech giants like Google and Facebook have developed their own graph databases to handle vast amounts of interconnected data efficiently? These systems power everything from social networking to search and advertising. Let’s explore how they work.

Facebook: Behind Social Connections

Facebook’s entire platform thrives on relationships—friendships, follows, likes, and shares. A traditional relational database would struggle to manage these intricate connections at scale. Instead, Facebook relies on a graph database to store and query relationships directly, enabling features like “People You May Know” and targeted advertising to work in real time.

Facebook developed TAO (The Associations and Objects)—a distributed data store optimized for its social graph to handle its massive network of billions of users and interactions.

How TAO Works

Graph Structure: Stores users, posts, pages, and interactions as nodes and edges, making relationship queries seamless.
High-Speed Caching: Uses an efficient caching layer to serve billions of queries per second.
Leader-Follower Model: Follows a leader-follower replication model, ensuring fast reads with low latency while maintaining data consistency.

Google: Graphs Powering Search, Maps, and Ads

Google relies on graph databases across multiple services. In Google Search, the Knowledge Graph helps understand connections between people, places, and concepts to improve search results. In Google Maps, it models roads, locations, and businesses to optimize navigation. Google Ads uses graphs to track user behavior and show relevant ads.

Google’s graph database setup includes:

- Knowledge Graph for entity relationships in search.
- Spanner, a distributed SQL database with graph-like capabilities.
- Bigtable & Pregel for large-scale graph processing, often used in AI/ML.

Key Graph Technologies at Google

Knowledge Graph:
- Powers Google Search by linking people, places, and concepts.
- Helps understand user queries better, providing richer, context-aware search results.
Spanner:
- A globally distributed SQL database with graph-like capabilities.
- Supports Google’s mission-critical applications, including ads and cloud services.
Bigtable & Pregel:
- Bigtable: A distributed storage system that supports large-scale graph processing.
- Pregel: A specialized system for running complex graph algorithms, often used in AI and machine learning tasks.

Core Concepts of Graph Databases

As we’ve already discussed, graph databases take a different approach to organizing data compared to traditional relational databases.

Let’s go a little deeper into the core components that make graph databases so powerful.

Nodes and Relationships

Nodes are the primary data points in a graph database, representing entities such as a person, product, or location. Relationships connect nodes and define how they interact. For example, a user can “follow” another user, or a customer can “purchase” a product.

Relationships are just as important as nodes. Unlike traditional databases, where relationships are inferred through foreign keys, graph databases store them explicitly. This allows for faster queries and better performance when analyzing complex connections.

Example Cypher Query:

				
					CREATE (alice:Person {name: 'Alice', age: 30})
CREATE (bob:Person {name: 'Bob', age: 25})
CREATE (alice)-[:FOLLOWS]->(bob)

### This creates Person nodes for Alice and Bob, with a FOLLOWS relationship between them.

Properties and Labels

Nodes and relationships store additional details using properties, which are key-value pairs. For example, a person node may have properties like name and age, while a “purchased” relationship could include a timestamp to track when the purchase occurred.

To keep data organized, labels categorize nodes. For instance, all people can have a Person label, while all products can have a Product label. This grouping improves query performance by allowing searches within specific categories instead of scanning the entire database.

Together, nodes, relationships, properties, and labels define the structure of a graph database, enabling efficient data organization and retrieval.

Example Cypher Query:

				
					CREATE (laptop:Product {name: 'Laptop', price: 1200})
CREATE (alice:Person {name: 'Alice', age: 30})
CREATE (alice)-[purchase:PURCHASED {timestamp: '2025-02-17'}]->(laptop)

### Here, a Product node (Laptop) is created, with a PURCHASED relationship from Alice, including a timestamp property.

Indexes

Schemas

Indexes are specially designed data structures that play a crucial role in optimizing the performance of databases. By organizing data in a way that allows for quick retrieval, indexes significantly improve read operations, making them an indispensable feature in database management.

How Indexes Work

Quick Data Retrieval: Think of an index like a book’s table of contents. Instead of flipping through every page to find a specific topic, you can jump straight to the page you need. Similarly, an index in a database allows you to locate and access data swiftly without scanning the entire dataset.
Improving Query Speed: Without indexes, a simple search query could require scanning every entry in a database. This process, known as a full table scan, is time-consuming and resource-intensive. Indexes streamline this by providing a shortcut, reducing the time taken to execute queries and therefore speeding up overall system performance.

Types of Indexes

Single-column Indexes: These are created for a single column. They’re useful when your query primarily targets a specific column.
Multi-column Indexes: Also known as composite indexes, they involve multiple columns. These are beneficial when queries frequently involve filtering or sorting based on more than one column.
Unique Indexes: As the name suggests, they ensure that the indexed column remains unique, preventing duplicate entries, which can be critical for maintaining data integrity.
Full-text Indexes: These are perfect for text-heavy searches, enabling efficient search capabilities across large text fields.

Advantages of Using Indexes

Faster Data Access: By reducing the need for a complete table scan, indexes provide a direct route to the data, cutting down on retrieval times.
Efficient Sorting and Filtering: Indexes can enhance operations that involve sorting and filtering, as they organize the data in a manner conducive to these tasks.
Increased Performance for Large Databases: As databases scale, maintaining performance can become challenging. Indexes help minimize bottlenecks, ensuring that even large databases run smoothly.

When working with a graph database, it is important to ensure that data remains consistent and reliable. Schemas and constraints serve as a powerful framework for data integrity.

Understanding Schemas and Their Role

Schemas Define Structure: A schema provides an overarching structure for how nodes and relationships should be organized within the graph, acting as a blueprint for data storage.
Guiding Data Input: Schemas help data architects define what kind of data is permissible, ensuring that it fits within the established framework and adheres to the intended design.

Constraints

Enforcing Data Rules: Constraints are specific rules applied to ensure data remains consistent. They check that the data respects certain conditions before it is committed to the database.
Preventing Data Anomalies: Constraints can prevent data duplication by enforcing uniqueness, as well as guarantee the presence of essential data fields by enforcing mandatory properties.

Types of Constraints in Graph Databases

Uniqueness Constraints: Ensure that specific properties like unique identifiers do not replicate across different nodes or relationships.
Property Existence Constraints: Guarantee that nodes or relationships have specific, required properties before they are saved in the database.
Node Key Constraints: A more advanced combination of uniqueness and property existence, ensuring complete and consistent node representation.

Graph Database Use Cases

Graph databases excel in scenarios where connections between data matter most. They help detect patterns, enhance search accuracy, and enable faster real-time decision-making. In this section, we’ll explore how graph databases power fraud detection, AI/ML, RAG, recommendation engines, and pattern discovery.

Fraud Detection

Graph databases can be used to analyze connections between users, transactions, and devices to detect fraudulent activities. By identifying unusual relationship patterns, you can flag suspicious behaviors such as unexpected money transfers, fake accounts, and bot networks. This approach can be far more effective than traditional rule-based fraud detection, which often fails to uncover complex fraud rings.

AI/ML

Machine learning models benefit significantly from graph databases because graphs enable AI/ML models to leverage interconnected data and reason over relationships more effectively. Large Language Models (LLMs), for instance, often hallucinate and are trained on web-scale data up to a certain cut-off date. If your data is not part of their training dataset, there is a high probability that the LLM will confidently generate incorrect or outdated outputs.

In such scenarios, you can enhance LLM performance by integrating graph databases to query relevant data, reason over structured relationships, and provide accurate contextual information. This ensures that the LLM generates more informed and reliable responses.

Graph Database for RAG

In Retrieval-Augmented Generation (RAG), graph databases enhance context retrieval for LLMs by organizing knowledge as interconnected nodes, an architecture pattern known as GraphRAG.

Traditional RAG systems use vector databases for retrieval. Vector search might retrieve similar data points, but fail when queries require more structured knowledge exploration. This is where graph databases like FalkorDB, especially if they support vector search as well, can be extremely powerful. The structured representation of data that knowledge graphs can capture strengthens the LLM’s reasoning capabilities, while the vector index can help refine the similarity searches, leading to more accurate and contextually relevant AI-generated responses.

Recommendation Engines

Graph databases can also be used to build recommendation systems. You can use them to analyze user preferences, purchase history, and interactions. Unlike traditional methods that rely on simple similarity scores, graph-based recommendations uncover deep connections between users and products. This approach enables highly personalized suggestions in e-commerce, streaming platforms, and social media, enhancing user engagement and satisfaction.

Pattern Discovery

Businesses can also use graph databases to uncover hidden trends in customer behavior, supply chains, and cybersecurity. In customer behavior analysis, for instance, businesses can map relationships between purchasing patterns, product preferences, and social interactions to identify emerging trends and predict future consumer needs. This allows for hyper-personalized marketing and better demand forecasting. In medical research and genetics, graph databases can facilitate the mapping of genetic relationships, protein interactions, and disease progression. By analyzing how genes and biological markers are interconnected, researchers can identify potential treatments, understand disease patterns, and accelerate drug discovery. The applications are numerous, and we are witnessing new use cases emerge daily.

How to Use a Graph Database

Now that we have a solid understanding of graph databases, let’s see how to actually use one. Using a graph database means storing, querying, and analyzing data as interconnected nodes and relationships. You typically load data, define relationships, and run queries to extract meaningful insights.

In this section, we’ll explore the fundamentals of the Cypher query language, and its syntax, and show how to set up graph database instances on FalkorDB using AWS and GCP cloud providers.

Querying the Graph with Cypher

FalkorDB’s Cypher support makes it easy to work with graph data. With Cypher, you can search for patterns, filter results, and structure data efficiently. It allows you to add, update, or remove nodes and relationships with straightforward, readable queries.

For more advanced tasks, Cypher lets you combine results, loop through lists, and execute specialized functions. You can also import external data seamlessly, making integration smooth and efficient. FalkorDB takes this a step further with built-in graph algorithms and indexing, ensuring that even large-scale queries run fast.

Below is a table outlining the essential Cypher commands and syntax you need to get started with FalkorDB.

Graph Database on FalkorDB Cloud

To start using FalkorDB, you first need to create an account. Go to the FalkorDB Cloud website and click on the Sign Up button. You’ll be asked to enter your email address and set a password. After filling in the details, submit the form to create your account.

Verifying Your Email

Once you sign up, you’ll receive an email from FalkorDB with a verification link. Open your inbox and click on the link to verify your email address. If you do not see the email, check your spam or promotions folder. Once verified, you can log in to your FalkorDB Cloud account.

Exploring the FalkorDB Cloud Dashboard

After logging in, you’ll be taken to the FalkorDB Cloud dashboard. This is where you can create and manage your database instances. The dashboard provides several important sections:

Instances: This is where all your deployed databases are listed. You can see their status, health, and other details.
Subscription Plans: FalkorDB offers different plans, including Free, Startup, Pro, and Enterprise. You can choose a plan based on your needs.
Cloud Providers: FalkorDB supports deployment on both Amazon Web Services (AWS) and Google Cloud Platform (GCP).
Monitoring Tools: You can track the health, usage, and logs of your database instances from the dashboard.

Creating a Deployment Instance

To create a new database instance, click on the Create Deployment Instance button. This will open a setup screen where you need to configure your database settings.

First, you must choose a subscription plan. For this tutorial, we are selecting the FalkorDB Free plan. This allows you to deploy a database at no cost with limited resources.

Choosing a Cloud Provider and Region

FalkorDB lets you deploy databases on either Amazon Web Services (AWS) or Google Cloud Platform (GCP).

AWS – you can choose Amazon EC2 R6i memory-optimized instances to deploy FalkorDB cluster.
GCP – you can also deploy on Google Cloud E2 machine series.

After selecting a cloud provider, you need to choose a region. Each provider offers multiple data center locations.

For example, if you select AWS, you might choose us-east-2, while on GCP, you could select us-central1. The region determines where your database will be hosted, so it is best to choose a location close to your users for better performance.

Configuring the Deployment Instance

Now that you’ve selected the cloud provider and region, it is time to configure your database instance.

Name: Enter a name for your database instance. This will help you identify it later.
Description: You can add a short description to explain the purpose of this instance.
FalkorDB Username: The default username is set to falkordb, but you can change it if needed.
FalkorDB Password: Set a strong password for your database. Make sure to store it securely, as you’ll need it to connect to the database later.

Double-check all the details before proceeding to the next step.

Deploying the Database Instance

Once you’ve completed the configuration, click on the Create button. FalkorDB will now start deploying your database.

The deployment process may take a few minutes. During this time, the status of your instance will show as Initializing. Once the deployment is complete, the status will change to Healthy, indicating that your database is ready to use.

Viewing and Managing Your Database Instance

After the deployment is complete, you can view your instance in the Instances tab on the dashboard. Here, you’ll find important details about your database:

Instance Name: The name you assigned to your database.
Status: The health status of your database, which can be Healthy or Unhealthy.
Created Timestamp: The exact time your database was deployed.
Cloud Provider and Region: The platform and data center location where your database is hosted.

Congratulations, you’ve successfully deployed a FalkorDB instance on the cloud. You can now connect to it and start using it for your applications. Note that if you want to upgrade your plan, you will have to create a new deployment.

Navigating to the FalkorDB Browser and Creating a Graph Database

As the next step, go to the FalkorDB Browser by selecting the instance, and clicking ‘Open’ in the Actions selector, or by clicking the ‘Open’ button on the instance detail page.

Once you have launched the FalkorDB Browser, you can create and visualize the graph database using the Cypher language.

First, you need to enter the credentials: Host, Port, Username, Password.

Host: This should be the public endpoint or IP address of your FalkorDB instance. You can find this in the FalkorDB Cloud dashboard under your instance details, in the connectivity tab.
Port: FalkorDB typically uses port 6379 by default unless a different one was assigned during deployment.
Username: Enter the username, as you mentioned.
Password: Enter the password you set during instance creation.

As the next step, go to the FalkorDB Browser by selecting the instance, and clicking ‘Open’ in the Actions selector, or by clicking the ‘Open’ button on the instance detail page.

Once you have launched the FalkorDB Browser, you can create and visualize the graph database using the Cypher language.

First, you need to enter the credentials: Host, Port, Username, Password.

Host: This should be the public endpoint or IP address of your FalkorDB instance. You can find this in the FalkorDB Cloud dashboard under your instance details, in the connectivity tab.
Port: FalkorDB typically uses port 6379 by default unless a different one was assigned during deployment.
Username: Enter the username, as you mentioned.
Password: Enter the password you set during instance creation.

Here’s a simple set of FalkorDB queries to create nodes and edges. Run them one by one in the FalkorDB Browser.

				
					### Step 1: Create Nodes
CREATE (:Person {name: "Alice", age: 30});
CREATE (:Person {name: "Bob", age: 25});

###Step 2: Create an Edge (Relationship)
MATCH (a:Person {name: "Alice"}), (b:Person {name: "Bob"})
CREATE (a)-[:FRIENDS_WITH]->(b);

By following these steps, you should now have a fully functional FalkorDB instance and a clear understanding of how to interact with it using queries. With your instance up and running, you can start exploring more advanced queries and integrations to unlock deeper insights from your data.

Wrapping Up

FalkorDB Cloud streamlines graph database deployment with an intuitive interface and seamless support for major cloud providers. Whether you’re a beginner or an experienced developer, its flexibility and scalability make it a powerful tool for managing connected data efficiently.

Install FalkorDB locally to get into graph processing, or deploy an instance on FalkorDB Cloud for scalable, high-performance data management. Start now and unlock the power of interconnected data!

What is a graph database?

A graph database stores data as nodes and relationships, enabling efficient querying of interconnected data.

How does FalkorDB compare to relational databases?

FalkorDB eliminates costly joins by storing relationships directly, offering faster queries for complex data

Can I deploy FalkorDB on AWS or GCP?

Yes, FalkorDB supports deployment on both AWS and GCP with easy configuration and scalability.

Build fast and accurate GenAI apps with GraphRAG SDK at scale

FalkorDB offers an accurate, multi-tenant RAG solution based on our low-latency, scalable graph database technology. It’s ideal for highly technical teams that handle complex, interconnected data in real-time, resulting in fewer hallucinations and more accurate responses from LLMs.

USE CASES

SOLUTIONS

GraphRAG-SDK

Code Graph

Browser

Ultra-fast, multi-tenant graph database using sparse matrix representations and linear algebra, ideal for highly technical teams that handle complex data in real-time, resulting in fewer hallucinations and more accurate responses from LLMs.

COMPARE

FalkorDB reduces computational overhead by leveraging sparse matrices and linear algebra operations, minimizing vCPU usage, lowering infrastructure costs, and reducing licensing expenses.

RESOURCES

COMMUNITY