hero-image

When does Aerospike Graph Database Shine?

2024-08-13

Introduction

In today’s data-driven world, AdTech businesses often face the challenge of managing vast amounts of data with limited resources. The Aerospike Graph database really shines in cases where the amount of graph-like data does not fit into cluster RAM, and it provides a compelling scaling story for graph query loads.

Hybrid memory architecture enables large datasets

Many databases perform very well when whole or hot data completely fits the cluster’s memory. In such a scenario, there are quite a few databases (especially in-memory caching solutions) that have quicker response times, and we saw this in our internal tests. This is also true for graph databases. However, Aerospike Graph uses a hybrid memory architecture that blends DRAMs with SSDs, offering consistent performance when the data doesn’t fit into the cluster memory.

For example, we compared the write speeds of AWS Neptune and Aerospike Graph. With a growing amount of stored data, AWS Neptune quickly slowed down, and very soon, Aerospike Graph database had better performance.

More importantly, Aerospike Graph had a predictable and stable write speed (see the figure below), regardless of the amount of stored data, which is often valued more when choosing a database for production workloads.  


This graphic shows that Aerospike Graph remains stable and very performant compared to a database that relies primarily on DRAM for providing quick access to data.
As the data volume reached certain points, Aerospike Graph became quicker, and its write latency remained stable, regardless of the data volume increase.

Aerospike Graph shows such stable performance because of its hybrid architecture—DRAM is used only for storing key indexes and the data itself is completely stored on SSD drives. The database is designed to work with SSD drives directly, bypassing the operating system’s file system, and is optimized to not create uneven wear on SSD drives when updating records.

Storing key data in memory enables quick traversal or transitive graph-style lookups. The way Aerospike Graph works with SSD or flash drives also allows reads within milliseconds for graph lookups, even in all-flash mode. The hybrid database may be slightly slower than fully in-memory databases, but it is still within many AdTech query time constraints. This hybrid model is a good option for when you have large amounts of data but expect stable and predictable performance.

Lower cost of storing data

Budget constraints are a common concern when scaling data infrastructure. Aerospike Graph offers a cost-efficient solution by minimizing the need for extensive memory and expensive hardware. Its hybrid storage model allows businesses to scale their data storage without proportional increases in cost.

Good failover characteristics and the ability to add a node to a live cluster allow us to maintain a manageable cluster size.

In an identity graph use case or a GDPR-style data layout, the graph often stores many small records (such as user IDs, attributes, and actions) per user. As a result, there are millions of records (primary keys and indexes to maintain graph structure), but the values data size is not that big. Aerospike Graph needs 64 bytes of DRAM for each primary key, and for such data use cases, you can be in a situation where there is no memory available for the key data, but SSD drives still have a lot of free disk space. This situation is common for AdTech use cases. Aerospike Graph has a solution to this problem: the all-flash (or SSD) storage mode, where all data is stored on SSDs or flash drives.

Conclusion

There are challenges in using graph databases for AdTech, especially with write latency and throughput, where batching and deduplication of records may help before writing to a database. However, Aerospike Graph’s hybrid memory architecture, high performance at scale, and cost efficiency make it a standout choice for managing large datasets within memory and budget limitations. By investing in an Enterprise Aerospike license, businesses can achieve the performance and scalability needed to stay competitive in a data-intensive world, without the prohibitive costs associated with traditional in-memory databases.

Author: Alexey Rosolovsky

Share:

Got a project?

Harness the power of your data with the help of our tailored data-centric expertise.

Contact us
Recent Posts