Feed: Blog Post – Corporate – DataStax.
Author: Sebastian Estevez.
Yesterday at ApacheCon, our very own Patrick McFadin announced the public preview of an open source tool that enables developers to run their AWS DynamoDB workloads on Apache Cassandra. With the DataStax Proxy for DynamoDB and Cassandra, developers can run DynamoDB workloads outside of AWS (including on premises, in other clouds, and in hybrid configurations).
The Big Picture
The cloud has changed computing forever, and as cloud services continue to evolve up the stack, the new capabilities they offer developers come with trade-offs. One of the trade-offs developers, architects, and IT departments are making (sometimes deliberately, and sometimes accidentally) is increasing their reliance on particular cloud vendors in exchange for ease of use and time to market.
DataStax strives to help enterprises fend off disruptors by scaling their transactional workloads with technology rooted in open source Cassandra. With the rise of database as a service offerings (DBaaS), we see developers weighing their options. Should I use the most scalable database available, or the one that’s easiest to operate? Should I use a cloud database that allows me to autoscale? Is that worth being tied to a particular cloud vendor? What will my bill look like as my transaction load scales given that I’m paying by request?
DataStax Constellation aims to provide the most scalable and performant DBaaS solution while still allowing users to pick the cloud providers and regions they want to run in. However, in some cases, developers have already made the choice to leverage cloud-specific DBaaS offerings and find themselves in the unenviable position of being unable to migrate their workloads away from a particular public cloud when they face technical limitations or commercial challenges.
At DataStax we believe it will be a better outcome for our customers if they are able to limit their reliance on individual tech giants, using only the best of each cloud, and reaping the benefits of their economies of scale. In the case of DynamoDB, this is an interoperability problem which we propose to solve with the DataStax Proxy for DynamoDB and Cassandra.
If you’re just here for the code you can find it in GitHub and DataStax Labs: https://github.com/datastax/dynamo-cassandra-proxy/
Possible Scenarios
Some of the many scenarios that the proxy enables may include migrating DynamoDB workloads to self-hosted Cassandra clusters on-prem, on any cloud, and eventually to Constellation / DataStax Apache Cassandra as a Service. Hybrid deployments with data that replicates to and from AWS DynamoDB and Cassandra via DynamoDB Streams and Cassandra CDC may also be possible.
- Modernize DynamoDB workloads to persist on Cassandra or DataStax Enterprise (DSE) without a rewrite.
- Replicate data across AWS DynamoDB tables and Cassandra-backed tables on other clouds or on-prem. For technical details see the section below on Going Hybrid.
This proxy provides compatibility with the DynamoDB SDK, allowing existing applications to read/write data to DSE or Cassandra without any code changes. It also provides hybrid + multi-model + scalability benefits of Cassandra to DynamoDB users.
What’s in the Proxy?
The proxy is designed to enable users to back their DynamoDB applications with Cassandra. We determined that the best way to help users leverage this new tool and to help it flourish was to make it an open source Apache 2 licensed project.
The code consists of a scalable proxy layer that sits between your app and the database. It provides compatibility with the DynamoDB SDK which allows existing DynamoDB applications to read and write data to Cassandra without application changes.
How It Works
A few design decisions were made when designing the proxy. As always, these are in line with the design principles that we use to guide development for both Cassandra and our DataStax Enterprise product.
Why a Separate Process?
We could have built this as a Cassandra plugin that would execute as part of the core process but we decided to build it as a separate process for the following reasons:
- Ability to scale the proxy independently of Cassandra
- Ability to leverage k8s / cloud-native tooling
- Developer agility and to attract contributors—developers can work on the proxy with limited knowledge of Cassandra internals
- Independent release cadence, not tied to the Apache Cassandra project
- Better AWS integration story for stateless apps (i.e., leverage CloudWatch alarm, autoscaling, etc.)
Why Pluggable Persistence?
On quick inspection, DynamoDB’s data model is quite simple. It consists of a hash key, a sort key, and a JSON structure which is referred to as an item. Depending on your goals, the DynamoDB data model can be persisted in Cassandra Query Language (CQL) in different ways. To allow for experimentation and pluggability, we have built the translation layer in a pluggable way that allows for different translators. We continue to build on this scaffolding to test out multiple data models and determine which are best suited for:
- Different workloads
- Different support for consistency / linearization requirements
- Different performance tradeoffs based on SLAs
Going Hybrid
Global tables in DynamoDB are nothing more than multiple regional tables connected by DynamoDB Streams. Developers were “building their own” global tables using DynamoDB Streams for years before global tables existed—they are just syntactic sugar. We can use the same mechanism to replicate to Cassandra via the Dynamo-Cassandra proxy and provide the same level guarantees that global tables provide. As part of the roadmap we are looking into different approaches for streaming data back from Cassandra to DynamoDB, we welcome external collaboration.
Conclusion
If you have any interest in running DynamoDB workloads on Cassandra, take a look at the project. Getting started is easy and spelled out in the readme and DynamoDB sections. Features supported by the proxy are quickly increasing and collaborators are welcome.
https://github.com/datastax/dynamo-cassandra-proxy/
All product and company names are trademarks or registered trademarks of their respective owner. Use of these trademarks does not imply any affiliation with or endorsement by the trademark owner.
SHARE THIS PAGE