Amazon's new graph NoSQL database Neptune can be used to build and run applications that work with highly connected, contextual, relationship-driven datasets. It also supports read replicas, point-in-time recovery, continuous backup to Amazon S3, and replication across Availability Zones (AZ).
The Amazon team announced the preview version of the new database at the recent AWS re:Invent 2017 Conference event.
Amazon Neptune supports popular graph models like Property Graph and W3C's standard Resource Description Framework (RDF version 1.1), and their respective query languages Apache TinkerPop Gremlin and SPARQL (specifically SPARQL Query 1.1., SPARQL Update 1.1, and SPARQL Protocol 1.1). For RDF, Neptune supports four serializations: Turtle, N-Triples, N-Quads, and RDF/XML.
The core of Amazon Neptune is a purpose-built graph database engine optimized for storing large data set of relationships and querying the graph with minimum latency. It's optimized for processing graph queries and supports up to 15 low latency read replicas across three Availability Zones to scale read capacity and execute several graph queries per second. The database also features fault-tolerant and self-healing storage built for the cloud that replicates six copies of the data across three Availability Zones.
Neptune continuously backs up data to Amazon S3, and transparently recovers from physical storage failures. It's a fully-managed database and takes care of database management tasks such as hardware provisioning, software patching, setup, configuration, or backups.
Similar to other graph databases, Neptune uses graph data elements such as nodes (data entities), edges (relationships), and properties to represent and store data. The relationships are stored as first order citizens of the data model which allows the data to be directly linked, improving the performance of queries that navigate relationships in the data.
Neptune also provides data security with support for encryption at rest and in transit. There are multiple levels of security including network isolation using Amazon VPC, encryption of data at rest using AWS Key Management Service (KMS), and encryption of data in transit using TLS. On an encrypted Neptune instance, all copies of data including automated backups, snapshots, and replicas in the same cluster are encrypted. For details of all the features, checkout Neptune Features page on the website.
Neptune can be used to implement graph use cases such as social networking, recommendation engines, fraud detection, knowledge graphs, life sciences, and network/IT operations.
How to use Amazon Neptune
There are two different query engines that you can use with Amazon Neptune, Gremlin and SPARQL. To connect to the gremlin endpoint you can use the following endpoint:
curl -X POST -d '{"gremlin":"g.V()"}' https://your-neptune-endpoint:8182/gremlin
If you are using SPARQL as the graph query, you can connect to the SPARQL endpoint by entering the following command on the prompt:
curl -G https://your-neptune-endpoint:8182/sparql --data-urlencode 'query=select ?s ?p ?o where {?s ?p ?o}'
If you are interested in trying out the new graph database, you can sign-up for the Amazon Neptune preview which requires an AWS Account Number in order to request access. Other useful resources on the new graph database include the Getting Started, Developer Resources, and the frequently asked questions (FAQs).