Introducing neo4j-embedded, a Neo4j driver for Node.js


Using Neo4j in standalone mode requires (except with clusters) that you communicate with the database over it’s native REST API. The big disadvantage with this method is, that when you need to get big amounts of data the REST API responds very slow.
Why is that?
Well, the problem seems to be the serialization to JSON objects and sending the resulting amount of data over HTTP.


In an Java application you can use Neo4j as embedded database. No serialization is needed, so the database outperforms every REST client.

So to get close to this performance with Node.js we need to run Neo4j in embedded mode.
Thus Neo4j is written in Java, we need to implement Java Native Interface (JNI) for communication between the two Languages.
A well written Node.js module is available, that implements JNI in Node.js: node-java.

With this module you can use Java classes directly inside your Node.js code. Now, how cool is that?
The problem with JNI is, that it doesn’t use Java’s code optimization. In Neo4j you mostly get iterators from your calls, so you have to call a lot of Java methods inside Node.js to read the results.

In Java you would do something like:

ExecutionEngine engine = new ExecutionEngine( db );
ExecutionResult result = engine.execute( "start n=node(*) where! = 'my node'" );
for ( Map<String, Object> row : result ) {
  for ( Entry<String, Object> column : row.entrySet() ) {
    rows += column.getKey() + ": " + column.getValue() + "; ";
  rows += "\n";

As you can see you have a lot of method calls inside the loop (entrySet(), getKey(), …). In Node.js there would even be more calls, because you cannot use Java’s for loop to iterate over the results. That would lower the performance gain or even perform slower.

To overcome this, I’ve written a wrapper for Neo4j that returns a single result object for queries, which just contains simple arrays, so it is easy for Node.js to iterate over the resulting data without calling any Java method. This results in higher memory and CPU usage of cause.

There’s also a query builder, for building Cypher queries.

Another great thing to mention here is the usage of multi-core systems. Node.js itself is designed to run on a single core. If you have a multi-core machine, you surely will want to use them, especially you don’t want Neo4j to use just a single core. So Neo4j itself scales great on multi-core system, and so it does under the JNI hood.

Further development …

My goal was to write a Neo4j driver that outperforms the existing Node.js modules on big data exchanges.

Most of the methods are written synchronously. Not really Node-Like.
Queries for example, which take some time to execute, are written asynchronously, so you can use the default node semantics on this.

Maybe there will be an asynchronous version of each method in the future (if needed somehow), but I think calls such as setProperty, which aren’t that expensive, can stay synchronous. What are your thoughts about this?

Tagged ,

Secure Neo4j Webadmin using HTTP auth and SSL

Neo4j offers a simple yet useful web interface to manage your database. You can secure it’s web administration interface with SSL. That’s great, but not good enough. If you’re opening the Port to the world, the world is invited to inspect and manipulate your Database. That’s not very nice.

So what we will do about this, is to add an additional layer of security to our setup.
I use NGINX as web proxy here, but the same should apply for every other proxy around.

Update: There’s a Neo4j Plugin out there, which might be a better option for you, depending on your needs:

In the first step we self-sign our own certificate. This should only be done in an development environment!

Our Steps are:

  • Create a certificate to use for SSL access
  • Activate HTTPS in Neo4j
  • Create a credential file
  • Create a NGINX vhost
  • Drink some coffee

Create a certificate to use for SSL access

Ok let’s start by creating the certificates:

mkdir -p /var/ssl/neo4j

# Create the CA Key and Certificate for signing Client Certs
openssl genrsa -des3 -out /var/ssl/ca.key 4096
openssl req -new -x509 -days 365 -key /var/ssl/ca.key -out /var/ssl/ca.crt

# Create the Server Key, CSR, and Certificate
openssl genrsa -des3 -out /var/ssl/neo4j/server.key 1024
openssl req -new -key /var/ssl/neo4j/server.key -out /var/ssl/neo4j/server.csr

# We're self signing our own server cert here.  This is a no-no in production.
openssl x509 -req -days 365 -in /var/ssl/neo4j/server.csr -CA /var/ssl/ca.crt -CAkey /var/ssl/ca.key -set_serial 01 -out /var/ssl/neo4j/server.crt


Activate HTTPS in Neo4j

Now we have our certificates signed and ready to use.
Next we’ll modify the Neo4j to suite our needs:


You should restart your Neo4j instance if running, otherwise start it now.

neo4j restart

Now you can visit https://localhost:7473/webadmin/

Create a credential file

Create a credentials file for nginx basic http authentication:

mkdir /var/auth

printf "$USER:$(openssl passwd -crypt $PASS)\n" >> /var/auth/neo4j

# use the group nginx runs as, on my system it's "nginx"
chown root:nginx /var/auth/neo4j
chmod 640 /var/auth/neo4j

Create a NGINX vhost

For the nginx vhost edit your /etc/nginx/nginx.conf:

server {
  listen 443 ssl;

  ssl on;
  ssl_certificate         /var/ssl/neo4j/server.crt;
  ssl_certificate_key     /var/ssl/neo4j/server.key;

  location / {
    auth_basic "Restriced";
    auth_basic_user_file  /var/auth/neo4j;
    proxy_set_header    X-Real-IP         $remote_addr;
    proxy_set_header    X-Forwarded-For   $proxy_add_x_forwarded_for;
    proxy_set_header    X_FORWARDED_PROTO https;
    proxy_set_header    Host              $http_host;
    proxy_buffering     off;
    proxy_redirect      off;

Drink some coffee

Finally restart your nginx instance and… don’t forget the coffee.

Tagged , ,