Saturday, December 7, 2019

graph-commit project

graph-commit project: Powershell "interpreter" for Cypher


This is the evolution from my 2nd draft: Using Powershell to execute cypher scripts with secure credentials and logging results/errors.

What was the issue?

I was running several
.cypher scripts as a scheduled task on Windows using cypher-shell to execute them.  This was fine, however my .cypher files had to provide plain-text to authenticate to various REST-API sites (and other datasources) I was using to feed my Neo4j database.  So I created some powershell credential storage/retrieval functions.

What was the other issue?
As I made changes to my scripts, I would inevitably write some syntax errors into my cypher scripts, and unknowingly break my import process.  But often, just break it a little.  Unless I manually ran each bit of code in the Neo4j Browser, I didn't have a simple method to verify or validate the results (or lack-thereof).

MY SOLUTION: A set of powershell scripts/cmdlets that would allow cypher execution and log the results (and some statistics meta-data), while showing syntax errors (exceptions) from the cypher.  I use the term "interpreter" loosely.

HOW IT WORKS:

  • Provide a Cypher query language (cql) source file script that you wish to execute.  Be sure to replace any sensitive credentials or session keys with a unique string or "placeholder text" For example: mysecretvaluegoeshere
  • Supply your Neo4j database destination & credentials by first using set-regcredentials.ps1

  • Then supply any additional (API, web credentials) also using set-regcredentials.ps1

    Any credentials stored with set-regcredentials store the password (in the registry) with secure-string. This is relatively secure, as it can only be retrieved by the SAME username logged onto the SAME computer.  In a future version I'd prefer the script retrieve credentials from a secure store like Vault (HashiCorp)
  • When executing the commandline you can supply -creds1 (thru -creds4) to have the wrapper perform a find-replace based on the key/value pair you created with set-regcredentials.ps1
  • Alternatively you can supply data in realtime on the commandline using the -findrep switch (then you supply json values of find/replace string pairs) This is useful for manual testing, or if you are supplying a dynamic sessionkey.  This would be common when authenticating to a WebAPI that gives you a one-time-use key for authorization.  We use this method when graphing data using the Veeam Backup Enterprise Manager webAPI.
Then execute your cypher by running get-cypher-results.ps1:
 POWERSHELL 
.\get-cypher-results.ps1 -Datasource 'N4jDataSource' -cypherscript 'C:\path-to-my-script\myscript.cypher' -logging 'N4jDataSource' -creds1 'mycredname'

Or let's say you wanted to embed some cypher execution within another powershell script.  You may do something like this:

 POWERSHELL 
cd "$env:programfiles\blue net inc\graph-commit"
. "$PSScriptRoot\bg-sharedfunctions.ps1" | Out-Null
$neo4jdatasource = "myn4jserver"
$scriptpath = -join ($PSScriptRoot,"\get-cypher-results.ps1")
$csp= "c:\the-path-to\my-source-script.cypher"
$frstring='{"mysecretvaluegoeshere":"1234567890abcd"}'
. $scriptPath -Datasource $neo4jdatasource -cypherscript $csp -logging $neo4jdatasource -findrep $frstring -verbosity 1


Results:


The get-cypher-results.ps1 will segment your script into individual transactions (a semicolon followed by a linefeed)

You can also give "sections" of your code a label by using the keyword section at the beginning of comments in the cypher script:

// section Main import routine to create (:Asset) nodes

...
Each transaction will be run and the metadata results will be (optionally) recorded in a log entry (per transaction).  The logging is done (of course) as a neo4j graph using the label 
(:Cypherlogentry) The following counter items will be recorded as properties:

ConstraintsAdded
ConstraintsRemoved
IndexesAdded
IndexesRemoved
LabelsAdded
LabelsRemoved
NodesDeleted
Notifications
Plan
Profile
PropertiesSet
RelationshipsCreated
RelationshipsDeleted
ResultAvailableAfter
 (how long did the transaction take to run)
StatementType
Version (of the target Neo4j server)
date (epoch when the transaction ran)
linenumber (of where this transaction begins in the script)
script (full path and filename of the .cypher script)
section (named section of code)
server: fqdn or IP and port of the neo4j server
source: name of the computer the powershell script was executed from
error: (any exception error thrown by the neo4j engine will be recorded here)
All transactions from a single .cypher script will be bookended by a "BEGIN SCRIPT" and "END SCRIPT" section marker, with the END SCRIPT logging a "ResultAvailableAfter" that is a sum of all the transactions within the script.

All entries for a particular script execution will be tied together with a relationship: 
-[:PART_OF_SCRIPT_EXECUTION]- The wrapper will complete the execution and supply some example cypher queries to return the error logging for that execution.



This gave me a method to quickly run batches of .cypher code against a neo4j database, and determine if I generated any exceptions, and log metadata to track trends for code sections.


GRAPH-COMMIT Installation procedure: 



PRE-REQUISITES:
  • Windows PC or server
  • An existing neo4j database installation.  If you need help visit the Configuring Neo4j server post.
  • Powershell v5.0 or newer
  • DotNET neo4j driver (script will install if it is missing)


Script Repositories:
  • Github repo: pdrangeid/graph-commit
    Please review github code before running it in your environment!  Be safe folks!

powershell
set-executionpolicy remotesigned -scope Process

Run the following commands to download them from the git repositories via 
 POWERSHELL 
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
$client = new-object System.Net.WebClient
$client.DownloadFile("https://raw.github.com/pdrangeid/graph-commit/master/update-modules.ps1","$Env:Temp\update-modules.ps1")
. "$Env:Temp\update-modules.ps1"

The 'update-modules.ps1' script can be used to update graph-commit scripts automatically (or from other git repositories)  I use to to automatically update my own operational scripts via windows scheduled tasks.


Configure your Neo4j datasource by providing the servername and credentials.  This will also verify that you have the DotNet neo4j driver installed (will prompt to install for you if it is missing)


 POWERSHELL 
cd "$env:Programfiles\Blue Net Inc\Graph-Commit"
.\set-regcredentials.ps1 -credname myn4jserver -n4j
 

If you haven't already installed the Neo4j DotNet driver, click no, and then answer YES when prompted to install it for you.



You will be prompted within the powershell to confirm installation of the Neo4j driver.

If it is successfully installed you will next be prompted to verify the logical name for the Neo4j datasource to be stored:


Next provide the address for your Neo4j server.  Neo4j's binary binary protocol is bolt, and defaults to TCP port 7687.  You can customize this and use name or ip address.


If the Neo4j server was able to be verified you will be asked to provide logon credentials.


If your credentials are correct you should see something like this:



You are now ready to run your .cypher scripts via powershell.

Review the get-cypher-results.ps1 script for more detail, but the command line works as such:

.\get-cypher-results.ps1 -Datasource 'yourdatasource' -cypherscript 'path-to-cypher-source-code' -creds1 -findrep {json string for find/replace pairs}
There are examples of using the get-cypher-results powershell wrapper for Cypher in my other posts. 



No comments:

Post a Comment

Have a comment? Would love to hear it!