Tuesday, November 12, 2019

Import vCenter infrastructure into a knowledge graph using Neo4j

Yes, I could have directly queried the Vmware WebAPI, but dealing with self-signed certificates and discovering all the API queries would have been a LOT of work.  RVTools conveniently already gathers ALL the data I'm looking for and exports it into a single Excel file, which makes this process quite a bit easier.

When complete this process will create the following database schema in your neo4j database:


Known Issues
    • Only tested against vCenter clusters (not standalone vsphere host output)
    • The script only builds Standard vSwitch and ports/portgroups.
      distributed virtual switches and ports ARE present in the .xls data export, but the .cypher will need modifications to properly map DV objects.

Installation: Steps (powershell)
  1. Login using the account you intend to use (particularly if scheduling for automation)
  2. Verify you are running powershell v5.0 or newer:

  1. If you don't have the base graph-commit script modules run the following commands to download them from the git repositories via powershell:
    This will result in the scripts being downloaded into %programfiles%\blue net inc\Graph-Commit"
set-executionpolicy unrestricted -force [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12 $client = new-object System.Net.WebClient $client.DownloadFile("https://raw.github.com/pdrangeid/graph-commit/master/update-modules.ps1","$Env:Temp\update-modules.ps1") "$Env:Temp\update-modules.ps1"

  1. Now download the script files to run the veeam data collector from the github repositories via powershell:
    This will result in the scripts being downloaded into %programfiles%\blue net inc\Graph-Commit"\
cd "$env:programfiles\blue net inc\graph-commit" .\update-modules.ps1 -gitrepo pdrangeid/vmware-graph -gitfile refresh-vmware.cypher
  1. If this is the first time using your neo4j database with my scripts, you will need to identify your Neo4j server location and provide credentials.
    This cmdlet will also verify you have the DotNET neo4j driver installed (The set-regcredentials cmdlet can install it automatically for you using the nuget package manager):

.\set-regcredentials.ps1 -credname myneo4jserver -n4j
The prerequisites (Nuget, Neo4J dotNet driver) will be validated and prompted to be installed if missing.  Once complete it will validate connectivity to your neo4j database instance.  A successful result should look like this:
  1. First let's generate your output file from rvtools.
    The example below assumes we will use passthru authentication for the vCenter server.  Review the RVTools documentation for specifying credentials.
    The resulting excel document will be placed in the import subfolder within the neo4j installation path (adjust this for your environment)
[string$RVToolsPathexe = ${env:ProgramFiles(x86)}+"\Robware\RVTools\RVTools.exe" $Arguments = " -passthroughAuth -s fqdn.yourvcenterserver.com -c ExportAll2xlsx -d c:\neo4j-community-3.5.12\import  -f fqdn.yourvcenterserver.com.xlsx" $Process = Start-Process -FilePath $RVToolsPathExe -ArgumentList $Arguments -NoNewWindow -Wait

  1. If all went well you should have your vcenter environment exported into the excel document in c:\neo4j-community-3.5.12\import
    Now we want to run the import process to ingest the data into the graph.
    The $findstring variable is used to perform a find/replace the placeholder (in the .cypher script you downloaded earlier) for the path/file to your excel document.
    Replace the 'neo4jserver' with the name of the neo4j datasource credential you used with the set-regcredentials.ps1 earlier.
    cd "$env:programfiles\blue net inc\graph-commit" $scriptpath = -join ($env:ProgramFiles,"\blue net inc\graph-commit\get-cypher-results.ps1") $findstring='{"path-to-vmware-import-file":"file:///c:/neo4j-community-3.5.12/import/fqdn.yourvcenterserver.com.xlsx"}' $csp=$(-join ($env:programfiles,"\blue net inc\graph-commit\refresh-vmware.cypher")) $result = . $scriptPath -Datasource 'myneo4jserver' -cypherscript $csp -logging 'myneo4jserver' -findrep $findstring
  2.  A successful import will cycle through the transactions and give you log queries to validate:
  3. Use the Neo4j browser: http://your-neo4jserver:7474
    Login with your credentials
  4. Review the cypher logs (run the log queries that were output from the script execution above)
  5. Review the VMware data that was imported.  Here are some sample cypher queries that will present an explorable graph:

    // SHOW vcenter, datacenter, cluster, folders and resource groups: MATCH (vc:Vcenterserver) MATCH (vc)--(vdc:Vspheredatacenter) MATCH (vc)--(vcc:Vcentercluster) WITH *,'/'+vdc.name as startpath OPTIONAL MATCH (vf:Vfolderwhere vf.path starts with startpath OPTIONAL MATCH (vrp:Vresourcepoolwhere vrp.path starts with startpath WITH * MATCH (vm:Virtualmachinewhere (vm)--(vf) or (vm)--(vrp) or (vm)--(vcc) or (vm)--(vdc) return vc,vdc,vcc,vf,vrp,vm

  6. DNS and NTP query:
    // SHOW vSphereHosts DNS,NTP, and vCenter relationships MATCH (vh:Vspherehost) OPTIONAL MATCH (vh)--(ds:Dnsserver) OPTIONAL MATCH (vh)--(ns:Ntpserver) OPTIONAL MATCH (vh)--(vc:Vcenterserver) return vh,ds,ns,vc
  7. Results: 
  8. vSphere Hosts and datastores:
    // SHOW vSpherehost datastores, types, and vcenter MATCH (vh:Vspherehost) OPTIONAL MATCH (vh)--(ds:Vdatastore) OPTIONAL MATCH (ds)--(dst:Vdatastoretype) OPTIONAL MATCH (vh)--(vc:Vcenterserver) return vh,ds,dst,vc
  9. results:  
  10. vSwitch, Portgroups, and Loadbalancing policies:
    // SHOW vSwitch portgroups, and lbpolicies MATCH (vh:Vspherehost) OPTIONAL MATCH (vh)--(vs:Vswitch) OPTIONAL MATCH (vs)--(vlbp:Vlbpolicy) OPTIONAL MATCH (vpg:Vportgroup) OPTIONAL MATCH (vhpg:Vhostportgroup)--(vpg) RETURN vh,vs,vpg,vhpg,vlbp
  11. results:  

Configuring Neo4j server:

Yes, there are plenty of tutorials for setting up Neo4j already, but I wanted to focus on a few settings that makes it easier to use it with data integration.

This tutorial will focus on Neo4j server for windows.  It is NOT very complicated, and you'll be up and running in no time flat:

What you need to before you begin
System Requirements:

  • Windows PC or server
  • JAVA (OpenJDK/Oracle/IBM Java) 8 or greater
  • Neo4j Server or Desktop Edition.For these instructions I used the community edition which you can download from https://neo4j.com/download-center/#community

  1. Extract the archive into the folder you want to be your installation folder
  2. Use an editor to modify /conf/neo4j.conf
    Comment out
    allows custom imports on-demand
    Allow APOC file imports
    customize memory
    Memory configuration will depend on how large your graphDB will become.  Here's a good primer:
    Allows non-local connections
  3. Configure Plugins:
    Download URL
    Don't ask, just install it.  No really!  You want this.
    Install binary .jar into /plugins folder
    Add this if you need to connect to MS SQL
    Extract mssql-jdbc-7.x.x.jre8.jar into /plugins folder
    Excel (multiple file formats)
    To support import from these formats download the dependencies
    Place these .jar files into the /plugins folder
    Advantage Database
    To connect to Advantage (sybase) SQL  via JDBC
    This is to support the CRM I use (CommitCRM)
  4. Configure Windows Service
    Neo4j should be configured to run as a Windows service. Launch a command shell, and install the service from within the /bin folder

    neo4j install-service

    If you are upgrading from an older version, you will need to first unregister the service for the old version:
    neo4j uninstall-service
  5. Set Initial Password
    neo4j-admin set-initial-password mysupersecretpassword

  6. Start the service
    sc start neo4j
    (or use the service control panel)

    That's it!  You should be ready to go with a neo4j server that's ready to connect to SQL Server, import from CSV/XLS files, and you will have the APOC library plugins at your disposal!

Monday, January 28, 2019

Using Powershell to execute cypher scripts with secure credentials and logging results/errors.

This is a continuation of my 1st draft: Using a Powershell wrapper to securely authenticate to Neo4J to execute CYPHER using Bolt.

PROBLEM #1: I was running several .cypher scripts as a scheduled task on Windows using cypher-shell to execute them.  This was fine, however my .cypher files had to provide plain-text to authenticate to various REST-API sites I was using to feed my Neo4j database.  So I wrote the credential ps wrapper (previous post).

PROBLEM #2: As I made changes to my scripts, I would inevitably write some syntax errors into my cypher scripts, and unknowingly break my import process.  But often, just break it a little.  Unless I manually ran each bit of code in the Neo4j Browser, I didn't have an easy way to verify the results (or lack-thereof) of my cypher script modifications.

MY WORK-AROUND: A full cypher execution method that would also log the results (and some statistics meta-data), and show me syntax errors (exceptions) from the cypher.

First you supply your Neo4j database destination & credentials using set-n4jcredentials.ps1.  Then supply any additional (API, web credentials) using set-customcredentials.ps1. These store credentials (in the registry) with secure-string for the sensitive data, and attach them to a logical datasource name.  (when requested on the command-line, your .cypher will have a search/replace of your text for the "actual" credential information retrieved from the secure-string stored in the registry before it is submitted to the neo4j engine. 

Then execute your cypher by running get-cypher-results.ps1:

.\get-cypher-results.ps1 -Datasource 'N4jDataSource' -cypherscript 'C:\path-to-my-script\myscript.cypher' -logging 'N4jDataSource'


The get-cypher-results.ps1 will segment your script into transactions (a semicolon followed by a linefeed)

You can also give "sections" of your code a label by using the keyword section at the beginning of comments in the cypher script:

// section Main import routine to create (:Asset) nodes

Each transaction will be run and the metadata results will be (optionally) recorded in a log entry (per transaction).  The logging is done (of course) as a neo4j graph using the label 
(:Cypherlogentry) The following counter items will be recorded as properties:

(how long did the transaction take to run)
Version (of the target Neo4j server)
date (epoch when the transaction ran)
linenumber (of where this transaction begins in the script)
script (full path and filename of the .cypher script)
section (named section of code)
server: fqdn or IP and port of the neo4j server
source: name of the computer the powershell script was executed from
error: (any exception error thrown by the neo4j engine will be recorded here)

All transactions from a single .cypher script will be bookended by a "BEGIN SCRIPT" and "END SCRIPT" section marker, with the END SCRIPT logging a "ResultAvailableAfter" that is a sum of all the transactions within the script.

All entries for a particular script execution will be tied together with a relationship: 
-[:PART_OF_SCRIPT_EXECUTION]- The wrapper will complete the execution and supply some example cypher queries to return the error logging for that execution.

This gave me a method to quickly run batches of .cypher code against a neo4j database, and determine if I generated any exceptions, and log metadata to track trends for code sections.

All the scripts referenced in this post are available at github.com/pdrangeid/n4j-pswrapper

Sunday, December 2, 2018

Using a Powershell wrapper to securely authenticate to Neo4J to execute CYPHER using Bolt.

I've been busily developing some of my automation leveraging Neo4j with our CommitCRM and Check_MK (Nagios) monitoring platform.

I wanted to automate my process for updating the graph database and generating tickets.  In order to do this securely I wanted to execute CYPHER scripts with windows task scheduler.

To make this easier I've built a powershell wrapper to run my scripts.  This consists of defining a neo4j datasource with the server location, user and password:

ServerURL bolt://server.fqdn.or.ip:7687
DSPW         01000000d08c9ddf0115d1118c7a00c04fc297e...
DSUser        neo4j

Download/install the Neo4j.Driver nuget package

Check out this post by Glenn Sarti for more information.

Configure your datasource:

First you will be prompted to locate the Neo4j.Driver.

Next supply the datasource name

Now the URL to your Neo4j database

Then the user/password

Now the script will attempt to connect using the provided information, and if successful will store the information in your registry under HKCU\Software\neo4j-wrapper\Datasource\your-datasource-name.

set-n4credentials.ps1 can be used to store multiple named datasources and will store each server URL, user, and password combination within a seperate reg key

Retrieve the datasource

Now you can use the execute-cypher-query.ps1 to securely retrieve the credentials from the registry and run your CYPHER code within powershell.

If the server is accessible and the credentials are correct, it will run whatever cypher code you run in the $query variable.  For example I've included a simple query to count the number of nodes in the graph database and return the results.

and here's the output:

In Summary

This should provide a secure way to run CYPHER scripts natively from powershell using the BOLT protocol. to allow for authentication without putting clear-text passwords within your powershell scripts.
Here's the example scripts in github
In a future version I'd like to use paramters to provide external .cypher scripts to run to truly use this as a wrapper.  I'd also like to store datasources that I reference fromWITHIN my cypher scripts.  (for instance apoc.load.json where I am required to provide API key or user credentials within the URL or as a header).

Friday, November 9, 2018

My journey building a connected enterprise with CommitCRM using a Neo4j Graph database


Last September I attended GraphConnect 2018 in New York.  If you are not familiar with what graph databases or Neo4j is here's an excellent primer.

The first session I attended immediately got my wheels spinning.  The example use case  was a customer service ticketing/support platform and bring together data that is normally stuck in silos of many different systems (HR, CRM, ticketing, asset management, etc).  Hey, I've got that problem!

We have been using CommitCRM for several years.  It is serving as our CRM, and PSA tool to provide IT services and support for our clients.

But the challenge has always long been, how can we bring holistic views and queries together with CommitCRM and from other systems, but we don't have a native integration into CommitCRM?

Here's a few systems that are very relevant, but don't currently integrate into CommitCRM:
  • Active Directory (both ours, and our clients') users, contacts, groups, computer objects
  • Office365 / Azure (CSP subscriptions, mailboxes, and users)
  • VMware (VMs and hosts)
  • Check_MK (a Nagios monitoring platform we use)
  • Avast Managed Workplace RMM tool (formerly AVG Level Platforms) 
  • DNS
Enter Neo4j Graph database.  This data platform solves several problems for us.  It allows us to ingest data from several platforms.  It's wicked fast, and allows us to ask complex relationship questions about our data to help decision support.  This essentially becomes a hyper-flexible reporting database that lets us ask questions about our clients' environments.

Another benefit to this platform, is you instantly have a visualization tool you can use to explore your environment.

I will be posting more about my experience integrating CommitCRM and our other systems using Neo4j to created a connected enterprise