Wednesday, November 20, 2019

Create Veeam Graph

via the Veeam Backup Enterprise Manager webAPI


Problem:

How can I identify VMs that were never properly configured for backups.  Or somehow aren't being backed up at the frequency intended?

Solution:

Create a knowledge graph with data from my Veeam backup servers in order to verify that backups were configured and running for intended VMs. 

For example: The data could be compared via query against data in your IT Asset Management that defines which machines are supposed to be protected.

You may find it helpful to also have your VMware data within your graph.  Here's how to do that.

The schema for this graph is fairly simple:


But enough of all that, on to the instructions!

Prerequisites:

Known Issues:
    • The scripts currently don't clean up after themselves.  I'm still working out exactly how often to purge old job data.  You could also just INIT the whole graph periodically.
    • This graph isn't intended to import ALL information about backups.  It's focused on capturing the latest successful backup for each configured VM.

Installation: Steps (powershell)
Login using the account you intend to use (particularly if scheduling for automation) 
Now download the scripts to run the veeam data ingester from the github repositories via powershell:
This will result in the scripts being downloaded into %programfiles%\blue net inc\Graph-Commit"\

 POWERSHELL 
cd "$env:programfiles\blue net inc\graph-commit"
.\update-modules.ps1 -gitrepo pdrangeid/veeam-maint -gitfile purge-veeam.cypher
.\update-modules.ps1 -gitrepo pdrangeid/veeam-maint -gitfile init-veeam-wrapper.ps1
.\update-modules.ps1 -gitrepo pdrangeid/veeam-maint -gitfile init-veeam.cypher
.\update-modules.ps1 -gitrepo pdrangeid/veeam-maint -gitfile refresh-veeam.cypher
.\update-modules.ps1 -gitrepo pdrangeid/veeam-maint -gitfile refresh-veeam-last-backup.cypher

If this is the first time using your neo4j database with my scripts, you will need to identify your Neo4j server location and provide credentials.


This cmdlet will also verify you have the DotNET neo4j driver installed (The set-regcredentials cmdlet can install it automatically for you using the nuget package manager):

 POWERSHELL 
.\set-regcredentials.ps1 -credname myn4jserver -n4j

The prerequisites (Nuget, Neo4J dotNet driver) will be validated and prompted to be installed if missing.  Once complete it will validate connectivity to your neo4j database instance.  A successful result should look like this:


Now let's set our veeam credentials and store them in the registry.This will display a prompt for you to supply your veeam username and password.  This data will be stored in

HKEY_CURRENT_USER\Software\neo4j-wrapper\Credentials\yourveeamservername

The password will be stored as a securestring value which can only be decrypted on this computer when logged in as the user you are currently authenticated as now.

If successful you will see a message, you can also verify it in the registry:
 POWERSHELL 
.\set-regcredentials.ps1 -credname yourveeamservername -credpath "neo4j-wrapper\Credentials"

Let's test the script.  By using the -sessionkey switch we indicate we don't want to run the script, but just authenticate to the VeeamAPI and return a session key to use.
 POWERSHELL 
.\init-veeam-wrapper.ps1 -baseapiurl http://yourveeamserver:9399/api -veeamcred myveeam -neo4jdatasource myn4jserver -sessionkey
 

If you returned a proper session id that means the wrapper script was able to retrieve a session key for authentication.  Run the command again omitting the -sessionkey switch and adding the -init switch to run the script for real this time.

 POWERSHELL 
.\init-veeam-wrapper.ps1 -baseapiurl http://yourveeamserver:9399/api -veeamcred myveeam -neo4jdatasource myn4jserver -init
What does -init do?
The -init switch runs the initial ingestion of Veeam backups.  It also takes the longest.

a)  Creates (:Veeamserver) nodes
b)  Create (:Veeamjob) nodes, and relates them to their (:Veeamserver)
 
c) Creates (:Veeamprotectedvm) nodes (these are all the VMs that Veeam is aware of)

d) Finally it locates restore points to discover the MOST RECENT restore point for each (:Veeamprotectedvm)
Discovery is performed from most recent through 32 days old.  Once a valid restore point is discovered it stops trying to find valid restore points for that VM (remember, we're just trying to validate the most recent valid restore point for each protected asset)


If you have multiple Veeam backup servers, be sure to run the -init process for any additional Veeam API endpoints.
Now we want to put the Veeam backups into "buckets" identifying how recently they are backed up:

 POWERSHELL 
$scriptpath = "$env:programfiles\blue net inc\graph-commit\get-cypher-results.ps1"
$csp="$env:programfiles\blue net inc\graph-commit\refresh-veeam-last-backup.cypher"
. $scriptPath -Datasource 'myn4jserver' -cypherscript $csp -logging 'myn4jserver'

Finally, you can now run the lighter-weight "refresh" script periodically (I run it hourly).
You only need to re-run the "init" script if you want to purge the data and start over.

 POWERSHELL 
$scriptpath = "$env:programfiles\blue net inc\graph-commit\get-cypher-results.ps1"
$csp="$env:programfiles\blue net inc\graph-commit\refresh-veeam-last-backup.cypher"
. $scriptPath -Datasource 'myn4jserver' -cypherscript $csp -logging 'myn4jserver'

Review the Veeam data that was imported.  Here are some sample cypher queries that will present an explorable graph:
 CYPHER 
// SHOW veeam backups
MATCH (lgb:Lastgoodbackup)
MATCH (lgb)--(vvm:Veeamprotectedvm)
return lgb,vvm


Show specific Job information:

 CYPHER 
// Show jobs, backups, VMs, and lastgoodbackup for any jobs with 'exchange' in the job name
MATCH (vs:Veeamserver)--(vj:Veeamjob)--(vb:Veeambackup) where toLower(vj.name) contains 'exchange'
OPTIONAL MATCH (vb)--(vvm:Veeamprotectedvm)--(lgb:Lastgoodbackup)
return vs,vj,vb,vvm,lgb


Tuesday, November 12, 2019

Create vCenter Graph


Import vCenter infrastructure into a knowledge graph using Neo4j


Yes, I could have directly queried the Vmware WebAPI, but dealing with self-signed certificates and discovering all the API queries would have been a LOT of work.  RVTools conveniently already gathers ALL the data I'm looking for and exports it into a single Excel file, which makes this process quite a bit easier.

When complete this process will create the following database schema in your neo4j database:




Prerequisites:

Known Issues
    • Only tested against vCenter clusters (not standalone vsphere host output)
    • The script only builds Standard vSwitch and ports/portgroups.
      distributed virtual switches and ports ARE present in the .xls data export, but the .cypher will need modifications to properly map DV objects.


Installation: Steps (powershell)

Login using the account you intend to use (particularly if scheduling for automation) 
Now download the script files to run the veeam data collector from the github repositories
 POWERSHELL 

cd "$env:programfiles\blue net inc\graph-commit"
.\update-modules.ps1 -gitrepo pdrangeid/vmware-graph -gitfile refresh-vmware.cypher

If this is the first time using your neo4j database with my scripts, you will need to identify your Neo4j server location and provide credentials. This cmdlet will also verify you have the DotNET neo4j driver installed (The set-regcredentials cmdlet can install it automatically for you using the nuget package manager)
 POWERSHELL 
.\set-regcredentials.ps1 -credname myneo4jserver -n4j


    The prerequisites (Nuget, Neo4J dotNet driver) will be validated and prompted to be installed if missing.  Once complete it will validate connectivity to your neo4j database instance.  A successful result should look like this:


    First let's generate your output file from rvtools.
    The example below assumes we will use passthru authentication for the vCenter server.  Review the RVTools documentation for specifying credentials.
    The resulting excel document will be placed in the import subfolder within the neo4j installation path (adjust this for your environment)

     POWERSHELL 
    [string] $RVToolsPathexe = ${env:ProgramFiles(x86)}+"\Robware\RVTools\RVTools.exe"
    $Arguments = " -passthroughAuth -s fqdn.yourvcenterserver.com -c ExportAll2xlsx -d c:\neo4j-community-3.5.12\import 
    -f fqdn.yourvcenterserver.com.xlsx"
    $Process = Start-Process -FilePath $RVToolsPathExe -ArgumentList $Arguments -NoNewWindow -Wait
    

    If all went well you should have your vcenter environment exported into the excel document in c:\neo4j-community-3.5.12\import

    Now we want to run the import process to ingest the data into the graph.

    The $findstring variable is used to perform a find/replace the placeholder (in the .cypher script you downloaded earlier) for the path/file to your excel document.

    Replace the 'neo4jserver' with the name of the neo4j datasource credential you used with the set-regcredentials.ps1 earlier. 


     POWERSHELL 
    cd "$env:programfiles\blue net inc\graph-commit"
    $scriptpath = -join ($env:ProgramFiles,"\blue net inc\graph-commit\get-cypher-results.ps1")
    $findstring='{"path-to-vmware-import-file":"file:///c:/neo4j-community-3.5.12/import/fqdn.yourvcenterserver.com.xlsx"}'
    $csp=$(-join ($env:programfiles,"\blue net inc\graph-commit\refresh-vmware.cypher"))
    $result = . $scriptPath -Datasource 'myneo4jserver' -cypherscript $csp -logging 'myneo4jserver' -findrep $findstring
    

    A successful import will cycle through the transactions and give you log queries to validate:

    Use the Neo4j browser: http://your-neo4jserver:7474Login with your credentials
    Review the cypher logs (run the log queries that were output from the script execution above)
    Review the VMware data that was imported.Here are some sample cypher queries that will present an explorable graph:


     CYPHER 
    // SHOW vcenter, datacenter, cluster, folders and resource groups:
    MATCH (vc:Vcenterserver)
    MATCH (vc)--(vdc:Vspheredatacenter)
    MATCH (vc)--(vcc:Vcentercluster)
    WITH *,'/'+vdc.name as startpath
    OPTIONAL MATCH (vf:Vfolder) where vf.path starts with startpath
    OPTIONAL MATCH (vrp:Vresourcepool) where vrp.path starts with startpath
    WITH *
    MATCH (vm:Virtualmachine) where (vm)--(vf) or (vm)--(vrp) or (vm)--(vcc) or (vm)--(vdc)
    return vc,vdc,vcc,vf,vrp,vm
    




    DNS and NTP query:
     CYPHER 
    // SHOW vSphereHosts DNS,NTP, and vCenter relationships MATCH (vh:Vspherehost)
    OPTIONAL MATCH (vh)--(ds:Dnsserver)
    OPTIONAL MATCH (vh)--(ns:Ntpserver)
    OPTIONAL MATCH (vh)--(vc:Vcenterserver)
    return vh,ds,ns,vc
    


    vSphere Hosts and datastores:

     CYPHER 
    // SHOW vSpherehost datastores, types, and vcenter
    MATCH (vh:Vspherehost)
    OPTIONAL MATCH (vh)--(ds:Vdatastore)
    OPTIONAL MATCH (ds)--(dst:Vdatastoretype)
    OPTIONAL MATCH (vh)--(vc:Vcenterserver)
    return vh,ds,dst,vc
    


    vSwitch, Portgroups, and Loadbalancing policies:

     CYPHER 
    // SHOW vSwitch portgroups, and lbpolicies
    MATCH (vh:Vspherehost)
    OPTIONAL MATCH (vh)--(vs:Vswitch)
    OPTIONAL MATCH (vs)--(vlbp:Vlbpolicy)
    OPTIONAL MATCH (vpg:Vportgroup)
    OPTIONAL MATCH (vhpg:Vhostportgroup)--(vpg)
    RETURN vh,vs,vpg,vhpg,vlbp
    

    Configuring Neo4j server


    Configuring Neo4j server:


    Yes, there are plenty of tutorials for setting up Neo4j already, but I wanted to focus on a few settings that makes it easier to use it with data integration.

    This tutorial will focus on Neo4j server for windows.  It is NOT very complicated, and you'll be up and running in no time flat:

    What you need to before you begin
    System Requirements:

    • Windows PC or server
    • JAVA (OpenJDK/Oracle/IBM Java) 8 or greater
    • Neo4j Server or Desktop Edition.For these instructions I used the community edition which you can download from https://neo4j.com/download-center/#community

    1. Extract the archive into the folder you want to be your installation folder
    2. Use an editor to modify /conf/neo4j.conf
      Setting
      Action
      Description
      #dbms.directories.import=import
      Comment out
      allows custom imports on-demand
      apoc.import.file.enabled=true
      add
      Allow APOC file imports
      #dbms.memory.heap.initial_size=5g
      #dbms.memory.heap.max_size=5g
      #dbms.memory.pagecache.size=7g
      customize memory
      Memory configuration will depend on how large your graphDB will become.  Here's a good primer:
       
      https://neo4j.com/docs/operations-manual/current/tools/neo4j-admin-memrec/
      dbms.connectors.default_listen_address=0.0.0.0
      uncomment
      Allows non-local connections
    3. Configure Plugins:
      Plugin
      Description
      Download URL
      Notes
      Don't ask, just install it.  No really!  You want this.
      Install binary .jar into /plugins folder
      MSSQL JDBC
      Add this if you need to connect to MS SQL
      Extract mssql-jdbc-7.x.x.jre8.jar into /plugins folder
      Excel (multiple file formats)
      To support import from these formats download the dependencies
      https://repo1.maven.org/maven2/org/apache/poi/poi/4.1.2/poi-4.1.2.jar
      https://repo1.maven.org/maven2/org/apache/poi/poi-ooxml/4.1.2/poi-ooxml-4.1.2.jar
      https://repo1.maven.org/maven2/org/apache/poi/poi-ooxml-schemas/4.1.2/poi-ooxml-schemas-4.1.2.jar
      https://repo1.maven.org/maven2/org/apache/xmlbeans/xmlbeans/3.1.0/xmlbeans-3.1.0.jar
      https://repo1.maven.org/maven2/com/github/virtuald/curvesapi/1.06/curvesapi-1.06.jar
      Place these .jar files into the /plugins folder
      Advantage Database
      To connect to Advantage (sybase) SQL  via JDBC
      This is to support the CRM I use (CommitCRM)
    4. Configure Windows Service
      Neo4j should be configured to run as a Windows service. Launch a command shell, and install the service from within the /bin folder

      neo4j install-service

      If you are upgrading from an older version, you will need to first unregister the service for the old version:
      neo4j uninstall-service
        
    5. Set Initial Password
      neo4j-admin set-initial-password mysupersecretpassword

    6. Start the service
      sc start neo4j
      (or use the service control panel)

      That's it!  You should be ready to go with a neo4j server that's ready to connect to SQL Server, import from CSV/XLS files, and you will have the APOC library plugins at your disposal!