microsoft/msticpy v2.6.0 on GitHub

The three big changes in this release are:

Executing MS Sentinel and Kusto queries in parallel across multiple instance
Threaded (parallel) execution of time-split queries
Addition of data provider to query local (exported) Velociraptor logs

Many thanks to @d3vzer0 for inspiration and early work on the threaded query feature.
Many thanks @juju4 for inspiration and work on the Velociraptor support.

Support for running a query across multiple connections (with optional threaded operation)

It is common for data services to be spread across multiple tenants or workloads. E.g., multiple Sentinel workspaces,
Microsoft Defender subscriptions or Splunk instances. You can use the MSTICPy QueryProvider to run a query across multiple connections and return the results in a single DataFrame.

To create a multi-instance provider:

Create an instance of a QueryProvider for your data source and execute the connect() method to connect to the first instance of your data service.
Then use the add_connection() method. This takes the same parameters as the connect() method (the parameters for this method vary by data provider) to add additional instance connections.

add_connection() also supports an alias parameter to allow you to refer to the connection by a friendly name.

    qry_prov = QueryProvider("MSSentinel")
    qry_prov.connect(workspace="Workspace1")
    qry_prov.add_connection(workspace="Workspace2, alias="Workspace2")
    qry_prov.list_connections()

When you now run a query for this provider, the query will be run on all of the connections and the results will be returned as a single dataframe.

    test_query = '''
        SecurityAlert
        | take 5
        '''

    query_test = qry_prov.exec_query(query=test_query)
    query_test.head()

Some of the MSTICPy drivers support asynchronous execution of queries against multiple instances, so that the time taken to run the query is much reduced compared to running the queries sequentially. Drivers that support asynchronous queries will use this automatically. The initial set of multi-threaded drivers are:

MSSentinel_New (the new version of the MSSentinel driver)
Kusto_New (the new version of the Kusto/Azure Data Explorer driver)

By default, the queries will use at most 4 concurrent threads. You can override this by initializing the QueryProvider with the
max_threads parameter to set it to the number of threads you want. Although you should be cautious
about using too many simultaneous connections due to the potential impact on the cluster performance.

    qry_prov = QueryProvider("MSSentinel", max_threads=10)

Multi-threaded support for split/shared queries

MSTICPy has supported splitting large queries by time-slice for a while. This is documented here Splitting a Query into time chunks. With this release, we've added asynchronous support for this (if the driver supports threaded/async operation) so that multiple chunks of the query will run in parallel.

    qry_prov.SecurityAlert.list_alerts(start=start, end=end, split_by="1d")

Use the parameter split_query_by or split_by to specify a time range (the time unit uses the same syntax as pandas time intervals - e.g. "1D", "4h", etc. - the the pandas documentation for more details on this).

In this release sharding is also supported for ad hoc queries as long as you add "start" and "end" parameters to the query (this is still experimental, so let us know if you have issues with this).

Velociraptor Local Data Provider

The Velociraptor data provider can read Velociraptor log files and provide convenient query functions for each data set in the output logs.

The provider can read files from one or more hosts, stored in in separate folders. The files are read, converted to pandas DataFrames and grouped by table/event. Multiple log files of the same type (when reading in data from multiple hosts) are concatenated into a single DataFrame.

To use the Velociraptor provider, you need to create an QueryProvider instance, passing the string "Velociraptor" (or "VelociraptorLogs") as the data_environment parameter. You also need to add the data_paths parameter to specify specific folders that you want to search for log file (although you can set these paths in msticpyconfig.yaml, if you do this frequently).

You can specify multiple folders to have the logs from different hosts.

    qry_prov = mp.QueryProvider("VelociraptorLogs", data_paths=["~/my_logs"])

Calling the connect method triggers the provider to read the locations of the
log files (although the contents are not read until a query function is run).

    qry_prov.connect()


## Listing Velociraptor tables

```python3
    qry_prov.list_queries()

    ['velociraptor.Custom_Windows_NetBIOS',
    'velociraptor.Custom_Windows_Patches',
    'velociraptor.Custom_Windows_Sysinternals_PSInfo',
    'velociraptor.Custom_Windows_Sysinternals_PSLoggedOn',
   ....

Each query returns the table of data types retrieved from the logs.

    qry_prov.vc_prov.velociraptor.Windows_Forensics_ProcessInfo()

Name	PebBaseAddress	Pid	ImagePathName	CommandLine	CurrentDirectory	Env
LogonUI.exe	0x95bd3d2000	804	C:\Windows\system32\LogonUI.exe	"LogonUI.exe" /flags:0x2 /state0:0xa3b92855 /state1:0x41c64e6d	C:\Windows\system32\	{'ALLUSERSP
dwm.exe	0x6cf4351000	848	C:\Windows\system32\dwm.exe	"dwm.exe"	C:\Windows\system32\	{'ALLUSERSP
svchost.exe	0x6cd64d000	872	C:\Windows\System32\svchost.exe	C:\Windows\System32\svchost.exe -k termsvcs	C:\Windows\system32\	{'ALLUSERSP
svchost.exe	0x7d18e99000	912	C:\Windows\System32\svchost.exe	C:\Windows\System32\svchost.exe -k LocalServiceNetworkRestricted	C:\Windows\system32\	{'ALLUSERSP
svchost.exe	0x5c762eb000	920	C:\Windows\system32\svchost.exe	C:\Windows\system32\svchost.exe -k LocalService	C:\Windows\system32\	{'ALLUSERSP

What's Changed

Ianhelle/velociraptor provider 2023 05 19 by @ianhelle in #668
Updating github checkout and upload-artifact to v3 by @ianhelle in #669
Added multithreading support for additional connections (+fixes) by @d3vzer0 in #645
Bump readthedocs-sphinx-ext from 2.2.0 to 2.2.2 by @dependabot in #679
Bump sphinx-rtd-theme from 1.2.0 to 1.2.2 by @dependabot in #675
Bump httpx from 0.24.0 to 0.24.1 by @dependabot in #666
Ianhelle/fix func query names 2023 06 30 by @ianhelle in #680

Full Changelog: v2.5.3...v2.6.0

microsoft/msticpy v2.6.0 v2.6.0 Parallel Queries, Velociraptor data on GitHub

Support for running a query across multiple connections (with optional threaded operation)

Multi-threaded support for split/shared queries

Velociraptor Local Data Provider

What's Changed

microsoft/msticpy v2.6.0
v2.6.0 Parallel Queries, Velociraptor data

on GitHub