github microsoft/msticpy v2.0.0.rc2
MSTICPy 2.0.0 pre-release 2

latest releases: v2.11.0, v2.10.0, v2.9.0...
pre-release23 months ago

New Features

There are several new features in V 2.0.0 of MSTICPy. The major
items include:

  • Folium map update - plot a map using multiple layers, custom
    icons, colors and tooltips from a single function call.
  • Time Series - calculate and display a Time Series anomalies
    plot from a single function call.
  • Threat Intelligence lookups - individual providers run asynchronously
    (simultaneously) making it many times faster to perform lookups
    across providers. Lookup progress is also displayed with a progress
    bar

Pre-release documentation for v2.0.0 is on ReadtheDocs
Note: API documentation should be up-to-date but user-guides for new features
are still TBD.

Folium map update

The Folium module in MSTICPy has always been a bit complex to use
since it normally required that you convert IP addresses to MSTICPy
IpAddress entities before adding them to the map. You can now
plot maps with a single function call from a DataFrame containing
IP addresses or location coordinates. You can group the data
into folium layers, specify columns to populate popups and tooltips
and to customize the icons and coloring.

folium_layers

plot_map

A new plot_map function (in the msticpy.vis.foliummap module) that
lets you plot mapping points directly from a DataFrame. You can
specify either an ip_column or coordinates columns (lat_column and
long_column). In the former case, the geo location of the IP address
is looked up using the MaxMind GeoLiteLookup data.

You can also control the icons used for each marker with the
icon_column parameters. If you happen to have a column in your
data that contains names of FontAwesome or GlyphIcons icons
you can use that column directly.
More typically you would combine the icon_column with the
icon_map parameter. You can specify either a dictionary or a
function. For a dictionary, the value of the row in icon_column
is used as a key - the value is a dictionary of icon parameters
passed to the Folium.Icon class. For a function, the icon_column
value is passed to the function as a single parameter and the return value
should be a dictionary of valid parameters for the Icon class.
You can read the documentation for this function in the
docs

plot_map pandas accessor

Plot maps from the comfort of your own DataFrame!
Using the msticpy mp_plot accessor you can plot maps directly
from a DataFrame containing IP or location information.
The folium_map function has the same syntax as plot_map
except that you omit the data parameter.

    df.mp_plot.folium_map(ip_column="ip", layer_column="CountryName")

pd_accessors

Layering, Tooltips and Clustering support

In plot_map and .mp_plot.folium_map you can specify
a layer_column parameter. This will group the data
by the values in that column and create an
individually selectable/displayable layer in Folium. For performance
and sanity reasons this should be a column with a relatively
small number of discrete values.

Clustering of markers in the same layer is also implemented by
default - this will collapse multiple closely located markers
into a cluster that you can expand by clicking or zooming.

You can also populate tooltips and popups with values
from one or more column names.

Classic interface

The original FoliumMap class is still there for more manual
control. This has also been
enhanced to support direct plotting from IP, coordiates or GeoHash
in addition to the existing IpAddress and GeoLocation entities.
It also supports layering and clustering.

Threat Intelligence Providers - Async support

When you have configured more than one TI provider, MSTICPy will
execute requests to each of them asynchronously. This will bring big
performance benefits when querying IoCs from multiple providers.
Note: requests to individual providers are still executed synchronously
since we want to avoid swamping provider services with multiple
simultaneous requests.

We've also implemented progress bar tracking for TILookups, giving a visual
indication of progress when querying multiple IoCs.

Combining the progress tracking with asynchronous operation means
that not only is performing lookups for lots of observables faster
but ou will also less likely to be left guessing whether or not your kernel
has hung.

TI Providers are now also loaded on demand - i.e. only when you have
a configuration entry in your msticpyconfig.yaml for that provider.
This prevents loading of code (and possibly import errors) due to providers
which you are not intending to use.

Finally, we've added functions to enable and disable providers
after loading TILookup:

    from msticpy.context import TILookup
    ti_lookup = TILookup()

    iocs = ['162.244.80.235', '185.141.63.120', '82.118.21.1', '85.93.88.165']
    ti_lookup.lookup_iocs(iocs, providers=["OTX", "RiskIQ"])

ti_providers_async

Time Series pandas accessor

Although the Time Series functionality was relatively simple to
use, it previously required several disconnected steps to compute
the time series, plot the data, extract the anomaly periods. Each of
these needed a separate function import. Now you can do all of these
from a DataFrame via pandas accessors.
(currently there is a separate accessor df.mp_timeseries but we are
still working on consolidating our pandas accessors so this may change
before the final release.)

Because you typically still need these separate outputs, the accessor
has multiple methods:

  • df.mp_timeseries.analyze - takes a time-summarized DataFrame
    and returns the results of a time-series decomposition
  • df.mp_timeseries.plot - takes a decomposed time-series and
    plots the anomalies
  • df.mp_timeseries.anomaly_periods - extracts anomaly periods
    as a list of time ranges
  • df.mp_timeseries.anomaly_periods - extracts anomaly periods
    as a list of KQL query clauses
  • df.mp_timeseries.apply_threshold - applies a new anomaly
    threshold score and returns the results.

See documentation

Analyze data to produce time series.

    df = qry_prov.get_networkbytes_per_hour(...)
    ts_data = df.mp_timeseries.analyze()

Analyze and plot time series anomalies

    df = qry_prov.get_networkbytes_per_hour(...)
    ts_data = df.mp_timeseries.analyze().mp_timeseries.plot()

Analyze and retrieve anomaly time ranges

    df = qry_prov.get_networkbytes_per_hour(...)
    ts_data = df.mp_timeseries.analyze().mp_timeseries.anomaly_periods()

In next pre-release

Plot networks (graphs) directly from a DataFrame

One frequently-requested feature is the ability to easily plot
networks from data. For example you may want to view the interactions
between account names and IP addresses. This feature use
Networkx to build the graph and
Bokeh to plot the graph.

Note: The graph has the usual Bokeh interactivity - zoomin, panning, selecting,
hover-over tooltips. It does not allow you to move individual
nodes and interactively recalculate the layout. For the
latter, you can use this functionality to build a networkx graph
and plot using something like GraphViz or PyViz.

The network plot will give you two functions:

  • df.mp.to_graph to convert a DataFrame to a networkx graph
  • df.mp_plot.network create and plot the graph in a single step.

(There is also a separate function msticpy.vis.network_plot.plot_nx_graph
that will just do the NX -> plot operation)

You can specify the columns to use as source and target. An edge
is created between source and target when the two occur on
in the same row (or more than one row). You can also
specify columns to use as node and edge attributes.

To Do items

We intend to add the following before release:

  • allow you to specify the networkx layout algorithm to use
    (currently it uses the default spring_layout)
  • assign edge weight attribute based on number of rows contributing
    to an edge

MS Sentinel Workspaces API

Lets you query and resolve details for Sentinel workspaces.
This is integrated into the MpConfigEdit and MpConfigFile utilities
to let you lookup workspace details when you are editing your
settings:

  • paste in a URL from the Sentinel Azure portal to populate workspace settings
  • or resolve full details from partial workspace such as the workspace ID.

Other important fixes

The API details for most of the MSTICPy functions were not being
generated - this should now be fixed.

What's Changed (GitHub PR Summary)

  • Added pd accessor for time series functions. by @ianhelle in #381
  • Added new Sentinel Search Features - merge from main by @ianhelle in #380
  • Ianhelle/ti async lookup 2022 04 27 by @ianhelle in #383
  • Ianhelle/folium accessor 2022 04 30 by @ianhelle in #384
  • Updated tweet action to include more details by @petebryan in #406
  • Add Device Code fallback option for when interactive auth isn't available. by @petebryan in #401
  • Adding OData Delegated Auth Support into 2.0 by @petebryan in #410
  • Removed plaintext token cache from MSAL auth and replaced it with fall back to in memory caching by @petebryan in #414
  • Ianhelle/kql nbinit fixes merge2.0 2022 05 18 by @ianhelle in #412
  • Ianhelle/geoip init fix 2022 05 27 by @ianhelle in #421
  • Ianhelle/geoip init fix 2022 05 27 by @ianhelle in #422
  • Ianhelle/geoip init fix 2022 05 27 by @ianhelle in #423

Full Changelog: v2.0.0.rc1...v2.0.0.rc2

Don't miss a new msticpy release

NewReleases is sending notifications on new releases.