Agenda (Pacific Timezone)

Time

Session

08:30

Welcome to the InfoSec Jupyterthon 2021!

Roberto Rodriguez @Cyb3rWard0g, Principal Threat Researcher, Microsoft

08:45

Workshop 1.1 - Jupyter Introduction

Roberto Rodriguez @Cyb3rWard0g, Principal Threat Researcher, Microsoft

09:15

Break

09:20

Workshop 1.2 - Acquiring Data

Roberto Rodriguez @Cyb3rWard0g, Principal Threat Researcher, Microsoft
Jose Rodriguez @Cyb3rPandah, Researcher, MITRE ATT&CK
Pete Bryan @MSSPete, Something about security, Microsoft

10:20

Break

10:35

Workshop 1.3 - Basic of Data Analysis

Jose Rodriguez @Cyb3rPandah, Researcher, MITRE ATT&CK
Pete Bryan @MSSPete, Something about security, Microsoft
Ian Hellen @ianhellen, Principal Developer and Security Engineer, Microsoft

11:35

Break

11:40

Keynote - The Open Source InfoSec Revolution

”If you want to go fast, go alone, if you want to go far, go together” The sharing of knowledge, techniques and tools through open collaboration is key to cyber defense. Initiatives like SIGMA, MITRE ATT&CK, Jupyter and OTRF allow defenders to pool resources to combat increasingly sophisticated and coordinated adversaries. Integrating open source tools like Jupyter into your SOC processes is not without cost: there is a learning curve that can appear steep. Is it worth the investment? Where do I start? What is the payback? John will describe his journey with Jupyter notebooks in Infosec to try to answer these questions.

John Lambert @JohnLaTwC, Microsoft Threat Intelligence Center (MSTIC) GM, Microsoft

12:05

Lunch

12:50

Intro to Sessions

Roberto Rodriguez @Cyb3rWard0g, Principal Threat Researcher, Microsoft

13:00

Using Jupyter Notebooks to Solve A Variety of Business Problems

Notebooks are a powerful tool — we all love simplifying the gap between research, presentation, and delivery of results, right? However, the best way to use notebooks is often different depending on the end consumer of the information you’re providing. For example, notebooks can be a place to quickly iterate on research and create a proof of concept, or they can enable data consumption and analysis for a non-technical user group. Using notebooks effectively can have a positive impact on your business. But how do different groups achieve that impact? During this talk, I’ll take you through some examples of how we’ve used notebooks to support different business needs at Expel. We’ll talk about the problems we solved, the tools we used to do it, and how notebooks helped us make the results understandable — and most importantly, accessible — to our teams, including non-technical users.

Elisabeth Weber, Principal Data Scientist, Expel

13:35

How MSTIC uses Notebooks to Analyze Threat Signals

An overview of how MSTIC created a platform upon which analysts can use Jupyter NBs to collect & process threat signals at-scale.

Neal Shenk, Senior Threat Intelligence Analyst, Microsoft
Sil Han, Senior Threat Intelligence Analyst, Microsoft
Natanela Brod, Software Engineer II, Microsoft

14:05

Break

14:15

Incident response notebooks - Log Analysis

Log analysis is a big problem in incident response. It is common for an incident to have to analyze gigabytes of events, usually in unhelpful formats. During the presentation, a proposal for the parsing, filtering and analysis of these events will be shown. In the same way, it will talk about the limitations of pandas for the analysis of large amounts of data and how these situations can be handled. This presentation is directly applicable to incident response but the techniques described can be used in any field of data analysis.

Luis F Monge @Lukky86, Forensic analyst and incident handler, Telefónica

14:15

Reason Cyber Campaigns With Kestrel

Is there any evidence of these IOCs on my systems? From the context, are they false positives? Are these IOCs part of a larger campaign? Do you frequently search for TTPs, manually walk process trees to identify the root cause, or backtrack an attack across systems on your network? Cyber reasoning is the abstraction behind all the above questions: like clearing the “fog of war” in a real-time strategy game, security analysts implement a reasoning step, execute it to clear a small piece of “fog” ahead, reevaluate the entire visible region, and plan for the next reasoning step to iterate. Analysts may enumerate well-known reasoning steps, encode the latest knowledge into never-seen steps, and probe in multiple directions to develop reasoning strategies for different clients regarding their security measures and gaps. The Kestrel Threat Hunting Language enables comprehensive reasoning by minimizing distractions from low-level incompatible data and operations. In this session, we will perform step-by-step reasoning using Kestrel on Jupyter Notebook. To discover an APT campaign reproduced in our red lab, we will match TTP patterns, pivot around endpoint entities like processes, files, and network traffic, apply analytic steps including context enrichment and visualization, and backtrack the cyber campaign across devices and networks. Join our hunters game at InfoSec Jupyterthon 2021 to reason cyber campaigns with Kestrel. https://github.com/opencybersecurityalliance/kestrel-lang

Xiaokui Shu, Research Staff Member, IBM

15:20

Break

15:40

Moonwalking - Deriving Audit policy’s from EventID’s

I will show walk users through connecting to an Elastic cluster, running queries that will give them an idea of what their audit policy looks like based on the EventID’s. I will also walk them through matching the events they have to possible rules that would fire in the Elastic SIEM. When you have logs, but don’t know what’s in them…I’ll show users how to enumerate the logs and understand the fields/values that are important.

Neil Desai @0x617075, Principal Security Strategist, Elastic

16:15

Threat Hunting at scale with Spark Notebooks

Jupyter notebooks already plays key role in threat hunting and automating blue team workflows. In most of the use cases, data being worked on is relatively small and can be manipulated effectively with computes attached to Jupyter notebooks but in some cases, analysts have to perform threat hunting or reactive investigation on historical logs ranging from 14 days to 30 days or sometimes even more on voluminous data sources such as network firewall logs. Loading such large datasets and working with such large dataframes can be challenging with the limited compute and memory resources , in such cases we need to look at alternative solutions to handle such data at scale. In this presentation, we will look at native python libraries available to manipulate large dataframes as well as look at handling it via notebooks attached to Apache Spark pools. We will showcase notebook demonstrating various use cases where Apache Spark can help in performing distributed processing of large dataframes, performs complex data manipulation operations to find anomalous stuff , perform data engineering on raw datasets and explore Spark ML libraries to perform threat hunting at scale.

Ashwin Patil @ashwinpatil, Senior Program Manager, Microsoft

16:50

GPU-era Hunting of MBs/GBs/TBs- Graph Visualization & Wrangling

2 mini-sessions in one! 1. One honeypot event log, many graphs: A notebook tour of visual modeling of event data - physical data: network connectivity graph, identity graph, … - metadata: multigraphs, hypergraphs, & handling time - inferred data: k-core, … - AI embeddings: tabular + graph - Making it practical: streamlit/databricks/graphistry dashboarding so your team doesn’t hate you 2. The GPU-accelerated SOC: Going from pandas notebooks to 100GB/s AI - pandas -> cudf: single gpu - cudf -> dask-cudf: multi-gpu / bigger than memory gpu - examples of normal DF ops and trickier corners like strings, graph partitions, & watching your memory - discussion of going 100GB/s by thinking about parallel IO sw/hw (GPU Direct) Please RSVP (first 50 people!) for private GPU Paperspace+Graphistry accounts: https://docs.google.com/forms/d/e/1FAIpQLSc7-RluHmzzEpgUsmEDzEETGJRhFW9p-aqHtth4DO7BKZSmrQ/viewform

Leo Meyerovich @lmeyerov, CEO & Founder, Graphistry

17:30

Closing

Roberto Rodriguez @Cyb3rWard0g, Principal Threat Researcher, Microsoft

Time

Session

08:30

Welcome Back to the InfoSec Jupyterthon 2021!

Roberto Rodriguez @Cyb3rWard0g, Principal Threat Researcher, Microsoft

08:35

Workshop 1.1 - Jupyter Notebooks Advanced

Ashwin Patil @ashwinpatil, Senior Program Manager, Microsoft
Luis F Monge @Lukky86, Forensic analyst and incident handler, Telefónica
Ian Hellen @ianhellen, Principal Developer and Security Engineer, Microsoft

09:25

Break

09:30

Workshop 1.2 - Visualizing Data

Jose Rodriguez @Cyb3rPandah, Researcher, MITRE ATT&CK
Ashwin Patil @ashwinpatil, Senior Program Manager, Microsoft
Pete Bryan @MSSPete, Something about security, Microsoft
Ian Hellen @ianhellen, Principal Developer and Security Engineer, Microsoft

10:20

Break

10:35

Workshop 1.3 - Advanced Pandas

Jose Rodriguez @Cyb3rPandah, Researcher, MITRE ATT&CK
Ashwin Patil @ashwinpatil, Senior Program Manager, Microsoft
Luis F Monge @Lukky86, Forensic analyst and incident handler, Telefónica
Pete Bryan @MSSPete, Something about security, Microsoft
Ian Hellen @ianhellen, Principal Developer and Security Engineer, Microsoft

11:20

Break

11:25

Workshop 1.4 - MSTICPy

Ashwin Patil @ashwinpatil, Senior Program Manager, Microsoft
Pete Bryan @MSSPete, Something about security, Microsoft
Ian Hellen @ianhellen, Principal Developer and Security Engineer, Microsoft

12:10

Lunch

12:55

Intro to Sessions

Roberto Rodriguez @Cyb3rWard0g, Principal Threat Researcher, Microsoft

13:00

Automating Incident Triage with Notebooks

Incident triage is a fundamental part of SOC work, and ensuring efficient and accurate triage results can be the key to catching and responding to major security incidents. Despite its criticality triage is often a very manual task with little quality control. In this presentation we will talk about how Notebooks, and automated execution of them can be used to help create a software defined triage process that helps analysts by providing additional context, saves them time be pre-producing data, and validates their work to ensure quality, reputability, and auditability.

Pete Bryan @MSSPete, Something about security, Microsoft

13:35

Scale Threat Investigations with Notebooks & Python Design Patterns

Writing code to scale threat hunting and incident response has been a proven practice of sophisticated infosec teams for years; now, with notebooks and well-designed Python libraries, orgs of all sizes, budgets and maturity levels can deploy repeatable IR tactics, find evil faster, and quickly level-up their entire team. In this session, we’ll underscore why notebooks matter, learn to add context and perform automated triage of IOCs, create our own threat intel for hunting activities, and connect different libraries, products and intel sources in a single repeatable pipeline. Along the way, we’ll explore a few low-level Python design patterns that will make your libraries super easy to use in a notebook context, even for folks early in their programming journey.

Mark Kendrick, Principal Program Manager, Microsoft

14:00

Break

14:15

SSH Session Hijack Analytic

SSH session hijacking is a technique used by adversaries that may lead to lateral movement and privilege escalation. In the lecture we will go through why each logic is selected, what we can and cant see in the data , which indicators helped, why combining weak hypothesis is important when you dont have a strong indication and how to explore it. https://hx015.medium.com/ssh-session-hijack-analytic-a2c684ba410f

Shachar Roitman @shacharoitman, Security Researcher, Palo Alto Networks

14:50

Hunting Powershell Obfuscation with Linear Regression

Powershell obfuscation is commonly used by adversaries because it allows for native code execution, and it evades static string detection. There’s no way to write static detection for all possible obfuscation techniques. Instead, let’s go hunt for the obfuscation! It turns out that for normal/non-obfuscated Powershell commands, there are strong correlations between the length of a command and the count of various characters in that command. We can use statistical techniques such as Linear Regression to find commands that don’t match our expected correlations, and therefore have a higher chance of being obfuscated. This presentation will demonstrate an effective technique for finding these outliers.

Joe Petroske, Cyber Threat Hunter, Target

15:20

Break

15:40

Detecting C2 Beaconing using Statistical Analysis

RITA is an open source framework developed by Active Countermeasures. It can perform beaconing detection, DNS tunneling detection and blacklist checking. In this session, we will partially implement the RITA beacon analyzer in jupyter notebook and analyze a dataset that was generated and shared with the community by Ali Aswahali(@ali_alwashali). The dataset comprises of around 2M records created from all “malware-traffic-analysis.net” PCAP files, from 2013 to 2021. The audience will be able to use the notebook, replicate the analysis, and , hopefully, use it in their environments.

Mehmet Ergene @Cyb3rMonk, Threat Hunter, Confidential

16:15

Jupyterthon 2021 Panel

Roberto Rodriguez @Cyb3rWard0g, Principal Threat Researcher, Microsoft
Jose Rodriguez @Cyb3rPandah, Researcher, MITRE ATT&CK
Pete Bryan @MSSPete, Something about security, Microsoft
Ashwin Patil @ashwinpatil, Senior Program Manager, Microsoft
Luis F Monge @Lukky86, Forensic analyst and incident handler, Telefónica
Ian Hellen @ianhellen, Principal Developer and Security Engineer, Microsoft

16:45

Closing

Roberto Rodriguez @Cyb3rWard0g, Principal Threat Researcher, Microsoft