CANSECWEST 2022 DOJO

Automated Program Analysis Using Machine Learning

Instructor: Hahna Kane Latonick

May 14 -17

Hahna Kane Latonick

For the past 16 years of her engineering career, Hahna Kane Latonick has worked throughout the defense industry specializing in cybersecurity as a security researcher for the Department of Defense and other defense contracting companies. She has been featured as a cybersecurity subject matter expert on Fox Business News, ABC, U.S. News and World Report, and other national media outlets. She has led multiple tech startups, serving as CTO, VP of R&D, and Director of R&D. She has trained and developed security researchers at one of the top five aerospace and defense industry companies. Over the years, she has also taught at different conferences, such as Ringzer0 and Security BSides Orlando. In 2014, she became a DEFCON CTF finalist, placing in 6th and ranking in the top 1.5% of ethical hackers worldwide. She also holds a CISSP and CEH certification. Latonick attended Swarthmore College and Drexel University where she earned her B.S. and M.S. in Computer Engineering along with a Mathematics minor.

Twitter: https://twitter.com/hahnakane

LinkedIn: https://www.linkedin.com/in/hahnakane/

Course Schedule

May 14 - 17.

This DOJO is offered REMOTELY only.

Course Abstract

This 4-day online course features a practical hands-on approach to automated program analysis using machine learning. Given the increasing pervasiveness of IoT devices and malware, there is a great need to perform automated reverse engineering at scale, especially since reverse engineering software and firmware can often be a manual, labor-intensive, and time-intensive process. This class is perfectly suited for students who are new to machine learning and want to leverage it to automate their program analysis and reverse engineering efforts.

This class kicks off with performing advanced program analysis to automatically identify shared code relationships between applications using different binary features, compute code sharing similarity over a data set to determine binary groupings, and then determine a new binary’s similarity to previously seen samples based on code sharing patterns. We will also cover intermediate representations of binaries and how they can be used for advanced program analysis.

Next, we will introduce machine learning concepts and their applications to automated reverse engineering. We will first use unsupervised machine learning algorithms to find data patterns and features which can be useful for categorization. Then we will develop supervised machine learning models to classify binaries and make certain predictions about them. Lastly, we will apply deep learning to automate program analysis by building and evaluating neural networks. Throughout the class, labs will be conducted in a virtual environment. Students will leave the course with the necessary hands-on experience, knowledge, and confidence to conduct automated program analysis at scale using machine learning.

Applications covered in the class include, but are not limited to:

Binary Analysis
Malware Analysis
Firmware Analysis
Network/IoT Analysis
Mobile Security Analysis
Security Research / Vulnerability Discovery

Course Pre-requisites

Knowledge of Python 3 programming

Knowledge of computer architecture concepts
Knowledge of an assembly language (e.g., x86/x64, ARM, etc.)
Familiarity with navigating Linux environments and command line knowledge

Course Learning Objectives

Performing Shared Code Analysis
Leveraging intermediate representations for advanced program analysis
Introduction to Machine Learning
Exploring Unsupervised ML algorithms
Developing Supervised ML models
Building Neural Networks
Evaluating and measuring the effectiveness of ML systems

Who Should Attend

Reverse engineers, security researchers, and analysts with little to no experience with machine learning

Analysts, security researchers, and reverse engineers who want to automate and scale their program analysis and reverse engineering process

Agenda

Day 1:

Introduction to advanced program analysis
Identifying and extracting program features
EXERCISE: Similarities Lab
Leveraging N-Grams for program analysis
EXERCISE: N-Grams Lab
Performing agnostic program analysis
EXERCISE: Architecture and Compiler Agnostic Analysis Lab
Introduction to intermediate representations
EXERCISE – IR Lab

Day 2:

Introduction to Machine Learning
Evaluating ML systems
Unsupervised ML algorithm: K-Means Clustering
EXERCISE: K-Means Lab
Unsupervised ML algorithm: Agglomerative Hierarchical Clustering
EXERCISE: Agglomerative Analysis Lab
Unsupervised ML algorithm: Principal Component Analysis
EXERCISE: PCA Lab

Day 3:

Introduction to Supervised Machine Learning
Supervised ML algorithm: Logistic Regression
EXERCISE: Logistic Regression Lab
Supervised ML algorithm: Decision Tree
EXERCISE: Decision Tree Lab
Supervised ML algorithm: Random Forest
EXERCISE: Random Forest Lab
Supervised ML algorithm: K Nearest Neighbors
EXERCISE: KNN Lab
Supervised ML algorithm: Support Vector Machines
EXERCISE: SVM Lab

Day 4:

Introduction to Neural Networks
Building Neural Networks for Program Analysis
EXERCISE: Neural Networks Development Lab
Evaluating Neural Networks
EXERCISE: Neural Networks Performance Lab

Students will be provided with

Students will be provided with access to course slides, sample code, and lab exercises which attendees can keep to continue their learning and practicing after the training ends.

Hardware Requirements

A working laptop or desktop (no Netbooks, no Tablets, no iPads)
Intel Core i3 (equivalent or superior) required
8GB RAM required, at a minimum
10 GB free hard disk space, at a minimum

Software Requirements

The following software needs to be installed on each student laptop prior to the workshop:

Linux / Windows / Mac OS X desktop operating systems

VMware Workstation or Fusion. The free 30-day trial is sufficient and can be downloaded here: https://www.vmware.com/try-vmware.html

Administrator / root access MANDATORY

CANSECWEST 2022 DOJO

Automated Program Analysis Using Machine Learning

Instructor: Hahna Kane Latonick

May 14 -17

Hahna Kane Latonick

Course Schedule

Course Abstract

Course Pre-requisites

Course Learning Objectives

Who Should Attend

Agenda

Day 1:

Day 2:

Day 3:

Day 4:

Students will be provided with

Hardware Requirements

Software Requirements

dragostech.com inc.

Sponsors

CANSECWEST 2022 DOJO

Automated Program Analysis Using Machine Learning

Instructor: Hahna Kane Latonick

May 14 -17

Hahna Kane Latonick

Course Schedule

Course Abstract

Course Pre-requisites

Course Learning Objectives

Who Should Attend

Agenda

Day 1:

Day 2:

Day 3:

Day 4:

Students will be provided with

Hardware Requirements

Software Requirements

Mainframe Hacking

Heap Exploitation

dragostech.com inc.

Sponsors