Malware Analysis

Presenter Notes

What is it?

Shellcode used to be a protective shell for a bad program.

Then it used to mean spawning a shell on a box with assembly payload

It has evolved to assembly payload

Presenter Notes

Position Independent

Shellcode is commonly used in code injection, thus it has no idea where it will land.

Can not use hard coded addresses

Must resolve EIP, Imports, etc.

Presenter Notes

PEB Techniques (x86)

PEB stored at fs[0x30]

Kernel32.dll has GetProcAddress and LoadLibraryA.

Kernel32.dll is always the 3rd entry in the InMemoryOrderModuleList

  1. Parse the PEB to find LDR list, go to proper one, walk _LIST_ENTRY and find base address of kernel32.dll PE header

  2. Resolve functions from Kernel32.dll using the previously mentioned API calls

  3. PE Header -> Export Table -> Number of Exports -> AddressOfNames -> Hash a Name and Compare it to our value -> Resolve absolute base address of function call

Presenter Notes

Hashing

Shellcode does not want to use strings for function values

  • Null terminated
  • Easy to spot

It uses a hash of a function name

ROR13 quite common (described by Skape)

Project Shellcode on Hashing

Presenter Notes

ROR13

Substitution cipher

ror13

Presenter Notes

How to use ROR13 for functions

for c in str( function ):
  function_hash  = ror( function_hash, bits ) 
  function_hash  = (function_hash + ord(c))

def ror( dword, bits ):
  return (( dword >> bits | dword << ( 32 - bits ) ) & 0xFFFFFFFF)

Shellcode that uses hashes will hash most functions in a DLL and compare the hash values. If they match, resolve the address of the function it matched.

C:\Python27\Scripts>hashing.py kernel32.dll LoadLibraryA
[+] Ran on Thu Apr 16 12:17:45 2015

[+] 0xEC0E4E8E = kernel32.dll!LoadLibraryA

Presenter Notes

Generate All Hashes for a DLL

C:\Python27\Scripts>hashing.py /mod c:\windows\system32 kernel32.dll | less
[+] Ran on Thu Apr 16 12:22:00 2015
[+] Scanning module 'kernel32.dll' in directory 'c:\windows\system32'.
IN SCAN
[+] 0xA77D8D5A = kernel32.dll!AcquireSRWLockExclusive
[+] 0xE2385C49 = kernel32.dll!AcquireSRWLockShared
[+] 0x2FA60624 = kernel32.dll!ActivateActCtx
[+] 0xECFC3453 = kernel32.dll!AddAtomA
[+] 0xECFC3469 = kernel32.dll!AddAtomW
[+] 0x99161276 = kernel32.dll!AddConsoleAliasA
[+] 0x9916128C = kernel32.dll!AddConsoleAliasW
[+] 0xE3EAC3E7 = kernel32.dll!AddDllDirectory
[+] 0x730CEFAE = kernel32.dll!AddIntegrityLabelToBoundaryDescriptor
[+] 0x33675025 = kernel32.dll!AddLocalAlternateComputerNameA
[+] 0x3367503B = kernel32.dll!AddLocalAlternateComputerNameW
[+] 0xBFC36E12 = kernel32.dll!AddRefActCtx
[+] 0xCDC729AB = kernel32.dll!AddSIDToBoundaryDescriptor
[+] 0x6B8B8FD9 = kernel32.dll!AddSecureMemoryCacheCallback
[+] 0x1A945C3B = kernel32.dll!AddVectoredContinueHandler
[+] 0x159B3EA0 = kernel32.dll!AddVectoredExceptionHandler
[+] 0x2CA68404 = kernel32.dll!AdjustCalendarDate
[+] 0xD9F868D8 = kernel32.dll!AllocConsole
[+] 0x6493DFD5 = kernel32.dll!AllocateUserPhysicalPages
[+] 0xDD6573EA = kernel32.dll!AllocateUserPhysicalPagesNuma
[+] 0xCEB4FB95 = kernel32.dll!ApplicationRecoveryFinished

Presenter Notes

Hashing Continued

Use hashes to look at shellcode imports

Locate type of ROR or Cipher shellcode is using and reverse the algorithm

ror13

Presenter Notes

Malware RE

Presenter Notes

Resources

The IDA Pro Book 2nd Edition – Chris Eagle

Practical Malware Analysis – Michael Sikorski and Andrew Honig

Presenter Notes

Recall. . .

We can extract binaries and DLLs.

Import Address Table is destroyed when extracting using Volatility.

  • We can recover and use with IDA with Impscan

Presenter Notes

Reverse Engineering

Understanding the

Algorithm Discovery

  • Zeus encryption keys Protocol Discovery

Understanding Purpose

  • What did binary do? How? Why? Exfiltration?
  • Damage caused?
  • Attribution?

Presenter Notes

Compilation

Source → Compiler → Assembler → Linker → Binary on Disk

Because we went through compilation process

  • We lost program design
  • Source code
  • Developer comments
  • Function / variable names
  • Etc...

We are left with machine code!

   01111111 01000101 01001100 01000110

Presenter Notes

Disassembly

Generate assembly from machine language

Assembly Disasm(machine[])

Objdump, Dumpbin, IDA, Ollydbg...

Any non-IDE debugger performs disassembly

Presenter Notes

Disassembly

Each architecture has different assembly/machine code mapping

Commonly malware is Intel Arch, some in ARM.

Algorithm needs to know how to read the file format to discover what is executable code / what is data.

PE has a lot of data as well as executable sections

Presenter Notes

Example

The IA32 Instruction 'ret'

Returns from a function and returns to current stack pointer (Should be old EIP).

Machine Instruction is 0xc3.

Dissassembler sees 0xc3

  • Input: 0xC3
  • IF Code → ret
  • ELSE data → 0xc3

Presenter Notes

Disassembly Algorithms

** Linear Sweep

Recursive Descent**

Both take in machine code and output assembly code

Both fail on obfuscated code

Presenter Notes

Linear Sweep

Parse down the .text segment linearly.

Iterate over a block of code and disassemble one instruction at a time

Determines size of instructions and then starts at the next one!

No regard for flow-control.

Data embedded within code chokes this algorithm.

Debuggers

Complete coverage over program's code sections.

Presenter Notes

Recursive Descent

Goes through linear flow while reading each instruction and building a list of locations to disassemble.

For each call, jump, etc, add destination to a list to disassemble.

  • IE: Control Flow.

Stops parsing at a 'ret' or unconditional branch.

  • Then take next location from list and continue

Presenter Notes

Malware Analysis

Static and Dynamic approaches

Static

  • Doesn't execute
  • Disassembly
  • Strings
  • Parsing file format

Dynamic

  • Execute
  • Running the binary
  • Debuggers

Presenter Notes

Static

Does not require running the malware

Analyze the code and structure of a program

Antivirus / YARA

Hashing

Strings

Determining if file is packed

Viewing imports / exports

Disassemble

Presenter Notes

Static

Some approaches useful to write YARA signatures.

Do not rely solely on output of strings, dependencies, etc.

Disassembly listings quite good, unless heavily obfuscated.

Recall packers alter the structure of a binary

Presenter Notes

Dynamic

Run the malware!

Requires a secure environment

Observe malware functionality

Run through a sandbox

Process Monitor

Procmon

Regshot

Fake a network (iNetSim/ApateDNS)

Wireshark

Presenter Notes

Assembly Primer

IA32 / x86

  • IDA Free only supports this

General Purpose Registers

Small amount of data storage available on CPU Quick access 8 General Purpose Segment Registers Status Register Instruction Pointer

Presenter Notes

registers

Presenter Notes

Instructions

Building blocks of assembly.

Consists of a mnemonic and zero or more operands.

Mnemonic

  • Mov, Add, Sub, Ret

Operands

  • Source/Destination register, address, value

Mov eax, 0x42

Turn into opcodes at time of assemble.

Little Endian.

Presenter Notes

Operands

Immediate

  • Fixed values such as 0x50

Register

  • EAX, EBX, etc

Memory Address

  • [EAX]

IA32 uses Dst, Src

  • Mov EAX, 0x50
    • EAX is the destination of 0x50.

Presenter Notes

Stack Primer

LIFO Data Structure

Grows from high memory to low memory

Elements pushed onto the top of the stack and popped off the top of the stack.

stack

Presenter Notes

Function Primer

Most functions consist of a prologue and an epilogue

Prologue sets up the stack frame for the current function

Push Ebp
Mov ebp, esp
Sub esp, sizeOfLocalArgs

Epilogue restores old stack pointers and returns to saved EIP before function call

Mov esp, ebp
Pop ebp
ret

Presenter Notes

Function Calls

Several calling conventions

Win32 uses stdcall

  • passes arguments on the stack backwards.

A call to a function foo with the arguments int x and int y

foo(int x, int y)

would need to be passed on the stack as such

push y
push x

Also the address of the next instruction after the call to foo will be pushed as the old EIP.

Presenter Notes

Conditionals

Decisions based on comparisons

  • C/Java/Whatever has IF, ELSE, WHILE, etc.

Assembly has instructions to make comparsions

  • Cmp, test . . .

And instructions to branch

  • Conditionally executed statement to control flow

  • Jmp, Jne, Jz, je. . .

Presenter Notes

IDA

Presenter Notes

IDA

Interactive Disassembler from Hex-Rays

Recursive descent Disassembler

As well as heuristics to find additional code not found during the algorithm.

Uses a working database

You are not modifying an executable when you make changes

Presenter Notes

IDA Disassembly

F.L.I.R.T.

Fast Library Identification and Recognition Technology

IDA recognizes standard library functions generated by certain compilers.

Will parse through a binary using recursive descent and then apply key structs to the binary if they match.

Presenter Notes

F.L.I.R.T. Example

flirt

IDA matched two winAPI functions for manipulating the registry.

Presenter Notes

Viewing Disassembly

sub_xxx

  • Subroutine.

loc_xxx

  • Code at location xxx

Byte, word, dword, etc_xxx

arg_xx

  • Argument passed into function

var_xx

  • Local variable in a function
  • IDA tries to determine if the stack frame uses EBP

Presenter Notes

Following Control Flow

Graph and Text mode.

  • “Space” switches between them
  • In “IDA-View” window.

Functions Window

  • Shows start address

Color band/navigation band

  • Linear view of address space.
  • Dark blue is user-written code!
  • Light blue is library code recognized by FLIRT

Presenter Notes

Control Flow/Branching

On a comparison, such as JNZ...

Green = Branch Taken / TRUE

Red = Branch Not taken / FALSE

Blue = Unconditional

Presenter Notes

More

Imports/Exports Window

  • Any imports or exports IDA finds it places into these windows.
  • Includes the virtual address or ordinal number.

Strings

  • Display for certain length and types

Names Window

  • Listing of the global names within a binary
  • Symbolic descriptions given to an address.
    • Start, _Main
  • “L” = Library
  • “F” = function

Presenter Notes

Favorite Features

Xrefs

  • “X”
  • Shows all references to a name/address/string/function, etc.

Renaming functions

  • “x”
  • Displays anywhere it is called

Comments

  • ; and : to enter

Edit function “Alt+P”

  • IDA sometimes doesn't know what the stack frame looks like, this allows you to tell it.

Presenter Notes

What you should use

Assembly is hard to learn

  • Use “Auto Comments” to make it easier.

Line Prefixes and Stack Pointer

  • Not default on graph view, is on text view.

Presenter Notes

What Else?

There is NO UNDO IN IDA!

  • Save database frequently (^W)
  • Git commit the database with a script

Presenter Notes

EOF and DEMO

Presenter Notes