PolyMorphism

x86 · Mar 8, 2006

Polymorphism
By Elektrique
AFAIK ...Its a year Old found in my old hdd thought might be useful I dont have Its Source.

Computer viruses (in this text worms are also implied by the word
virus) have existed since the very early days of personal computing.
The first example called "Elk Cloner" was developed in 1981 for the
Apple II platform. 10 years later one sees the first polymorphic virus
in history dubbed "Tequila." Since the beginning computer viruses and
detection software have evolved competitively to create one of the
most interesting and complicated aspects of computer science.

The standardization of the x86 platform and DOS has effectively
increased the range and potential damage of viruses designed for this
platform. Without this standardization in personal computing Virus
detection as a commercial industry would not have been necessary. When
antivirus software first appeared on the market all it would have to
do is have a copy of the virus which it would then scan potentially
infected files for.

Polymorphism has changed that making virus detection quite a chore.
Here is the basic concept behind creating a polymorphic virus:

1 - Create the actual virus, the code that does damage and provides
the ability to spread.

2 - Create an encryption and decryption program. The encryption code
will be stored encrypted along with the virus code, the decryption code
will remain executable.

3 - Carefully index the decryption algorithm, noting where code may be
injected, and which parts of the code are pointers that need to be
incremented upon code injection.

4 - Create the mechanism that injects the code and maintains the index
to the decryption algorithm at a specified spread interval.

5 - Create a library of useless code that does nothing helpfull or
harmfull in anyway.

Let me explain what we mean by morphing the decryption code. Let's say
our decryption program in pure form starts with these instructions:
CODE

MOV AX, 2110h
INT 21
PUSH DX
MOV DX, [DI]
PUSH DX
CALL 0500h

It is possible to modify this code by adding NOOP instructions to it
such as that it becomes (assume the program starts at 100)
CODE

JMP 0110h ; NOOP INSTRUCTION JUMPS BEYOND
DB 10h, 10h, 10h, 10h, 10h, 10h, 10h ; DB SECTION TO 1ST REAL INSTRUCTION
MOV AX, 2110h
PUSH AX ; NOOP INSTRUCTION DOES NOTHING
POP AX ; USEFULL, JUST WASTE PROCESSOR TIME
MOV DX, [DI]
MOV CX, 0502h ; WE ARENT ACTUALLY USING THE CX
; REGISTER FOR ANYTHING USEFULL
ADD CX, [DI] ; ADD RANDOM DATA
PUSH DI ; SAVE DI
MOV DI, 103 ; THE POINTER TO JUNK DB SECTION
MOV [DI], CX ; MOVE GENERATED DATA TO JUNK
POP DI ; RESTORE DI
CALL 0520h ; PROCEED WITH PROGRAM (NOTICE
; THAT THE OFFSET WAS INCREMENTED
; DUE TO ADDED CODE
Hopefully you understand how this works. The virus creator knows his
own decryption program and where he can insert how many junk
instructions as well which jmp / call and absolute pointers need to be
incremented.

Virus detection software has a few tricks to defeat polymorphism:

1 - Run program in a controlled environment to have it self decrypt.
2 - Heuristics
3 - Crack the encryption

I think number 1 is quite obvious. There are ways to defeat it. One
could write the program so that it decrypts every 10 runs or only
decrypts if it knows it is not running in a controlled environment. A
very intelligent virus programmer could make the latter apply to all
anti virus software by creating a buffer overrun in a microsoft
function, or experimenting with various antivirus software to see how
each could be individually detected.

Heuristics works in a way that is similar to spam detection software.
Virus code will almost always be at either the beginning of a program
and much rarer at program exit. The detection software takes snippets
that could be potential virus (decryption) code and analyzes them for
NOOP instructions. Every instruction in the code either increments or
decrements the score for the scanned program.

PUSH AX followed by POP AX would send alarm signals since it is
obvious that it does absolutely nothing useful.
CODE

MOV CX, 0502h ; WE ARENT ACTUALLY USING THE CX
; REGISTER FOR ANYTHING USEFULL
ADD CX, [DI] ; ADD RANDOM DATA
PUSH DI ; SAVE DI
MOV DI, 103 ; THE POINTER TO JUNK DB SECTION
MOV [DI], CX ; MOVE GENERATED DATA TO JUNK
POP DI ; RESTORE DI

The above code doesn't actually look like NOOP so heuristic detection
on very generic settings might not notice it, however heuristics
programs are generally fine tuned for an x amount of specific viruses
that use either the same or closely related PME (polymorphism
engines), for whose NOOP codes are commonly known. The detection
software may do a heuristic analysis twice or three times using
different configurations that detect different PME.

If it is known to the AV comunity that the encryption used in a
targeted virus is particularly weak. AV developers may skip heuristics
and focus on cracking the encryption using cryptanalysis techniques.
Cryptanalysis may be made inefficient and near impossible by using
tried and tested secure encryption algorithms with a few extra layers
of simple shifting or substitution encryption. A cryptoanalsys attack
should be the least of the intelligent programmers worries.

Conclusion: Every new virus that gets developed means new work for anti
virus software developers. Well designed encryption algorithms and
polymorphism engines will make it harder for antivirus software to
catch up. Whereas a plaintext virus could be added to the detected
virus lists in an hour, a polymorphic might take a week. A week is
indeed enough for the virus to spread and become a global
threat/nuisance.

NOTE: Assembly code in this document is not completely accurate.