Abstract:This lecture discusses viruses, worms, and Trojans and related terminology, the structure of these, and how virus removal programs work.
Slides: virusPM.pptx • Stuxnet 2011
"Unix. The world's first computer virus."
Title of Chapter 1 of The Unix Haters Handbook, ISBN: 1-56884-203-1
The above book is in fact written by serious computer scientists. Nevertheless, we must disregard the suggestion that Unix is a virus as an attempt at being hilarious. Equally unhelpful are the news media that use the term virus in referring to any piece of malicious software. The academic world uses the term "malware" for these. The purpose, not the approach, makes a program malicious. Many of the programs that fall in the malware categories have benevolent uses. For example, worms can be used to distribute computation on idle processors; back doors are useful for debugging programs; and viruses can be written to update source code and patch bugs. Rigorous definitions have been given by many computer security experts but they do not match the typical use even by other security experts. Thus, we must settle for "practical definitions" of malicious software.
A stealth virus has code in it that seeks to conceal itself from discovery or defends itself against attempts to analyze or remove it. The stealth virus adds itself to a file or boot sector but, when you examine, it appears normal and unchanged. The stealth virus performs this trickery by staying in memory after it is executed. From there, it monitors and intercepts your system calls. When the system seeks to open an infected file, the stealth virus displays the uninfected version, thus hiding itself.
Macro languages are (often) equal in power to ordinary programming languages such as C. A program written in a macro language is interpreted by the application. Macro languages are conceptually no different from so-called scripting languages. Gnu Emacs uses Lisp, most Microsoft applications use Visual Basic Script as macro languages. The typical use of a macro in applications, such as MS Word, is to extend the features of the application. Some of these macros, known as auto-execute macros, are executed in response to some event, such as opening a file, closing a file, starting an application, and even pressing a certain key. A macro virus is a piece of self-replicating code inserted into an auto-execute macro. Once a macro is running, it copies itself to other documents, delete files, etc. Another type of hazardous macro is one named for an existing command of the application. For example, if a macro named FileSave exists in the "normal.dot" template of MS Word, that macro is executed whenever you choose the Save command on the File menu. Unfortunately, there is often no way to disable such features.
In May 2000, an OutLook mail program macro virus called LOVELETTER propagated widely.
The most famous of the security incidents in the last decade was the Internet Worm incident which began from a Unix system. But Unix systems were considered virus-immune -- not so. Several Linux viruses have been discovered. The Staog virus first appeared in 1996 and was written in assembly language by the VLAD virus writing group, the same group responsible for creating the first Windows 95 virus called Boza.
Like the Boza virus, the Staog virus is a proof-of-concept virus to demonstrate the potential of Linux virus writing without actually causing any real damage. Still, with the Staog assembly language source code floating around the Internet, other virus writers are likely to study and modify the code to create new strains of Linux viruses in the future.
The second known Linux virus is called the Bliss virus. Unlike the Staog virus, the Bliss virus can not only spread in the wild, but also possesses a potentially dangerous payload that could wipe out data.
While neither virus is a serious threat to Linux systems, Linux and other Unix systems will not remain virus-free. Fortunately, Linux virus writing is more difficult than macro virus writing for Windows, so the greatest virus threat still remains with Windows. [July 2000, http://www.boardwatch .com/ mag/ 2000/ jul/ bwm142pg2.html ]
Whereas a Trojan horse is delivered pre-built, a virus infects. In the past, such malicious programs arrived via tapes and disks, and the spread of a virus around the world took many months. Antivirus companies had time to identify a new viral strain, and create cleaning procedures. Today, Trojan horses, and viruses are network deliverable as E-mail, Java applets, ActiveX controls, JavaScripted pages, CGI-BIN scripts, or as self-extracting packages.
Integrated mail systems such as Microsoft Outlook make it very simple to send not only a quick note edited within a limited text editor but also previously composed computer documents of arbitrary complexity to anyone, and to work with objects that you receive via standards such as MIME. They also support application programming interfaces (such as MAPI) that allow programs to send and process mail automatically. Well over 500 million E-mail messages are delivered daily in July 2000.
Mobile-program systems are becoming more and more widespread. The most widely-hyped examples today are Java and ActiveX. This technology became popular with Web servers and browsers, but it is now integrated (e.g., Java into Lotus Notes, and ActiveX into Outlook) mail systems. Both Java and ActiveX have been found to have security bugs.
Here is a simple structure of a virus. In the infected binary, at a known byte location in the file, a virus inserts a signature byte used to determine if a potential carrier program has been previously infected.
V() { infectExecutable(); if (triggered()) { doDamage(); } jump to main of infected program; } void infectExecutable() { file = chose an uninfected executable file; prepend V to file; } void doDamage() { ... } int triggered() { return (some test? 1 : 0); }
The above virus makes the infected file longer than it was, making it easy to spot. There are many techniques to leave the file length and even a check sum unchanged and yet infect. For example, many executable files often contain long sequences of zero bytes, which can be replaced by the virus and re-generated. It is also possible to compress the original executable code like the typical Zip programs do, and uncompress before execution and pad with bytes so that the check sum comes out to be what it was.
Known viruses are by far the most common security problem on modern computer systems. Several web sites maintain complete lists of known viruses. There are thousands. Visit, e.g., www.cai.com/ virusinfo/ encyclopedia/. In the month of July 2000, there were 200+ "PC Viruses in the Wild" (www. wildlist. org). Virus detection programs analyze a suspect program for the presence of known viruses.
Fred Cohen has proven mathematically that perfect detection of unknown viruses is impossible: no program can look at other programs and say either "a virus is present" or "no virus is present", and always be correct. But, in the real world, most new viruses are sufficiently like old viruses that the same sort of scanning that finds known viruses also finds the new ones. And there are a large number of heuristic tricks that anti-virus programs use to detect new viruses, based either on how they look, or what they do. These heuristics are only sometimes successful, but since brand-new viruses are comparatively rare, they are sufficient to the purpose.
Virus scanners are sometimes classified by their "generation." The first generation virus scanners used previously obtained a virus signature, a bit pattern, to detect a known virus. They record and check the length of all executables. The second generation scans executables with heuristic rules, looking, e.g., for fragments of code associated with a typical virus. They also do integrity checking by calculating a checksum of a program and storing somewhere else the encrypted checksum. The third generation use a memory resident program to monitor the execution behavior of programs to identify a virus by the types of action that the virus takes. The fourth Generation Virus Detection combines all previous approaches and includes access control capabilities.
It is very educational to study the details of a scanner. The paper by Sandeep Kumar, and Gene Spafford, "A Generic Virus Scanner in C++," Proceedings of the 8th Computer Security Applications Conference, IEEEPress, Piscataway, NJ; pp.210-219, 2-4 Dec 1992 [Local copy .pdf] is Required Reading.
The Morris worm has been extensively analyzed as it was perhaps the first worm to use Internet to spread out. Of course, the Internet of 1988 was only few thousand nodes. Robert Tappan Morris was convicted and sentenced to three years of probation, 400 hours of community service, a fine of $10,050, etc.
Popular media often labelled the StuxNet of 2010 as a virus, but it is a worm. There is a TED talk (Ralph Langner, "Cracking Stuxnet, a 21st-Century Cyber Weapon", Mar 31, 2011). Read also: http://securitywatch.pcmag.com/none/296603-report-stuxnet-worm-was-planted-by-an-iranian-secret-agent Apr 13, 2012. http://midsizeinsider.com/en-us/article/researchers-release-stuxnet-like-exploit Researchers Release Stuxnet-Like Exploits on Metasploit http://abcnews.go.com/blogs/headlines/2012/03/new-version-of-stuxnet-related-cyber-weapon-discovered/ Mar 23, 2012.
These lecture materials are gleaned from many sources. All are presented after careful reading. In some cases, I may have neglected proper attribution. I assure the reader it is not because I claim authorship. Indeed, in the lectures there is hardly any thing new that I have contributed. Suggestions for improvement are always welcome.