Archive for Philosophy

The Philosophical Future of Digital Immunization

digital-trojan-horse-viriiUsually it’s difficult for me to make a correlation between the two primary subjects that I studied in college–computer science and philosophy. The first few things that pop into mind when attempting to relate the two are typically artificial intelligence and ethics. Lately, intuition has caused me to ponder over a direct link between modern philosophy and effective digital security.

More precisely, I’ve been applying the Hegelian dialectic to the contemporary signature-based approach to anti-virus while pontificating with my peers on immediate results; the extended repercussions of this application are even more fascinating. Some of my thoughts on this subject were inspired by assertions of Andrew Jacquith and Dr. Daniel Geer at the Source Boston 2008 security conference. Mr. Geer painted a beautiful analogy between the direction of digital security systems and the natural evolution of biological autoimmune systems during his keynote speech. Mr. Jacquith stated the current functional downfalls of major anti-virus offerings. These two notions became the catalysts for the theoretical reasoning and practical applications I’m about to describe.

Hegel’s dialectic is an explicit formulation of a pattern that tends to occur in progressive ideas. Now bear with me here–In essence, it states that for a given action, an inverse reaction will occur and subsequently the favorable traits of both the action and reaction will be combined; then the process starts over. A shorter way to put it is: thesis, antithesis, synthesis. Note that an antithesis can follow a synthesis and this is what creates the loop. This dialectic is a logical characterization of why great artists are eventually considered revolutionary despite  initial ridicule for rebelling against the norm. When this dialectic is applied to anti-virus, we have: blacklist, whitelist, hybrid mixed-mode. Anti-virus signature databases are a form of blacklisting. Projects such as AFOSI md5deep, NIST NSRL,  and Security Objectives Pass The Hash are all whitelisting technologies.

A successful hybrid application of these remains to be seen since the antithesis (whitelisting) is still a relatively new security technology that isn’t utilized as often as it should be. A black/white-list combo that utilizes chunking for both is the next logical step for future security software. When I say hybrid mixed-mode, I don’t mean running a whitelisting anti-malware tool and traditional anti-virus in tandem although that is an attractive option. A true synthesis would involve an entirely new solution that inherited the best of each parent approach, similar to a mule’s strength and size. The drawbacks of blacklists and whitelists are insecurity and inconvenience, respectively. These and other disadvantages are destined for mitigation with a hybridizing synthesis.

The real problem with mainstream anti-virus software is that it’s not stopping all of the structural variations in malware. PC’s continue to contract virii even when they’re loaded with all the latest anti-virus signatures. This is analogous to a biological virus that becomes resistant to a vaccine through mutation. Signature-based matching was effective for many years but now the total set of malicious code far outweighs legitimate code. To compensate, contemporary anti-virus has been going against Ockham’s Razor by becoming too complex and compounding the problem as a result. It’s time for the security industry to make a long overdue about-face. Keep in mind that I’m not suggesting that there be a defection of current anti-virus software. It does serve a purpose and will become part of the synthesization I show above.

The fundamental change in motivation for digital offensive maneuvers from hobbyist to monetary and geopolitical warrants a paradigm shift in defensive countermeasure implementation. For what it’s worth, I am convinced that the aforementioned technique of whitelisting chunked hashes will be an invaluable force for securing the cloud. It will allow tailored information, metrics and visualizations to be targeted towards various domain-specific applications and veriticals. For example: finance, energy, government, or law enforcement, as well as the associated software inventory and asset management tasks of each. Our Clone Wars presentation featuring Pass The Hash (PTH) at Source Boston and CanSecWest will elaborate on our past few blog posts and much more.. See you there!

Advertisements

Leave a Comment

Short-Term Memory

Sometimes I get the feeling that too many Internet users (especially the younger generation) view 1995, or the beginning of commercialized Internet as the start of time itself. More specifically, I notice how people tend to have a short-term memory when it comes to security issues. A recent example of this was all the creative network exploitation scenarios that arose from the great DNS cache poisoning scare of 2008: intercepting e-mails destined for the MX of users who didn’t really click on “Forgot Password,” pushing out phony updates, innovative twists on spear phishing, etc. The fact of the matter is that man-in-the-middle attacks were always a problem; cache poisoning makes them easier but their feasibility has always been within reason. My point is that vendors should address such weaknesses before the proverbial fertilizer hits the windmill.

Too often, short-term memory is the catalyst for reoccurring breaches of information. Sometimes I wonder what (if anything) goes through the mind of one of those celebrities that just got their cell phone hacked for the third time. Maybe it’s something like, “Oh.. those silly hackers! They’ve probably gotten bored by now and they’ll just go away.” Then I wonder how often similar thoughts enter corporate security (in)decision–which is likely to be why cellular carriers neglect to shield their clientele’s voicemail from caller ID spoofing and other shenanigans. Nonetheless, the amusing charade that 2600 pulled on the Obama campaign for April Fool’s Day was simply a case of people believing everything they read on the Internet.

Don’t get me wrong. I’ve seen some major improvements in how larger software vendors are dealing with vulnerabilities, but an overwhelming majority of their security processes are still not up to par. Short-term memory is one of those cases where wetware is the weakest link in the system.

The idea of the digital security industry using long-term memory to become more like insurance companies and less like firefighters is quite intriguing. Putting protective forethought into the equation dramatically changes the playing field. Imagine an SDLC where programmers don’t have to know how to write secure code, or even patch vulnerable code for that matter. I can say for sure that such a proposition will become reality in the not too distant future. Stay tuned…

Leave a Comment

Good grief!

Charlie Brown Good GriefHaving just caught up on some of the conference “Source Boston”, I can’t help but call out some of the musings of Andrew Jaquith. Something of a more technical abstract can be read at the code project’s article by Jeffrey Walton (pay special attention to Robin Hood and Friar Tuck). If anybody doubt’s the current trend of sophistication in malware, I’m sure it is somebody who is currently penetrated. I’ve had the opportunity to devote specific analysis on occasion over the years to MAL code and its impact on the enterprise. I know FOR SURE the level of sophistication is on the rise. One thing I had to deal with recently, the extent of capability afforded by most desktop OS’s being so advanced, the majority of functionality desired by MAL code is pre-deployed. Unfortunately paving the way for configuration viruses and their ability to remain undetected in that all they are is an elaborate set of configuration settings. You can imagine, a configuration virus has the entire ability of your OS at its disposal, any VPN/IPSEC, self-(UN) healing, remote administration, etc… The issue is then, how do you determine if that configuration is of MAL intent, it’s surely there for a reason and valid in many deployments. The harm is only when connected to a larger entity/botnet that harm begins to affect a host. Some random points to add hard learned through experience;

  • Use a native execution environment
    • VMWare, prevents the load or typical operation of many MAL code variants
      • I guess VM vendors have a big win here for a while, until the majority of targets are VM hosts.
  • Have an easily duplicated disk strategy
    • MAC systems are great for forensics, target disk mode and ubiquitous fire-wire allows for live memory dumps and ease of off-line disk analysis (without a drive carrier).
    • I’m planning a hash-tree based system to provision arbitrarily sized block checksums of clean/good files, useful of diff’ing out the noise for arbitrary medium (memory, disk, flash).
  • Install a Chinese translator locally
    • As you browse Chinese hack sites, (I think all Russian site’s are so quiet these days due to the fact that they are financially driven, while Chinese are currently motivated by nationalistic motivators), you need to translate locally. Using a .com translation service is detected and false content is rendered, translate locally to avoid that problem.
      • Also, keep notes on lingo.. there are no translation-hack dictionaries yet. (I guess code pigeon is referring to a homing pigeon, naturally horse/wood code is a Trojan).

Unfortunately part of the attacker advantage is the relatively un-coordinated fashion defenders operate, not being able to trust or vet your allies to compare notes can be a real pain. One interesting aspect of a MAL system recently analyzed was the fact that that it had no persistent signature. It’s net force mobility so complete, that the totality of its functionality could shift boot-to-boot, so long as it compromised a boot-up driver it would rise again. The exalted C. Brown put it best, “Good grief!” http://www.codeproject.com/KB/cpp/VirusProtect.aspx http://www.sourceboston.com/blog/?p=25

Comments (11)

Combinatoric Input Set Generation

GeneratorI’ve been studying combinatoric methods of generating test cases for quite some time now. Most publicly available fuzz testing packages implement fairly crude techniques for passing input values to applications–although recent research is becoming more creative in attacking the issue because of insufficient path coverage metrics by orthodox methods. Generating input sets combinatorially is a much more holistic approach to the black-box software testing paradigm.

In this article, I’ll be providing a brief overview of how set operations taken from the field of discrete mathematics can be applied to fuzz testing. Explicit definitions of set operations can be found elsewhere and links to Wikipedia are provided where appropriate. Brief explanations on set theory will be given here but I will mainly be focusing on how it relates to software testing.

Let’s start off with permutations; just about any computer science bachelor is familiar with these. They’re the possible orderings of a sequence. A “sequence” may consist of members that are non-unique (which is in contrast to a “set” whose members are unordered and unique.) For example, the permutations of sequence {1,2,3} are:

{{1,2,3},{1,3,2}, {2,1,3},{2,3,1},{3,1,2},{3,2,1}}

Easy enough. Permutations can be used when testing protocols where the order of commands is significant. For example, if testing the FTP protocol with a sequence of {“STAT .”,”CWD ..”,”MKD foo”} depending on the order in which the commands are executed, the client could be retrieving the status for and/or making the “foo” directory in the parent/child directories.

Sets of subsets (also known as combinations) are also quite useful and are expressed using what’s called “n-choose-r” notation because r elements are being chosen from a set that has n elements in total (the set is said to have cardinality n.) Combinations can be utilized to enumerate command-line argument possibilities since the order of argv values is usually irrelevant. Say an executable has the command line flags “-n”, “-i”, “-q”, and “-v”. We want to generate all subsets of cardinality 2 (4-choose-2.) The answer is:

{{“-n”,”-i”},{“-n”,”-q”},{“-n”,”-v”},{“-i”,”-q”},

{“-i”,”-v”},{“-q”,”-v”}}

r-permutations are similar to subsets in that a given amount of objects are chosen from the list but since permutations are represented as sequences the order of the elements matters. Using the same baseline sequence as the previously described n-permutations, the 2-permutations of {1,2,3} are as follows:

{{1,2},{2,1},{1,3},{3,1},{2,3},{3,2}}

Again, observe how these permutations would affect the logic of network protocol commands processed by a server daemon..

Moving on, a power set is the set of all subsets including the empty set. Using the example above, the power set of {“-n”,”-i”,”-q”,”-v”} looks like this:

{{},{“-n”},{“-i”},{“-q”},{“-v”},{“-n”,”-i”},{“-n”,”-q”},

{“-n”,”-v”},{“-i”,”-q”}{“-i”,”-v”},{“-q”,”-v”},

{“-n”,”-i”,”-q”},{“-n”,”-q”,”-v”},{“-n”,”-i”,”-v”},

{“-i”,”-q”,”-v”},{“-n”,”-i”,”-q”,”-v”}}

The accuracy of the power set calculation can be checked because power sets have a cardinality of 2**n. In this case, 2**4=16. Other set operations can be checked with similar formulas.

Taking it another step further, consider giving that executable whose command line flags are being generated environment variable input as well. Suppose that the power set for the command line options is:

{{},{“-n”},{“-i”},{“-n”,”-i”}}

and the power set for the environment variable values is:

{{},{“TERM=vt100”},{“TERM=%n%n%n”}},{“LOGIN=root”},

{“LOGIN=%n%n%n”},{“TERM=vt100″,”LOGIN=root”},

{“TERM=%n%n%n”,”LOGIN=root”},

{“TERM=%n%n%n”,”LOGIN=%n%n%n”}}

The environment variable names and values were paired up using a set operation known as the Cartesian product. i.e. the Cartesian product of the sets {{“TERM”}} and {“vt100″,”%n%n%n”} is:

{{“TERM”,”vt100″},{“TERM”,”%n%n%n”}}

Taking further advantage of Cartesian products, all possible pairings of command line flags and environment values are generated. I won’t be typing the whole thing out here as it is excruciatingly long but it would start out something like this:

{{{},{}},{{},”TERM=vt100″},{{},”TERM=%n%n%n”}},

{{},”LOGIN=root”}, … , {{“-n”,”-i”},{“TERM=%n%n%n”,”LOGIN=root”}},

{{“-n”,”-i”},{{“TERM=%n%n%n”,”LOGIN=%n%n%n”}}}

I put the ellipsis in there because, well, you get the idea! This continues on until all possible subsets of command line flags have been paired with all possible subsets of pairings of environment variable names and values. The final Cartesian product for two power sets of equal size can be represented visually by Pascal’s triangle.

The aforementioned set operations can be used to systematically prove the correctness of simple computer programs via deterministic testing. Modern computer programs are so complex that attempting to calculate all possible input scenarios would be infeasible with a silicon-based machine. In the future, I expect that it will be commonplace for quantum molecular systems or perhaps even DNA computers to solve software assurance problems (and many others) in constant time or O(1), but I digress.

Since silicon will be prevalent for the foreseeable future, the input space used for a real-world software test has to be reduced. The goal is to minimize test executions while maximizing path coverage so disparate input sets must be chosen. This is where “pseudo exhaustive”, “n-way”, or more recently “6-way” testing comes into practice. More information about input space reduction and pseudo exhaustive testing is in Rick Kuhn’s research at NIST.

Leave a Comment

%d bloggers like this: