Archive for Systems Theory

Changed-Rooted Jail Hackery Part 2

It’s been a while, but that doesn’t necessarily mean it was vaporware! 😉 As promised in Part 1, the system call tool that mimicks GNU fileutils commands is in the code listing below. Support for any additional commands is welcome; if anybody adds more feel free to e-mail your source code. Extension should be fairly straightforward given then “if(){}else if{}else{}” template. Just simply add another else-if code block with appropriate command line argument parsing. It’s too bad you can’t really do closures in C, but a likely approach to increasing this tool’s modularity is the use of function pointers. Of course new commands don’t have to be from GNU fileutils–mixing and matching Linux system calls in C has limitless possibilities.

Speaking of GNU, I stumbled across an extremely useful GNU project called parallel. Essentially, it’s a multi-threaded version of xargs(1p). I’ve been including it in a lot of bash scripts I’ve written recently. It doesn’t seem to be part of the default install for any operating system distributions, yet; maybe when it evolves into something even more awesome it’ll become mainstream. 🙂 Suprisingly, I was even able to compile it on SUA/Interix without any problems. The only complaint I have about it is the Perl source language (not that I have anything against Perl). I simply feel that the parallelization processes could be that much faster if written in C. Maybe I’ll perlcc(1) it or something. Okay, then–without any further adieu, here’s the code for syscaller:

/*
 * syscaller v0.8a - breaking out of chroot jails "ex nihilo"
 *
 * by Derek Callaway <decal@security-objectives.com>
 *
 *
 * Executes system calls instead of relying on programs from the
 * GNU/Linux binutils package. Can be useful for breaking out of
 * a chroot() jail.
 *
 * compile: gcc -O2 -o syscaller -c syscaller.c -Wall -ansi -pedantic
 * copy: cat syscaller | ssh -l user@host.dom 'cat>syscaller'
 *
 * If the cat binary isn't present in the jail, you'll have to be more
 * creative and use a shell builtin like echo (i.e. not the echo binary,
 * but bash's internal implementation of it.)
 *
 * Without any locally accessible file download programs such as:
 * scp, tftp, netcat, sftp, wget, curl, rz/sz, kermit, lynx, etc.
 * You'll have to create the binary on the target system manually.
 * i.e. by echo'ing hexadecimal bytecode. This is left as an exercise
 * to the reader.
 *

 * to the reader.
 *
 */

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<ctype.h>

#define _GNU_SOURCE 1
#define _USE_MISC 1

#include<unistd.h>
#include<sys/syscall.h>
#include<sys/types.h>
#include<pwd.h>
#include<grp.h>

int syscall(int number, ...);

/* This is for chdir() */
#define SHELL_PATHNAME "/bin/sh"

static void usage(char **argv)
{
  printf("usage: %s syscall arg1 [arg2 [...]]\n", *argv);
  printf("help:  %s help\n", *argv);
  exit(EXIT_FAILURE);
}

static void help(char **argv)
{
  puts("syscaller v0.8a");
  puts("=-=-=-=-=-=-=-=");
  puts("");
  puts("SYSCALLER COMMANDS");
  puts("");
  puts("chmod mode pathname");
  puts("chdir pathname");
  puts("chown user group pathname");
  puts("mkdir pathname mode");
  puts("rmdir pathname");
  puts("touch pathname mode");
  puts("");

  puts("Note: modes are in octal format (symbolic modes are unsupported)");
  puts("Note: some commands mask octal mode bits with the current umask value");
  puts("Note: creat is an alias for touch");
  puts("");
  puts("USEFUL SHELL BUILTINS");
  puts("");
  puts("ls -a / (via brace/pathname expansion): echo /{.*,*}");

  exit(EXIT_SUCCESS);
}

int main(int argc, char *argv[])
{
  register char *p = 0;
  signed auto int r = 1;

  if(argc < 2)
    usage(argv);

  /* I prefer to avoid strcasecmp() since it's not really standard C. */
  for(p = argv[1];*p;++p)
    *p = tolower(*p);

  do
  {
    if(!strcmp(argv[1], "chmod") && argc >= 4)
    {
      /* decimal to octal integer conversion */
      const mode_t m = strtol(argv[2], NULL, 8);

      r = syscall(SYS_chmod, argv[3], m);

#ifdef DEBUG
  fprintf(stderr, "syscall(%d, %s, %d) => %d\n", SYS_chmod, argv[3], m, r);
#endif
    }
    else if((!strcmp(argv[1], "chdir") || !strcmp(argv[1], "cd")) && argc >= 3)
    {
      static char *const av[] = {SHELL_PATHNAME, NULL};
      auto signed int r2 = 0;

      r = syscall(SYS_chdir, argv[2]);

#ifdef DEBUG
  fprintf(stderr, "syscall(%d, %s) => %d\n", SYS_chdir, argv[2], r);
#endif

      /* This is required because the new current working directory isn't
       * bound to the original login shell. */
      printf("[%s] exec'ing new shell in directory: %s\n", *argv, argv[2]);
      r2 = system(av[0]);
      printf("[%s] leaving shell in child process\n", *argv);

      if(r2 < 0)
        r = r2;
    }
    else if(!strcmp(argv[1], "chown") && argc >= 5)
    {
      struct passwd *u = NULL;
      struct group *g = NULL;

      if(!(u = getpwnam(argv[2])))
        break;

#ifdef DEBUG
  fprintf(stderr, "getpwnam(%s) => %s:%s:%d:%d:%s:%s:%s\n", argv[2], u->pw_name, u->pw_passwd, u->pw_uid, u->pw_gid, u->pw_gecos, u->pw_dir, u->pw_shell);
#endif

      if(!(g = getgrnam(argv[3])))

        break;

#ifdef DEBUG
  fprintf(stderr, "getgrnam(%s) => %s:%s:%s:%s:", argv[3], g->gr_nam, g->gr_passwd, g->gr_gid);

  if((p = g->gr_mem))
    while(*p)
    {
      fputs(p, stderr);

      p++;

      if(*p)
        fputc(',', stderr);
    }
#endif

        r = syscall(SYS_chown, argv[4], u->pw_uid, g->gr_gid);

#ifdef DEBUG

  fprintf(stderr, "syscall(%d, %d, %d, %s) => %d\n", SYS_chown, u->pw-uid, g->gr_gid, argv[4], r);
#endif
    }
    else if((!strcmp(argv[1], "creat") || !strcmp(argv[1], "touch")) && argc >= 4 )
    {
      const mode_t m = strtol(argv[3], NULL, 8);

      r = syscall(SYS_creat, argv[2], m);

#ifdef DEBUG
  fprintf(stderr, "syscall(%d, %S, %d) => %d\n", SYS_creat, argv[2], m, r);
#endif
    }
    else if(!strcmp(argv[1], "mkdir") && argc >= 4)
    {
      const mode_t m = strtol(argv[3], NULL, 8);

      r = syscall(SYS_mkdir, argv[2], m);

#ifdef DEBUG
  fprintf(stderr, "syscall(%d, %S, %d) => %d\n", SYS_mkdir, argv[2], m, r);
#endif
    }
    else if(!strcmp(argv[1], "rmdir") && argc >= 3)
    {
      r = syscall(SYS_rmdir, argv[2]);

#ifdef DEBUG
  fprintf(stderr, "syscall(%d, %S) => %d\n", SYS_rmdir, argv[2], r);
#endif
    }
    else if(!strcmp(argv[1], "help"))
      help(argv);
    else
      usage(argv);

    break;
  } while(1);

  perror(argv[1]);

  exit(r);
}

Please note that some of the lines of code in this article are truncated due to how WordPress’s CSS renders the font text. Although, you’ll still receive every statement in its entirety when you copy it to your clipboard. The next specimen is similar to the netstat emulating shell script from Part 1. It loops through the procfs PID number directories and parses their contents to make it look like you’re running the actual /bin/ps, even though you’re inside a misconfigured root directory that doesn’t have that binary. It also has some useful aliases and a simple version of uptime(1).   

#!/bin/bash
# ps.bash by Derek Callaway decal@security-objectives.com
# Sun Sep  5 15:37:05 EDT 2010 DC/SO

alias uname='cat /proc/version' hostname='cat /proc/sys/kernel/hostname'
alias domainname='cat /proc/sys/kernel/domainname' vim='vi'

function uptime() {
  declare loadavg=$(cat /proc/loadavg | cut -d' ' -f1-3)
  let uptime=$(($(awk 'BEGIN {FS="."} {print $1}' /proc/uptime) / 60 / 60 / 24 ))
  echo "up $uptime day(s), load average: $loadavg"
}

function ps() {
    local file base pid state ppid uid
    echo 'S USER     UID   PID  PPID CMD'
    for file in /proc/[0-9]*/status
        do base=${file%/status} pid=${base#/proc/}
        { read _ st _; read _ ppid; read _ _ _ _ uid; } < <(egrep '^(State|PPid|Uid):' "$file")
        IFS=':' read user _ < <(getent passwd $uid) || user=$uid
        printf "%1s %-6s %5d %5d %5d %s\n" $st $user $uid $pid $ppid "$(tr \ \ <"$base/cmdline")"
    done
}

#EOF#

Leave a Comment

Jenny’s Got a Perfect Pair of..

binomial coefficients

binomial coefficients

..binomial coefficients?! That’s right. I’ve found the web site of a Mr. Bob Jenkins with an entire page dedicated to a pairwise covering array generator named jenny.c. I’m fairly sure that only the most hardcore of the software testing weenies have some notion of what those are so for the sake of being succinct I’ll be providing my own explanation here: A pairwise covering array generator is a program for silicon computing machines that deduces sequences of input value possibilities for the purposes of software testing; and yes, I did say silicon computers–since testing their software is really a question of the great Mr. Turing’s halting problem, the existence of a practical, affordable, and efficient nano/molecular computing device such as a DNA computer, Feynman machine, universal quantum computer, etc. would essentially predicate a swift solution to the problem of testing contemporary computer software in non-deterministic polynomial time. The only problem we would have then is how to test those fantastic, futuristic, (seemingly science fictive) yet wondrous problem-solving inventions as they break through laborious barriers of algorithmic complexities that twentieth century computer scientists could have only dreamed about: PCP, #P, PSPACE-complete, 2-EXPTIME and beyond.. The stuff that dreams are made of.

Now, let’s return to Earth and learn about a few things that make Jenny so special. Computer scientists learned early on in their studies of software testing that pairwise or test cases with two input values were the most likely to uncover erroneous programming or “bugs.” Forget the luxury of automation for a minute, old school programmers typed input pairs manually to test their own software. Code tested in that manner was most likely some sort of special-purpose console mode utility. (Celsius to Fahrenheit, anyone?) As the computing power of the desktop PC increased according to Moore’s law, it became time-effective to write a simple program to generate these input pairs instead of toiling over it yourself–I suppose not testing at all was another option. Today, still some software is released to market after only very minor functional and/or quality assurance testing. Regression, stress, security, and other forms of testing cost money and reduce time to market, but in reality significant return on investment acts as a hedge against any losses incurred. Even ephemeral losses justify the absolute necessity of these expenditures.

A Jenny built in modern times undoubtedly has the power to deductively prove that a software product of the eighties decade is comprised of components (or units) that are fundamentally error-free. However, the paradox remains that improvements in automated software testers share a linear relationship with improvements of software in general. Thus, pairwise has become “n-way” which describes the process of utilizing greater multiples of input values in order to cover acceptable numbers of test cases. The number of covering arrays generated in this fashion grows exponentially and can be calculated as a binomial coefficient (see formula below.)

(n choose r) in factorial terms

(n choose r) in factorial terms

According to Paul Black, former SAMATE (Software Assurance Metrics and Tool Evaluation) project leader, researchers at NIST have pegged 6-way as the magic number for optimal fault interaction coverage (notably Rick Kuhn and Dolores Wallace.) This conclusion is based on hard evidence from studies on real-world software scenarios including medical devices and the aerospace industry. However, it would not surprise me to see this approximation rise significantly in the coming decades, just as the paradoxical relationship between general-purpose software and automated software testing programs shifts itself in accordance with Moore’s law. If not by Moore, then by some other axiom of metric progression such as Rogers’ bell curve of technological adoption.

I’ve also got a hunch that the tiny percentage of bugs in that “n is arbitrarily greater than 6” range are some of the most critical, powerfully impacting software vulnerabilities known to man. They lie on an attack surface that’s almost non-existent; this makes them by definition, obscure, non-obvious, shadowy, and hidden. Vulnerabilities in this category are the most important by their very nature. Therefore, detecting vulnerabilities of this type will involve people and tools that are masters of marksmanship and artistic in their innovation. Research in this area is entering a steadfast beginning especially within the realms of dynamic instrumentation or binary steering, active analysis, fault propagation, higher-order preconditions/dependencies, concurrency issues, race conditions, etc. I believe that combining merits inherent in various analysis techniques will lead to perfection in software testing.

For perfection in hashing, check out GNU’s gperf, read how Bob used a perfect hashing technique to augment Jenny’s n-tuples; then get ready for our Big ßeta release of the BlockWatch client software (just in time for the holiday season!)

Leave a Comment

The Philosophical Future of Digital Immunization

digital-trojan-horse-viriiUsually it’s difficult for me to make a correlation between the two primary subjects that I studied in college–computer science and philosophy. The first few things that pop into mind when attempting to relate the two are typically artificial intelligence and ethics. Lately, intuition has caused me to ponder over a direct link between modern philosophy and effective digital security.

More precisely, I’ve been applying the Hegelian dialectic to the contemporary signature-based approach to anti-virus while pontificating with my peers on immediate results; the extended repercussions of this application are even more fascinating. Some of my thoughts on this subject were inspired by assertions of Andrew Jacquith and Dr. Daniel Geer at the Source Boston 2008 security conference. Mr. Geer painted a beautiful analogy between the direction of digital security systems and the natural evolution of biological autoimmune systems during his keynote speech. Mr. Jacquith stated the current functional downfalls of major anti-virus offerings. These two notions became the catalysts for the theoretical reasoning and practical applications I’m about to describe.

Hegel’s dialectic is an explicit formulation of a pattern that tends to occur in progressive ideas. Now bear with me here–In essence, it states that for a given action, an inverse reaction will occur and subsequently the favorable traits of both the action and reaction will be combined; then the process starts over. A shorter way to put it is: thesis, antithesis, synthesis. Note that an antithesis can follow a synthesis and this is what creates the loop. This dialectic is a logical characterization of why great artists are eventually considered revolutionary despite  initial ridicule for rebelling against the norm. When this dialectic is applied to anti-virus, we have: blacklist, whitelist, hybrid mixed-mode. Anti-virus signature databases are a form of blacklisting. Projects such as AFOSI md5deep, NIST NSRL,  and Security Objectives Pass The Hash are all whitelisting technologies.

A successful hybrid application of these remains to be seen since the antithesis (whitelisting) is still a relatively new security technology that isn’t utilized as often as it should be. A black/white-list combo that utilizes chunking for both is the next logical step for future security software. When I say hybrid mixed-mode, I don’t mean running a whitelisting anti-malware tool and traditional anti-virus in tandem although that is an attractive option. A true synthesis would involve an entirely new solution that inherited the best of each parent approach, similar to a mule’s strength and size. The drawbacks of blacklists and whitelists are insecurity and inconvenience, respectively. These and other disadvantages are destined for mitigation with a hybridizing synthesis.

The real problem with mainstream anti-virus software is that it’s not stopping all of the structural variations in malware. PC’s continue to contract virii even when they’re loaded with all the latest anti-virus signatures. This is analogous to a biological virus that becomes resistant to a vaccine through mutation. Signature-based matching was effective for many years but now the total set of malicious code far outweighs legitimate code. To compensate, contemporary anti-virus has been going against Ockham’s Razor by becoming too complex and compounding the problem as a result. It’s time for the security industry to make a long overdue about-face. Keep in mind that I’m not suggesting that there be a defection of current anti-virus software. It does serve a purpose and will become part of the synthesization I show above.

The fundamental change in motivation for digital offensive maneuvers from hobbyist to monetary and geopolitical warrants a paradigm shift in defensive countermeasure implementation. For what it’s worth, I am convinced that the aforementioned technique of whitelisting chunked hashes will be an invaluable force for securing the cloud. It will allow tailored information, metrics and visualizations to be targeted towards various domain-specific applications and veriticals. For example: finance, energy, government, or law enforcement, as well as the associated software inventory and asset management tasks of each. Our Clone Wars presentation featuring Pass The Hash (PTH) at Source Boston and CanSecWest will elaborate on our past few blog posts and much more.. See you there!

Leave a Comment

Breaking Vegas Online

We recently published an advisory for PartyPoker, an online gambling site (SECOBJADV-2008-03.) It was for a weakness in the client update process, a class of vulnerability that can affect various kinds of software. The past few years have seen some vulnerabilities that are specific to online gaming software. Statically seeded random number generators that allow prediction of forthcoming cards and reel values on upcoming slot spins were researched in the early days of online gaming–let’s take a look at some additional threats.

Usually, forms of online cheating are pretty primitive. Justin Bonomo was exposed for using multiple accounts in a single tournament on PokerStars and of course collusion between multiple players occurs as well. Absolute Poker’s reputation took a pretty big hit when players discovered that a site owner used a backdoor to view cards in play. Many private and public bots are also in use. However, a good human poker player will beat a bot, especially in no-limit which is less mathematical than other variations of the game; bots are likely to be most useful in low-stakes fixed-limit games.

Earlier this year, a logic flaw was exploited on BetFair (oh, the pun!) because of a missing conditional check to test for chip stack equality when determining finishing positions. As a result, if multiple players with the same amount of chips were eliminated at the same time, they would all receive the payout for the highest position, instead of decrementing positions. For example, if there were three players that all had chip stacks of the same size and everyone went all-in, the winner of the hand would finish in first place and the other two players would both receive second place money. Interesting!

Comments (1)

Reducing the Cost of Software Regression

H.G. Wells Time MachineA widely held notion among computer scientists is that 80% of a programmer’s time is occupied maintaining code while the other 20% is spent actually writing the software. This inefficient allocation of effort was the subject of a master’s thesis at the Lund Institute of Technology called “Formalizing Use Cases with Message Sequence Charts.” According to a 2002 NIST study entitled “The Economic Impacts of Inadequate Infrastructure for Software Testing“, the annual cost of testing and fixing buggy software in the U.S. is estimated to be between $22.2 and $59.5 billion. What are the root causes of this costly inefficiency?

Unfortunately, corporate culture is naturally a contributing factor for this problem. Companies that produce commercial software are in business to make a profit first and foremost. Release schedules are expedited so the program can be released to market quickly and copies are sold sooner rather than later–making code work as expected for baseline use cases takes priority over regression testing. Any aspect of the software development process that does not appear to be fully in-line with company interests is considered a waste. Usually, far too much emphasis is put on project progress. The overall progress is only perceived progress because unforeseen problems will inevitably pop-up later on. In the long run, bugs end up costing much more to fix down the road and the entire business apparatus pays the cost of maintenance. Some secondary losses are customer support costs, internal communications and a negative impact on the company’s public image all of which can be preempted by proper preliminary work.

Since there’s so much emphasis on speedy release code writing starts as soon as possible, often with disregard to planning ahead beforehand and ensuring quality afterwards. Furthermore, programmers tend to implement items that they are most familiar with first because it seems like the easy way to do things. In most cases, the most difficult parts of a task are best handled first, not last. Handling the difficult items first allows more time and attention to the important stuff; it also allows the developer to recognize how the simpler pieces will fit into the grand scheme of things.

Planning ahead is an essential during the inception phase of a software project. Appropriately analyzing the problem and carefully designing the solutions will minimize the accumulation of technical hardships in the future. In fact, by taking advantage of the popular Unified Process for software development and diagramming specifications in UML the need for writing code almost disappears. UML CASE tools such as IBM’s Rational Rose will translate UML diagrams into program source code (typically an object-oriented language such as C++ or Java.) Of course creating detailed requirements and specifications is still extremely helpful even if UML is not practical for the task at hand. Writing code early on gives the illusion that progress is being made but in reality it is a recipe for disaster. No code should be written until all implementation issues have been resolved.

Threat modeling and secure design principles need to be key focal points during the initial phases of a software project. After the code has been written security issues will also comprise a large chunk of ongoing maintenance work. When software developers handle security fixes they have to stop what they’re doing, modify or maybe even rewrite the offending code, test the new code, report their progress, etc. Since developers are rarely security specialists, they tend to write fixes in such a way that the security hole is not closed completely. As a result, the vulnerability persists and leads to more code rewriting. This cycle of stopping, rewriting, and continuing severely detracts from productivity in programming. The majority of a developer’s time should be spent doing what they do best–adding new features to software.

Removing developers from the patch creation process significantly increases their utilization. Dedicated security experts are most fit to create patches and conduct regression testing. A dynamic program analysis tool accelerates the regression testing process. Building security in by default from the ground-up will minimize software bugs along with subsequent patching and testing. In conclusion, proper planning, resource allocation, and testing procedures can greatly reduce the costs associated with software regression.

Leave a Comment

Older Posts »
%d bloggers like this: