SoftwareSecurity2013/Group 6/Requirements/GeneralBackground

Uit Werkplaats
Ga naar: navigatie, zoeken

Preventing Direct Execution

The following are slightly out of scope in that they are better studied in the context of session management but knowledge of these constructions is necessary to understand when code must be directly considered. In terms of authentication: code need only be looked at when it is reachable directly.

It does not matter if there is a function in a PHP file that can execute without authentication when the code is only reachable through another function in another PHP file that does enforce authentication.

MEDIAWIKI Variable

The MEDIAWIKI variable is used to ensure that a piece of code intended to be "required" from another piece of code inside Mediawiki cannot be called directly from the web. It is not uncommon to find pieces of code like the following found at the beginning of /serialized/serialize.php:

<source lang="php"> if ( !defined( 'MEDIAWIKI' ) ) {

       $wgNoDBParam = true;
       $optionsWithArgs = array( 'o' );
       require_once( __DIR__ .'/../maintenance/commandLine.inc' );
       $stderr = fopen( 'php://stderr', 'w' );

... } </source>

Maintenance Class

Classes that do maintenance work can be created as a subclass of Maintenance, which includes a setup() function that will automatically stop and return an error if the code is not executed on the command line:

<source lang="php"> /**

* Do some sanity checking and basic setup
*/
public function setup() {
       global $wgCommandLineMode, $wgRequestTime;
       # Abort if called from a web server
       if ( isset( $_SERVER ) && isset( $_SERVER['REQUEST_METHOD'] ) ) {
               $this->error( 'This script must be run from the command line', true );
       }

[...] </source>

As a result, even when not protected by a .htaccess file that prevents execution entirely, maintenance code will be automatically shut down when executed from the web unless specifically written to allow that use case.

Maintenance subclasses are implemented as follows:

<source lang="php"> [...] require( "$maintenanceDir/Maintenance.php" ); // definition of Maintenance class

class YourMaintClass extends Maintenance {

  [...]
  public function __construct(){
      parent::__construct();
      $this->mDescription = "Describe your maintenance function";
  }
  public execute(){
      // do whatever you want
  }
  [...]

}

$maintClass = "YourMaintClass"; // tells doMaintenance.php what class to use require_once( "$maintenanceDir/doMaintenance.php" ); // causes the code in the class to execute </source>

The included doMaintenance.php instantiates YourMaintClass and then calls YourMaintClass::setup(), which then does some sanity checking to verify that it is allowed to happen. After those checks pass, the YourMaintClass::execute() is executed and the job is done.

Exceptions

Types of Exceptions in Mediawiki

The following exceptions are defined as related to permissions. Cryptographic exceptions do not appear to exist as such and simply use MWException with a string, and that is only when it actually uses exceptions. Any other cryptography related messages come through only as debugging messages.

Exception Description
MWException Generic Mediawiki Exception
PermissionsError General error for insufficient permissions
ReadOnlyError Page is not writable
ThrottledError User is limited in how quickly they may make updates
UserBlockedError User has had access blocked
UserNotLoggedIn User must log in
PasswordError Unable to Set Password

Exception Handling (and lack of logging)

Exception handling for Mediawiki starts at the run() method in the Wiki class, which is the root of all page loads and normal operations of Mediawiki. As we can see in the Catch clause for all exceptions, MWExceptionHandler::handle() is called when nothing else catches an exception.

<source lang="php"> /**

* Run the current MediaWiki instance
* index.php just calls this
*/

public function run() {

       try {   
               $this->checkMaxLag();
               $this->main();
               $this->restInPeace();
       } catch ( Exception $e ) {
               MWExceptionHandler::handle( $e );
       }       

} </source>

Now, we look at the default handler in includes/Exception.php and see the self::report($e) and wfLogProfilingData() commands followed by an exit to stop execution.

<source lang="php">

   public static function handle( $e ) {
   global $wgFullyInitialised;

   self::report( $e );

   // Final cleanup
   if ( $wgFullyInitialised ) {
         try {
           // uses $wgRequest, hence the $wgFullyInitialised condition
           wfLogProfilingData();
         } catch ( Exception $e ) {
       }
   }
   // Exit value should be nonzero for the benefit of shell jobs
   exit( 1 );

} </source>

The call to self::report($e) calls the $e->report() if it can but performs no logging at all, otherwise. The call to wfLogProfilingData() just makes sure that pages that end in an exception are still logged for profiling purposes (semi-anonymous usage statistics, etc.) The profiling data does not even include the error message reported.

Unless explicitly handled with a catch clause, no exceptions will be logged, beyond the logging of abnormal failures with a filename and line number that e.g. Apache generates. Other web servers may or may not log even those errors. The nature of the exception does not show up in the Apache logs.

Cryptographic Modules

The few cryptographic functions that are used in Mediawiki are contained withing the MWCryptRandom class. The public ones are the following:

Function Description
generateHex Generate n random string of hex digits (effectively n/2 random bytes) if "force secure" parameter is passed as true
generate Same, except it returns a binary string of n bytes
wasStrong Determines whether the last string generated by the particular MWCryptRand object was generated strongly.

By "strong," the authors mean that all the bits of the output were derived from a cryptographically strong entropy source. If say, one byte was not then it is not strong. As there is no way to guarantee that there actually is enough of an entropy pool to generate those random numbers, the program is written so that when it runs at scale, it will not crash unexpectedly.

Under normal operation, the PRNG will take random bytes from /dev/urandom (or similar on other OS') and fill up the return buffer to an appropriate size. If there is not enough random data available from the preferred sources, an additional "random" state generated from the MWCryptRand::hash() of the following non-cryptographic entropy sources:

  • All information in the HTTP request
  • Hostname of the machine serving the request
  • Filename (on the filesystem)
  • Directory name where Mediawiki code lives
  • Parent directory
  • Mediawiki configuration file
  • All the stat information for all the files / directories referred to above (including date, time, inode number, last access, etc.)
  • Process ID
  • Current memory used by the program
  • Name of the Wiki
  • A 512-bit securely generated random number, generated at installation time
  • output of PHP's various non-cryptorgraphic PRNG's
  • add a "0" character for files that don't exist (one bit)

At that point, the rest of the requested buffer is filled with a fallback method, as follows:

<source lang="php"> while ( strlen( $buffer ) < $bytes ) {

       wfProfileIn( __METHOD__ . '-fallback' );
       $buffer .= $this->hmac( $this->randomState(), mt_rand() );
       // This code is never really cryptographically strong, if we use it
       // at all, then set strong to false.
       $this->strong = false;
       wfProfileOut( __METHOD__ . '-fallback' );

} </source>

Obviously, the PRNG does not fail securely in a formal sense.

generate() and generateHex()

Each of these functions takes two parameters: the length of the output and whether the output is required to be based on a cryptographically strong random number generator. They default to not requiring cryptographically strong random strings. Both versions of the generate function are based on the protected function realGenerate()

realGenerate()

As for the generate() and generateHex(), the generates all ultimately call realGenerate() and its code appears quite incorrect in a fundamental way. It's default assumption is that the returned bytes were derived securely and in the case on non-integral numbers of bytes being requested: it returns the floor of the number of bytes instead of the ceiling, which would be more appropriate (erring on the side of greater security.)

<source lang="php"> public function realGenerate( $bytes, $forceStrong = false ) { [...]

       $bytes = floor( $bytes );
       static $buffer = ;
       if ( is_null( $this->strong ) ) {
               // Set strength to false initially until we know what source data is coming from
               $this->strong = true;
       }

[...] </source>

wasStrong()

The wasStrong() function stands a good chance of being incorrect because of the apparent one-line error above.

PRNG Risk Analysis

Assumptions

  • Attacker has access to one or more accounts already
  • Attacker knows all above variables for pages loaded by them in the past
  • Attacker cannot manipulate any variables above, except HTTP request information he sends
  • Connections happen over HTTPS (HTTPS proves bi-directional communication)
  • The HTTP(s) request information includes source IP address
  • Attacker can successfully establish a TLS connection from an arbitrary forged IP address, which is nearly impossible to begin with
  • Attacker can perform new page loads with arbitrary HTTP request information
  • Attacker knows when there is no strong entropy source
  • Attacker cannot observe sessions of other users in cleartext
  • Attacker cannot predict full system state for users other than himself
  • Attacker does not know the value of $wgSecretKey
  • Attacker cannot forge at 512-bit HMAC
  • Attacker does not already know all passwords
  • Attacker posesses all browser an IP address information for all users logged into the system, including username but not password
  • Attacker does not posess existing session information for users
  • No entropy beyond what is provided by the fallback method is available

Consequences of No Strong Entropy Source

  • Output of the PRNG can be predicted based on previous values observed by the attacker
  • Prediction of PRNG output based on fallback entropy sources is infeasible because of the intractability of forging a 512-bit HMAC
  • Probability of state collisions increases dramatically to one in a billion, conservatively
  • To generate a state collision, an attacker must make 500 million requests.
  • Even knowing all HTTP request information from another user (including username / password,) the attacker must successfully establish a TLS connection while spoofing the IP address of his victim.

Random String Use Cases

Per the automated scan that provides us with a list of where the PRNG is used in the Mediawiki software in V7.6, we arrive at the following use cases:

Use Case Risk Attacker Advantage
Generation of an identifier for a Watch List One in every 500 million watch lists will have a colliding identifier, causing potential database problems. possible, but unlikely DoS
Login token generation Determination requires full session information which convers access by itself Zero
Create account token Determination requires full session information which confers access by itself Zero
Renew Session ID Determination requires full session information which confers access by itself Zero
Normal Session ID generation Determination requires full session information which convers access by itself Zero
Random password generation Determination requires full session information which confers access by itself Zero
Updating of per-request session token Determination requires full session information which confers access by itself Zero
Generation of editor token Determination requires full session information which confers access by itself Zero
Generic token generation Determination requires full session information which confers access by itself Zero
E-mail confirmation token The attacker will not have seen the output of the PRNG for this exact state because the exact parameters in the request for an email confirmation to be sent will not expose output from the PRNG -- that is, unless the attacker happened to be in control of the e-mail address used by his victim and sent himself several hundred million e-mail confirmations. Of course, if he can forge an HTTPS connection from an arbitrary IP address then he can simply watch the (probably) cleartext e-mail and intercept the token Infinitessimal
Generation of password salt The Attacker must know the full session data of the request that resulted in the salt generation (either password change or account creation, which includes the password for the account being updated or created) to proceed, which means he already wins Zero

Conclusion

The cryptographic module fails insecurely in general, but under the use cases of the particular application: it is more than adequate for the purpose intended.

Class Autoloader

The file includes/AutoLoader.php sets up an autoloader that allows the PHP code to include lines like the following that is found at the end of index.php

<source lang="php"> $mediaWiki = new MediaWiki(); $mediaWiki->run(); </source>

This makes it convenient to locate the implementations of class constructors, methods, etc.

Session Management

Session management code maintains various pieces of information about the session of a particular user. Since HTTP is stateless, web applications must track which requests are associated with which sessions. When a function is requested that requires privileges that are not available to the session because the user is not authenticated then the user is presented with the option to log in by pointing them at Special:UserLogin

Special Pages

Special pages are pages in the wiki that are processed differently than a normal wiki page. Among the special pages in the system, which include a variety of ways to look into the information held by the system. Given the availability of these pages on RU's wiki, one must wonder if the system administrators at RU who run this server are aware that the system was not designed to hide much of anything. It is usually inadvisable to oppose the nature of a system in such a fundamental way.

Special Page Description
Special:Allpages All pages in the wiki
Special:Categories Page categories
Special:PrefixIndex List prefixes in the wiki
Special:ListUsers List registered users on the system
Special:ActiveUsers List active users on the system
Special:ListAdmins List all system administrators
Special:Log list of everything anyone has updated on the wiki
Speciaal:Softwareversie version information for everything relevant to the system (page removed some time after 2011, which RU last updated this wiki)
Special:Listfiles Show all the pictures on the system

Note the links at the bottom of the browser when hovering over the links. Each language that the wiki runs in provides a set of aliases for the special pages, like Speciaal:Aanmelden on this wiki, which has it's language set to Dutch.