Buca Bay - Always nice

Dua tiko noqu toa loaloa, na yacana ko… laga mai…

Base conversion in PHP, Radix 255

September29

Allows you to convert to any base between 2 and 255, effectively using all the ASCII characters.

In order to convert very large numbers with arbitrary precision you’ll need the BCMath lib. Without BCMath the large numbers will not be converted correctly due to PHP not being able to do the arithmetic.

If you need to convert between bases 2-36, you can use the base_convert() function. However, converting to higher bases such as 255 has some benefits, such as “compressing” the characters.

You can successfully compress SHA1 from a 40 byte hex to a 20 byte string.

echo base255(base_convert(sha1('test'), 16, 10)));

Since there is no loss of data, it can be used as a lossless compression. Normal compression such as zlib won’t work on a SHA1 since there are no repeating patterns.

posted under php | 2 Comments »

PHP Object Cache

June25

The last two days I’ve been writing an Object cache in PHP as part of a larger project. I released it today as open source so it will help those doing something similar and get some that helpful feedback open source offers.

PHP Object Cache is a Memory Object cache, implemented with PHP Sockets. It runs on PHP4 or PHP5+ and requires your PHP build to have sockets enabled.

The project is a web (browser) based chat system called Joomla Ajax Chat. The initial development of the chat was 3 and a half years ago, so many of what worked then (HTTP Polling and AJAX) is becoming old school now and not as efficient as what is possible with browsers today.

One of the new features is implementing comet like HTTP, which essentially means keeping the HTTP connection open for as long as you can. XMPP defined a specification for this called BOSH for their Instant Messaging Protocol to work over HTTP. With a good implementation, Comet can be very efficient.

Our problem is that we have to implement Comet on regular ol’ Apache, Lighty, Nginx, IIS servers running different versions and builds of PHP. The code will have to run on the average shared hosting, dedicated servers to cloud based solutions. The other problem is it has to run on top of Joomla, and other CMSs. A call to a minimal Joomla page alone, is around 5-6 database reads and one or two writes and about 5-6Mb of ram. It wouldn’t take much to crash a shared hosting account trying to implement comet on top of that.

So the solution would have to be an object cache of some sort, even if it means a file based cache (last resort). Thus, PHP Object Cache, which hopefully will work for the percentage of shared server that allow sockets. The idea it to take the most intensive IO processes, such as session management, chat events and put them into the Object cache. When everyone online has viewed what they need from the cache, flush that bit to the database or some other persistent storage if needed.

Since the browser is able to do more as time goes on, hopefully php on old shared servers can keep up.

Incoming mail with PHP Mime Mail Parser

May13

I’ve just added the ability to parse mime mail from standard input into PHP Mime Mail Parser. This allows you to receive and parse email in PHP efficiently and effortlessly. 

To pipe email to PHP, follow one of these articles:

Evolt - Incoming Mail and PHP
DevPapers - Incoming Mail and PHP
DevArticles - Incoming Mail and PHP

Then for your PHP code, get the PHP Mime Mail Parser, and use:

<?php
// include the mime-mail-parser class
require_once('MimeMailParser.class.php');

// instantiate
$Parser = new MimeMailParser();
// read the email from stdin
$Parser->setStream(STDIN);

// get the email parts
$to = $Parser->getHeader('to');
$delivered_to = $Parser->getHeader('delivered-to');
$from = $Parser->getHeader('from');
$subject = $Parser->getHeader('subject');
$text = $Parser->getMessageBody('text');
$html = $Parser->getMessageBody('html');
$attachments = $Parser->getAttachments();

?>

A PHP Mime Mail parser using MailParse Extension

April21

Ever tried parsing Mime Mail? Not for the faint hearted, I assure you. The great thing is that PHP has an extension for parsing Mime Messages called MailParse. The bad news is that using this extension is probably just as hard as writing your own Mime Parser. 

Fortunately with a bit of help I’ve put together a Mime Mail Parser Class that wraps the MailParse extension functions making it simple, efficient and fast to parse mime mail in PHP. 

Why another Mime Mail parser? Well for two main reasons. 

1) Pure PHP implementations are slow and inefficient compared to MailParse.

2) MailParse is too hard to use, and is not OO. 

Therefore welcome to MimeMailParser.

Here is a an example that shows how easy it is to parse raw mime mail using MimeMailParser:

<?php

require_once('MimeMailParser.class.php');

$path = 'path/to/mail.txt';
$Parser = new MimeMailParser();
$Parser->setPath($path);

$to = $Parser->getHeader('to');
$from = $Parser->getHeader('from');
$subject = $Parser->getHeader('subject');
$text = $Parser->getMessageBody('text');
$html = $Parser->getMessageBody('html');
//$attachments = $Parser->getAttachments();
$attachments = $Parser->getAttachmentsAsStreams();

?>

You can find the source code here for your enjoyment.

Secure PHP Programming for Web Developers

February12

Security in PHP is the same as any server side programming language, they are all vulnerable to the same attacks.

PHP has a history of being vulnerable mainly because of its popularity.

1) More non-security aware developers use PHP then any other language, so there code has flaws almost 100% of the time.
2) There are more sites written in PHP then any other server side language, thus more chances to find security holes amongst so many.

PHP has had a rep of having security problems in the PHP core itself, but this has improved greatly. Much of which can be attributed to the Hardened-PHP project.
http://www.hardened-php.net/suhosin/

If you are using a shared host, ask if they have PHP4 with the suhosin patch or PHP5 or higher. I’m not sure if PHP5 still needs Suhosin but it seems many large sites aren’t using the two together so I believe PHP5 has a better security then PHP4 natively.

As a developer I think there are just about 4 or so main security vulnerabilities to keep in mind when coding.

1) XSS (Cross Site Scripting)

This is the most common vulnerability in any website. It is estimated that around 70% of websites have an XSS vulenerability.
http://en.wikipedia.org/wiki/Cross-site_scripting

A simple example in PHP:

<?php

echo $_GET['username'];

?>

What happens is the PHP echo’s a variable passed in from HTTP (in this case a GET parameter). If a user typed in the browser URL:

site.com/example.php?username=<script>alert('document.cookie')</script>

They would see the cookies saved for their session. An attacker can make a user click a link that will also retrieve these cookies from JavaScript, and send it to them - without the user knowing.

To prevent it:

<?php

echo htmlentities($_GET['username'], ENT_QUOTES, 'UTF-8');

?>[

This will turn any HTML into HTML entities. You also have to specify the encoding you used for the page (in this case UTF-8). The reason is so PHP codes not mangle the character encoding, which can also result in XSS.

I must say here, that you should never use your own PHP filtering functions for HTML, or any other "cleansing" of user input. Most likely you will miss something that an attacker will use.

2) XSRF - Cross Site Request Forgery

This is similar to XSS and just as common or maybe even more common. It is when a website fails to protect it's users from being used by 3rd parties without their knowledge.
http://en.wikipedia.org/wiki/Cross-site_request_forgery

And example of this in PHP is a simple comment form.

<form action="submit.php">

<textarea name="comment">
<input type="submit" value="Post Comment" />

</form>

Imagine the comment form is only available for logged in users. Now an attacker can just send an already logged in user the URL:

site.com/submit.php?comment=I hacked you&submit=Post Comment

So when the logged in user clicks on that link, they have posted the comment without knowing. This can even be done in a hidden frame, so the user never see's it.

So the attacker is using the user's already authenticated session (privileges) to do his/her bidding.

Preventing XSRF:

<form action="submit.php">

<textarea name="comment">
<input type="submit" value="Post Comment" />
<input type="key" value="some_random_value" />

</form>

Notice the new <input /> called "key". It will contain a random value remembered by PHP. This random value should be saved, and be unique for every form that is sensitive.

This way, the attacker would not be able to make the user post something on their behalf, since they don't know the value of "key".

(This only works if you don't have an XSS vulnerability of the page itself, as that can lead to the attacker knowing what the value of "key" is)

3) SQL injection

SQL injection is when an attacker manages to manipulate any SQL database queries in your website in a way you didn't intend.
http://en.wikipedia.org/wiki/SQL_injection

Example:

<?php

$query = "SELECT * FROM users where password = '".$_GET['password']."'";
$result = mysql_query($query);

?>

Because the $_GET['password'] can be anything the attacker wishes to put in the URL, they could craft a URL like:


site.com/login.php?username=joe&password=nothing' or 1

Notice the ‘ in the value for the parameter “password”

This will make your sql query:

SELECT * FROM users where password = 'nothing' or 1

This would make it return the first user instead of the user with password = “nothing” since “or 1″ is always true.

Preventing SQL injection:

< ?php

$query = "SELECT * FROM users where password = '".mysql_real_escape_string($_GET['password'])."'";
$result = mysql_query($query);

?>

the function mysql_real_escape_string() will prevent any SQL injection by escaping any character that would otherwise terminate the string.

4) Remote File inclusion

Remote file inclusion is when an attacker can include remote file into your PHP code. This is the most dangerous attack, as it allows the attacker to execute arbitrary code on your PHP server.
http://en.wikipedia.org/wiki/Remote_File_Inclusion

eg:

<?php

include('/pages/'.$_GET['page'].'.php');

?>

With this code the developer is hoping to have a URL such as:

site.com/pages.php?page=home

And this would include the file:

/pages/home.php

However, any attacker can now place a URL such as:


site.com/pages.php?page=../../passwords.txt

And it would reveal the contents of the file passwords.txt

Or they could use it to include a remove file from their server, if the URL wrappers are enabled for file includes (which is common).

It is best not to have any user input in the files to include. However, if you think it is beneficial to your PHP website, then make sure you have a predefined list of files that can be included.

For example:

<?php

if (in_array($_GET['page'], array('home', 'contact', 'links', 'about')) {
    include('/pages/'.$_GET['page'].'.php');
}

?>

With these in mind, you should be able to keep your PHP site secure from most attacks. Now in one paragraph:

When ever you write a PHP page, make sure all input from users (HTTP) is escaped according to the the following rules, HTML just be entity encoded with htmlentities(), XML with htmlspecialchars(), SQL must be escaped by mysql_real_escape_string(). User input should not be used for file inclusions at all, but if you do, then make sure it is from a predefined list of possible files. All forms must contain a secure, random, key that is used to authenticate it only once so that there can be no subsequent posts of the same form, or posts from other sources.

That actually covered all the points above. Can you see how simple it is? It is just keeping those in mind at all times while coding that is the hard part.

Google AJAX Language API with PHP

January20

I had noticed some time ago that Google had released an API for their language translation service. A recent forum discussion made me revisit the API, and since I had a wee bit of time on my hands, I wrote a very rudimentary PHP class implementation of the API.

Google seems to like flaunting “AJAX”, in their APIs at least. So the API is called “Google AJAX Language API” and the main implementation is .. take a guess, AJAX. However, in addition to their pure JavaScript API, they also have a REST interface (of course JavaScript would need such an interface anyway).

The REST interface is just a HTTP endpoint (URL) that returns JSON. You just need to formulate a HTTP GET passing the parameters described in the API documentation, and Google will send you a nicely formated JSON response with the translated text and some other details.

Here is an example request, to translate “Hello World” to Italian.

http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&q=hello%20world&langpair=en|it

The parameters are q=hello world and langpair=en|it

The JSON response looks like:

{"responseData": {"translatedText":"ciao mondo"}, "responseDetails": null, "responseStatus": 200}

So a simple implementation in PHP would be to use file_get_contents() to download the JSON text from the URL over HTTP and here it is in a PHP class.

http://code.google.com/p/php-language-api/source/browse/trunk/google.translator.php

Example Usage:

// example usage
$text = 'Welcome "to my " website.';
$trans_text = Google_Translate_API::translate($text, '', 'it');
if ($trans_text !== false) {
	echo $trans_text;
}

The class uses file_get_contents() which assumes your PHP has allow_url_fopen directive enabled in the PHP configuration (PHP.ini). This is usually the case, however, it can be disabled for security reasons (since it allows include() to use the URL wrapper and thus include remote files - a favourite exploit for attackers it to inject remote files into include() functions)

The class doesn’t use a JSON parser, I think its a bit of an overhead including the JSON libraries in PHP4. Intead it just uses regular expressions. The PCRE regular expression functions are pretty fast. PHP5 has native support for JSON however, so the class could be modified to use the PHP5 native JSON functions if you use PHP5 specifically.

Something I ran into was that Google returns not only UTF-8, but UTF-8 escape sequences. That is, they have characters outside the basic ASCII range escaped with the UTF-8 escape sequence which is \u followed by the character’s hex value. For example, the & symbol becomes:

\u0026

Cools aye. Sucks, because PHP does not understand this. JavaScript, which is the main method of invoking the Google Language API, understands this natively. PHP doesn’t even understand UTF-8 in PHP4. First I resorted to this ugly function to unescape the UTF-8 escape sequences (convert those UTF-8 sequence to actual UTF-8 byte sequences).

/**
         * Convert UTF-8 Escape sequences in a string to UTF-8 Bytes. Old version.
         * @return UTF-8 String
         * @param $str String
         */
        function __unescapeUTF8EscapeSeq($str) {
                return preg_replace_callback("/\\\u([0-9a-f]{4})/i", create_function('$matches', 'return html_entity_decode(\'\'.$matches[1].\';\', ENT_NOQUOTES, \'UTF-8\');'), $str);
        }

The function is ugly because it uses html_entity_decode() to do the transformation for us. We just convert the UTF-8 escape sequence to a HTML escape sequence (HTML entities), then use html_entity_decode() which PHP handles well. I decided on a compatible function that uses bitwise operations instead. Both are included in the source however for reference.

The code is very early development and will be buggy. You can check out the latest sources via SVN:

svn checkout http://php-language-api.googlecode.com/svn/trunk/ php-language-api-read-only

Feel free to let me know on the Google project page if you find any bugs.

http://code.google.com/p/php-language-api/issues/list

Tag Cloud