edit:
I zipped up the files and provided and example of the output. Go here to see the example and download the files
It pains me when I read product feed users, particularly those using php, complain about the duplicate content issue you get when you republish a feed on your site. Its an easy problem to solve with a little programming know how.
One solution I use for this is to present most of the text as an image. If you didn’t know php can dynamically create images using the now standard GD library.
The first file just contains a function to create line wrapped text meeting the requirements of the image. The second is the actual code I use to create the text. You see that I call the image creation function twice, that is to determine the actual height of the image for the given width and the amount of text.
I am not going to go in great detail about how it all works. You either get it or your don’t, feel free to post questions for clarification though.
Notes:
- You need to know where your ttf fonts are located.
- You need to have GD library installed the FreeType library compiled in php.
- You’ll need to know how to dynamically load your text in the script.
- Call the script by linking to it with an image tag and what ever you need to use to get the text
Copy and paste this into a file called image_function.php
//
// ripped this from http://us2.php.net/manual/en/function.imagettfbbox.php
//
function imageprintWordWrapped(&$image, $top, $left, $right, $font, $color, $text, $textSize, $halign=”left”)
{
$maxWidth = $right - $left ; //the trivial change
$words = explode(’ ‘, strip_tags($text)); // split the text into an array of single words
$line = ”;
while (count($words) > 0) {
$dimensions = imagettfbbox($textSize, 0, $font, $line.’ ‘.$words[0]);
$lineWidth = $dimensions[2] - $dimensions[0]; // get the length of this line, if the word is to be included
if ($lineWidth > $maxWidth) { // if this makes the text wider that anticipated
$lines[] = $line; // add the line to the others
$line = ”; // empty it (the word will be added outside the loop)
}
$line .= ‘ ‘.$words[0]; // add the word to the current sentence
$words = array_slice($words, 1); // remove the word from the array
}
if ($line != ”) { $lines[] = $line; } // add the last line to the others, if it isn’t empty
// added some padding in the line height
$lineHeight = 1.5 * ( $dimensions[1] - $dimensions[7] ); // the height of a single line
$height = count($lines) * $lineHeight; // the height of all the lines total
// do the actual printing
$i = 1;
//print_R($widths);
foreach ($lines as $line) {
if($halign==”center”) {
//figure out width of line
$dimensions = imagettfbbox($textSize, 0, $font, $line);
$lineWidth = $dimensions[2] - $dimensions[0];
//figure out where the center is.
$center=floor($maxWidth/2 + $left);
$leftStart=$center-$lineWidth/2;
} else if ($halign==”right”) {
//figure out width of line
$dimensions = imagettfbbox($textSize, 0, $font, $line);
$lineWidth = $dimensions[2] - $dimensions[0];
$leftStart=$left+$maxWidth-$lineWidth;
} else {
$leftStart=$left;
}
imagettftext($image, $textSize, 0, $leftStart, $top + $lineHeight * $i, $color, $font, $line);
$i++;
}
return $height;
}
?>
This is the bit that create the image of the text.
You will need to figure out how dynamically get the correct text. I usually use sql but the choice is situation dependent.
Copy and save to text2image.php
include( ROOT_DIRECTORY . ‘/image_functions.php’ );
$text = ‘THE TEXT YOU WANT TO HERE’;
$text = preg_replace( ‘/\n/’, ”, $text ) ;
// grab font
$font = FONT_DIR . ‘/VeraSe.ttf’;
// black and white here
$req_height = 1;
$image = imagecreate( $width, $req_height );
$bg_color = ImageColorAllocate( $image, 255, 255, 255 );
$color = ImageColorAllocate( $image, 0, 0, 0 );
$textSize = 10;
$angle = 0;
$top = 10;
$left = 0;
$right = $width;
// ImageTTFText( $image, $size, $angle, $top, $left, $text_color, $font , $description );
$height = imageprintWordWrapped(&$image, $top, $left, $right, $font, $color, $text, $textSize, $halign=”left”) ;
if( $height > $req_height )
{
ImageDestroy( $image );
$image = imagecreate( $width, ( $height * 2 ));
$bg_color = ImageColorAllocate( $image, 255, 255, 255 );
$color = ImageColorAllocate( $image, 0, 0, 0 );
imageprintWordWrapped(&$image, $top, $left, $right, $font, $color, $text, $textSize, $halign=”left”) ;
}
header( ‘Content-type: image/png’ );
ImagePNG( $image );
?>
Like it? Subscribe to the blog if you haven't already
Related Posts
Tracking Google Base Referrals using Google Analytics
WMW PubCon Notes: Link Building Clinic
Free Thesaurus
Link to Yourself
| Filed under: Content Creation, General, Web Technologies — Scott @ September 17, 2005 11:35 pm |
|





Ye Old Domain Names
Tool for finding new domain names and two tools for finding the age of sites.
[...] le Reader - How To Link From Your Blog Stuntdubl’s List - Project Management Tools Never Fear, No Duplicate’s Here | avoiding the duplicate content penalty [...]
I tried to use the zipped version of this resource. The True Type font was named “Vera.ttf.” But in the “text2image.php” code it calls for “Verase.ttf.” Hd to change it to get the script to work. Or you could make sure the font file name is correct (rename if necessary).
Thanks for the heads up on the typo. I’m suprised no one else brought that up.
interesting and pretty good ;)
keep on da nice job yo’
I like the image idea and have been using it myself for a few quotations that are repeated across a few of the sites Im working on. I am also wondering about the efficacy of doing it the utterly honest way by using the blockquote and cite tags for quotes and citations where they are repeated between sites in order to avoid any detriment to search engine performance. Building these sort of semantic indicators into the code seems to work for anything else; however, after a bit of searching Ive found no evidence to back up the assumption that these tags help in any way with duplicate content. Logically they should, and some sites even recommend them for their SEM benefits. Could be time, however, to either ask Matt Cutts or just bite the bullet and test it.
I dunno Chris I have never tested that.