Creating a personal aggregation page using RSS, Blueprint and more
If you are like me, you find yourself posting to several blogs and contributing to many sites in total as well as wanting to promote your social network profiles. In this post, I will create a personal aggregation page for myself where I will post the latest 5 posts from my blogs as well as show my latest twitter updates and links to my social network profiles. I wanted to provide one URL to all sites I commented on / posted to for all my web content, instead of trying to use the most relevant site for whatever I am posting to (ie, using my Sustain Myself blog on eco sites, my Fliquid URL for tech sites etc – I now just use the one address for all.)
This post was initially going to be a simple RSS feed reader in php, then I just kept adding to it and now its a bit of a giant.
In this post I will:
1. Demonstrate how easy Blueprint (CSS Framework) is to work with and create a simple 3 column layout.
2. Write a simple RSS reader in PHP
3. Create some simple .htaccess rewrite rules
4. Glue it all together using MySQL and a cron job
The end result will resemble my personal aggregator at www.cb.net.au.
Blueprint CSS
I have never been too strong with my CSS skills. I tend to look over what other people have built in CSS and then adapt it to my needs. Mainly because I don’t think I have ever given myself enough time to learn it properly and things like float and positioning leave me stranded without a clue. When I decided on my topic for this post I thought I’d love to just touch on some CSS for the layout but my knowledge is not sufficient enough for that. I had heard a lot about Blueprint in the past and thought I’d have a quick look (it’d have to be quick as time is of the essence for me). Generally a quick look for me means that I can’t understand it and I give up trying, at least until I can sit down properly and nut it out. There was absolutely no need for any of that with Blueprint. After one simple tutorial, I had built a 3 column layout with really nice fonts, in no time at all.
Blueprint uses a ‘grid’ layout system which I thought seemed a bit odd at first, but makes a lot of sense. You specify a number of ‘cols’ for your grid, in most examples it is 24. Then, you split your ‘cols’ up into actual columns with widths relative to how much of that 24 you wish to assign.
For example:
We create a container div:
<div class=”container”>
We then create a header div (and specify its width in columns – ’span-24′):
<div id=”header” class=”span-24″>
That tells Blueprint to use all 24 cols to display the header, that means there will be nothing on either side of the header, so when it comes to creating our 3 col layout below, its as simple as giving each column a width in cols that wont equal more than 24;
<div id=”content” class=”span-10″>
<div id=”midbar” class=”span-6″>
<div id=”sidebar” class=”span-8 last”>
And thats it. A 3 col layout. The ‘last’ class will make sure that its layed out properly for the last column. Easy eh?
And just like that, literally, you have a 3 column layout. I won’t touch on the typography features or other layout classes, but a quick read of the Blueprint site and a Google will provide more than enough info to get you started.
This is the HTML for my site:
<div class="container">
<div id="header" class="span-24 first last">
<h1 class="loud">cb.net.au</h1>
<h3 class="quiet">The home of Christian Biggins</h3>
</div>
<div class="span-10 colborder" id="content">
<h2 class="loud">Blog Posts</h2>
</div>
<div class="span-6 colborder" id="sidebar">
<div id="twitter_div">
</div>
</div>
<div class="span-6 last" id="sidebar2">
</div>
</div>
MySQL
Now, the next part was to create a small db to keep the posts and feeds in. This post is not touching on storage, encoding, anything like that. So this is all pretty simple.
SQL:
DROP TABLE IF EXISTS feeds; CREATE TABLE feeds ( id int(10) unsigned auto_increment, Title varchar(255) NOT NULL, RSS tinytext NOT NULL, SiteURL tinytext NOT NULL, LastFetched TIMESTAMP NOT NULL, PRIMARY KEY(id) ); DROP TABLE IF EXISTS feedcontents; CREATE TABLE feedcontents ( FeedID int(10) unsigned NOT NULL, FeedData blob NOT NULL, PRIMARY KEY(FeedID) );
RSS Reader
I now needed to write a small RSS reader to scrape my feedburner feeds and put them into a user friendly array to serialize and store in the db. I know there are many RSS readers out there, but I love building my own stuff, I get a kick out of it. Why re-invent the wheel? Why not ask every company in the world why they try to compete with others?
Thats a topic for another post.
So, my RSS reader (still in BETA – its untested in any feed apart from Feedburner feeds, if you come across a bug, please let me know in the comments or twitter, or email, or anything) can be seen here: http://www.fliquidstudios.com/projects/fliquid-rss-library/
Its a very simply script and can be read farly easily. To instigate:
$rss = new FliquidRSS('http://feedurl');
$rss->parseRSS();
var_dump($rss->xmlarray);
Then you can just use the same object for another url, like so;
$rss->newURL('http://feedurl2');
The problem with RSS, is that one post can spread across multiple elements. Its not as easy as just converting your XML to an array, you need to look for specific elements, such as the opening element, then start ’scraping’ the data and then stop when we hit the closing element. So, to do this we simply create a flag ‘initem’ and as long as that flag is set to true, we gather the required data:
foreach ($this->xmlstruct as $element) {
if ($element['tag'] == 'item' && $element['type'] == 'open') { // We have just opened a new tag
if ($this->itemcount >= $this->maxitems) break;
$inItem = TRUE; // Set our 'initem' flag to true
$this->itemcount++; // increase our item count
}
if ($inItem) {
if ($element['type'] == 'complete') {
switch ($element['tag']) {
case 'title':
$this->xmlarray[$this->itemcount]['title'] = $element['value'];
break;
case 'link':
$this->xmlarray[$this->itemcount]['link'] = $element['value'];
break;
case 'description':
$this->xmlarray[$this->itemcount]['description'] = $element['value'];
break;
}
}
}
if ($element['tag'] == 'item' && $element['type'] == 'close') { // We have just closed the tag
$inItem = FALSE; // No longer in an item, next item will get a new itemcount.
}
}
}
So, that creates a multi dimensional array and each post has the following array keys; title, link, description. There are many more attributes to the rss feed but these are the only ones that were relevant to what I was doing. We discard the rest.
Updating our local posts database
We need to make an ‘update’ file to run in a cron job. You can’t have the RSS reader running onload for the display as it takes a while to load (takes a good 8 – 10 seconds to grab my 3 feeds). So we pop the updating code in a file to run every night or every 12 hours or as regular as you want and keep our tables up to date with our latest posts. Keep in mind that you do not want this file in a web accessible directory as malicious users could run the script non stop or do anything else they wanted with it.
The update files code can be seen here: http://www.fliquidstudios.com/projects/christians-aggregator-updatephp/
As we need the array we store to be usable directly from the database, we serialize the array. This means that PHP will convert the array into a storable representation for MySQL. Now, I cheat here and urlencode my serialized array as I was not going to touch on encoding and decoding strings. It does the job but it is not the best solution for this.
As cron’s can be run from anywhere, we can store it anywhere and invoke it like so:
$ php update.php
A note with the updater and the index files, this post is not about databases. I could be creating my database, doing my querying, quoting results etc in a much more involved and pretty fashion, but I am not touching db’s and db objects so I have done the bare minimum for this. We will post on db objects and the best way to query in a later post.
Twitter updates
I could have used the RSS feed from my Twitter account and treated it in a similar fashion to the RSS feeds I was already gathering, but Twitter provide nice widgets in plain text and flash so I decided to use the plain text (javascript powered) version as later on when I apply styling to my page (in another post), I can make it look the way I want it to easily. Grab the widgets here: http://twitter.com/widgets
My display
The PHP for my index can be seen here: http://www.fliquidstudios.com/projects/christians-aggregator-indexphp/
I just have that PHP code nested in my content div in my layout.
My Links
I got so jack of trying to remember my usernames to all the different sites I am a member on when telling people, so I made a bunch of 301 redirects to those pages using Apaches .htaccess file.
redirect 301 /github http://www.github.com/cbiggins
So, when somebody visits cb.net.au/github they will be redirected to my github profile. Also, it means that I can link to that by assigning my href the value of “/github” – too easy.
So, my final result is this; www.cb.net.au
Further Reading:
Blueprint CSS: http://www.blueprintcss.org/
RSS 2.0 Specification: http://cyber.law.harvard.edu/rss/rss.html
