RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'hide');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome1; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome2; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, Lockergnome - with CDATA stripped and shown - and flickr).

bbc

Wright quits Labour but not PCC role
Shaun Wright resigns from the Labour Party but refuses to quit his South Yorkshire police and crime commissioner role following a damning report into child abuse in Rotherham.
Date: Thu, 28 Aug 2014 01:08:01 GMT
PM: UK 'supports million Scots jobs'
The prime minister is to tell business leaders that the UK is "an economy of opportunity" which supports one million Scottish jobs.
Date: Wed, 27 Aug 2014 23:55:08 GMT
Russia pressed over Ukraine fighting
Germany demands an explanation from Russia's Vladimir Putin amid reports that Russian troops launched an incursion into south-east Ukraine.
Date: Thu, 28 Aug 2014 02:23:34 GMT
Depression in cancer 'overlooked'
Three-quarters of cancer patients who are depressed are not getting the psychological therapy they need, researchers say.
Date: Thu, 28 Aug 2014 00:36:20 GMT
'Deeply elitist UK locks out talent'
The UK is "deeply elitist" according to a new analysis of the backgrounds of more than 4,000 business, political, media and public sector leaders.
Date: Thu, 28 Aug 2014 00:43:47 GMT
UK Jews and Muslims issue peace call
The Jewish Board of Deputies and Muslim Council of Britain issue an unprecedented joint statement calling for peace and condemning prejudice.
Date: Thu, 28 Aug 2014 02:36:00 GMT
US hostage mother makes video plea
The mother of US journalist Steven Sotloff, held hostage by IS militants, pleads with his captors to release him in a video appeal.
Date: Wed, 27 Aug 2014 19:45:03 GMT
Ebola outbreak 'will get worse'
A top US public health official says the Ebola outbreak is set to get worse before it gets better, as West African health ministers meet in Ghana.
Date: Thu, 28 Aug 2014 02:36:27 GMT
Cancer drugs face NHS price squeeze
The government might threaten to stop buying some expensive cancer drugs if the companies that make them do not cut their prices, Newsnight learns.
Date: Wed, 27 Aug 2014 21:01:19 GMT
Brazil pursues Amazon 'destroyers'
Brazilian police say they have dismantled a criminal organisation they believe was the "biggest destroyer" of the Amazon rainforest.
Date: Wed, 27 Aug 2014 22:52:52 GMT

cnn

How grandma got hooked on heroin
Cynthia Scudo was a mother and a grandmother who worked full-time. But behind her all-American image was a dark secret that could have killed her.
Date: Wed, 27 Aug 2014 21:46:14 EDT
Alarming increase in near-collisions
New statistics from the FAA show a sharp increase in near-collisions of passenger planes. CNN's Rene Marsh reports.
Date: Tue, 26 Aug 2014 20:42:57 EDT
Vanessa Williams hit with tax lien
The IRS filed a tax lien against Vanessa Williams, saying the singer-actress owes the federal government $369,249 for her 2011 earnings.
Date: Wed, 27 Aug 2014 16:41:45 EDT
Flash floods wash bus away
At least seven people are dead after heavy rains spark flash flooding in South Korea. CNN's Mari Ramos reports.
Date: Tue, 26 Aug 2014 13:55:04 EDT
Solar system is inside bubble?
Ever feel like you're living in a bubble? You are. Our whole solar system is, say space scientists, who published work last month corroborating its existence. And, oh, what a bubble it is! 300 light years long. And about a million degrees hot.
Date: Wed, 27 Aug 2014 16:56:19 EDT
Kate Bush returns after 35 years
After 35 years away, British singer Kate Bush returned to the stage Tuesday night -- and the response was rapturous.
Date: Wed, 27 Aug 2014 12:17:19 EDT
Holocaust-like shirts pulled from stores
Spanish fashion retailer Zara has apologized for selling a striped T-shirt bearing a yellow star that drew criticism for its resemblance to uniforms worn by Jewish concentration camp inmates.
Date: Wed, 27 Aug 2014 14:50:26 EDT
Greatest building implosion ever?
A hotel is demolished in Albany, New York, with a colorful fireworks show. WRGB reports.
Date: Mon, 25 Aug 2014 07:10:04 EDT
Drone's eye view of quake damage
See a drone's eye view of the aftermath of the 6.0 Napa earthquake.
Date: Mon, 25 Aug 2014 21:30:13 EDT
Panda 'may have faked pregnancy'
A giant panda intended to be the star of the first ever live broadcast of the birth of panda cubs has lost the role -- after it was discovered the bear is not pregnant after all, Chinese state media reported.
Date: Wed, 27 Aug 2014 06:29:11 EDT

lockergnome (hidden CDATA)

The lockergnome feed seems to be down.

lockergnome

The lockergnome feed seems to be down.

flickr

(Untitled)

richseow posted a photo:

Date: 2014-08-28T05:04:16Z
_MG_0094

paddlenswmarathon posted a photo:

_MG_0094

Date: 2014-08-28T05:04:17Z
IMG_2667

johnandchelsea2 posted a photo:

IMG_2667

Date: 2014-08-28T05:04:08Z

政偉WeGo posted a photo:

Date: 2014-08-28T05:04:12Z
_MG_3499.jpg

j.nillo_22 posted a photo:

_MG_3499.jpg

Date: 2014-08-28T05:04:13Z
ilex, myyard, jdy324 XX201211192264.jpg

rachelgreenbelt posted a photo:

ilex, myyard, jdy324 XX201211192264.jpg

Date: 2014-08-28T05:04:06Z
IMG_6656

Bjammin_B posted a photo:

IMG_6656

Date: 2014-08-28T05:04:11Z
DSC_0319_20_21 copy

Bowman Group Architectural Photography posted a photo:

DSC_0319_20_21 copy

Date: 2014-08-28T05:04:09Z
Convoy_MP2008_001

Megaplexcon posted a photo:

Convoy_MP2008_001

Date: 2014-08-28T05:04:12Z
P09A9712.jpg

Cap7ainClu7ch posted a photo:

P09A9712.jpg

Date: 2014-08-28T05:04:13Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word '' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url