RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'hide');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome1; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome2; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, Lockergnome - with CDATA stripped and shown - and flickr).

bbc

Benn denies fuel bill cave-in
Environment Secretary Hilary Benn denies ministers "caved in" to energy firms over cash rebates for soaring fuel bills.
Date: Fri, 05 Sep 2008 13:34:59 GMT
Shannon mother accused of kidnap
The mother of Shannon Matthews and a 40-year-old man plead not guilty to kidnapping the Dewsbury schoolgirl.
Date: Fri, 05 Sep 2008 11:49:13 GMT
Winds and rain battering Britain
Heavy rainfall and strong winds are sweeping across south Wales and western England, bringing the risk of flooding.
Date: Fri, 05 Sep 2008 14:01:29 GMT
Flooded Haitians 'in dire need'
The UN says severe storms that have hit Haiti in recent weeks have left some 600,000 people in desperate need of help.
Date: Fri, 05 Sep 2008 14:01:06 GMT
McCain vows to fight to change US
John McCain promises "change is coming" as he accepts the Republican party's candidacy for the White House.
Date: Fri, 05 Sep 2008 04:34:00 GMT
Canoeist appeals against sentence
Back-from-the dead canoeist John Darwin is to appeal against his prison sentence for fraud, his lawyer says.
Date: Fri, 05 Sep 2008 12:36:19 GMT
UK food prices show 8.3% increase
UK food prices have risen by 8.3% on average since January, with meat and fish up 23%, according to a study for the BBC.
Date: Thu, 04 Sep 2008 23:07:46 GMT
Sex killer 'could not be stopped'
A stalker who killed his ex-lover and then had sex with her corpse could not have been stopped, a report finds.
Date: Fri, 05 Sep 2008 06:05:37 GMT
Chaos at £20,000 petrol giveaway
There are angry scenes at a petrol station in north London which is giving away £20,000 of petrol in a promotion.
Date: Fri, 05 Sep 2008 12:17:46 GMT
Westlife's 'biggest fan' proves her love for Irish band
A mother of four gets a tattoo of her favourite band across her back to celebrate her 40th birthday.
Date: Fri, 05 Sep 2008 13:22:27 GMT

cnn

Bush to praise McCain as 'ready to lead'
In comments to be shown at the Republican National Convention tonight, President Bush calls presumptive nominee Sen. John McCain "ready to lead," saying he's been prepared for the presidency by a lifetime of service.

Date: Tue, 02 Sep 2008 19:50:24 EDT
Lawmaker: Palin due to face investigators
As she takes part in the Republican National Convention with Sen. John McCain, the abuse of power probe facing Alaska Gov. Sarah Palin at home is charging ahead -- and the governor is expected to be questioned later this month, according to the lawmaker overseeing the investigation.

Date: Tue, 02 Sep 2008 19:27:08 EDT
Anti-abortion group says Palin 'walks her talk'
Sarah Palin's announcement that her 17-year-old daughter is pregnant -- and she supports her daughter -- shows that the Alaska governor is steadfast in her support of family values, GOP loyalists and anti-abortion groups say. "She walks her talk," said Serrin Foster, the president of Feminists for Life of America.

Date: Tue, 02 Sep 2008 18:32:00 EDT
Leslie Sanchez: Obama's high-tech advantage
In 1840, a young Whig organizer named Abraham Lincoln wrote the guidebook on political field work. His "confidential" circular advised Whig campaign operatives to "make a perfect list of all the voters and ascertain with certainty for whom they will vote."

Date: Tue, 02 Sep 2008 11:47:35 EDT
Papers suggest Olympians got banned drugs
Two members of the 2008 Jamaican Olympic track team received shipments of performance-enhancing drugs through an Internet distribution network, according to documents obtained by SI.

Date: Tue, 02 Sep 2008 17:22:58 EDT
Trio of storms stirring up Atlantic
Hurricane Hanna's path and strength remain uncertain, but the latest forecast map from the National Hurricane Center predicts it could make landfall as a major hurricane on the southeastern U.S. coast by Friday evening.

Date: Tue, 02 Sep 2008 18:48:28 EDT
Sheriff: 'Strong probability' Caylee is dead
Read full story for latest details.

Date: Tue, 02 Sep 2008 13:37:12 EDT
Police: Tycoon killed family in mansion arson
British police said Tuesday they now believe a millionaire killed his wife and daughter before setting fire to their mansion home and killing himself.

Date: Tue, 02 Sep 2008 13:43:38 EDT
'Bandit' star Jerry Reed dies
Read full story for latest details.

Date: Tue, 02 Sep 2008 15:07:45 EDT

lockergnome (hidden CDATA)

The lockergnome feed seems to be down.

lockergnome

The lockergnome feed seems to be down.

flickr

pictures (299)

pmahbobi posted a photo:

pictures (299)

Date: 2008-09-05T14:16:50Z
n286800394_126478_6438

rich viner posted a photo:

n286800394_126478_6438

In this photo: Kathryn Gelder, Jimmy Mousicos

Date: 2008-09-05T14:16:52Z
Resize of snapshot20080901225706

jimmysho100 posted a photo:

Resize of snapshot20080901225706

Date: 2008-09-05T14:16:52Z
000019

Vincent Arthur posted a photo:

000019

Date: 2008-09-05T14:16:50Z
DSC02438

eandgindominica posted a photo:

DSC02438

Date: 2008-09-05T14:16:51Z
IMG0138A

nhipxinh_6x_to posted a photo:

IMG0138A

Date: 2008-09-05T14:16:51Z
999

Abode of Chaos posted a photo:

999

Date: 2008-09-05T14:16:50Z
HPIM1089

telefonegurl posted a photo:

HPIM1089

Date: 2008-09-05T14:16:50Z
143_4355

vicedmonds posted a photo:

143_4355

Date: 2008-09-05T14:16:48Z
Sunset Exaggerated

nicholas mak posted a photo:

Sunset Exaggerated

Date: 2008-09-05T14:16:48Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'CSS' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url