RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'hide');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome1; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome2; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, Lockergnome - with CDATA stripped and shown - and flickr).

bbc

Lifestyle quiz to secure a mortgage
Mortgage applicants will face tougher questions about their lifestyle from lenders, under new rules about to come into force.
Date: Thu, 24 Apr 2014 23:07:41 GMT
Russia 'destabilising' Ukraine - US
The US accuses Russia of "deception and destabilisation" in eastern Ukraine, warning of further sanctions unless Moscow defuses the crisis.
Date: Fri, 25 Apr 2014 01:34:20 GMT
UK science to get £200m polar ship
UK scientists are to get a £200m icebreaker, which will be one of the biggest, most capable polar research vessels in the world.
Date: Fri, 25 Apr 2014 02:28:21 GMT
Miliband plans zero-hours crackdown
Ed Miliband will unveil plans to tackle the "epidemic" of zero-hours contracts in a speech in Scotland later.
Date: Fri, 25 Apr 2014 03:10:36 GMT
Probe over Hillsborough insult posts
The government says it is making "urgent inquiries" into reports Whitehall computers were used to make insulting comments about the Hillsborough disaster.
Date: Thu, 24 Apr 2014 22:06:44 GMT
MPs urge action on foreign prisoners
Foreign prisoners are not being deported quickly enough to help cut costs and relieve overcrowding in jails, MPs warn.
Date: Fri, 25 Apr 2014 01:27:43 GMT
Obama in S Korea amid nuclear fears
US President Barack Obama arrives in Seoul for a visit that comes amid concern North Korea may be planning a fourth nuclear test.
Date: Fri, 25 Apr 2014 03:41:10 GMT
New hearing for Briton in US prison
A US judge orders a hearing to consider new evidence relating to the 1987 double-murder conviction of a British businessman in Miami.
Date: Fri, 25 Apr 2014 04:06:03 GMT
Tech giants settle hiring court case
Technology giants Apple, Google, Intel and Adobe settle a class action lawsuit alleging they conspired to hold down salaries.
Date: Fri, 25 Apr 2014 01:38:19 GMT
Terrorism financing suspect arrested
Officers from the Metropolitan Police arrest a man in London on suspicion of financing and encouragement of terrorism.
Date: Thu, 24 Apr 2014 23:12:50 GMT

cnn

A totally new kind of golf, dude
George Plimpton, in the 1973 nonfiction book "Mad Ducks and Bears," describes a charity golf tournament. One of its organizers was Alex Karras, a pro football player with a sense of humor every bit as absurd as his big scene in the Mel Brooks cowboy spoof "Blazing Saddles," in which Karras' character, Mongo, punches a horse.
Date: Fri, 25 Apr 2014 01:09:26 EDT
Doctors in Afghanistan didn't die in vain
Frida Ghitis says, as violence claims three U.S. doctors, the temptation is to despair, but aid to Afghanistan has made it a much better place
Date: Thu, 24 Apr 2014 15:18:57 EDT
'Guns everywhere' law packs dangers
Norcross, Georgia, Chief of Police Warren Summers says the new state law that allows guns in bars, churches and schools will have unintended consequences that don't make anybody safe.
Date: Thu, 24 Apr 2014 13:57:56 EDT
Your best travel photos
Photo of the Day
Date: Thu, 24 Apr 2014 20:21:55 EDT
11 assassination spots you can visit
Death -- one of life's great unknowns, yet it comes to all of us.
Date: Thu, 24 Apr 2014 22:00:59 EDT
2014's big-name graduation speakers
It's that time of year when colleges around the country announce the people who will offer the last lesson to soon-to-be graduates: the commencement speakers.
Date: Thu, 24 Apr 2014 22:28:49 EDT
10 things you're wasting money on
Financial guru Dave Ramsey and his daughter Rachel Cruze reveal their list of 10 things Americans waste money on.
Date: Wed, 23 Apr 2014 07:20:03 EDT
Pols rip rancher's race remarks
Cliven Bundy tells CNN he is putting meat on the table for America, but his comments on race have since gone viral, drawing widespread condemnation from Democrats and Republicans alike.
Date: Thu, 24 Apr 2014 23:14:05 EDT
Another folk hero exposes a nerve
Nevada Rancher Cliven Bundy's remarks about whether the "Negro" fared better under slavery represents the latest in a series of incendiary racial comments from a new crop of folk heroes embraced in some conservative circles.
Date: Thu, 24 Apr 2014 18:40:09 EDT
Fox News 'misunderstood me'
Nevada rancher Cliven Bundy talks to CNN's Bill Weir about Fox News hosts denouncing his recent comments.
Date: Thu, 24 Apr 2014 22:30:56 EDT

lockergnome (hidden CDATA)

The lockergnome feed seems to be down.

lockergnome

The lockergnome feed seems to be down.

flickr

2014-04-11 18-22-56 IMGP1848

searchingforpatrick posted a photo:

2014-04-11 18-22-56  IMGP1848

Date: 2014-04-25T05:33:19Z
image

Fullerton x4 posted a photo:

image

Date: 2014-04-25T05:33:25Z
20130928-2013-09-28 13.14.00 HDR

strausshaydn posted a photo:

20130928-2013-09-28 13.14.00 HDR

Date: 2014-04-25T05:33:30Z
P1050827

mlinksva posted a photo:

P1050827

Date: 2014-04-25T05:33:18Z
Scratches!

Faisal Al-Duwaisan posted a photo:

Scratches!

Date: 2014-04-25T05:33:26Z
00D6FB010EED(IPCAM) motion alarm at 20140425063313

aurochcam posted a photo:

00D6FB010EED(IPCAM) motion alarm at 20140425063313

Date: 2014-04-25T05:33:27Z
Photo

Imajicka1 posted a photo:

Photo

Date: 2014-04-25T05:33:28Z
IMG_0076

swagata.rupa posted a photo:

IMG_0076

Date: 2014-04-25T05:33:14Z
#cake #coffee #waiting #food #foodporn #foodgasm #pinoyfood #pinoy #instafood #manila #philippines #pinas #IGers #IGersManila #earth #world #PinoyIGers #happytummy #gastronomic #gastrorgasm #hudas #instagram #instaphoto #itsmorefuninthephilippines #self

Judd Lax posted a photo:

#cake #coffee #waiting #food #foodporn #foodgasm #pinoyfood #pinoy #instafood #manila #philippines #pinas #IGers #IGersManila #earth #world  #PinoyIGers #happytummy #gastronomic #gastrorgasm #hudas  #instagram #instaphoto #itsmorefuninthephilippines #self

Date: 2014-04-25T05:33:26Z
GM244074

MrGPM posted a photo:

GM244074

Date: 2014-04-25T05:33:28Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'wolf' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url