RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'hide');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome1; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome2; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, Lockergnome - with CDATA stripped and shown - and flickr).

bbc

Migrant benefits to be tightened
The time during which EU migrants can claim a range of UK benefits if they do not have realistic job prospects is to be halved to three months.
Date: Tue, 29 Jul 2014 01:03:10 GMT
Israel warns of 'prolonged' Gaza war
Israeli PM Benjamin Netanyahu warns of a "prolonged" military campaign in Gaza, following an upsurge of violence from both sides.
Date: Tue, 29 Jul 2014 01:29:39 GMT
Gold medal record for Scotland
Scotland have 13 gold medals after earlier eclipsing their record haul for Commonwealth Games.
Date: Mon, 28 Jul 2014 22:28:45 GMT
Police placing ads on piracy sites
The City of London police has started placing banner advertisements on websites believed to be offering pirated content illegally.
Date: Mon, 28 Jul 2014 23:04:02 GMT
Pause NHS privatisation - Labour
NHS privatisation is "being forced through at pace and scale" says shadow health secretary Andy Burnham, as he calls for it to be halted until after the general election.
Date: Tue, 29 Jul 2014 00:13:45 GMT
Clarify 'revenge porn' laws say peers
More clarification is needed about the circumstances in which cases of "revenge porn" can be prosecuted, peers say.
Date: Mon, 28 Jul 2014 23:02:34 GMT
Tory MP David Ruffley to stand down
Conservative MP David Ruffley, who had been criticised after receiving a police caution for common assault, is to retire from Parliament at the next election.
Date: Mon, 28 Jul 2014 20:13:03 GMT
Ministers plan student loan overhaul
Ministers have been working on a policy that could bring major changes to England's student loan system, and possibly higher university fees, Newsnight learns.
Date: Mon, 28 Jul 2014 21:06:56 GMT
PM urges 'prompt' Russian sanctions
David Cameron says he and fellow European leaders agree that "strong" economic sanctions should be imposed on Russia as soon as possible.
Date: Mon, 28 Jul 2014 20:57:36 GMT
MH17 jet 'downed by shrapnel'
Security officials in Ukraine say the Malaysia Airlines jet downed in eastern Ukraine suffered an explosive loss of pressure caused by missile shrapnel.
Date: Mon, 28 Jul 2014 20:47:42 GMT

cnn

Teen accused of decapitating classmate
A 16-year-old Japanese girl has been arrested in Sasebo, Nagasaki prefecture, on suspicion of murdering a fellow student. Police confirmed that the alleged attacker also dismembered her victim's body.
Date: Mon, 28 Jul 2014 09:22:59 EDT
2 Americans infected with Ebola
The deadliest Ebola outbreak in history continues to plague West Africa as leaders scramble to stop the virus from spreading.
Date: Mon, 28 Jul 2014 18:35:15 EDT
Terror group put heads on poles
In some of the most gruesome images yet to emerge from the latest mass violence in Syria, videos show militants raising their victims' decapitated heads on poles.
Date: Mon, 28 Jul 2014 19:21:04 EDT
Plane kills man walking on beach
A father was killed and his daughter critically injured Sunday when an airplane struck them as they walked along a Florida beach.
Date: Mon, 28 Jul 2014 12:43:49 EDT
Cops: Student hid cams in restrooms
University of Delaware students are being offered counseling after a doctoral student allegedly hid video cameras in restrooms around the university's Newark campus over a two-year period.
Date: Mon, 28 Jul 2014 15:15:27 EDT
Security clearance holders owe $730M
About 83,000 Defense Department employees and contractors with security clearances to protect the nation's secrets have delinquent federal tax debts totaling $730 million, according to an internal government audit.
Date: Mon, 28 Jul 2014 07:05:05 EDT
Boy, 3, smashes Jeep into house
A diaper-clad toddler crashed a Jeep into a neighbor's house in Oregon -- then scampered home, sat on the couch and watched cartoons, authorities said.
Date: Mon, 28 Jul 2014 11:55:02 EDT
Crane operator falls during climb
Firefighters raced to save a crane operator who fell climbing up a 200-foot tower. CNN affiliate KHOU reports.
Date: Sun, 27 Jul 2014 16:31:33 EDT
Meat scandal hits McDonald's, KFC
McDonald's and KFC are hit by the Shanghai meat Scandal. Customers get apologies but no refunds. Ralitsa Vassileva reports.
Date: Sun, 27 Jul 2014 21:09:58 EDT
Cops: He called 911 to avoid ticket
Florida police say a man called 911 to report a possible murder to divert attention away from a cop who pulled him over.
Date: Mon, 28 Jul 2014 11:34:27 EDT

lockergnome (hidden CDATA)

The lockergnome feed seems to be down.

lockergnome

The lockergnome feed seems to be down.

flickr

DSC_0123.jpg

Robinsegg posted a photo:

DSC_0123.jpg

Date: 2014-07-29T02:34:23Z
P1120264

atticusfinch posted a photo:

P1120264

Date: 2014-07-29T02:34:26Z
(Untitled)

kennethreitz posted a photo:

Date: 2014-07-29T02:34:24Z
Kip Moore T-Shirt 1

venusnep posted a photo:

Kip Moore T-Shirt 1

My Kip Moore T-Shirt -Taken July 13, 2014 at Aaron's Amphitheatre at Lakewood.

Date: 2014-07-29T02:34:16Z
2014-06-22 01.50.19 1

s0urgrapesnape1 posted a photo:

2014-06-22 01.50.19 1

Processed with VSCOcam with f2 preset

Date: 2014-07-29T02:34:23Z
(Untitled)

shanebreland posted a photo:

Date: 2014-07-29T02:34:23Z
_MG_9628.jpg

Roque Fabular posted a photo:

_MG_9628.jpg

Date: 2014-07-29T02:34:20Z
Loved my 1st day @AOL. I get to work with the nicest & smartest people! I'm so lucky!

suhailahobba posted a photo:

Loved my 1st day @AOL. I get to work with the nicest & smartest people! I'm so lucky!

Date: 2014-07-29T02:34:23Z
IMG_4468.jpg

Mark Ghesquiere posted a photo:

IMG_4468.jpg

Date: 2014-07-29T02:34:25Z
20140728_192226

samreid86 posted a photo:

20140728_192226

Date: 2014-07-29T02:34:21Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'Valencia' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url