RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'hide');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome1; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome2; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, Lockergnome - with CDATA stripped and shown - and flickr).

bbc

West Bank Gaza protests turn deadly
At least two Palestinians have been killed at a march in the West Bank in protests against Israel's military operation in Gaza.
Date: Fri, 25 Jul 2014 02:45:15 GMT
Wreckage of Algeria plane found
The wreck of a plane that disappeared with 116 people on board on a flight from Burkina Faso to Algiers has been found in Mali, officials say.
Date: Fri, 25 Jul 2014 03:52:14 GMT
Murdoch caps great first day for Scots
Swimmer Ross Murdoch secures gold to cap a stunning first day of Commonwealth Games action for hosts Scotland.
Date: Thu, 24 Jul 2014 22:11:44 GMT
One-shot cancer therapy gets NHS nod
A pioneering breast cancer treatment that replaces weeks of radiotherapy with a single, targeted shot is set to be offered on the NHS.
Date: Fri, 25 Jul 2014 00:10:38 GMT
Shift workers 'face diabetes risk'
Type 2 diabetes is more common in people who work shifts, with effects on waistlines, hormones and sleep increasing the risk, a study suggests.
Date: Fri, 25 Jul 2014 00:14:23 GMT
Ferry owner death cause 'unknown'
South Korea's forensic agency says it cannot determine the cause of death for the fugitive tycoon blamed for the recent ferry disaster.
Date: Fri, 25 Jul 2014 03:58:15 GMT
'Politicians protected by cover-up'
A former assistant director of social services at Lambeth council claims there was a cover-up to protect politicians after allegations of abuse at a children's home.
Date: Fri, 25 Jul 2014 05:02:01 GMT
UN sends 'unofficial' aid to Syria
The first UN aid shipment to Syria delivered without the consent of the government has arrived in the north of the country.
Date: Thu, 24 Jul 2014 21:39:20 GMT
Arizona halts executions amid review
Arizona halts executions pending a review of its death penalty procedures, after the allegedly botched lethal injection of a convicted murderer.
Date: Fri, 25 Jul 2014 00:18:42 GMT
Russia 'fired rockets into Ukraine'
The US says Russia has fired artillery at Ukrainian military positions in the east, a week after the Malaysia Airlines flight MH17 disaster.
Date: Fri, 25 Jul 2014 04:07:39 GMT

cnn

Chief Ebola doctor contracts Ebola
A doctor who has played a key role in fighting the Ebola outbreak in Sierra Leone is infected with the disease, according to that country's Ministry of Health.
Date: Thu, 24 Jul 2014 18:27:53 EDT
Condemned woman leaves Sudan
Mariam Yehya Ibrahim, the Sudanese Christian woman sentenced to death in Sudan for apostasy but subsequently pardoned, arrived in Rome on Thursday, the Italian Foreign Ministry said.
Date: Thu, 24 Jul 2014 17:40:26 EDT
Arizona: Killer's execution not botched
Joseph Wood gasped and struggled to breathe during his nearly two-hour execution involving a novel combination of drugs, some witnesses say.
Date: Thu, 24 Jul 2014 22:42:15 EDT
Tornado hits campground; 2 die
Two people are dead and at least 20 were injured at a Virginia campground on Thursday after a possible tornado, Virginia State Police spokeswoman Corinne Geller told CNN.
Date: Thu, 24 Jul 2014 22:36:31 EDT
Teen pilot dies in quest for record
An American teenager who was trying to set a world record for flying around the world was killed and his father is missing after their plane crashed into the ocean off American Samoa on Tuesday night, the boy's family said.
Date: Thu, 24 Jul 2014 05:09:48 EDT
Boy attacked by barracuda
A 13-year-old boy is recovering after a barracuda jumped into his boat and struck him. WKMG has more.
Date: Thu, 24 Jul 2014 07:32:55 EDT
'Fifty Shades of Grey' trailer sees 'Red'
"Fifty Shades of Grey's" infamous "Red Room" has arrived.
Date: Thu, 24 Jul 2014 12:15:17 EDT
Resident: I found 6 snakes in toilet
WFTX reports that one Floridian says she has found reptiles in her restroom.
Date: Wed, 23 Jul 2014 18:52:54 EDT
'Scammers' won't leave Airbnb house
If the old saw about houseguests being like fish is true -- after a few days they begin to stink -- imagine what it must smell like in poor Cory Tschogl's 600-square-foot condo in Palm Springs, California.
Date: Thu, 24 Jul 2014 14:10:00 EDT
Man drags cop in high speed chase
A wild police chase was caught on camera after a police officer pulled over a man for running a stop sign. WXYZ reports.
Date: Wed, 23 Jul 2014 14:49:40 EDT

lockergnome (hidden CDATA)

The lockergnome feed seems to be down.

lockergnome

The lockergnome feed seems to be down.

flickr

20140313-_MG_5177.jpg

kabraxcis1 posted a photo:

20140313-_MG_5177.jpg

Date: 2014-07-25T05:15:42Z
DSC_5680

msebazco posted a photo:

DSC_5680

Date: 2014-07-25T05:15:44Z
Grest 2014 - 1392

Unità Pastorale di Campagnola posted a photo:

Grest 2014 - 1392

Date: 2014-07-25T05:15:44Z

invincibleworldwide posted a photo:

Date: 2014-07-25T05:15:37Z
Canada 2014/ some Duluth lol

Robert A. Luna posted a photo:

Canada 2014/ some Duluth lol

Date: 2014-07-25T05:15:37Z
(Untitled)

kathi.wachter posted a photo:

Aufgenommen mit einer Sony ILCE-6000.

Date: 2014-07-25T05:15:40Z

Nitro Gen posted a photo:

Date: 2014-07-25T05:15:38Z
Captured these photos when my friend play these game. #toys #teddybear #games #boring #throwback

shirayukiyumiko posted a photo:

Captured these photos when my friend play these game.  #toys #teddybear #games #boring #throwback

Date: 2014-07-25T05:15:40Z
IMG_6478

tristanrobledo posted a photo:

IMG_6478

Date: 2014-07-25T05:15:44Z
@healthytips6 : @healthytips6 : Twitter / healthytips6 : http : //t.co/aMYj3vIu0U http://t.co/mB8cqlnJXz http://t.co/RgeQiW6gfQ http://t.co/K4ian6HWAR

Alex Mathio posted a photo:

@healthytips6 : @healthytips6 : Twitter / healthytips6 : http : //t.co/aMYj3vIu0U http://t.co/mB8cqlnJXz http://t.co/RgeQiW6gfQ http://t.co/K4ian6HWAR

ift.tt/1pPo43z

Date: 2014-07-25T05:15:45Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'wolf' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url