RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'hide');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome1; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome2; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, Lockergnome - with CDATA stripped and shown - and flickr).

bbc

Murder sparks anti-Muslim backlash
There has been a large increase in anti-Muslim incidents since the murder of a British soldier in Woolwich, an inter-faith charity says.
Date: Sat, 25 May 2013 20:50:13 GMT
Hezbollah promises Syria 'victory'
The leader of Lebanon's Hezbollah group promises victory in Syria where his supporters are backing President al-Assad, amid an upsurge in fighting there.
Date: Sat, 25 May 2013 21:08:53 GMT
Borussia Dortmund 1-2 Bayern Munich
Arjen Robben scores a dramatic late winner as Bayern Munich beat Borussia Dortmund in the Champions League final.
Date: Sat, 25 May 2013 21:43:13 GMT
48 rescued as island boat hits rock
A total of 48 passengers, including children, are rescued from a boat after it hits a rock and starts taking in water off the Pembrokeshire coast.
Date: Sat, 25 May 2013 17:21:21 GMT
Warnings over flagship projects
More than 30 of the coalition's flagship schemes, including the Universal Credit, are at serious risk of failure, a government report warns.
Date: Sat, 25 May 2013 13:37:00 GMT
Police probe fatal tiger attack
The death of a zoo worker attacked by a tiger could have been due to "human or technical" factors, police say.
Date: Sat, 25 May 2013 16:24:58 GMT
Pakistan bus fire kills 16 children
At least 16 children and a teacher are killed in a fire on their school bus in the eastern Pakistan city of Gujrat, police say.
Date: Sat, 25 May 2013 11:07:22 GMT
'Maoist rebels' kill 17 in India
At least 17 people, including a senior leader of India's governing Congress party, are killed after their convoy was attacked by Maoist rebels, officials say.
Date: Sat, 25 May 2013 20:32:30 GMT
UK plane alert suspects still held
Police secure a 12-hour extension to question two men after RAF jets were scrambled to escort a Pakistan Airlines plane in UK airspace.
Date: Sat, 25 May 2013 17:48:32 GMT
French army in major Mali pullout
France begins a key stage of its military withdrawal from Mali, four months after sending troops to push Islamist rebels out of the north.
Date: Sat, 25 May 2013 11:33:03 GMT

cnn

What made London Samaritan so brave
A London woman got off a bus and talked to the London hacking suspects. Jason Marsh explains makes some step up and others hang back in a crisis.
Date: Sat, 25 May 2013 11:10:31 EDT
Soldiers and sex: Can men grow up?
Pepper Schwartz says with constant drumbeat of scandals in armed forces, the military must require education programs to teach men self control, address culture of sexual entitlement
Date: Sat, 25 May 2013 14:20:48 EDT
Who owns Jolie's genes?
Angelina Jolie, when writing about her preventive double mastectomy, did not discuss how much her surgeries cost, but she did mention that many women would not be able to afford the $3,000 to $4,000 test that led her to make the decision. What she failed to say was why the test costs so much.
Date: Fri, 24 May 2013 11:17:10 EDT
See lightning strike TV tower
Watch as lightning strikes a television tower on the waterfront in St. Petersburg, Russia.
Date: Sat, 25 May 2013 15:35:19 EDT
Two boys found dead; brother arrested
A 15-year-old Utah boy was arrested in connection with the slayings of his two younger brothers, ages 4 and 10, who were apparently stabbed to death in their home, authorities said Thursday.
Date: Thu, 23 May 2013 17:31:23 EDT
Amanda Bynes busted
Adding to what's becoming a lengthy list of run-ins with the law, New York police arrested actress Amanda Bynes on Thursday night after she allegedly tossed drug paraphernalia out the window of her 36th floor Manhattan apartment.
Date: Sat, 25 May 2013 01:57:11 EDT
Does Brad Pitt have face blindness?
It seems that anytime Brad Pitt speaks, the world stops to listen, and his latest interview with Esquire has been no exception.
Date: Fri, 24 May 2013 22:57:01 EDT
Mom dies, gives birth, then is revived
Three-month-old Elayna Nigrelli has redefined what it means to be a miracle baby. She was born while her mother was technically dead.
Date: Fri, 24 May 2013 17:24:34 EDT
Three men in their 20s are in custody
Metropolitan Police say the men are being held on suspicion of conspiracy to commit murder in the brutal killing of a British soldier. FULL STORY
Date: Sat, 25 May 2013 16:58:57 EDT
Far-right protesters hold march
The turnout appears to have been fueled by anger over the slaying of a British soldier by attackers who claimed an Islamist motive. FULL STORY
Date: Sat, 25 May 2013 16:53:00 EDT

lockergnome (hidden CDATA)

The lockergnome feed seems to be down.

lockergnome

The lockergnome feed seems to be down.

flickr

IMG_0171

leobhmgbr posted a photo:

IMG_0171

Date: 2013-05-25T23:38:06Z
Calçada Romana da Roda - Portugal

Portuguese_eyes posted a photo:

Calçada Romana da Roda - Portugal

Provavelmente este seria um trecho da estrada proveniente da capital da Província Romana da Lusitânia (Mérida) e que entroncaria na estrada Olisipo (Lisboa) / Bracara Augusta (Braga)

Date: 2013-05-25T23:38:06Z
IMG_3284

maroei posted a photo:

IMG_3284

Date: 2013-05-25T23:38:06Z
Chris and Angelica

Jordan Enriquez posted a photo:

Chris and Angelica

Date: 2013-05-25T23:38:08Z
IMG_9591

rebz.org posted a photo:

IMG_9591

Date: 2013-05-25T23:38:08Z

newhonda.tw posted a photo:

Date: 2013-05-25T23:38:08Z
1996-421.jpg

theoldned posted a photo:

1996-421.jpg

Date: 2013-05-25T23:38:09Z
IMG_20130525_182527_236.jpg

kittip posted a photo:

IMG_20130525_182527_236.jpg

Date: 2013-05-25T23:38:10Z
CIMG1152

K A Johnson posted a photo:

CIMG1152

Date: 2013-05-25T23:38:10Z
Aburrida :S

Karen_Hdzz posted a photo:

Aburrida :S

Date: 2013-05-25T23:38:10Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'wolf' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url