RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, and flickr).

bbc

Theresa May's bid to become Conservative leader has won the backing of two more cabinet ministers and the Daily Mail.
Date: Fri, 01 Jul 2016 00:57:26 GMT
Commemorations are being held in the UK and France to mark the centenary of the start of the Battle of the Somme, in which more than one million men were killed or wounded.
Date: Fri, 01 Jul 2016 06:07:55 GMT
Prostitution laws should be urgently rewritten in England and Wales so sex workers are no longer criminalised for soliciting, a new Commons report says.
Date: Fri, 01 Jul 2016 02:25:37 GMT
The top European Union trade official says the UK cannot begin negotiating terms for doing business with the bloc until after it has left it.
Date: Thu, 30 Jun 2016 18:23:25 GMT
Shadow Chancellor John McDonnell is to set out Labour's response to the EU referendum in a speech in London.
Date: Fri, 01 Jul 2016 00:39:37 GMT
Researchers say they have found the first clear evidence that the thinning in the ozone layer above Antarctica is starting to heal.
Date: Thu, 30 Jun 2016 20:23:20 GMT
The man at the centre of popular podcast Serial, Adnan Syed, is to receive a new trial, says a judge.
Date: Thu, 30 Jun 2016 22:32:11 GMT
The body of a tourist is recovered from a ravine in the Peruvian Andes after he fell while taking pictures of himself near the Machu Pichu site.
Date: Fri, 01 Jul 2016 00:24:06 GMT
Wales face Belgium in the last eight of their first European Championship on Friday - and boss Chris Coleman is refusing to "play the occasion down".
Date: Thu, 30 Jun 2016 19:56:20 GMT
The immune system can be trained to attack itself to reverse a devastating autoimmune disease, in animals.
Date: Thu, 30 Jun 2016 23:03:16 GMT

cnn

A doctor who was one of dozens of people killed at Istanbul's airport was trying to rescue his son from ISIS, a friend said.
Date: Fri, 01 Jul 2016 03:02:36 GMT
The Obama administration is considering a plan to coordinate strikes against terrorist groups in Syria with Russia if Moscow agrees to use its leverage with Syrian President Bashar al-Assad to stop bombing U.S.-backed rebels, U.S. officials said Thursday.
Date: Fri, 01 Jul 2016 02:30:21 GMT
Russians are the largest group of foot soldiers in the group from a non-Muslim majority country and they have also played a leadership role in the organization, write Peter Bergen and David Sterman
Date: Fri, 01 Jul 2016 05:36:01 GMT
A 3-year-old boy traveling with his aunts and an 8-year-old girl coming home to her father are among the 44 who died.
Date: Fri, 01 Jul 2016 03:33:33 GMT
Nearly 70 million airbags in U.S. cars have been or will be recalled as part of a massive safety scandal enveloping Takata since 2014.
Date: Fri, 01 Jul 2016 00:44:37 GMT
A report found that the 10 sailors captured by Iranians in January suffered from "failed leadership" at all levels on a mission that was plagued by mistakes from beginning to end.
Date: Fri, 01 Jul 2016 00:43:49 GMT
It's the last day of the first half of the year. And it's been a wacky one on Wall Street -- especially the past week! So what now?
Date: Thu, 30 Jun 2016 17:28:42 GMT
Rio de Janeiro finally got a federal bailout, with nearly a month to go before the Olympic Games start here.
Date: Thu, 30 Jun 2016 22:08:33 GMT
Donald Trump on Wednesday said his former Republican primary rivals who have refused to support him in November should be barred from running for public office again.
Date: Thu, 30 Jun 2016 23:32:25 GMT
New Jersey Gov. Chris Christie -- a former Donald Trump rival turned top defender -- is being vetted as a possible running mate for the presumptive Republican nominee, a source confirmed to CNN on Thursday.
Date: Fri, 01 Jul 2016 04:11:56 GMT

flickr

FORO FLAMENCO INFANTIL JUNIO 2016_08.jpg

FOTOS CANAL SUR posted a photo:

FORO FLAMENCO INFANTIL JUNIO 2016_08.jpg

Date: 2016-07-01T06:13:43Z

bjsowers1977 posted a photo:

Date: 2016-07-01T06:13:46Z
超級好天氣#花蓮#牛山呼庭#台11線#沙灘#海#石頭#陽光#熱辣辣#Hualian#sky#beach#mountain#hot#sea

alison0120 posted a photo:

超級好天氣#花蓮#牛山呼庭#台11線#沙灘#海#石頭#陽光#熱辣辣#Hualian#sky#beach#mountain#hot#sea

Date: 2016-07-01T06:13:37Z
DSC_7301.JPG

De temps en temps je prends des photos des courses posted a photo:

DSC_7301.JPG

Date: 2016-07-01T06:13:39Z
#jeep #jeepscrambler #jeepporn #graytonbeach #florida #lake #ocean #gulf #beach #canonphotography #canon60d #canon_official #photography #carinstagram #carporn #carsofinstagram

{ashley*} posted a photo:

#jeep #jeepscrambler #jeepporn #graytonbeach #florida #lake #ocean #gulf #beach #canonphotography #canon60d #canon_official #photography #carinstagram #carporn #carsofinstagram

Date: 2016-07-01T06:13:40Z
IMG_8130.jpg

fritzcat posted a photo:

IMG_8130.jpg

Date: 2016-07-01T06:13:42Z
มองบน

MoolekJidrid posted a photo:

มองบน

Date: 2016-07-01T06:13:43Z
(Untitled)

Ross Donnelly posted a photo:

Date: 2016-07-01T06:13:43Z
Laptop Battery New Arrival😘😘😘

iphone3cparts posted a photo:

Laptop Battery New Arrival😘😘😘

Date: 2016-07-01T06:13:38Z
IMG_2191.jpg

ozzobear posted a photo:

IMG_2191.jpg

Date: 2016-07-01T06:13:43Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'wine' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url