RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'hide');

$tags=array('title','content:encoded','link','pubDate');
$lockergnome=xml_parser("http://feed.lockergnome.com/nexus/all",$container,$tags,5,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome1; ?>
<h2>lockergnome</h2>
<?php echo $lockergnome2; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, Lockergnome - with CDATA stripped and shown - and flickr).

bbc

Ashya parents in extradition hearing
The parents of a five-year-old boy who was taken from hospital against medical advice arrive at Spain's High Court for an extradition hearing.
Date: Mon, 01 Sep 2014 10:45:27 GMT
Ukraine troops abandon key airport
Ukrainian troops abandon the airport of Luhansk after clashes with pro-Russian rebels, ahead of negotiations on the crisis.
Date: Mon, 01 Sep 2014 10:03:30 GMT
Man Utd agree to sign Falcao on loan
Manchester United agree to sign Colombia striker Radamel Falcao on loan, as clubs scramble to attract new players before tonight's transfer deadline.
Date: Mon, 01 Sep 2014 10:14:33 GMT
Pupils begin 'tough' new curriculum
Teaching of a "tough" new national curriculum begins in England as children return to school this week.
Date: Mon, 01 Sep 2014 08:14:03 GMT
Cameron to set out anti-terror plans
Negotiations between Conservatives and Lib Dems are continuing ahead of the unveiling of a new plan to tackle the threat of Islamist extremists.
Date: Mon, 01 Sep 2014 09:07:32 GMT
Official heckled amid HK poll anger
Hong Kong pro-democracy activists disrupt a speech by a Beijing official, a day after China ruled out a direct leadership election in 2017.
Date: Mon, 01 Sep 2014 07:24:23 GMT
Troops return Pakistan TV to air
Pakistan's national television channel is back on air after security forces remove anti-government protesters from its headquarters in Islamabad.
Date: Mon, 01 Sep 2014 08:27:46 GMT
Greenhouse gas fear over meat eating
New research estimates greenhouse gases from food production will go up 80% if meat and dairy consumption continues to rise at its current rate.
Date: Mon, 01 Sep 2014 03:56:06 GMT
Conservative MP Kelly to stand down
Conservative MP for Dudley South Chris Kelly announces he will stand down at the 2015 general election after one term.
Date: Mon, 01 Sep 2014 10:41:11 GMT
High-powered hairdryers under threat
High-powered hairdryers are on a list of household electrical items the EU is considering banning in an attempt to curb energy consumption.
Date: Sun, 31 Aug 2014 23:01:07 GMT

cnn

Police, protesters battle in Pakistan
Protesters demanding the resignation of the Pakistani Prime Minister engaged in running battles with police near government offices in the capital Monday.
Date: Mon, 01 Sep 2014 02:49:50 EDT
Rivers' daughter: 'Fingers crossed'
The latest statement from Joan Rivers' daughter gave no new information about the comedian's condition three days after she was rushed to a New York hospital.
Date: Sun, 31 Aug 2014 16:29:00 EDT
Joan Rivers' chilling death joke
CNN's Miguel Marquez explains the chain of events that led up to Joan Rivers being rushed to the hospital.
Date: Sat, 30 Aug 2014 16:57:26 EDT
Unexpected reveal of new iPhone 6
Apple is expected to unveil the new iPhone 6 on September 9. Tech journalist Jon Erlichman talks about the new features.
Date: Sat, 30 Aug 2014 13:03:47 EDT
GOP jitters over Kansas
Our Labor Day weekend trip around the "Inside Politics" table included GOP jitters over Kansas, the 2016 impact of President Obama's immigration deliberations, progressive worries about Elizabeth Warren's hawkish foreign policy views, and Republican angst about a research project that was designed to help but may have done more harm then good.
Date: Sun, 31 Aug 2014 18:16:57 EDT
McCartney: Scotland, you can't do that
Beatles star Paul McCartney became the latest high profile figure to sign a letter calling on Scottish voters to choose to remain part of the United Kingdom in a vote on independence next month.
Date: Sat, 30 Aug 2014 12:07:39 EDT
Ferguson costs another cop his job
One St. Louis-area police officer resigned and another retired in the continued fallout from questionable police actions in the days after the fatal shooting of an unarmed black teenager in a Missouri suburb.
Date: Sat, 30 Aug 2014 23:29:53 EDT
CNN host's wild NASCAR ride
CNN anchor Christi Paul went for a ride with NASCAR driver Danica Patrick.
Date: Sun, 31 Aug 2014 18:09:33 EDT
SUV runs over boy, but then ...
Surveillance video shows a boy being hit by an SUV in front of his home. CNN's Andrew Stevens has more.
Date: Sun, 31 Aug 2014 17:03:13 EDT
Meet the Fittest Man on Earth
From Fittest Man on Earth to new father, CrossFit champion Rich Froning talks about his success at the 2014 Games and where he goes from here.
Date: Sat, 30 Aug 2014 10:09:00 EDT

lockergnome (hidden CDATA)

The lockergnome feed seems to be down.

lockergnome

The lockergnome feed seems to be down.

flickr

Первоклашка моя-:)))

skladgovna posted a photo:

Первоклашка моя-:)))

by dianazakarieva ift.tt/Y5pjV9 ift.tt/1nksazg

Date: 2014-09-01T11:09:17Z
PICT0652

olaigo posted a photo:

PICT0652

Date: 2014-09-01T11:09:15Z
DSC_8075

British Council Indonesia posted a photo:

DSC_8075

Date: 2014-09-01T11:09:09Z
(Untitled)

KimSolez posted a photo:

Date: 2014-09-01T11:09:16Z
Some milk

Silvia Aoi posted a photo:

Some milk

Date: 2014-09-01T11:09:17Z
140825(월) 스칸디나비아반도 여행_3일차

Lyussam posted a photo:

140825(월) 스칸디나비아반도 여행_3일차

140825(월) 스칸디나비아반도 여행_3일차

Date: 2014-09-01T11:09:17Z
camp 236

breefriendly posted a photo:

camp 236

Date: 2014-09-01T11:09:13Z
413397897629371243_3708150520140901-10503-17ykj3d

mechanic_x posted a photo:

413397897629371243_3708150520140901-10503-17ykj3d



42 Likes on Instagram

Date: 2014-09-01T11:09:17Z
Photo

ivan_draga1 posted a photo:

Photo

by nepman56 ift.tt/W191Ln

Date: 2014-09-01T11:09:18Z
SAMUDERA TIGA HATI

pojokbuku posted a photo:

SAMUDERA TIGA HATI

via RSS Toko Buku Online pojokbuku.com/item/samudera-tiga-hati-sku0059

Date: 2014-09-01T11:09:18Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'wine' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url