RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, and flickr).

bbc

The US Army private, jailed for leaking documents to Wikileaks, will be released in May.
Date: Tue, 17 Jan 2017 22:44:25 GMT
Nearly 1,000 Thomas Cook tourists will return from Gambia to the UK after Foreign Office advice.
Date: Tue, 17 Jan 2017 23:16:45 GMT
Theresa May sets out UK hopes for Brexit talks, warning that "no deal is better than a bad deal".
Date: Tue, 17 Jan 2017 19:18:57 GMT
More than 50 Democrats will miss Mr Trump's inauguration because of his attack on civil rights hero John Lewis.
Date: Tue, 17 Jan 2017 22:21:14 GMT
The BBC's Jeremy Bowen has visited the site of Aleppo's Umayyad Mosque to see what's left after the war in Syria.
Date: Tue, 17 Jan 2017 21:13:58 GMT
Lincoln City reach the fourth round of the FA Cup for the first time in 41 years with a deserved victory over Ipswich at Sincil Bank.
Date: Tue, 17 Jan 2017 23:08:15 GMT
The Riu Imperial Marhaba where 30 Britons were killed had just four unarmed guards, an inquest hears.
Date: Tue, 17 Jan 2017 21:15:43 GMT
An air force pilot thought he was attacking Boko Haram militants, the army says.
Date: Tue, 17 Jan 2017 20:59:01 GMT
Brother and sister in a critical condition after being struck by a car in County Antrim.
Date: Tue, 17 Jan 2017 22:54:03 GMT
A former Scotland international footballer and his ex-teammate are ruled to be rapists and ordered to pay £100,000 damages in a civil action.
Date: Tue, 17 Jan 2017 16:54:39 GMT

cnn

President Barack Obama on Tuesday commuted the sentence of Chelsea Manning and pardoned James Cartwright.
Date: Tue, 17 Jan 2017 22:55:36 GMT
U.S. Army soldier Chelsea Manning was sentenced to 35 years in prison for her role in leaking government documents to WikiLeaks.
Date: Tue, 17 Jan 2017 22:25:35 GMT
More than a million supporters of Edward Snowden have petitioned President Barack Obama to pardon him, but the former National Security Agency contractor hasn't submitted the required documents for clemency, according to the White House.
Date: Tue, 17 Jan 2017 21:17:51 GMT
President Barack Obama reduced or eliminated the sentences for hundreds more non-violent drug offenders on Tuesday, likely his final acts of clemency while in office.
Date: Tue, 17 Jan 2017 22:18:44 GMT
President-elect Donald Trump's secretary of interior nominee faced tough questions Tuesday about his and Trump's views on sexual assault during his Senate confirmation hearing.
Date: Tue, 17 Jan 2017 22:10:09 GMT
President-elect Donald Trump's nominee to lead the Interior Department pledged Tuesday to review Obama administration actions limiting oil and gas drilling in Alaska and said he does not believe climate change is a hoax.
Date: Tue, 17 Jan 2017 22:41:45 GMT
Ahead of Betsy DeVos' hearing to become education secretary, her experience with public schools is being called into question.
Date: Tue, 17 Jan 2017 20:10:24 GMT
Award-winning journalist Bob Woodward criticizes the Russia dossier that was presented to President-elect Donald Trump.
Date: Tue, 17 Jan 2017 15:58:41 GMT
President Barack Obama plans to travel to the Palm Springs area after the inauguration Friday, two sources familiar with his plans confirmed to CNN Tuesday.
Date: Tue, 17 Jan 2017 17:45:18 GMT
President Obama makes a surprise appearance as Press Secretary Josh Earnest delivers his final White House briefing.
Date: Tue, 17 Jan 2017 17:44:15 GMT

flickr

IMG-20161115-WA0034

agenciaperfilclass posted a photo:

IMG-20161115-WA0034

Date: 2017-01-17T23:22:38Z
YX57HJJ Vauxhall Movano with an unusual Gifa Collet Emergency Ambulance conversion, operated by Trust Medical,new to West Midlands AS Seen at Lancaster Royal Infirmary 28/12/2016

alanmagill1959 posted a photo:

YX57HJJ  Vauxhall Movano with an unusual Gifa Collet Emergency Ambulance conversion, operated by Trust Medical,new to West Midlands AS Seen at Lancaster Royal Infirmary 28/12/2016

Date: 2017-01-17T23:22:42Z
_DAM2497.jpg

St Thomas Aquinas College Athletics posted a photo:

_DAM2497.jpg

Date: 2017-01-17T23:22:42Z
City Year Milwaukee-31.jpg

cityyear posted a photo:

City Year Milwaukee-31.jpg

Date: 2017-01-17T23:22:45Z
IMG_7002

Acampamento Victoria posted a photo:

IMG_7002

Date: 2017-01-17T23:22:47Z
D-Link Camera Alert: BentleyLake.ca - Schedule Snapshot Recording

bentleylake posted a photo:

D-Link Camera Alert: BentleyLake.ca - Schedule Snapshot Recording

BentleyLake.ca schedule snapshot saved

Date: 2017-01-17T23:22:48Z
_UTA0502.jpg

ThatA480Guy posted a photo:

_UTA0502.jpg

Date: 2017-01-17T23:22:39Z
IMG_3089

Andy E. Nystrom posted a photo:

IMG_3089

In Port Angeles East but Port Angeles in background

Date: 2017-01-17T23:22:38Z
✈️✌

Michael Kerper posted a photo:

✈️✌

via Instagram ift.tt/2k2lzCH

Date: 2017-01-17T23:22:38Z
DTA 2016

famiglia_vienna posted a photo:

DTA 2016

Date: 2017-01-17T23:22:43Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'poker' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url