RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, and flickr).

bbc

US President Donald Trump suffers a major setback as his healthcare bill is withdrawn from Congress.
Date: Sat, 25 Mar 2017 03:29:42 GMT
Most of those arrested after the Westminster attack are freed as police investigate if the killer acted alone.
Date: Sat, 25 Mar 2017 05:44:23 GMT
The UN looks into reports that 200 people, mostly civilians, died in an air strike in the past week.
Date: Fri, 24 Mar 2017 22:54:36 GMT
Germany "lost" World War Two but is now being helped to "win the peace", veteran Tory politician says.
Date: Fri, 24 Mar 2017 16:36:09 GMT
The cast of Love Actually, Mrs Brown and Ed Sheeran persuade viewers to part with cash for charity.
Date: Sat, 25 Mar 2017 03:22:22 GMT
At least 11 people killed a sheep, undressed, and chained themselves up at the former Nazi camp.
Date: Fri, 24 Mar 2017 19:58:12 GMT
The UK's biggest supermarket will keep trolleys unlocked while it converts them to the new £1 coin.
Date: Fri, 24 Mar 2017 19:57:17 GMT
he Financial Exclusion Committee says banks are failing the customers who need them the most.
Date: Sat, 25 Mar 2017 00:08:14 GMT
Tini Owens says she has been left in a 'wretched predicament' locked in a 'loveless marriage'.
Date: Fri, 24 Mar 2017 13:59:48 GMT
The motorway is now fully open but long delays remain, Highways England say.
Date: Fri, 24 Mar 2017 18:33:01 GMT

cnn

House Speaker Paul Ryan is at the White House to brief President Donald Trump on the GOP health care bill, and it is not to deliver good news, a Republican source tells CNN.
Date: Fri, 24 Mar 2017 22:48:59 GMT
President Donald Trump likes winning. But on Friday he failed.
Date: Sat, 25 Mar 2017 02:46:15 GMT
The panel discusses fallout from the House Republican health care bill failure.
Date: Sat, 25 Mar 2017 03:55:19 GMT
Take a look at the week in politics from March 19 through March 25.
Date: Sat, 25 Mar 2017 02:17:47 GMT
Next move on Obamacare? It's up to President Trump.
Date: Sat, 25 Mar 2017 02:12:17 GMT
House Speaker Paul Ryan conceded the biggest defeat of his political career Friday: Republicans have failed to repeal and replace Obamacare.
Date: Sat, 25 Mar 2017 02:43:18 GMT
Sen. Bernie Sanders and President Donald Trump seem to agree on one thing when it comes to the failure of the GOP health care bill: Blame the Democrats.
Date: Sat, 25 Mar 2017 03:51:41 GMT
It seemed like a just-about-even, all-too-familiar fight: traditional corporate power players staring down scrappier and ultimately hungrier insurgents dead-set on torpedoing the Republican health care bill.
Date: Sat, 25 Mar 2017 05:13:45 GMT
What a fiasco.
Date: Sat, 25 Mar 2017 05:11:12 GMT
Date:

flickr

20170317174308-8ed8_wm.jpg

nhadatvideo posted a photo:

20170317174308-8ed8_wm.jpg

Date: 2017-03-25T05:54:48Z
Webdriver Cavi - Test

WebdriverCavi posted a photo:

Webdriver Cavi - Test

Date: 2017-03-25T05:54:47Z
Laura

Bert Rymenams posted a photo:

Laura

via Instagram ift.tt/2nRYOTI

Date: 2017-03-25T05:54:40Z
_DSC2888

ClintonC posted a photo:

_DSC2888

Date: 2017-03-25T05:54:44Z
2017-03-25_12-54-13

Dang Thanh Nguyen posted a photo:

2017-03-25_12-54-13

Date: 2017-03-25T05:54:34Z
Ferrari World Abu Dhabi

Bizarro's Theme Park Photography posted a photo:

Ferrari World Abu Dhabi

Tuesday 21st March, 2017

ECC Arabian Adventure 2017

European Coaster Club

Date: 2017-03-25T05:54:36Z
Quirky CubaDupa bar (1)

4nitsirk posted a photo:

Quirky CubaDupa bar (1)

CubaDupa is a street festival in Wellington, New Zealand. In 2017 it took place from 25 to 26 March 2017.

www.cubadupa.co.nz/

Date: 2017-03-25T05:54:39Z
upload

Pierre swarts posted a photo:

upload

Date: 2017-03-25T05:54:39Z
Nox, Eye To The Galaxy #space #tutorialtuesday #portrait #surreal #conceptual #abstract #photooftheday #photography #artwork #art #fantasy #scifi #steampunk #jj_creative #granmersunite #superhubs #fa_hypnotic #surrel42 #igcreative_editz #ig_underground

dracogem posted a photo:

Nox, Eye To The Galaxy #space #tutorialtuesday #portrait #surreal #conceptual #abstract #photooftheday #photography  #artwork  #art #fantasy #scifi #steampunk #jj_creative #granmersunite #superhubs  #fa_hypnotic #surrel42 #igcreative_editz #ig_underground

Date: 2017-03-25T05:54:39Z
Mca 2016 - 2017

si_nedvon0280 posted a photo:

Mca 2016 - 2017

Date: 2017-03-25T05:54:41Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'wolf' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url