RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, and flickr).

bbc

Abuse victims 'want Woolf to quit'
Victims' groups are expected to tell child abuse inquiry officials that Fiona Woolf should step down as its head as concerns about her suitability persist.
Date: Fri, 31 Oct 2014 02:35:34 GMT
Alcohol 'should have calorie labels'
Alcohol should have a calorie content label in order to reduce obesity, according to public health doctors.
Date: Fri, 31 Oct 2014 00:10:00 GMT
Person missing after fireworks blaze
One person remains missing after a major blaze in a Stafford fireworks factory that also leaves two people needing hospital treatment.
Date: Fri, 31 Oct 2014 07:23:49 GMT
RBS reserves £400m for currency probe
Royal Bank of Scotland has set aside £400m to cover costs into possible manipulation of the foreign exchange market.
Date: Fri, 31 Oct 2014 08:09:33 GMT
Key Jerusalem holy site 'to reopen'
Israel promises to reopen a key Jerusalem holy site after closing it following the shooting of a Jewish activist.
Date: Fri, 31 Oct 2014 01:47:43 GMT
ChildLine seeing more suicidal children
More children who have talked about killing themselves are contacting ChildLine for help, the counselling service says.
Date: Fri, 31 Oct 2014 06:10:50 GMT
'Do more' to tackle aid corruption
The UK government is not doing enough to tackle "petty corruption" in countries to which it gives aid, a report by a scrutiny body says.
Date: Fri, 31 Oct 2014 01:24:26 GMT
Lavish praise 'does not help pupils'
Teachers who try to encourage low-achieving students with lots of praise could harm their learning, claims research into classroom tactics.
Date: Fri, 31 Oct 2014 00:46:01 GMT
Russia-Ukraine deals secures EU gas
Russia will resume gas deliveries to Ukraine this winter in a deal brokered by the European Union, which will also safeguard supplies to EU countries.
Date: Fri, 31 Oct 2014 01:08:33 GMT
Burkina Faso leader 'to stay on'
Burkina Faso's President Blaise Compaore says he will stay in power for a year under a transitional government, despite violent protests.
Date: Fri, 31 Oct 2014 01:58:48 GMT

cnn

Comedian, rap mogul arrested
Former rap mogul Marion "Suge" Knight and comedian Micah "Katt" Williams were arrested Wednesday, accused of stealing a photographer's camera last month.
Date: Thu, 30 Oct 2014 15:57:19 EDT
'House of Cards' actress dies
It's a sad time for the cast and crew of "House of Cards."
Date: Thu, 30 Oct 2014 15:57:51 EDT
Convicted mayor wants 'final rodeo'
Tuesday's election in the littlest of states is causing the biggest of stirs.
Date: Thu, 30 Oct 2014 18:32:22 EDT
New clue in Amelia Earhart mystery?
Could one of aviation's most enduring mysteries be solved? An aircraft recovery group says it may already have a part of Amelia Earhart's plane, and it thinks it knows where to find the rest of it.
Date: Thu, 30 Oct 2014 12:47:39 EDT
Hagel memo criticized Syria strategy
Earlier this month, while on an trip to Latin America to discuss climate change, Defense Secretary Chuck Hagel sat down and wrote a highly private, and very blunt memo to National Security Advisor Susan Rice about U.S. policy toward Syria.
Date: Thu, 30 Oct 2014 18:29:18 EDT
'Unusual' Russian flights over Europe
An "unusual" uptick in the size and scale of Russian aircraft flying throughout European airspace in recent days has raised alarm bells for NATO officials.
Date: Thu, 30 Oct 2014 06:33:43 EDT
What's fishy about your shrimp?
You can barbecue it, boil it, broil it, bake it and saute it, but one of America's favorite seafoods -- the humble shrimp -- might not be what you think it is.
Date: Thu, 30 Oct 2014 13:58:58 EDT
Scientists link 60 genes to autism
Researchers have found dozens of new genes that may play a role in causing autism, according to two studies published Wednesday in the medical journal Nature.
Date: Thu, 30 Oct 2014 15:56:57 EDT
Students: Dump Maher from graduation
The controversy over having TV host Bill Maher speak at the University of California Berkeley has taken another turn. Well, make that two.
Date: Thu, 30 Oct 2014 07:25:05 EDT
Senator's jokes caught on tape
South Carolina Sen. Lindsey Graham, who is toying with the idea of a presidential bid, joked in a private gathering this month that "white men who are in male-only clubs are going to do great in my presidency," according to an audio recording of his comments provided to CNN.
Date: Thu, 30 Oct 2014 16:40:26 EDT

flickr

Dekoration

Zytostatika posted a photo:

Dekoration

Date: 2014-10-31T08:13:28Z
Petit déjeuner : Routine : #thévert , #jus de fruits #pomme #poire #kiwi , #crunchy #lait écrémé . Bonne journée à vous !!! #motivation#eatclean#healthy#fitness#diet#fit#weightloss#healthylife#regime#regimeuse#instaregime#instaregimeuse#reequilibragealime

Luciaschallenge posted a photo:

Petit déjeuner : Routine : #thévert , #jus de fruits #pomme #poire #kiwi , #crunchy #lait écrémé . Bonne journée à vous !!! #motivation#eatclean#healthy#fitness#diet#fit#weightloss#healthylife#regime#regimeuse#instaregime#instaregimeuse#reequilibragealime

Date: 2014-10-31T08:13:30Z
看到拉庫音溪山屋了!!!

bagwolf525 posted a photo:

看到拉庫音溪山屋了!!!

Date: 2014-10-31T08:13:21Z
2104/10/31不分系對建築系籃球對抗賽 運動攝影拍照練習

klsd93040118 posted a photo:

2104/10/31不分系對建築系籃球對抗賽 運動攝影拍照練習

Date: 2014-10-31T08:13:12Z
DSC06566.jpg

tanja kitaina posted a photo:

DSC06566.jpg

Date: 2014-10-31T08:13:15Z
ISC_18-05-2013 (128)

truyenlm posted a photo:

ISC_18-05-2013 (128)

Date: 2014-10-31T08:13:21Z
Mom and children

Geobert Quach posted a photo:

Mom and children

Date: 2014-10-31T08:13:15Z
Good Morning.

fmpreuss posted a photo:

Good Morning.

Date: 2014-10-31T08:13:19Z
DSC_0397.jpg

IrisEve posted a photo:

DSC_0397.jpg

Date: 2014-10-31T08:13:20Z
DSC_8811

lovesymphony98 posted a photo:

DSC_8811

Date: 2014-10-31T08:13:32Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'curry' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url