RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, and flickr).

bbc

Britons jailed over Syria terror camp
Two brothers from east London become the first Britons to be jailed for terrorism training in Syria.
Date: Wed, 26 Nov 2014 17:47:29 GMT
Family 'crushed' by Ferguson ruling
The family of Michael Brown, the black teenager shot dead by a policeman, say they are "crushed" by the decision not to charge Darren Wilson.
Date: Wed, 26 Nov 2014 17:30:30 GMT
Care home boss guilty of child abuse
A former children's homes boss is found guilty of 26 charges of sexually abusing youngsters in Wrexham.
Date: Wed, 26 Nov 2014 14:12:23 GMT
Meters overcharge 1.5m gas customers
About 1.5 million gas customers with pre-payment meters have been overcharged because their meters are faulty, the industry has admitted.
Date: Wed, 26 Nov 2014 16:32:05 GMT
Gangster Frankie Fraser dies, aged 90
Former gangland enforcer "Mad" Frankie Fraser has died in hospital, aged 90
Date: Wed, 26 Nov 2014 18:02:54 GMT
Concerns over terror exclusion plans
The UK's reviewer of terrorism laws raises concerns about plans to exclude people from the UK if they go abroad to fight with extremist groups, as a new counter-terror bill is published
Date: Wed, 26 Nov 2014 18:04:55 GMT
Burglars admit attacking lecturer
Four men admit a burglary during which a university lecturer was savagely beaten at his south London home.
Date: Wed, 26 Nov 2014 16:22:33 GMT
Facebook Lee Rigby attack 'unfair'
It would be "almost impossible" for internet firms to monitor all website postings for possible terrorist content, a former MI6 director says.
Date: Wed, 26 Nov 2014 16:49:23 GMT
'Radical' Scottish land reform plan
Scottish First Minister Nicola Sturgeon announces plans to take action against landowners who pose a "barrier" to development.
Date: Wed, 26 Nov 2014 18:03:08 GMT
Royal Mail 'scaremongering' on post
The boss of Royal Mail is accused of "scaremongering" by the business secretary, after telling MPs that there is a threat to the universal service.
Date: Wed, 26 Nov 2014 16:18:04 GMT

cnn

Rare Shakespeare's First Folio found
A librarian in northern France made what may be the discovery of his lifetime when he uncovered a rare Shakespeare's First Folio in his library's collection.
Date: Wed, 26 Nov 2014 11:07:11 EST
Opinion: A 'Tax Day' all could love?
Edward McCaffery says Americans can take a page from Finland, where tax returns of all citizens are made public
Date: Wed, 26 Nov 2014 11:51:31 EST
Pelosi: Senate Dem leader is wrong
Nancy Pelosi says there are 14 million reasons why Chuck Schumer is wrong.
Date: Wed, 26 Nov 2014 06:50:05 EST
DNA pioneer to sell Nobel Prize
DNA pioneer James Watson is to sell the Nobel Prize he won for his co-discovery of the double helix structure -- the building block of life.
Date: Wed, 26 Nov 2014 07:54:03 EST
See laser shoot drone from sky
The U.S. Navy is deploying a secret weapon to the Persian Gulf: A laser cannon that can bring down drones and ships.
Date: Mon, 24 Nov 2014 13:52:06 EST
7 of the most beautiful lake lodges
As far as bodies of water go, lakes are easily the most overlooked.
Date: Tue, 25 Nov 2014 17:38:03 EST
Life-size hologram greets travelers
A life-size hologram named 'Eva' will tells passengers what they need to know before they board their flights.
Date: Wed, 26 Nov 2014 06:37:36 EST
'Sad' Christmas tree will stay put
The 50-foot Christmas tree decorating downtown Reading, Pennsylvania, was supposed to spread holiday cheer, but instead it made some residents unhappy.
Date: Tue, 25 Nov 2014 17:57:46 EST
Are millennials really lazy?
Are millennials really lazy? Hear what Mike Rowe has to say. "Somebody's Gotta Do It", Wed at 9pm EST on CNN.
Date: Wed, 26 Nov 2014 06:37:14 EST
'CSI' surprise: George Eads to exit
"CSI" star George Eads will soon wash his hands of all those crime scenes.
Date: Tue, 25 Nov 2014 11:41:27 EST

flickr

Qatar arrests workers for protesting over low pay. http://t.co/iyQ6sbtMhB | #Qatar | | http://t.co/M2MPBsPFm4

movement news posted a photo:

Qatar arrests workers for protesting over low pay. http://t.co/iyQ6sbtMhB  | #Qatar | | http://t.co/M2MPBsPFm4

via Twitter twitter.com/movement_news

Date: 2014-11-26T18:21:29Z
His best angry face-30.jpg

V Y8s posted a photo:

His best angry face-30.jpg

Date: 2014-11-26T18:21:26Z
IMG_20141126_102718

SeninleyimDaima posted a photo:

IMG_20141126_102718

Date: 2014-11-26T18:21:28Z
Новое видео! New video! http://ift.tt/15zjDGr

skladgovna posted a photo:

Новое видео! New video! http://ift.tt/15zjDGr

by annie_aster ift.tt/15zjFOz ift.tt/15zjDWM

Date: 2014-11-26T18:21:29Z
DSC_5197.jpg

thunderbird-72 posted a photo:

DSC_5197.jpg

Date: 2014-11-26T18:21:22Z
Thanks to everybody that took time out of their day to come see me at Dangerfield's. Much appreciated

Magicdusa posted a photo:

Thanks to everybody that took time out of their day to come see me at Dangerfield's. Much appreciated

Date: 2014-11-26T18:21:27Z
(Untitled)

c8linchrist posted a photo:

Date: 2014-11-26T18:21:28Z
This is a mole cricket (Neocurtilla hexadactyla), named for its shovel-like forelimbs highly developed for digging! Similar forelimbs are found on moles!

jack.c.koch posted a photo:

This is a mole cricket (Neocurtilla hexadactyla), named for its shovel-like forelimbs highly developed for digging! Similar forelimbs are found on moles!

Date: 2014-11-26T18:21:29Z
Effraction - Format (46)

clodyus posted a photo:

Effraction - Format (46)

Date: 2014-11-26T18:21:28Z
AfterTheLeaves@LkMinne 11-9-14 (567)

CAKopp posted a photo:

AfterTheLeaves@LkMinne  11-9-14 (567)

Date: 2014-11-26T18:21:24Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'sexy' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url