RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, and flickr).

bbc

N Korea seeks joint Sony hack probe
North Korea urges a joint inquiry with the US into a cyber-attack on Sony that led to the studio cancelling the release of the film The Interview.
Date: Sat, 20 Dec 2014 12:13:16 GMT
Off-duty PC murder suspect arrested
A 28-year-old man is detained on suspicion of the murder of an off-duty PC who died after being attacked in Liverpool city centre.
Date: Sat, 20 Dec 2014 02:05:15 GMT
Cook 'gutted' to miss World Cup
Alastair Cook is "disappointed" to be left out of England's World Cup squad after Eoin Morgan is named new one-day captain.
Date: Sat, 20 Dec 2014 09:48:35 GMT
£1.2bn spent on 'Panic Saturday'
UK shoppers are expected to spend a record £1.2bn buying Christmas presents and groceries on what researchers are calling "Panic Saturday".
Date: Sat, 20 Dec 2014 13:09:47 GMT
Cairns children's mother arrested
Police arrest the mother of seven of the eight children found dead in a home in Cairns, Australia.
Date: Sat, 20 Dec 2014 12:26:13 GMT
Plan to help over-50s get jobs
Unemployed over-50s will be offered "career reviews" and help using computers as part of plans to get more people in that age group into work.
Date: Sat, 20 Dec 2014 11:44:07 GMT
Hart still UK's most desirable place
The most desirable place to live in the UK is named as Hart in Hampshire where residents are found to be the healthiest and live the longest.
Date: Sat, 20 Dec 2014 12:07:31 GMT
Migration system 'in intensive care'
A parliamentary committee warns ministers over UK exit controls, record-keeping and the use of a net migration target.
Date: Sat, 20 Dec 2014 14:18:26 GMT
Man remanded over stabbing death
A man is remanded in custody over the murder of a woman stabbed to death in a north-west London street.
Date: Sat, 20 Dec 2014 13:12:30 GMT
Antarctic science archive unlocked
UK experts are using aerial photographs from the 1940s and 1950s to probe the climate history of the Antarctic Peninsula.
Date: Sat, 20 Dec 2014 01:19:32 GMT

cnn

Family: Meatloaf dinner killed parents
The children of an elderly West Virginia couple who passed away months apart in late 2012 and early 2013 are blaming their deaths on a restaurant chain's meatloaf.
Date: Fri, 19 Dec 2014 03:34:32 EST
Why Castro asked FDR for $10
President Barack Obama jolted the long, tumultuous relationship between the United States and Cuba with his decision to normalize relations with a country that has fascinated and vexed Americans since the days of Theodore Roosevelt.
Date: Fri, 19 Dec 2014 02:55:52 EST
Photos: Everyday life in Cuba
Date: Thu, 18 Dec 2014 21:19:05 EST
5-year-old sees mom fall, then ...
A 5-year-old boy called 911 after he saw his mom fall off the couch and have a seizure. KTVZ reports.
Date: Fri, 19 Dec 2014 08:37:47 EST
Colbert signs off
For his final episode of "The Colbert Report," Stephen Colbert signed off with a star-filled sing-a-long.
Date: Fri, 19 Dec 2014 18:21:15 EST
Woman dies, wants dog to die, too
Connie Lay made an unusual request in her will before she died last month in Aurora, Indiana: She asked that her German shepherd, Bela, be euthanized and buried with her.
Date: Fri, 19 Dec 2014 05:53:09 EST
Perry really, really wrong on Hanukkah
On Tuesday night, Texas Gov. Rick Perry -- long a public fan of Judaism -- marked the beginning of the Jewish festival of Hanukkah by comparing it to the Boston Tea Party, which was celebrating its 241st anniversary the same day.
Date: Fri, 19 Dec 2014 16:17:29 EST
Astronaut prints 3-D wrench in space
This week, thanks to 3-D printing, astronaut and ISS Commander Barry "Butch" Wilmore, had a wrench he needed manufactured by a printer in just four hours. The ratcheting socket wrench was the first "uplink tool" printed in space, meaning it was designed on the ground, e-mailed to the space station and then manufactured in space.
Date: Fri, 19 Dec 2014 17:57:13 EST
Yearbook pic school wanted to ban
A high school senior's yearbook picture includes her two great loves: her dog and hunting. CNN affiliate WTEN has more.
Date: Fri, 19 Dec 2014 17:13:45 EST
U.S. military's stealth dirt bike
The U.S. military's building a stealth bike with an electric battery, capable of silent operation for covert missions.
Date: Fri, 19 Dec 2014 17:12:27 EST

flickr

scan6453

timington posted a photo:

scan6453

Date: 2014-12-20T14:19:47Z
Cartoon Wolverine Samurai sword USB 2.0 64GB flash drive memory stick pendrive

floresparmenides posted a photo:

Cartoon Wolverine Samurai sword USB 2.0 64GB flash drive memory stick pendrive

ift.tt/1zMnrPJ

Date: 2014-12-20T14:19:45Z
image

hello peggie posted a photo:

image

Date: 2014-12-20T14:19:46Z
(Untitled)

egveitikkje posted a photo:

Date: 2014-12-20T14:19:46Z
IMG_6364

Dr Sam C posted a photo:

IMG_6364

Date: 2014-12-20T14:19:45Z
1977 TOPPS STARTER Set (450 CARDS) in ALBUM - YAZ YOUNT BRETT CAREW +++ EX-MT

ciprianomogrovejo posted a photo:

1977 TOPPS STARTER Set (450 CARDS) in ALBUM - YAZ YOUNT BRETT CAREW +++ EX-MT

gekoo.co/buy-now/01/?query=361158295143

Date: 2014-12-20T14:19:47Z
All Photos-40

Lee Hopcroft posted a photo:

All Photos-40

Date: 2014-12-20T14:19:47Z
IMG_9572

johnchen6000 posted a photo:

IMG_9572

Date: 2014-12-20T14:19:48Z
Food is the most primitive form of comfort. #quote #quoteoftheday #iphone #iphonesia #kamerahpgw #fromwhatisee #fromwhereisit #foodquote

Emak Ndaru posted a photo:

Food is the most primitive form of comfort.   #quote #quoteoftheday #iphone #iphonesia #kamerahpgw #fromwhatisee #fromwhereisit #foodquote

Date: 2014-12-20T14:19:44Z
IMG_1295

bechtolsheimerhof posted a photo:

IMG_1295

Date: 2014-12-20T14:19:45Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'panda' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url