RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, and flickr).

bbc

Scottish Labour leader stands down
Scottish Labour Party leader Johann Lamont resigns with immediate effect after accusing the UK party of treating Scotland like a "branch office".
Date: Sat, 25 Oct 2014 09:42:42 GMT
Fourth Portsmouth man killed in Syria
A fourth man from Portsmouth who went to fight in Syria for Islamic State is believed to have been killed.
Date: Sat, 25 Oct 2014 12:18:17 GMT
Ebola outbreak cases pass 10,000
The number of cases in the Ebola outbreak passes 10,000, with 4,922 deaths, the World Health Organization's latest report says.
Date: Sat, 25 Oct 2014 11:47:10 GMT
Iran hangs woman despite campaign
Iran defies an international campaign and hangs a woman who killed a man she said was trying to sexually abuse her.
Date: Sat, 25 Oct 2014 10:49:13 GMT
Britain must pay EU bill, says MEP
The rest of Europe expects the UK to settle a £1.7bn EU budget demand "and that's that", a vice president of the European Parliament has said.
Date: Sat, 25 Oct 2014 10:04:44 GMT
Attacks 'threaten Egypt's existence'
Egypt faces an existential threat from jihadists, the president says, after at least 31 soldiers are killed in two attacks in the Sinai peninsula.
Date: Sat, 25 Oct 2014 12:03:52 GMT
Man killed in west Belfast shooting
A man dies in hospital after he was shot in a west Belfast alleyway following what is believed to have been a serious brawl.
Date: Sat, 25 Oct 2014 12:10:34 GMT
Cost of driving licence to be cut
The cost of getting a driving licence is being cut following a recent public consultation, the government says.
Date: Fri, 24 Oct 2014 23:01:11 GMT
New HS2 station proposed for Crewe
A report into the route of the second phase of the controversial fast train project HS2 is expected to recommend a new station be built in Crewe.
Date: Sat, 25 Oct 2014 10:55:58 GMT
Plan to clamp down on nuisance calls
Companies which cause "annoyance, inconvenience or anxiety" with marketing calls and text messages could be fined up to £500,000, ministers say.
Date: Sat, 25 Oct 2014 11:07:40 GMT

cnn

Last words she spoke to Nathan Cirillo
Thousands lined the streets on Friday as the body of Cpl. Nathan Cirillo began its final journey home.
Date: Fri, 24 Oct 2014 22:00:38 EDT
New video shows shooting details
Ottawa shooter Michael Zehaf-Bibeau is believed to have communicated with extremists. CNN's Jim Sciutto reports.
Date: Thu, 23 Oct 2014 21:11:35 EDT
Guess who bested Conan on Twitter?
The former secretary of state and the late night talk show host staged a humorous battle of words.
Date: Fri, 24 Oct 2014 23:57:41 EDT
Elizabeth Peña's cause of death ID'd
Actress Elizabeth Peña died from complications related to alcoholism, according to her death certificate obtained by CNN.
Date: Fri, 24 Oct 2014 14:40:26 EDT
Report: Monica Lewinsky mistreated
A government report obtained by the Washington Post on Thursday states that government agents and lawyers mistreated Monica Lewinsky when they approached the former White House intern in January 1998 to get her to cooperate with an investigation into President Bill Clinton.
Date: Fri, 24 Oct 2014 09:42:33 EDT
Foss Lake mystery solved: It's them
A decades-old mystery that captivated Oklahoma has been solved after officials confirmed the identities of two groups of people -- some of them teens -- who went missing in 1969 and 1970.
Date: Thu, 23 Oct 2014 14:54:14 EDT
The best (worst?) sick-day excuses
Had a big night out? Don't fancy leaving a cozy bed for the cold outdoors? Having a Ferris Bueller moment?
Date: Fri, 24 Oct 2014 05:52:49 EDT
Swerving driver crashes on camera
Two men captured video of a swerving driver who eventually hits a car that then crashes into the men. WTAE reports.
Date: Thu, 23 Oct 2014 06:10:56 EDT
Time-lapse shows solar eclipse
Time-lapse video shows the solar eclipse over Denver on October 23, 2014.
Date: Thu, 23 Oct 2014 21:48:38 EDT
11-foot python surprises workers
An 11-foot python was found in a pipe at a construction site in South Florida. WSVN has more on this story.
Date: Fri, 24 Oct 2014 10:15:34 EDT

flickr

Your IP Camera detected motion;here is a snapshot

yoyojp posted a photo:

Your IP Camera detected motion;here is a snapshot

Your IP Camera detected motion;here is a snapshot

Date: 2014-10-25T12:38:27Z
2014_1025_12495600

Steven's Transport Photos posted a photo:

2014_1025_12495600

Date: 2014-10-25T12:38:22Z
IMG_0723

rpealit posted a photo:

IMG_0723

Female Common Yellowthroat.

Date: 2014-10-25T12:38:22Z
DSC_0018

shusterpics posted a photo:

DSC_0018

Date: 2014-10-25T12:38:23Z
Screenshot_2014-10-25-14-33-11-1

benham_beham posted a photo:

Screenshot_2014-10-25-14-33-11-1

Date: 2014-10-25T12:38:27Z
Xmas gift!20x Women girls hair accessories ponytail rope Hairband Rope Elastic gekoo.co (eBay link)

celisodin posted a photo:

Xmas gift!20x Women girls hair accessories ponytail rope Hairband Rope Elastic gekoo.co (eBay link)

via gekoo.co (eBay link) rover.ebay.com/rover/1/711-53200-19255-0/1?ff3=2&tool...

Date: 2014-10-25T12:38:22Z
P1000184_edited-web

zatafish posted a photo:

P1000184_edited-web

Date: 2014-10-25T12:38:23Z
Tiny Bathroom Remodel Ideas http://t.co/XNejp6tBxl

rosemary_1302 posted a photo:

Tiny Bathroom Remodel Ideas http://t.co/XNejp6tBxl

Tiny Bathroom Remodel Ideas t.co/XNejp6tBxl (via Twitter ift.tt/1zrMkSs)

Date: 2014-10-25T12:38:25Z
upload

luis.mercado14 posted a photo:

upload

Date: 2014-10-25T12:38:25Z
file

kanakoahmad posted a photo:

file

Date: 2014-10-25T12:38:26Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'wolf' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url