RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, and flickr).

bbc

Three found dead in house named
Police name three people found dead from stab wounds in Oxfordshire as the hunt continues for a 21-year-old suspect.
Date: Sun, 24 May 2015 16:51:30 GMT
US: Iraqi forces lack will to fight
US Defence Secretary Ashton Carter says the rout of Iraqi forces at Ramadi showed they lacked the will to fight against Islamic State.
Date: Sun, 24 May 2015 15:18:43 GMT
'Beautiful Mind' mathematician killed
Renowned mathematician John Nash, subject of the film A Beautiful Mind, dies in a New Jersey taxi crash with his wife.
Date: Sun, 24 May 2015 16:47:25 GMT
Tax credits change 'key to EU talks'
Limiting the rights of EU migrants to claim tax credits is a key part of renegotiating the UK's relationship with the EU, Sajid Javid says.
Date: Sun, 24 May 2015 12:43:27 GMT
Labour to back EU referendum bill
Acting leader Harriet Harman says Labour has dropped its opposition to an "in/out" vote on the UK's EU membership, and will now back the government's referendum bill.
Date: Sun, 24 May 2015 11:45:48 GMT
Malaysia finds 'migrant' mass graves
Authorities in Malaysia find several mass graves near the border with Thailand suspected of containing the bodies of trafficked migrants.
Date: Sun, 24 May 2015 09:20:08 GMT
Irish Church needs 'reality check'
One of Ireland's most senior Catholic clerics calls for the Church to take a "reality check" following the country's overwhelming vote in favour of same-sex marriage.
Date: Sun, 24 May 2015 14:57:28 GMT
Greece 'cannot afford IMF repayment'
Greece will not be able to make a debt repayment to the IMF due in early June as it does not have the money, the interior minister says.
Date: Sun, 24 May 2015 10:26:57 GMT
Nepal flood alert after landslide
Thousands flee their homes in western Nepal following a landslide which blocked a river, causing fears of flash flooding.
Date: Sun, 24 May 2015 06:27:02 GMT
No UK troops to fight IS, say Tories
Business Secretary Sajid Javid rejects a call from a former head of the army to send UK ground troops to fight Islamic State.
Date: Sun, 24 May 2015 11:05:36 GMT

cnn

Pentagon chief concerned that Iraqis don't fight ISIS
Defense Secretary Carter says Iraqis aren't outnumbered, but don't seem to have the will to fight. FULL STORY
Date: Sun, 24 May 2015 12:22:41 EDT
Members of Congress: ISIS making gains
Iraq's lack of national unity in the fight against ISIS threatens to splinter the country into several parts, a defense expert and two members of Congress said Sunday.
Date: Sun, 24 May 2015 11:38:58 EDT
Zelizer: Iraq isn't going away anytime soon
Iraq isn't going away anytime soon.
Date: Sun, 24 May 2015 11:28:37 EDT
'Beautiful Mind' mathematician John Nash, wife killed
John Forbes Nash Jr., the famed mathematician and inspiration for the film "A Beautiful Mind," and his wife died in a car crash Saturday in New Jersey, according to state police.
Date: Sun, 24 May 2015 11:32:30 EDT
Rain breaks records; more on the way
A firefighter died early Sunday while performing a high water rescue operation in Claremore, Oklahoma, an emergency management official said. FULL STORY
Date: Sun, 24 May 2015 08:47:24 EDT
Protests after cop acquitted
Police arrested demonstrators in downtown Cleveland following the acquittal of a police officer charged in the 2012 shooting death of two unarmed people. FULL STORY
Date: Sun, 24 May 2015 08:13:56 EDT
A band of sisters in Special Ops
Even before women were allowed in combat roles, a group of female soldiers gained crucial information for special operations units.
Date: Sat, 23 May 2015 21:30:37 EDT
Russia bans 'undesirable' NGOs
Non-governmental organizations working in Russia awoke Sunday to a new reality -- that they operate now under a law that allows the government to prosecute them on the grounds they are 'undesirable.'
Date: Sun, 24 May 2015 10:17:38 EDT
Gloria Steinem crosses Korea's DMZ
An international group of activists, including Gloria Steinem, crossed the heavily fortified border between North and South Korea to bring attention to the need for peace between the two nations. FULL STORY
Date: Sun, 24 May 2015 11:41:09 EDT
Julian Castro: Clinton emails a 'witch hunt'
Julian Castro labeled the House Republican-led investigation into Hillary Clinton's emails a "witch hunt" and a "sideshow" from dealing with America's most pressing problems.
Date: Sun, 24 May 2015 12:05:58 EDT

flickr

d53a25b7-4a70-46f8-90b2-63311a99c49d

Stu Smith1 posted a photo:

d53a25b7-4a70-46f8-90b2-63311a99c49d

Date: 2015-05-24T17:00:51Z
PVR_7952

Chebotarev Kirill posted a photo:

PVR_7952

Date: 2015-05-24T17:00:55Z
AA1_7795_1

ardilles posted a photo:

AA1_7795_1

Date: 2015-05-24T17:00:49Z
9-10 A Reserve Boys UWA Vs Mods_ (82)

Chris J. Bartle posted a photo:

9-10 A Reserve Boys UWA Vs Mods_ (82)

Date: 2015-05-24T17:00:53Z
Beautiful toes

blizzardbeaches posted a photo:

Beautiful toes

Date: 2015-05-24T17:00:55Z
Lamb Chops

DenRosen posted a photo:

Lamb Chops

Date: 2015-05-24T17:00:55Z
000C5DDCA28D(BLFD Fire Cam 3) motion alarm at 20150524090012

Buckslakefire posted a photo:

000C5DDCA28D(BLFD Fire Cam 3) motion alarm at 20150524090012

Date: 2015-05-24T17:00:55Z
Lookout

katiedodat posted a photo:

Lookout

Date: 2015-05-24T17:00:52Z
IMG_3987

okambuvacoop posted a photo:

IMG_3987

Date: 2015-05-24T17:00:54Z
IMG_4364

GianlucaTodini posted a photo:

IMG_4364

Date: 2015-05-24T17:00:55Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

Comment form

Please type the word 'life' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url