RSS/XML feed parser

Here's some php:

PHP
function xml_parser($page,$container,$tags,$number,$cdata) {
  if (!$number) {$number=100;}
  $stories=0;
  $xml=file_get_contents($page);
  preg_match_all("/<$container>.+<\/$container>/sU",$xml, $items);
  $items=$items[0];
  $itemsArray=array();
   foreach ($items as $item) {
    for($i=0; $i<count($tags); $i++) {
    preg_match("/<$tags[$i](.+)(<\/$tags[$i]>)/sU", $item, $tag);
    $this[$i]=preg_replace("/<$tags[$i]>(.+)(<\/$tags[$i]>)/sU",'$1',$tag);
    $this[$i]=array_map('html_entity_decode', $this[$i]);
    }
     if (count($itemsArray)<$number) {array_push($itemsArray, $this);}
   }
  $theData="<dl>";
  foreach ($itemsArray as $item) {
  for($i=0; $i<count($tags); $i++) {
  $data[$i]=$item[$i][0];    }
   $title=$data[0];
   $dpatterns[0]="/<img(.+)><\/img>/sU"; $dreplacements[0]='<img$1>';
   $dpatterns[1]="/<img(.+)\/>/sU"; $dreplacements[1]='<img$1>';
   $dpatterns[2]="/<(\/|)content?(.+|)>/sU"; $dreplacements[2]='';
   $dpatterns[3]="/border=\"0\"/sU"; $dreplacements[3]='';
   if ($cdata!='hide') {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='$1';
   }
   else {
    $dpatterns[4]="/<\!\[CDATA\[(.+)\]\]>/sU"; $dreplacements[4]='';
   }
   $description=preg_replace($dpatterns,$dreplacements,$data[1]);
   $link=preg_replace("/<link.+href=\"(.+)\"(.+|)\/>/sU",'$1',$data[2]);
   $date=$data[3];
   $theData.="
   <dt><a href=\"$link\">$title</a></dt>
   <dd class=\"story\">$description</dd>
   <dd>Date: $date</dd>\r";
  }
$theData.="</dl>";
return $theData;
}

$container='item';
$tags=array('title','description','link','pubDate');
$bbc=xml_parser("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml",$container,$tags,10,'');
$cnn=xml_parser("http://rss.cnn.com/rss/cnn_topstories.rss",$container,$tags,10,'');

$container='entry';
$tags=array('title','content','link','published');
$flickr=xml_parser("http://api.flickr.com/services/feeds/photos_public.gne",$container,$tags,10,'');

Here's some HTML with PHP

HTML/PHP
<h2>bbc</h2>
<?php echo $bbc; ?>
<h2>cnn</h2>
<?php echo $cnn; ?>
<h2>flickr</h2>
<?php echo $flickr; ?>

Here's what we get... (the lastest feeds from the BBC, CNN, and flickr).

bbc

MPs vote to take over Commons business in an unprecedented move aimed at finding a majority for a Brexit plan.
Date: Mon, 25 Mar 2019 23:40:54 GMT
Michael Avenatti is charged with trying to extort more than $20m from sports apparel giant Nike.
Date: Tue, 26 Mar 2019 00:54:55 GMT
Coroners may be given new powers to investigate stillbirths so that each death is independently assessed.
Date: Tue, 26 Mar 2019 00:44:46 GMT
Raheem Sterling and Callum Hudson-Odoi condemn the "unacceptable" racist abuse directed at England players during their 5-1 win in Montenegro.
Date: Mon, 25 Mar 2019 23:48:01 GMT
Riders Marion Calvo and Jorge Martinez have been banned from racing for two years after fighting on the track at Costa Rica's National Motorbike Championship.
Date: Mon, 25 Mar 2019 14:09:02 GMT
The tech giant confirmed it was focusing on online services, rather than devices, at a live event.
Date: Mon, 25 Mar 2019 21:21:22 GMT
The practice involves ironing a girl's chest with hot objects to delay breasts from growing
Date: Tue, 26 Mar 2019 00:47:37 GMT
EU nationals who have paid UK taxes for years could be denied benefits after Brexit, says a report.
Date: Tue, 26 Mar 2019 01:45:12 GMT
Israel blamed the militant group Hamas after seven Israelis were injured in a rocket attack.
Date: Mon, 25 Mar 2019 23:33:14 GMT
A class of five-year-olds beat 25,000 entries to win a Premier League poetry competition.
Date: Mon, 25 Mar 2019 16:33:58 GMT

cnn

For more than two years, Donald Trump has wanted the investigation into allegations of collusion with Russia to go away. Now that it has, the President isn't prepared to let go.
Date: Mon, 25 Mar 2019 22:04:04 GMT
Rudy Giuliani, President Donald Trump's lawyer, said the line special counsel Robert Mueller wrote in his report about not exonerating Trump on obstruction of justice is a "cheap shot."
Date: Tue, 26 Mar 2019 02:15:02 GMT
CNN's Erin Burnett discusses President Trump's reaction to Attorney General Bill Barr's memo on the Mueller Report while sources tell CNN that the White House has not seen the full report.
Date: Mon, 25 Mar 2019 23:50:13 GMT
Robert Mueller's investigation into Russia's interference in the 2016 election is over.
Date: Mon, 25 Mar 2019 20:33:41 GMT
CNN's Wolf Blitzer discusses California Democrat Eric Swalwell's comments on President Trump and Russian collusion in the last year, in response to Attorney General Bill Barr's memo saying that the Mueller investigation "did not establish that members of the Trump campaign conspired or coordinated with the Russian government."
Date: Mon, 25 Mar 2019 22:11:44 GMT
Donald Trump seems to think he is in the clear where the special counsel investigation is concerned. So why isn't he happier? Maybe because he already misses Robert Mueller.
Date: Tue, 26 Mar 2019 00:21:01 GMT
Former Director of National Intelligence James Clapper reacts to White House press secretary Sarah Sanders saying he and other former intelligence officials should be investigated after special counsel Robert Mueller did not find any evidence of collusion between the Trump campaign and Russia.
Date: Tue, 26 Mar 2019 01:25:29 GMT
The Trump administration on Monday said the entire Affordable Care Act should be struck down,in a dramatic reversal.
Date: Tue, 26 Mar 2019 02:20:30 GMT
The Pentagon notified Congress Monday night that it has authorized the transfer of $1 billion to begin new wall construction along the US-Mexico border, drawing immediate objections from Democratic lawmakers.
Date: Tue, 26 Mar 2019 02:21:20 GMT
For much of the past 22 months, Democrats waited with bated breath to see what special counsel Robert Mueller had found out about Russian interference in the 2016 election and the possibility that either Donald Trump or someone(s) close to him colluded with the Russians to help him win the election.
Date: Tue, 26 Mar 2019 00:32:37 GMT

flickr

Schweiz, Kanton Graubünden, Malix

gerag [Georg Ragaz] posted a photo:

Schweiz, Kanton Graubünden, Malix

Schweiz, Kanton Graubünden, Malix

Date: 2019-03-26T02:30:11Z
Bera vs Obras

aylengeraldine9 posted a photo:

Bera vs Obras

Date: 2019-03-26T02:30:11Z
_5507231-Edit.jpg

Adventurefish posted a photo:

_5507231-Edit.jpg

Date: 2019-03-26T02:30:14Z
Jeremy Lamb

trendingtopics posted a photo:

Jeremy Lamb

Date: 2019-03-26T02:30:18Z
20190313_124426

Antonius Giovanni posted a photo:

20190313_124426

Date: 2019-03-26T02:30:24Z
Purple Haze

luzYano posted a photo:

Purple Haze

Date: 2019-03-26T02:29:57Z
Birthday 2019

Bhavnish Sharma posted a photo:

Birthday 2019

Date: 2019-03-26T02:30:01Z
L-R: Hermann Esser, Hitler and Wilhelm Brückner, probably the Obersalzberg, date unknown

Troy-Tempest posted a photo:

L-R: Hermann Esser, Hitler and Wilhelm Brückner, probably the Obersalzberg, date unknown

Date: 2019-03-26T02:30:12Z
100_9128

mestes76 posted a photo:

100_9128

Date: 2019-03-26T02:30:04Z
2019-03-25_07-30-14

milofficer posted a photo:

2019-03-25_07-30-14

Date: 2019-03-26T02:30:25Z

Comments

#1
2007-03-02 dumb_dave says :

Sorry, I'm new to this stuff, willing to learn and all that, but I don't get the idea. Copy that snippet of PHP code into a file and call it, say, parser.php. Copy the other snippet of HTML into a file and call it, for lack of inventiveness, parser.html. Right so far? If so, where's the intermediate step? How does this HTML "call" or "include" the PHP in order to function? Or am I missing something so basic that even asking this will earn me the cherished "Idiot of the Day Award"? Thanks.

#2
2007-03-02 BonRouge says :

dave,
You can include the php or just have it in one page. The page would have a '.php' extension - not '.html.'
Here's a simple example of this page (with no style or anthing) in one file.
Save it and change the extension to '.php'. If you don't have a server installed on your machine, you'll have to upload it to a remote server to view it.
If you want, you can take the php code out of that page and save it in a different file and include it into the page - that way, you could use it on more than one page if you wanted.

I hope that makes it a bit clearer.

#3
2007-03-02 dumb_dave says :

Thanks for the explanations. Much clearer now and ... yes, it indeed works like a champ. (Maybe I was just too tired? Putting 1 and 1 together and coming up with 11 instead of two?) Best regards and thanks for all the tips elsewhere as well.

#4
2007-03-07 dumb_dave says :

Useful indeed, BonRouge, but how does one display the <description> tagged material that is buried behind things like <![CDATA[ <p> etc.? Is the PHP code easily modified to handle that? And if so, can one apply it selectively? That is, show the fuller "description" material for one site but then reduce the next site entry to "headines" only (i.e., "titles" and "links") and then toggle the next one back to fuller details? Hope this is not a major headache, but it's beyond my ability to work it out at this stage ... and everything tried brought the larger process to a grinding halt. (This isn't a do-my-homework-for-me question. I'm bewildered by the code.) Thanks.

#5
2007-03-07 BonRouge says :

dave,
I thought I'd already sorted out the problem of data wrapped in the CDATA stuff. Does the code have a problem? If you could show me where it's not working, I'll try to improve it.
As for choosing whether to show that particular data or not, yes - I think you could do that by adding another variable. You see near the top where there's a preg_replace() to remove the CDATA tags? You could put that in an if statement - if the variable is not present, remove the CDATA tags, if it is, leave them where they are.
Does that make sense?

#6
2007-03-10 BonRouge says :

dave,
I think I found the problem and sorted it out. As you can see, it seems to work OK now. Some of the characters in the Lockergnome feed don't show right on this page though. I wonder if it's anything to do with me being in Japan. Do you see strange characters?

#7
2007-05-01 Ice says :

I have been trawling the web for days looking for something like this. Thanks a WHOLE lot man. I was also wondering if you can modify this parser to merge these fields and display, say, only the latest 10 items? wine

#8
2007-11-02 steve says :

thanks sorted out my cdata parasing problem, seems that is not too clear in the docs

s

#9
2019-03-14 Rosalinda says :

Hello there I am so grateful I found your blog, I really found you
by mistake, while I was looking on Google for something else, Anyhow I am here now and would just like to say many
thanks for a tremendous post and a all round interesting blog (I also love the theme/design), I don_t
have time to look over it all at the minute but
I have saved it and also included your RSS feeds, so when I have time I will be back to read much more, Please
do keep up the fantastic b.
Buy Cheap Viagra Online Buy Cheap Viagra Online

Comment form

Please type the word 'sexy' here:

BB code available :

  • [b]...[/b] : bold
  • [it]...[/it] : italic
  • [q]...[/q] : quote
  • [c]...[/c] : code
  • [url=...]...[/url] : url