This is a discussion on "Grabbing the first paragraph" within the PHP Forum section. This forum, and the thread "Grabbing the first paragraph are both part of the Program Your Website category.
|
|
|
|
|
![]() |
||
Grabbing the first paragraph
|
||
| Notices |
![]() |
|
|
LinkBack | Thread Tools |
|
#1
|
|||
|
|||
|
Grabbing the first paragraph
Ok ... a while back a friend of mine gave me this script that would grab the text in <p>...</p> from an article and display it. For some reason ... I just noticed that it display the 2 paragraph instead of the first one like I want it to.
I have no idea how to fix this ... I suck at regular expressions and things like that in php. Anyways ... here's the code:
Normally I would call the functions like this
Any help would be great! |
|
|
|
#2
|
|||
|
|||
|
Re: Grabbing the first paragraph
Just a little bump
I really need help with this |
|
#3
|
|||
|
|||
|
Re: Grabbing the first paragraph
I tried
|
|
#4
|
|||
|
|||
|
Re: Grabbing the first paragraph
Thank you so much Graham for looking into this.
Here's what I have in my db ...
So ... there's not extra stuff in there that I can see? The first paragraph that should be spitting out is
![]() |
|
#5
|
|||
|
|||
|
Re: Grabbing the first paragraph
But there's *something* that's triggering a failure to match on the first line. Have you tried to print out $latestblogs and see what you have in there? Putting it through htmlspecialchars() will let you see all the tags.
|
|
#6
|
|||
|
|||
|
Re: Grabbing the first paragraph
GOT IT ....
It's the extra "href" tag within the first paragraph. You're looking for <p> then characters which are NOT < then </p> .... but in the case of the first paragraph you have an extra <a href= .... in there, so it fails to match. |
|
#7
|
|||
|
|||
|
Re: Grabbing the first paragraph
Ok ... so ... how can I make this work regardless of what's between the <p> ... </p>?
|
|
#8
|
|||
|
|||
|
Re: Grabbing the first paragraph
If you want to match *regardless* of intermediate tags, try
preg_match("/(<p>.*?<\/p>)/s", $subject, $matches); (any number of any character, but as few as possible - a sparse match. And note the extra "s" after the second slash. Forces the "." to match against new line characters too in case the <p> and </p> are on different lines. |
|
#9
|
|||
|
|||
|
Re: Grabbing the first paragraph
AH!!!! thank you thank you thank you thank you thank you!!!!!!!!!!!!!!!!!
*smooches* |
|
#10
|
|||
|
|||
|
Re: Grabbing the first paragraph
On a slight tangent, however if you want a generic solution that avoids issues with low-level changes to code and so on, you could do worse than learn to use Perl and the LWP. Tokenise the HTML with something like html tokeparser or even split it into a tree and you can develop some very robust solutions with minimal code to do all sorts of data extraction tasks.
Cheers Dan |
|
#11
|
|||
|
|||
|
Re: Grabbing the first paragraph
What?!?! Was that whole paragraph even written in English?!
The code works ... and works just the way I want it. |
|
#12
|
|||
|
|||
|
Re: Grabbing the first paragraph
Lol
It was a bit of a tangent. If your script works fine then obviously no reason at all to change it - just added this for anyone else who is looking to do more intensive parsing of HTML at any stage and comes across this thread Dan |
|
#13
|
||||
|
||||
|
Re: Grabbing the first paragraph
That sounds very interesting. Any links to resources on the subject?
|
![]() |
| Tags |
| php, regular expressions, summary |
| Thread Tools | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Help with IFRAME and grabbing URL | tbathgate | JavaScript Forum | 6 | Feb 28th, 2008 10:17 |
| grabbing still images from wmv for PHP? | Raynia | PHP Forum | 3 | Sep 21st, 2007 06:24 |
| Paragraph formatting when importing text from TXT or XML | McAurthur19 | Flash & Multimedia Forum | 0 | Mar 20th, 2007 19:12 |
| Screen-grabbing from a DVD | James-Clarke | Graphics and 3D | 18 | Jan 31st, 2007 15:14 |
| Paragraph & menu showing wrong in FireFox | cbrams9 | Web Page Design | 9 | Oct 25th, 2006 14:19 |