Re: [問題] PHP抓原始碼內容

作者: godspeedlee (妳,我可以)   2011-08-06 17:36:32
※ 引述《SABER101 (None)》之銘言:
: http://www.epinions.com/review/Canon_PowerShot_SX210_IS_Digital_Camera/content_536777166468
: 想從上面網址抓中間Review部分
: Review是在本文位置
: <span class=rkr>
: 本文
: <br>
: 表達式這樣寫<span class=rkr>\s(.*)\s<br>
: 完全抓不到東西
: 然後\s(.*)\s<br>這樣抓到文章九成
: 剩下這段抓不到
: Straight photography (Street, documentary, and environmental portraiture) is
: primarily concerned with capturing images of people in uncontrived, naturally
: lit and candid settings that evocatively depict or dramatically reveal some
: aspect of the human condition. In addition to being a first rate general
: purpose digicam, the nifty little SX210 is almost perfect for “straight”
: photography - it is compact, responsive, unobtrusive, features a 14x zoom
: (for a little extra standoff room) and dependably generates first rate images.
: 明明就有寫換行為什麼會抓不到在同一行的這一段
: 然後第一個和第二個的差別只差<span class=rkr>
: 然後就全沒了?????
: 好苦惱啊
: 希望有人能解答
: 謝謝
http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php
加上 s (PCRE_DOTALL) 就行了
然後改成下面這樣比較妥當,比較不會撈到多餘的資料
<span class=rkr>(.*?)<br>
示意圖
http://imageshack.us/f/706/pttregexp.png/

Links booklink

Contact Us: admin [ a t ] ucptt.com