티스토리 뷰

코딩/PHP

IMG 태그 속성 제거

Raichu 2014.01.04 15:53
$text = '<p id="paragraph" class="green">This is a paragraph with an image <img src="/path/to/image.jpg" width="50" height="75"/></p>';

echo preg_replace("/<([a-z][a-z0-9]*)(?:[^>]*(\ssrc=['\"][^'\"]*['\"]))?[^>]*?(\/?)>/i",'<$1$2$3>', $text);

// <p>This is a paragraph with an image <img src="/path/to/image.jpg"/></p>



The RegExp broken down:

/              # Start Pattern
 <             # Match '<' at beginning of tags
 (             # Start Capture Group $1 - Tag Name
  [a-z]         # Match 'a' through 'z'
  [a-z0-9]*     # Match 'a' through 'z' or '0' through '9' zero or more times
 )             # End Capture Group
 (?:           # Start Non-Capture Group
  [^>]*         # Match anything other than '>', Zero or More Times
  (             # Start Capture Group $2 - ' src="...."'
   \s            # Match one whitespace
   src=          # Match 'src='
   ['"]          # Match ' or "
   [^'"]*        # Match anything other than ' or " 
   ['"]          # Match ' or "
  )             # End Capture Group 2
 )?            # End Non-Capture Group, match group zero or one time
 [^>]*?        # Match anything other than '>', Zero or More times, not-greedy (wont eat the /)
 (\/?)         # Capture Group $3 - '/' if it is there
 >             # Match '>'
/i            # End Pattern - Case Insensitive

Add some quoting, and use the replacement text <$1$2$3> it should strip any non src= properties from well-formed HTML tags.

Please Note This isn't necessarily going to work on ALL input, as the Anti-HTML + RegExp people are so cleverly noting below. There are a few fallbacks, most notably <p style=">"> would end up<p>"> and a few other broken issues... I would recommend looking at Zend_Filter_StripTags as a full proof tags/attributes filter in PHP


http://stackoverflow.com/questions/2994448/regex-strip-html-attributes-except-src



$html2 = preg_replace('#<img.+?src=[\'"]([^\'"]+)[\'"].*?>#i', 
'<img src="$1">', $html2);


'코딩 > PHP' 카테고리의 다른 글

preg_replace를 사용해서 table,div 태그 내용만 뽑기  (0) 2014.01.16
PHP 소켓통신으로 페이지 가져오기  (0) 2014.01.09
IMG 태그 속성 제거  (0) 2014.01.04
PHP 자료형 비교표  (0) 2013.03.05
URL 자동링크  (0) 2012.07.09
PHP 기본적인 페이징(Paging)  (0) 2012.06.07
댓글
댓글쓰기 폼