当前位置: 动力学知识库 > 问答 > 编程问答 >

php - Regular Expression to extract a HTML-Tag without certain parameter (to fix a WordPress-Plugin)

问题描述:

I need the help of some RegEx-Experts to fix a bug in a WordPress-Plugin, which is no longer maintained by the author.

Inside the plugin there is the following php-sytax to find included scripts:

'/(\\s*)(<script\\b[^>]*?>)([\\s\\S]*?)<\\/script>(\\s*)/i'

This line filters scripts no matter for what media they are written. To fix an bug this line must be changed, so that script tags with the parameter media="print" are not extracted.

How must this line be chanced that script tags with parameter media="print" are not affected?

See here for the topic in the WordPress-Support-Forum.

网友答案:

preg are not meant to match HTML tags. You'll never know where and how attributes will be defined :

<script media="print">
<script media=print>
<script type="text/javascript" media="print">
<script media="print" type="text/javascript">

Basically, you cannot handle that in a good way with pregs. I'd suggest you to extract the html you want to clean into some DOM (or even SimpleXML) object and get all script tags where attributes are "print" with an xpath function

//script[media="print"]
网友答案:

A pretty simple approach would be:

'#<script\b(?:\s+(?!media="print")[^\s>]+)*\s*>(.*?)</script>#i'

It uses a (?!..) negative assertion to look at each string part after a space. This will not exactly match HTML attributes, but is sufficient to detect the single case. You might need to add alternatives though (media=print or media='print') because preg_match is looking for raw strings, not interpreting HTML-equivalent expressions. (Using DOM however would certainly be overkill for this task.)

网友答案:

to remove tags use strip_tag according to your need

分享给朋友:
您可能感兴趣的文章:
随机阅读: