-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX ODT substitution when many HTML tags in string #32515
base: 18.0
Are you sure you want to change the base?
FIX ODT substitution when many HTML tags in string #32515
Conversation
… everything is taken between first opening tag and last closing tag. Write test to reproduce.
72693f2
to
78bf506
Compare
htdocs/includes/odtphp/odf.php
Outdated
const FIND_TAGS_REGEX = '/<([A-Za-z0-9]+)(?:\s([A-Za-z]+(?:\-[A-Za-z]+)?(?:=(?:".*?")|(?:[0-9]+))))*(?:(?:\s\/>)|(?:>(.*)<\/\1>))/s'; | ||
const FIND_ENCODED_TAGS_REGEX = '/<([A-Za-z]+)(?:\s([A-Za-z]+(?:\-[A-Za-z]+)?(?:=(?:".*?")|(?:[0-9]+))))*(?:(?:\s\/>)|(?:>(.*)<\/\1>))/'; | ||
const FIND_TAGS_REGEX = '/<([A-Za-z0-9]+)(?:\s([A-Za-z]+(?:\-[A-Za-z]+)?(?:=(?:".*?")|(?:[0-9]+))))*(?:(?:\s\/>)|(?:>(((?!<\1>).)*)<\/\1>))/s'; | ||
const FIND_ENCODED_TAGS_REGEX = '/<([A-Za-z]+)(?:\s([A-Za-z]+(?:\-[A-Za-z]+)?(?:=(?:".*?")|(?:[0-9]+))))*(?:(?:\s\/>)|(?:>(((?!<\1>).)*)<\/\1>))/'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note, FIND_ENCODED_TAGS_REGEX seems to be never used into all dolibarr source code, maybe a code cleanup @eldy to simplify the code ?
@@ -46,8 +46,8 @@ class Odf | |||
public $userdefined=array(); | |||
|
|||
const PIXEL_TO_CM = 0.026458333; | |||
const FIND_TAGS_REGEX = '/<([A-Za-z0-9]+)(?:\s([A-Za-z]+(?:\-[A-Za-z]+)?(?:=(?:".*?")|(?:[0-9]+))))*(?:(?:\s\/>)|(?:>(.*)<\/\1>))/s'; | |||
const FIND_ENCODED_TAGS_REGEX = '/<([A-Za-z]+)(?:\s([A-Za-z]+(?:\-[A-Za-z]+)?(?:=(?:".*?")|(?:[0-9]+))))*(?:(?:\s\/>)|(?:>(.*)<\/\1>))/'; | |||
const FIND_TAGS_REGEX = '/<([A-Za-z0-9]+)(?:\s([A-Za-z]+(?:\-[A-Za-z]+)?(?:=(?:".*?")|(?:[0-9]+))))*(?:(?:\s\/>)|(?:>(((?!<\1(\s.*)?>).)*)<\/\1>))/s'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(((?!<\1(\s.*)?>).)*)
This groups matches any character (last .
) as much as we want (last*
) EXCEPT (?!
) the first capturing group between opening and closing chevron (<\1>
). It allows any other chars between the chevrons, if a space separates them from the tag((\s.*)?
) : this is a lazy way to allow html attributes.
NB - In case of intricated tags <b class="truc">quatre<i>troc</i></b>
, we want the match to take the outer tags, because _getDataFromHtml() is called recursively on the tag inner content.
FIX ODT substitution : if a string contains many times the same HTML tag, only the first opening tag and the last closing tag are used.
To reproduce :
create a translation string containing two times the same HTML tag, :
I <b>contain</b> two <b>bold</b> tags
create an ODT template with this translation string and upload it for any object
generate the ODT
see that the string is incorrectly rendered
This PR fixes this behavior and adds a test to lock the correct behavior.
As a side enhancement, the PR reverses the orders of parameters in test assertEquals() method since they were not in the right order.