-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does mochiweb's parse/1 ignore (some) white-space? #166
Comments
In Zotonic we have an HTML sanitizer. See: https://github.com/zotonic/z_stdlib/blob/master/src/z_html.erl In our Mochiweb fork we have some changes that still need to be merged upstream.
|
Nice, that is way better then the regex replace of spaces between open and closing tags to :D If you still like to go that way though, then I can send you a pull request if you like. (all tests are passing now as well) |
@smeevil You mean sending a pull request to Mochiweb based on the Zotonic fork? That would be very nice. Or a pull to the Zotonic fork for getting the tests working? That is welcome as well 👍 |
no, sorry :) |
@rrrene ah, in the html_sanitize_ex issue. Be free to check the Zotonic fork - it is used a lot in production for lots of html (and z_stdlib also sanitizes the css). |
Oh, I now see I mixed up the issues... sorry :O |
@mworrell Thanks for getting back to me so quickly :) I will check the zotonic fork in the evening, but your example looks very promising! 👍 |
For the most part, this code was designed to parse out data from HTML (e.g. microformats style data) and XML, not correctly deal with all of HTML. The workaround I would use is |
Is the current status of this 'Won't fix'? |
The current status of this is "Nobody has contributed a fix" |
First off, thanks for an amazing library. I love the parser and how it enabled me to concentrate on the "sanitizing" part while building an HTML sanitizer in Elixir. One thing that struck me as odd was that white-space between a closing and an opening tag seems to be omitted in the parser's return value.
The following is in Elixir syntax and I hope it is understandable. I am sorry that my Erlang is not good enough to translate this for a better bug report. 😞
When we put this binary into
mochi_web
:the result is this:
The space between the closing of the first
</b>
and the opening of the second<b>
is somehow lost. I would have expected the following, where the space is preserved as a "text node":But maybe the described behaviour is intended? Or is this a bug?
Thanks again and keep up the good work! 👍
The text was updated successfully, but these errors were encountered: