We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug When an article (all from Substack or Medium) has images embed inside a <figure> it doesn't get parsed.
<figure>
To Reproduce Parse any article from Substack/Medium that contains images.
Expected behavior When using keep_article_html=True images should be embedded there
keep_article_html=True
Screenshots N/A
System information
Additional context Example of image not being parsed
<figure class="mi mj mk ml mm mn mf mg paragraph-image"> <div role="button" tabindex="0" class="mo mp ed mq bh mr"> <div class="mf mg mh"> <picture> <source srcset="https://miro.medium.com/v2/resize:fit:640/format:webp/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/format:webp/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/format:webp/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/format:webp/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/format:webp/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/format:webp/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/format:webp/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" type="image/webp"> <source data-testid="og" srcset="https://miro.medium.com/v2/resize:fit:640/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 640w, https://miro.medium.com/v2/resize:fit:720/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 720w, https://miro.medium.com/v2/resize:fit:750/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 750w, https://miro.medium.com/v2/resize:fit:786/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 786w, https://miro.medium.com/v2/resize:fit:828/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 828w, https://miro.medium.com/v2/resize:fit:1100/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 1100w, https://miro.medium.com/v2/resize:fit:1400/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px"> <img alt="Two hands reaching for one another, seen under pink light against a bright pink background" class="bh ko ms c" width="700" height="680" loading="eager" src="https://miro.medium.com/v2/resize:fit:1155/1*IWuB7jVUQ0lsNICWUv5gPg.jpeg"> </picture> </div> </div> </figure>
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Describe the bug
When an article (all from Substack or Medium) has images embed inside a
<figure>
it doesn't get parsed.To Reproduce
Parse any article from Substack/Medium that contains images.
Expected behavior
When using
keep_article_html=True
images should be embedded thereScreenshots
N/A
System information
Additional context
Example of image not being parsed
The text was updated successfully, but these errors were encountered: