Page Content to Markdown versjonshistorikk - 2 versjonar
Page Content to Markdown av Jared
Page Content to Markdown versjonshistorikk - 2 versjonar
Ver forsiktig med eldre versjonar! Desse versjonane er vist som referanse og testformål.Du bør alltid bruke den siste versjonen av eit tillegg.
Siste versjon
Versjon 1.0.1
Sleppt 12. mai 2026 - 119,44 KBFungerar med firefox 109.0 og nyareFixed- General extractor picks the largest matching candidate per selector, not the first. On The Verge, the first <article> on a story page is a related-cards stub — first-match-wins picked it and returned empty markdown. Score every match by textContent.length and pick the largest qualifying candidate.
- Tighter content-significance threshold. Bump the hasSignificantContent floor to ≥3 <p> descendants and ≥500 chars of trimmed text. Rejects related-card grids that previously slipped through because their aggregated link text passed the old 50-char gate.
- SVG elements no longer crash Turndown mid-traversal. SVG className is a SVGAnimatedString, not a string; calling .toLowerCase() on it threw and Turndown returned '' for the whole page. Read class via getAttribute('class') throughout the converter, with a fallback to .baseVal for safety. Eliminates a silent empty-output failure mode on news sites that ship inline SVG icons.
- Visible junk inside the article body no longer ships through. Expanded the non-content substring regex with author-bio, author-card, byline-bio, topics-list, tags-list, tags-row, subscribe, affiliate, disclosure, disclaimer, share-row, share-icons, social-icons, related-articles, related-stories, read-more-cta, keep-reading, frequently-asked, faq-, further-reading, comments-section. Clears author-bio cards on TechCrunch / Tom's Guide, the trailing FAQ section on Mashable, and the end-of-post subscribe widget on Substack.
- Structural section rejector for related/topics/FAQ/subscribe blocks. Any <section> or <div> whose first heading (looking one level deep through a wrapper div) reads as Topics, Tags, Related…, Frequently Asked…, Further Reading, Read Next, Keep Reading, Recommended, or Subscribe to… gets rejected wholesale, regardless of class names. Catches framework-generated wrappers (mx-auto mt-12, pc-paddingTop-32) that didn't pattern-match before.
Kjeldekode sleppt under MIT-lisens
Eldre versjonar
Versjon 1.0.0
Sleppt 8. mai 2026 - 115,92 KBFungerar med firefox 109.0 og nyareKjeldekode sleppt under MIT-lisens