Extract Anchor Elements

语言

Browsers turn raw HTML text into structures the page can work with. Building a tiny slice of that logic is a practical way to learn how HTML parsing works: you have to walk the string, recognize where tags begin and end, and avoid being tricked by characters like > when they appear inside quoted attributes.

In this interview-sized version, implement extractAnchors(html), which returns every complete anchor element from an HTML string in source order.

Each returned item should be the exact substring from the original input, including attributes, whitespace, and nested non-anchor markup inside the anchor.

Examples

extractAnchors(
'<div> <a href="/home" >Home</a > <span>Later</span> <a >Docs</a> </div>',
);
// ['<a href="/home" >Home</a >', '<a >Docs</a>']
extractAnchors(
`<section>
<p>Read <a
href="https://example.com/docs"
data-track="docs link"
>the docs</a> first.</p>
</section>`,
);
// [
// `<a
// href="https://example.com/docs"
// data-track="docs link"
// >the docs</a>`
// ]
extractAnchors(
`<div>
<a href="https://example.com/guides" class="link" >
<abbr>API</abbr>
<span>guide</span>
</a>
</div>`,
);
// [
// `<a href="https://example.com/guides" class="link" >
// <abbr>API</abbr>
// <span>guide</span>
// </a>`
// ]

Arguments

extractAnchors(html)

  • html (string): A valid HTML-like string.

Returns

Returns an array of anchor element substrings in source order.

Notes

  • Inputs are guaranteed valid for this question.
  • Only lowercase <a ...> and </a> tags need to be handled.
  • Anchors are never nested inside other anchors.
  • Nested non-anchor tags can appear inside an anchor.
  • The HTML can contain extra spaces, indentation, and line breaks inside the markup. Preserve them in the returned substrings.
  • Preserve the original substring exactly as it appeared in the input. Do not normalize whitespace or rebuild the HTML.
  • You do not need to handle malformed markup, uppercase tags, comments, scripts, or entity decoding.

Off Limits

  • Do not use DOMParser, innerHTML, querySelectorAll(), getElementsByTagName(), or similar DOM helpers.
  • Build the result by scanning the string yourself.

加载编辑器

    Extract Anchor Elements | 带有解决方案的 JavaScript 面试问题