How We Extract Emails from Linktree Pages (and Why Influencer Niches Have Way More)

by HarvestMyData

instagramlinktreeemail-extractionscrapingtechnical

Instagram only shows you half the emails

When you look at an Instagram profile, you see two places where an email might be:

  1. The contact button on business/creator accounts (the "public email" field)
  2. Literally typed into the bio text — like collabs: [email protected]

Most scraping tools stop here. They grab the contact email from Instagram's API, maybe run a regex over the bio, and call it a day.

We do that too. But there's a third source that most people miss entirely, and it's responsible for a significant chunk of the emails we find: Linktree pages.

What's actually on a Linktree page

If you've been on Instagram for more than five minutes you've seen "link in bio" profiles that point to a linktr.ee URL. The person puts all their links there — their website, their YouTube, their booking form, whatever.

What's less obvious is that Linktree pages often contain an email address that isn't listed anywhere on Instagram itself. It might be in a "Contact Me" button, a mailto: link buried in a booking widget, or just plain text on the page. The person set it up on Linktree and never bothered to also add it as their Instagram contact email.

This happens a lot with creators and small businesses. They set up Linktree once, added their email there, and Instagram's contact email field either stayed empty or has some old address they don't check.

How we actually scrape it

The pipeline has three steps, in priority order:

Step 1 — We check Instagram's own data first. Business and creator accounts expose a public_email field through the API. This is the "official" contact email. If it's there, we use it and move on.

Step 2 — For accounts without a contact email, we run a regex over the bio text. You'd be surprised how many people just type their email into their bio. We filter out false positives (image filenames ending in .png/.jpg, etc.) and take the first valid match.

Step 3 — For accounts that still don't have an email after steps 1 and 2, we check if they have a Linktree link. We look in three places:

  • The bio text itself (people paste linktr.ee/username right in their bio)
  • The external_url field (the clickable link under the bio)
  • The bio_links array (Instagram's newer multi-link feature)

If we find a Linktree URL, we scrape the page.

The scraping part

Linktree is a Next.js app. Every page has a <script id="__NEXT_DATA__"> tag that contains the full page data as JSON. We don't need to render JavaScript or parse the DOM — we just grab that script tag and parse the JSON.

Then we regex the entire JSON blob for email addresses. This catches emails wherever they appear — in link titles, descriptions, contact buttons, custom text blocks, anywhere. If someone put their email on their Linktree page in any form, we'll find it.

We run these requests through residential proxies at around 250 requests per second with gzip compression. Linktree doesn't aggressively rate-limit like Instagram does, so we can move fast. Each request gets cached so we don't re-scrape the same page twice across different jobs.

The whole Linktree step usually adds 1-3% to the total email count for a typical job. That doesn't sound like much, but on a 50K scrape that's 500-1500 extra emails that would otherwise be missing.

Which niches actually have Linktree links

Not every Instagram audience uses Linktree equally. We've seen a clear pattern after running thousands of jobs:

High Linktree usage (15-30% of accounts have one):

  • Fitness coaches and personal trainers
  • Beauty / makeup / skincare creators
  • Life coaches, mindset coaches, business coaches
  • Musicians and DJs
  • Photographers (especially portrait/wedding)
  • Small e-commerce brands
  • OnlyFans creators (obviously)

Medium usage (5-15%):

  • Food bloggers and recipe accounts
  • Travel creators
  • Real estate agents
  • Local businesses (salons, studios, cafes)
  • Freelance designers and developers

Low usage (under 5%):

  • Meme pages
  • News/media accounts
  • Large brand accounts
  • Sports fan pages
  • Most accounts with 100K+ followers (they tend to use direct website links instead)

The pattern is straightforward: if someone is trying to get contacted for business — coaching, bookings, collaborations — they're more likely to have a Linktree with their email in it. Passive accounts that just post content don't bother.

The overlap problem

One thing we had to handle: some people have the same email on both Instagram and Linktree. If we already found their email from Instagram's API in step 1, we skip the Linktree scrape entirely. No point wasting a request.

What this means for you

If you're scraping a niche full of creators, coaches, or small businesses — basically anyone who's trying to get clients — the Linktree step will meaningfully boost your email count. These are often better leads too, because a person who bothered to set up a Linktree with their contact info is actively looking to be reached.

On the other hand, if you're scraping a celebrity's followers (mostly passive fans), Linktree won't add much. The emails you get will come almost entirely from Instagram's own contact fields.

The takeaway: niche selection matters more than raw follower count. A 10K scrape of a business coach's followers will usually yield more usable emails than a 50K scrape of a celebrity's followers, partly because of Linktree and partly because business-oriented accounts are more likely to have contact info in general.

This is one of the reasons we show you the email count after each job — so you can learn which targets give the best results for your specific use case and adjust accordingly.

We built HarvestMyData to handle all of this for you.

No proxies, no code, no account needed.

Try it now