Daxos · Internal Setup

Apify Scraping Setup

~5 minutes. Then Claude can pull LinkedIn, X, Google Maps & more for deal diligence.

1Sign up

  1. Open console.apify.com/sign-up and sign up with Google (fastest, no credit card needed to create the account).
  2. Pick the Starter plan ($29/mo) when prompted — or start Free and upgrade later. See why in section 2.
  3. You're in. The account already has an API token waiting — grab it in section 3.

2Which plan

PlanPriceMonthly creditCatch
Free$0$5Many of the best scrapers block free-tier users
Starter ✓$29/mo$29 usageNone — this is the pick
Scale$199/mo$199Overkill for our volume
Pick Starter. Our actual scraping fees run $2–8/month, and the $29 plan includes $29 of usage credit — so the plan effectively pays for the scraping with big headroom. The only reason not to use Free: it gates the LinkedIn and Twitter scrapers we rely on.

3Get your token

  1. Go to console.apify.com/settings/integrations
  2. Copy the Personal API token shown there (starts with apify_api_…).
  3. Paste it to Claude in chat: just send “here's the apify token: <paste>”. Claude stores it securely on the droplet, verifies it with a live call, and runs smoke tests.

🔒 Treat the token like a password. Paste it only into our Claude chat — not anywhere public. (This page is unlisted and holds no secrets.) You can rotate or scope it anytime from the same settings page.

Open token page →

4What we'll scrape

Every actor below is cookieless where it matters — no logging into your accounts, no ban risk. All prices are pay-per-result.

WhatUseCost
LinkedIn profilesFounder background checks cookieless$4 / 1,000
LinkedIn companiesHeadcount, growth$3 / 1,000
LinkedIn jobsHiring = traction signal$1 / 1,000
Twitter / XFounder & company posts$0.40 / 1,000
Google Maps + reviewsBusiness diligence~$1–2 / batch
Website traffic (SimilarWeb)Is traffic real & growing? Where from? cookieless$1 / 1,000
Website → markdownFeed company sites into DRA~$0.02–0.50 / site
App Store / Play reviewsConsumer traction$0.10–1 / 1,000
Instagram / TikTokConsumer startup reach~$2 / 1,000
Google Search (SERP)Open-web diligence$1.80 / 1,000
Trustpilot / NewsReputation, signalcheap

skip Crunchbase & G2 scrapers — they're losing the anti-bot war (flaky). We stay on Harmonic / Dealroom / SEC EDGAR for funding data.

4.5PitchBook — the honest answer: don't

You asked about PitchBook. Short version: don't scrape it — it's the one source on your list where scraping is genuinely a bad idea, for three reasons:

IssueWhy it matters
It's paywalledThe real data (valuations, AUM, cap tables) sits behind a paid login. The working scrapers only return the free public shell — near-worthless.
Needs your loginTo get anything real, an actor needs your own PitchBook cookies. That can get your seat banned ($20–30k/yr) and breaches your subscriber terms.
Legal exposurePublic data is fair game (LinkedIn). Paywalled data is not — that's the line where the "hacking" law actually applies. PitchBook is Morningstar-owned and litigious, and their terms ban feeding their data into AI.
Get the same data, safely & cheaply: free SEC EDGAR Form D (every US private raise — issuer, amount, date, officers) covers most of PitchBook's seed-relevant value. Add Dealroom via Apify ($49/mo, funding rounds + valuations) if you want structured coverage. ~1% of PitchBook's cost, zero legal exposure.

4.6More sources worth having

Several of the best diligence sources are free official APIs — Claude can hit these with no Apify token at all. The rest are cheap Apify actors.

SourceUseHow
SEC EDGAR Form DUS private raises (who raised how much)free API
Court recordsFounder lawsuit historyfree CourtListener API
Hacker NewsFounder / product sentimentfree Algolia API
Wayback MachineWhat the company claimed months ago (drift)free CDX API
PatentsIP ownership / chain-of-title (deeptech)free PatentsView, or Apify
Company registriesCatch dissolved / forfeited entitiesOpenCorporates (Apify)
RedditProduct sentimentApify, ~pay-per-result
YouTubeFounder talks, demos, conference clipsApify, $0.50–5 / 1k
BuiltWith / WHOISVerify tech stack & real company ageApify, cheap

The free APIs (EDGAR, CourtListener, HN, Wayback, Patents) are the diligence spine — funding, litigation, sentiment, claim-drift, IP — all free and legal. Apify fills the gaps. None of this needs anything from you beyond the one token in section 3.

5What it costs

Full diligence sweep on one company ≈ $5–8.

(Founder LinkedIn + company page + 1k tweets + Instagram/TikTok + both app stores + site crawl + search + job postings + reviews + news — all of it.) A 100-company batch enrichment ≈ $2–5. These are rounding errors against the deal sizes, and they draw from the $29 monthly credit.

6Is this legal?

Short version: yes, for what we're doing. We scrape public data only, cookieless (never logging into your accounts), for internal diligence, and we never republish or resell it. That's standard practice across the VC industry — every sales-intelligence tool (Apollo, ZoomInfo, Harmonic) does the same. Courts have held that scraping public data isn't “hacking.” We keep it clean by tying scraped data to an active deal and deleting it when the deal dies — flag-raiser, not a permanent dossier.

Next step: grab the token from section 3 and paste it to Claude.
Everything else is already built and waiting.

Daxos Capital · internal · unlisted