Files
lead-scraper/README.md
Timo Uttenweiler aa50f46748 docs: add Coolify deployment instructions to README
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 11:21:42 +01:00

168 lines
4.3 KiB
Markdown

# LeadFlow — Lead Generation & Email Enrichment Platform
A unified platform for three lead-scraping pipelines with email enrichment via Anymailfinder.
---
## Tech Stack
- **Next.js 16** (App Router) + TypeScript
- **SQLite** via Prisma 7 + better-sqlite3
- **shadcn/ui** + Tailwind CSS
- **Anymailfinder API v5.1** — email enrichment (bulk JSON + individual search)
- **Vayne API** — LinkedIn Sales Navigator scraping
- **Apify** — Google SERP scraping
---
## Setup
### 1. Install dependencies
```bash
cd leadflow
npm install
```
### 2. Configure environment
Copy `.env.local.example` to `.env.local`:
```bash
cp .env.local.example .env.local
```
Edit `.env` and `.env.local`:
```env
APP_ENCRYPTION_SECRET=your-32-character-secret-here!!
DATABASE_URL=file:./leadflow.db
```
### 3. Run database migration
```bash
npx prisma migrate dev
```
### 4. Start the app
```bash
npm run dev
```
Open [http://localhost:3000](http://localhost:3000)
---
## API Keys — Where to Get Them
Go to **Settings** in the sidebar to enter and save credentials. All keys are AES-256 encrypted before storage.
### Anymailfinder
- Sign up at [anymailfinder.com](https://anymailfinder.com)
- Account → API → copy your key (format: starts with your account prefix)
- Pricing: 2 credits/valid decision maker email, 1 credit/person email
- Bulk API charges only when downloading results
### Apify
- Sign up at [apify.com](https://apify.com)
- Console → Account → Integrations → API tokens
- The app uses the `apify/google-search-scraper` actor (pay-per-event)
### Vayne
- Sign up at [vayne.io](https://vayne.io)
- Dashboard → API Settings → generate token
- **Connect LinkedIn** in the Vayne dashboard — Vayne manages the LinkedIn session on their end
- No need to manually export cookies
---
## Pipeline Workflows
### Tab 1 — AirScale → Email
1. Export companies from AirScale as CSV
2. Upload → map domain column
3. Select decision maker categories
4. Start Enrichment → bulk API runs asynchronously
5. Export CSV/Excel
### Tab 2 — LinkedIn Sales Navigator → Email
1. Build search in Sales Navigator, copy URL
2. Paste URL + set max results → Start Scrape
3. Vayne scrapes profiles (polls until done)
4. Select profiles → Enrich with Emails
5. Export results
### Tab 3 — SERP → Email
1. Enter search term (e.g. `"Solaranlage Installateur Deutschland"`)
2. Set country/language, enable social filter
3. Select decision maker categories
4. Start → Apify scrapes Google, domains extracted, then bulk enriched
5. Export results
---
## How Anymailfinder Bulk Works
1. All domains submitted as one POST to `/v5.1/bulk/json`
2. Poll status every 5s until `completed`
3. Download results once (credits charged at download, not submission)
4. Speed: ~1,000 domains per 5 minutes
---
## Database
SQLite at `./leadflow.db`. Inspect with:
```bash
npx prisma studio
```
---
## Deploy with Coolify (Docker)
### 1. Add the Gitea repo in Coolify
- New Resource → Application → Docker Compose (or Dockerfile)
- Repository: `https://gitea.onyva.dev/TimoUttenweiler/lead-scraper`
- Branch: `main`
- Build method: **Dockerfile** (recommended) or **Docker Compose**
### 2. Set environment variables in Coolify
In the app's Environment Variables tab:
| Variable | Value |
|---|---|
| `APP_ENCRYPTION_SECRET` | Any random 32-character string (generate once, keep secret) |
| `DATABASE_URL` | `file:/data/leadflow.db` |
| `NODE_ENV` | `production` |
| `PORT` | `3000` |
### 3. Add a persistent volume
- Mount path in container: `/data`
- This stores the SQLite database across deployments
### 4. Port mapping
- Container port: `3000`
- Coolify will proxy via its reverse proxy automatically
### 5. Deploy
Click Deploy — Coolify builds the image from the Dockerfile, migrations run automatically on startup via `docker-entrypoint.sh`.
### Dockerfile stages
1. **deps**`npm ci` + native build tools for better-sqlite3
2. **builder** — Prisma generate + `next build` (standalone output)
3. **runner** — minimal Alpine image, non-root user, copies only what's needed
---
## Troubleshooting
| Issue | Solution |
|-------|----------|
| "API key not configured" | Add key in Settings |
| Job stuck at "running" | Check server console (`npm run dev` terminal) |
| Prisma errors on build | Run `npx prisma generate && npm run build` |