Home

WOTC Tax Credit Suite

$200K+ revenue generated in 2024

A complete ecosystem for Work Opportunity Tax Credit processing. From federal forms to AI extraction to audio verification.

"The bottleneck wasn't the code. It was human comprehension. Audio guidance solved what better UI couldn't."

The Numbers

$200K+

Revenue 2024

7

Languages

86

CSV columns

<60s

Verification time

The Problem

Industry Standard (Before)

• Manual phone calls to state agencies
• $5K per applicant verification
• 45+ minutes per call
• Paper forms that got lost
• Applicants who didn't understand questions
• Each wrong checkbox = $5,000 lost
"The bottleneck wasn't the code. It was human comprehension. Applicants weren't clicking 'No' because they weren't eligible. They were clicking 'No' because they didn't understand the question."

The Question 13 Problem

The $5,000 Question

"Many employers miss out on tax credits through the New York Youth Program due to incorrect answers on Question 13. This question asks if the employee was unemployed, wanted more paid work, or felt their skills were underutilized when starting the job."

Most people qualify, but often respond incorrectly due to confusion about legal terminology. The audio app reads each question aloud in plain language—transforming comprehension rates.

Before

30% wrong answers

After

<5% wrong answers

Difference

$5K per error saved

The Solution

Three production systems working together:

1. Audio WOTC Verification

Built in 1 week. Solo. $200K+ generated.

GitHub →

Audio-guided self-service form. Every instruction read out loud. Simple Yes/No buttons. Mobile-friendly. Under 60 seconds to complete.

Before

$5K/applicant

After

~$0.10/applicant

Time

45min → <1min

2. Digital IRS Form 8850

7 languages · Touch signatures · Real-time validation

GitHub →

Complete digitization of IRS Form 8850. Multi-step flow with conditional logic. Accessibility-first design. Works on mobile.

Languages: English · Spanish · French · Haitian Creole · Korean · Russian · Chinese

3. Enterprise Tax Credit Platform

AI extraction · PostGIS · Multi-tenant

GitHub →

Full processing platform. Gmail API monitoring → PDF extraction → OpenAI data extraction → PostGIS geographic eligibility → 86-column CSV export for federal reporting.

The Pipeline

Gmail API Watch → Email Processing → AI Extraction → PostGIS Queries → Dashboard → CSV Export
       ↓                  ↓                 ↓                ↓               ↓            ↓
 Auto-fetch         Parse PDFs        86 fields         EZ/TEZ zones      Kanban      Federal
   emails          attachments        extracted          verified          view       compliance

Technical Architecture Deep Dive

1. Document Ingestion Pipeline

emails → attachments → PDF split to JPG pages →
form classification (Gemini 2.5) → field extraction → ai_extraction_json
• Automatic PDF-to-JPG conversion for ML processing
• Form type classification (8850, Questionnaire, NY Youth)
• Page-level extraction for multi-form documents
• Confidence scoring on each extracted field

2. Railway Webhook Architecture

5 microservices powering the extraction pipeline:

PDF Processor → Applicant Jobs → Form Responses → Orchestrator → Addresses
• Separate Railway services for isolation
• Database triggers for automation
• Supabase Edge Functions for lightweight tasks
• Webhook orchestration for complex flows

3. Human Verification Layer

"100% of extractions pass human verification despite confidence level. The manual verification process is simple— shows eligible/not eligible for each of the 2 forms, then auto 3-second defaults to what extract found, user can override."
• 3-second auto-advance
• One-click override
• Full audit trail

Tech Stack

Frontend

Next.js 14, React 18, TypeScript, Tailwind, shadcn/ui, TanStack Table

Backend

Supabase, PostgreSQL 15, PostGIS, Row-Level Security

AI

OpenAI GPT-4 for document extraction, confidence scoring

Integrations

Gmail API, Google Cloud Storage, Vercel, Railway

Geographic Intelligence

PostGIS Spatial Queries

-- Check if address is in Empowerment Zone
SELECT COUNT(*) > 0 as is_ez
FROM empowerment_zones ez
WHERE ST_Contains(
  ez.geometry,
  ST_SetSRID(ST_MakePoint(longitude, latitude), 4326)
);
<50ms eligibility queries
• 10,000+ EZ/TEZ polygons
• 3,143 U.S. counties
• Address normalization + geocoding

Automated EZ Detection Pipeline

"Fully automated pipeline that detects when someone signs a WOTC Form 8850, automatically creates coordinate records via database triggers, geocodes addresses using OpenCage API, and determines empowerment zone eligibility."
{
  "is_in_empowerment_zone": false,
  "ez_eligible": false,
  "confidence": 0.5,
  "county_name": "Queens County",
  "determination_method": "PostGIS spatial intersection + OpenCage geocoding",
  "coordinates": {"lat": 40.777214, "lng": -73.808409},
  "processing_time_ms": 956
}
• Database triggers on form_8850_present
• OpenCage API for address → coordinates
• PostGIS ST_Contains for zone intersection
• HUD Empowerment Zone polygon data

Business Impact

90% reduction

in manual data entry time

95%+ accuracy

vs 70% manual error rate

$50-100/app

cost savings vs manual

5x faster

application turnaround

Form 8850 Digital Implementation

Production-Ready IRS Form 8850

"Production-ready digital implementation of IRS Form 8850 for tax credit processing companies. Multi-language support (7 languages), digital signature capture, and real-time validation."
• React 18 + Vite + TypeScript
• i18next for 7-language support
• Touch-enabled signature capture
• Real-time field validation
• Accessibility-first design (WCAG 2.1)
• Mobile-responsive layout
EN
ES
FR
HT
KO
RU
ZH

English · Spanish · French · Haitian Creole · Korean · Russian · Chinese

AI/ML Extraction Engine

Multi-Model Competition

"Competition between Claude Sonnet 4 and Gemini Flash 2.5 Lite on test PDFs—creating JSON outputs, comparing accuracy on: 1) unique applicants in PDF, 2) correct page numbers for 8850 and questionnaire, 3) all responses extracted accurately."
• GPT-4 for document extraction
• Gemini 2.5 for form classification
• Claude for verification prompts
• Confidence scoring per field

Field Extraction: 86 Columns

• Applicant PII (name, SSN, DOB)
• Address components
• 8850 checkboxes Q1-Q7
• Questionnaire responses
• Signatures & dates
• EZ/TEZ eligibility

The Insight

"Simple > Smart. No AI, no fancy extraction. Just audio + big buttons. The key insight: applicants weren't clicking 'No' because they weren't eligible—they were clicking 'No' because they didn't understand the question."
"$5K per wrong answer changes how you think about forms. Every input field is a potential failure point."

Use Cases

Payroll Services

Process WOTC for client employees

Staffing Agencies

High-volume applicant screening

HR Departments

Internal tax credit capture

Tax Credit Consultants

Multi-client service bureau

Audio Verification → · Digital 8850 → · Enterprise Platform →