Principal AI Systems Architect
Platform-Scale LLM Infrastructure & Workflow Engineering
Why This Role Exists
Rapid Alpha is executing a 2026 transition:
From: Analyst-heavy delivery
To: Platform-leveraged execution
Our objective is clear:
Increase margin.
Increase throughput.
Increase reliability.
Without increasing analyst headcount.
EVOS is not designed to eliminate human expertise.
It is designed to encode expert reasoning into systems that are testable, auditable, and reusable — so humans can focus on interpretation, judgment, and decision-making.
You are not joining to experiment.
You are joining to scale our ability to deliver results for our clients.
This role directly impacts platform profitability.
The Business Context You Are Stepping Into
You are not starting from scratch.
You inherit:
- Revenue-generating SaaS platform (Laravel web layer)
- Python-based AI services (embeddings, indexing, classification)
- PostgreSQL + AWS infrastructure
- Active analyst workflows already using EVOS
- Real client demand requiring scalable execution
The foundation exists.
Your responsibility is to professionalize and scale it.
The System You Will Inherit
You will step into an existing, revenue-generating environment:
- Laravel-based SaaS application (web layer)
- Python services handling embeddings, indexing, and AI logic
- PostgreSQL database
- AWS infrastructure (EC2, RDS, S3)
- Multi-tenant client usage
- Analysts actively using EVOS to deliver strategic outputs
Current constraints:
- AI jobs can compete with application workloads
- Bulk document processing needs stronger async isolation
- Analyst workflows are partially systematized but not fully decomposed
- Scaling requires architectural discipline
You will not be starting from zero.
You will be professionalizing and scaling what already works.
What You Will Own (Non-Negotiable)
1. Encode Expert Reasoning Into Systems
You will work directly with analysts and the CEO to:
- Decompose decades of strategic experience into structured logic
- Convert reasoning patterns into multi-step AI workflows
- Design retrieval-first classification systems
- Implement structured outputs that support executive decision-making
Your work reduces cognitive repetition
And increases platform throughput.
2. Drive Platform Margin Through Architecture
Your impact is measurable in P&L terms.
You will:
- Reduce manual analyst time per engagement
- Increase client concurrency capacity
- Lower cost-per-analysis via intelligent retrieval design
- Eliminate fragile, ad hoc prompt behavior
- Reduce token waste and redundant embedding
Every workflow you systematize improves delivery leverage.
3. Build Production-Grade AI Systems
You will design systems that:
- Ingest and process thousands of documents
- Support multi-tenant concurrency
- Operate asynchronously
- Remain stable under load
- Produce structured, auditable outputs
- Maintain version history and reproducibility
You are accountable for correctness, stability, and repeatability.
4. Collaborate on Scalable Infrastructure
Working with our Platform Engineer, you will ensure:
- AI workloads are queue-driven and isolated
- Bulk embedding/classification does not impact web stability
- Rate limiting and retry logic are disciplined
- Observability supports diagnosis
- Scaling does not degrade reliability
You own the intelligence layer.
The system must hold under growth.
What You Must Be Able To Do
You should be able to explain clearly:
- How to classify 5,500+ documents across 100 dimensions without brute-force prompting.
- How to design retrieval systems that minimize API calls.
- How to prevent token explosion and runaway cost.
- How to evaluate classifier accuracy using measurable metrics.
- What fails first in LLM systems under scale.
- How to design workflows so outputs are auditable and version-controlled.
If you cannot articulate trade-offs in detail, this role is not a fit.
Required Experience
- 5–10+ years total engineering experience
- 2+ years building production LLM/NLP systems
- Strong Python (async programming, concurrency, APIs)
- Experience with vector databases (PGVector, Pinecone, Weaviate, etc.)
- Experience designing RAG pipelines
- Experience processing large document sets programmatically
- Familiarity with AWS environments
- Experience collaborating across product and engineering functions
Formal degrees are optional; production experience is not.
What You Do NOT Own
- Frontend feature development
- Client-specific customization work
- General DevOps administration
- One-off prompt experiments
- Marketing AI content creation
This role is about system architecture, not experimentation.
What Success Means in 2026 Terms
Success is not “interesting prompts.”
Success is:
- Measurable reduction in analyst manual load
- Stable concurrent client execution
- Workflows that can be saved, rerun, versioned, and audited
- Increased margin without increasing delivery headcount
- A system that scales execution logic across clients
You are building execution leverage.
What Success Looks Like (First 90 Days)
By Day 30
- Clear decomposition of analyst workflows
- Architecture for scalable ingestion and classification defined
- Prototype pipeline running in staging
By Day 60
- At least two workflows automated end-to-end
- Bulk document processing stable and asynchronous
- Structured outputs versioned and stored reliably
By Day 90
- Measurable reduction in manual analyst workload
- Stable AI execution under concurrent client usage
- Clear documentation of workflow logic and evaluation metrics
Compensation
- 100% remote
- India federal holidays + 3 company shutdown weeks
- Full benefits
- Direct collaboration with founder and senior strategists
- High ownership and high accountability
Compensation reflects senior-level expectations.
About Rapid Alpha
Rapid Alpha helps mid-sized companies align Vision, Focus, and Results (VFR) into repeatable execution systems.
EVOS is not a chatbot.
It is a structured execution platform that encodes expert reasoning into reusable AI workflows.
Our 2026 objective is simple:
Scale results delivery without scaling headcount.
This role is central to that mission.
Option 1 — High-Signal Screening CTA
How to Apply
Send:
- A brief description of one production AI/LLM system you personally designed.
- A short explanation (max 500 words) of how you would architect a system to process 5,000+ research documents and classify them across multiple dimensions.
- Your CV and expected CTC.
Applications without this information will not be considered.