Who is Prasanth Padharthi?

Prasanth Padharthi is the Co-Founder and CTO of HONO AI, with 16 years of experience in enterprise software development. He is the architect of the world's first Headless HRMS and previously led the Innovation Lab at Ramco Systems in Singapore. He was elevated to Co-Founder in October 2025 and currently leads a 40+ engineering team.

What is a Headless HRMS?

A Headless HRMS is a revolutionary HR management system architecture where the entire backend is decoupled from any fixed interface. Users interact through natural language conversation, and an Intelligent Execution Engine dynamically selects and fires APIs based on intent. HONO AI Zero UI is the world's first implementation of this concept, built on Python, FastAPI, and Transformers for LLMs, with AWS Bedrock and Anthropic Claude powering the intelligence.

Zero UI is an interface paradigm where software operations are executed through conversation rather than traditional graphical interfaces. Prasanth Padharthi architected HONO AI Zero UI, which launched publicly on May 7, 2026, as the world's first Headless HRMS.

HONO AI is an enterprise HR technology company founded by Mukul Jain that serves 300+ enterprise clients across Asia. The HONO brand is operated by SequelOne Solutions Private Limited. Under Prasanth Padharthi's technical leadership, HONO AI developed the world's first Headless HRMS with embedded Generative AI and Agentic AI capabilities. The platform was rebuilt from a PHP stack to a modern architecture (React 18, Node.js, Apollo GraphQL, AWS VPC) designed to survive for a decade.

Is HONO AI the same company as SequelOne Solutions Private Limited?

Yes. HONO, HONO AI, HONO.AI, Hono AI, and SequelOne Solutions Private Limited all refer to the same company. HONO is the product brand; SequelOne Solutions Private Limited is the registered legal entity. The flagship product is HONO Zero UI, externally branded Era — by HONO. Prasanth Padharthi serves as Co-Founder and CTO; Mukul Jain is the Founder.

What technology stack does Prasanth Padharthi work with?

Prasanth Padharthi works with a modern enterprise stack including React 18, Node.js, Apollo GraphQL, Sequelize ORM, and AWS VPC for the core platform. For AI capabilities, he uses Python, FastAPI, Transformers for LLMs, AWS Bedrock, and Anthropic Claude. He also designed the MCP (Model Context Protocol) server topology for enterprise integrations.

What is Prasanth Padharthi currently working on in 2026?

As of mid-2026, Prasanth Padharthi is actively supporting HONO's next investment round for global expansion, delivering enterprise AI deployments across hospitality, energy, and security-services sectors spanning APAC, the Middle East, Europe, and North America, expanding HONO's agentic AI automation into invoice processing and contractor onboarding domains, leading the Forward Deployed Engineers program, and building HONO's MCP ecosystem for third-party enterprise integrations.

Back

Architecture · Long-form

Why we killed the UI — Inside HONO Zero UI's Architecture

Published · May 2026~12 min read

1. The premise

For 30 years, enterprise HR software competed on interface density. More screens. More fields. More dashboards. HONO Zero UI takes the opposite bet: the interface is not the product. The execution layer is.

If any HR action can be expressed as an intent, and any intent can be routed to an API, then the entire HRMS becomes headless. The UI shrinks to a single conversational surface: one chat window, slash commands, generative components rendered inline. Everything else (payroll, leave, attendance, payslips, recruitment, reimbursement, holidays, employee directory) becomes a backend capability the execution engine composes on demand.

The simple cases (apply leave, check payslip, mark attendance) are not the interesting part. Any transactional chatbot from the last decade could do those. They are intent-to-API mappings.

The harder case looks like this:

Brief me on my team before tomorrow's mid-year reviews. Flag anyone who looks at risk.

That single message has no pre-built workflow. The executor has to resolve “my team” from the org chart, pull each report's last six months of performance reviews, leave and attendance patterns, project allocations, and outstanding 1:1 actions. Then it has to synthesise across all of that and surface the patterns a busy manager is about to miss. Anywhere between eight and a dozen tool calls. Three different ways to define “at risk.” One generative UI card per direct report, rendered inline.

No transactional chatbot can do that. It requires multi-step planning, conditional tool selection, mid-pipeline reasoning, and the willingness to make a judgment call the user can still override. That is the bar the architecture in the rest of this piece has to clear.

2. The stack at a glance

Zero UI runs on a deliberately small, modern stack — every choice serves the headless thesis.

Frontend: React 19, TypeScript, Vite, TailwindCSS v4, PWA-ready
Backend: Express.js, CopilotKit v2 runtime (AG-UI protocol), monolithic by design for deployment simplicity
Models: Anthropic Claude (Haiku 4.5, Sonnet 4.6, Opus 4.7) as the primary stack, with 4o/5o-class models accessed through Azure AI and AWS Bedrock, plus self-hosted open-source LLMs. All routed per-tenant from the admin settings.
Data: PostgreSQL (Neon serverless) via Prisma ORM
Observability: Langfuse for end-to-end AI traces, an internal journey bus for in-app step visibility, an LLM cost dashboard at /admin/llm-costs
Embeddings: A small-dimension text embedding model (1.5K-dim) for tool retrieval and the HONO Any-API catalog (~1,429 GraphQL operations indexed). Same options as above: hosted via Azure or AWS Bedrock per-tenant, or self-hosted.

The product is branded Era — by HONO externally; the engineering repo retains the technical identifier HONO_ZERO_UI.

Before the deep-dive, here's the whole flow on one page. Each block below the message line is unpacked in the sections that follow.

Request flow · one user message, end to end

User message

“Brief me on my team for tomorrow's mid-year reviews. Flag anyone at risk.”

Tier 1 — Classifier

Lightweight model

Returns intent + confidence + suggested tools

cache_control: ephemeral · 5-min TTL · ~90% off cached prefix

Hybrid tool routing

Static intent map or embedding RAG

Narrow intents (≤5 tools) → static allowlist · broad (≥6) → top-K retrieval

Filtered by tenant.enabled_hrms_modules

Tier 2 — Executor

Efficient model · CopilotKit runtime

Plans the tool sequence, invokes through the action layer

Action layer

defineTool wrappers

45s hard timeout · date validation · OTP gating · auto auth injection

HONO backend

GraphQL / REST

Same APIs the rest of the platform uses — Zero UI is just another caller

Response

Generative UI rendered inline

Plus follow-up suggestions for next steps

External MCP clients
(Claude Desktop, Cursor, ServiceNow, Oracle Fusion, HubSpot) enter at the action layer via the MCP topology — same execution engine, different intent surface.

3. Dual-model routing — cheap classifier, smart executor

Most chat turns don't need a frontier model. Looking at a typical HONO conversation:

60–70% are simple lookups — “what's my leave balance?”, “mark me present.” One tool, format the result, done. A lightweight model handles this perfectly.
20–30% are multi-tool synthesis — “morning briefing.” Four tool calls plus a written summary. A mid-tier model is fine.
5–10% need the smart model — policy-nuance reasoning, ambiguous questions where the wrong tool pick costs three retries.

Zero UI runs every request through two tiers:

Tier 1 · Classification

Lightweight model (default Sonnet 4.6, optional Haiku 4.5). Max 256 tokens. Returns intent + confidence + suggested tools. The system prompt is marked cache_control: ephemeral for Anthropic prompt caching — 5-minute TTL, ~90% discount on cached prefix bytes.

Tier 2 · Execution

Efficient model (default Sonnet 4.6, optional Opus 4.7 for complex tenants). Full tool access via the CopilotKit runtime. The model the classifier suggests informs tool filtering for this layer.

Tenants configure both tiers from the admin Settings page; cache hit/miss telemetry is logged on every request and surfaced in the LLM cost dashboard. The split is feature-flagged so a tenant can fall back to single-model if a workload shape changes.

4. Hybrid tool routing — static intent map + embedding RAG

Once intent is classified, the executor needs a focused tool surface. Showing every tool to every request is wasteful (and hurts accuracy — LLMs get worse, not better, with too many tools in scope). Zero UI uses two routing strategies:

Narrow intents (≤5 tools) — direct hit via a static intent → tools allowlist. “Apply leave” deterministically resolves to the three or four tools needed. Zero embedding cost, instant.
Broad intents (≥6 tools) — embedding-based top-K retrieval. The full tool corpus is embedded once at boot into an in-memory Map<toolName, vector>; queries are embedded on the fly and matched by cosine similarity.

Both paths run side-by-side in a shadow-mode evaluator that logs which tools each strategy selected. An admin dashboard at /admin/rag-evaluation compares them so we can tighten the cutover threshold based on production data, not guesses.

Layered on top: per-tenant module gating. The enabled_hrms_modules setting filters which tool modules ship to the LLM at all — a hospitality tenant doesn't see manufacturing tools, a payroll-only tenant doesn't see recruitment tools. Module-agnostic tools (marked module: '*') always pass.

5. The Intelligent Execution Engine

The IEE is what makes Zero UI “headless” rather than “a chatbot wrapper around a HRMS.” It composes tool calls, validates inputs, talks to the underlying HONO GraphQL/REST backend, and renders structured results back to the conversation.

Authoring a new capability follows a thin-wrapper pattern. The bulk of HONO operations are GraphQL queries with a single shape — we describe them declaratively via a defineTool factory that bakes in: employee-code gating, per-request throttling, GraphQL invocation, default response wrapping, follow-up suggestion generation, and catch-all error handling. Complex tools that need OTP gating, multi-step orchestration, or external HTTP stay hand-written.

Three safety invariants govern every tool handler:

45-second hard ceiling. The agent stream requires every tool_call to be paired with a tool_result. A hung handler strands the LLM transcript and silently breaks every subsequent message in that thread. The adapter races every handler against a 45-second timeout and synthesises an error result on expiry. Individual network calls have their own shorter, AbortController-driven timeouts inside.
Date validation on every write. LLMs reliably misencode dates (“9th April” → 2026-09-04). Every write action funnels user-supplied dates through a shared validator that enforces format AND business window (e.g. futureDays, pastDays). Successful responses echo a human-readable date back (“Friday, September 4, 2026”) so the renderer can surface it prominently — catches residual errors prompt engineering can't.
OTP gating on sensitive operations. Payslip access and similar high-trust actions route through an otpService before the underlying API is even called. Conversation is the interface; security is not optional because of it.

6. MCP topology — the external surface

Internally, every HONO subsystem already speaks the same tool vocabulary. The MCP layer exposes that vocabulary to external AI assistants — Claude Desktop, Cursor, Zapier, any MCP-compliant client — so they can act on HONO directly.

Two URL shapes are supported, mounted on the same router:

Path-based: /api/:tenant/mcp — tenant resolved from the path parameter.
Subdomain-based: <tenant>.mcp.hono.ai/mcp — tenant resolved from the host header. Cleaner to paste into external clients; the preferred form. Slug rules: exactly one level, case-insensitive, tight regex, reserved-subdomain list for confused-deputy safety.

Each MCP API key has a read_only flag. Classification is fail-closed by name prefix — get_, list_, fetch_, search_, view_, preview_ are read; everything else is mutating by default. New actions default to “protected” until a reviewer explicitly opts them into the read scope. This is layered on top of the tenant module filter, so an external read-only key sees the intersection of (tenant's enabled modules) ∩ (read-scope tools).

7. The cost levers — making AI-native economics work

Conversational HRMS sounds expensive until you do the math carefully. Four levers are wired into Zero UI:

Lever	Effect
Prompt caching	~90% off cached reads on Anthropic (opt-in via `cache_control`); 50% off matched input on 4o-class providers (automatic, ≥1,024-token stable prefix). Both achieved by moving the date-interpolated block out of the cached prefix into a small dynamic suffix.
Two-tier model routing	Cheap classifier handles ~60% of turns at a fraction of the execution model's rate.
Budget + alerting	Monthly cap per tenant with 80% / 100% alerts. Visible in the admin LLM cost dashboard with per-day, per-user, per-model breakdowns.
Prompt & tool description diet	A `prompt:audit` script that surfaces stable prefix size and per-tool token cost; a `COMPACT_TOOL_DESCRIPTIONS` flag that trims schemas the LLM doesn't need at runtime.

Combined effect on a typical repeat-user request through a 4o-class model:

Baseline: ~15,000 tokens × $2.50/M = $0.0375/request
+Caching + base diet: ~14,000 tokens, ~13,800 cached at 50% = ~$0.019/request (-49%)
+Compact descriptions: ~9,000 tokens, ~8,800 cached = ~$0.012/request (-68%)
+Two-tier (cheap model handles 60% of turns at 1/10 price) = ~$0.005/request (-87%)

Numbers are illustrative of what the architecture makes possible; actual tenant costs vary by traffic mix and model choice.

8. What's next

Any-API Phase 2 — invoke_hono_api. Phase 1 (read-only discovery) is live — the LLM can search the ~1,429-row GraphQL catalog and get back an HMAC-signed, single-use, 5-minute-TTL endpoint envelope. Phase 2 wires the actual invocation, opening the long tail of operations the curated tool set doesn't cover.
Anthropic caching on the execution layer. CopilotKit's built-in agent doesn't yet expose a message-level hook for cache_control. We'll either subclass the agent or contribute upstream.
Multimodal intents. Voice (phone-calling architecture already in place), document upload as input, on-screen context as input — same execution engine, different intent surface.
Third-party MCP marketplace. External systems already plug in. The next step is a curated catalog so any enterprise can participate in the conversation by exposing an MCP server, without bespoke integration work.

The interface is not the product. The execution layer is.

Everything above is in production. Some of it is bleeding edge. All of it is open to feedback.

Building something at the intersection of AI and enterprise? Let's talk.

Back to portfolio