Skip to main content
Nepheli
All research/Case StudyOct 2025

Scalable Infrastructure for a Growing 200+ Employee Enterprise

From sprawling manual infrastructure to self-service platform engineering.

Scalable Infrastructure for a Growing 200+ Employee Enterprise

The challenge

A rapidly growing enterprise with over 200 employees had outgrown their original infrastructure setup. What began as a simple AWS deployment had evolved into a sprawling, undocumented environment with inconsistent naming conventions, no tagging strategy, overlapping security groups, and multiple single points of failure.

The platform team of three was overwhelmed. Provisioning a new service took two weeks of manual work, and the team had a backlog of 30+ pending infrastructure requests. HashiCorp's 2025 Cloud Complexity Report found that only 8 percent of organizations qualify as "highly mature" in cloud operations — and 56 percent cite hybrid and multi-cloud complexity as a leading challenge. This company was squarely in the struggling majority.

Assessment and planning

Nepheli conducted a comprehensive infrastructure assessment using our knowledge-graph methodology, cataloguing every resource, dependency, and configuration pattern. The assessment revealed 47 orphaned resources consuming $8,200 per month, 12 single points of failure in the production path, and 23 security group rules that were either redundant or overly permissive.

We developed a phased remediation plan: quick wins (immediate cost savings and security fixes), platform modernization (Kubernetes migration), and self-service enablement (developer portal and automated provisioning). Gartner predicts that by 2026, 80 percent of large software engineering organizations will have platform engineering teams — up from 45 percent in 2022 — and this engagement was designed to put the company ahead of that curve.

Platform modernization

We introduced a standardized account structure using AWS Organizations with service control policies enforcing security and compliance baselines. Workloads were migrated to a Kubernetes-based platform with automated scaling, self-healing capabilities, and namespace-level resource isolation per team.

A tagging taxonomy aligned with the company's cost-center model was implemented and enforced through policy-as-code, enabling accurate cost allocation by team, project, and environment for the first time. With 59 percent of organizations now having dedicated FinOps teams according to Flexera, cost visibility has become a baseline expectation — not a nice-to-have.

Self-service enablement

The final phase introduced a self-service developer portal built on Backstage. Gartner predicts that by 2028, 85 percent of organizations with platform teams will provide internal developer portals. Research shows that organizations with mature internal developer portals see cycle time reductions of up to 50 percent.

The portal offered pre-approved templates for common service types — REST APIs, event consumers, scheduled jobs, static sites — each with built-in monitoring, logging, and alerting configured by default. New services can now be provisioned in under 15 minutes through the portal, compared to the previous two-week manual process.

The results

Infrastructure costs decreased by 40 percent through right-sizing, resource consolidation, and improved utilization from Kubernetes bin-packing — the same knowledge-graph-powered cost optimization approach that underpins Nepheli's Hermeez platform. Service provisioning time went from two weeks to 15 minutes. The platform team's role shifted from fulfilling individual requests to maintaining the platform itself.

The company has since grown to over 350 employees without expanding the platform team beyond four engineers. The self-service model scaled naturally with headcount, and the standardized infrastructure patterns reduced onboarding time for new engineers by over 60 percent. The CTO described the transformation as "removing the ceiling on our ability to ship."