Viswanathan Ranganathan
Title of the Talk :
Git for Data: Versioned Dataset Distribution in Distributed Systems
Abstract of Talk:
In distributed systems, large reference datasets—product catalogs, feature stores, configuration data, model metadata, pricing tables—must be served at microsecond latency across multiple application instances. Traditional local caching approaches break down at gigabyte scale: cold starts during deployments take minutes, memory spikes from full reloads trigger garbage collection pauses, and cache stampedes overwhelm source systems when TTLs expire simultaneously.
This keynote introduces a different paradigm: treating datasets as immutable, versioned snapshots with delta-based distribution—essentially applying Git’s model to in-memory data. Rather than invalidating caches, we version them. Rather than reloading gigabytes, we apply kilobyte-sized deltas. Producers and consumers are completely decoupled through a simple blob store interface.
Drawing from Netflix’s production experience with Hollow—an open-source library implementing this pattern since 2016 to power everything from content recommendations to streaming infrastructure—attendees will learn:
(1) why versioned immutability eliminates traditional cache invalidation problems;
(2) how delta-based distribution reduces network overhead by 16-500x;
(3) the architectural decoupling that enables operational simplicity; and
(4) clear decision criteria for when this pattern applies versus traditional caching.
Through live demonstration and real performance comparisons, this talk provides engineers with a production-proven pattern for one of distributed computing’s persistent challenges—efficient large-scale data distribution without operational complexity.
Bio:
Viswanathan Ranganathan is a Senior Distributed Systems Engineer at Netflix, specializing in deployment infrastructure and developer platforms. His core expertise is reliability at scale – building systems that remain dependable whether processing massive data volumes or enabling safe software releases across large engineering organizations.
What distinguishes his approach is product thinking applied to infrastructure: designing technical systems from the perspective of the engineers who use them, rather than optimizing purely for technical metrics. This methodology has driven measurable impact throughout his career – from architecting analytics platforms that served enterprise customers processing over 250K events per second, to designing governance frameworks that reduced developer onboarding from six days to minutes.
His 11-year career spans five technology sectors: healthcare, e-commerce, communications, enterprise software, and streaming. Each domain contributed distinct capabilities – healthcare’s discipline and precision, e-commerce’s adaptability, communications infrastructure’s scale under pressure, enterprise software’s coordination skills, and streaming’s end-to-end planning across large organizations. This cross-domain experience provides perspective on which distributed systems challenges are universal versus domain-specific.
He holds an M.S. in Computer Science from the University of Houston. He is an active member of the ACM Fremont Chapter and serves as a technical judge for hackathons, evaluating projects on architectural soundness, scalability, and operational maturity.
