PayPal extended its MLOps platform Cosmos.AI to support the development of generative AI applications using large language models (LLMs). The company incorporated support for vendor, open-source, and self-tuned LLMs and provided capabilities around retrieval-augmented generation (RAG), semantic caching, prompt management, orchestration, and AI application hosting.
PayPal conceived the Cosmos.AI Platform around 2020 and made it generally available in mid-2022. The company decided to consolidate many bespoke and fragmented solutions that teams had previously built independently into an enterprise platform supporting the end-to-end Machine Learning Development Lifecycle (MLDCL). Since its launch, Cosmos.AI has become a de facto AI/ML platform for the company and has been used by thousands of data scientists, analysts, and developers.
The unified platform combines capabilities similar to those from cloud providers, like Amazon SageMaker, Azure Machine Learning, and GCP Vertex AI. Furthermore, Cosmos.AI decouples platform capabilities from their implementations, allowing users to choose between bespoke, in-house implementations and those offered by open-source solutions or third-party vendors that the platform integrates with. Moreover, the platform supports multi-tenancy and self-service and can operate in multi-cloud and hybrid-cloud environments.
Architecture of Cosmos.AI MLOps Platform (Source: PayPal Technology Blog)
PayPal recognized the importance of generative AI and embraced it early on. Building on the flexible and extensible architecture of Cosmos.AI, the company invested in developing GenAI capabilities that leverage large language models (LLMs) to foster product innovation. Jun Yang, engineering director at PayPal, provides the overview of the company’s efforts to provide centralized support for Gen AI:
Thanks to the solid foundations we have in place for platform with its remarkable extensibility, we were able to develop a Gen AI horizontal platform on PayPal Cosmos.AI in the span of a few months, allowing PayPal to fully tap into this technology and rapidly scale Gen AI application development across the company, while reducing costs by minimizing duplicated efforts on Gen AI adoptions among different teams.
The company augmented Cosmos.AI's training capabilities to allow fine-tuning of open-source and vendor-hosted LLMs. The model repo was also extended to enable easy onboarding of LLMs from public model gardens such as Hugging Face while supporting legal and licensing checks. Cosmos.AI now also provides LLMOps capabilities, including multi-GPU deployments, LLM optimizations, streaming interfaces, and enhanced logging/monitoring.
LLMOps Capabilities of Cosmos.AI (Source: PayPal Technology Blog)
To support Gen AI applications, PayPal developed a range of new capabilities, including an RAG framework leveraging Vector DB and Cosmos.AI pipelines, semantic caching for LLM inferencing, a prompt management framework, a platform orchestration framework, LLM evaluation tools, and application hosting. In the future, the team is planning to evolve the AI/ML platform further and transition self-service/manual processes into autonomous workflows.