Cloud Run

Cloud Run is a container platform on Google Cloud, and I'm part of the Cloud Run team at Google. Before joining Google, I wrote the O'Reilly book on it.

Cloud Run at Cloud Next 2025

Get ready for Cloud Next 2025, where Cloud Run takes center stage with sessions covering everything from new features and serverless GPUs to AI agents and high-availability services.April 07, 2025

Cloud Run Adds GPUs

Google Cloud Run now offers NVIDIA L4 GPUs in public preview, enabling fast, scalable, and cost-effective deployment of AI workloads like large language models with scale-to-zero capabilities.August 21, 2024

TEQNation 2024: Using RAG and ReACT to Augment Your LLM App

At TEQNation 2024, I demonstrated how to build a powerful LLM app using RAG and ReACT, grounded in over 20 million Hacker News comments, and shared my journey of data scraping, indexing, and implementation with LangChain.May 03, 2024

Build and Deploy a LangChain App with a Vector Database

Learn to build a Retrieval Augmented Generation (RAG) app with LangChain and Gemini on Cloud Run, using a vector database to answer questions from Cloud Run release notes.April 27, 2024

Building Generative AI Apps on Google Cloud with LangChain

At Cloud Next 2024, we showed how to build generative AI applications on Google Cloud using LangChain, with insights from LangChain's founder, a demo of L'Oreal's internal GPT, and a hands-on guide to building a LangChain app from scratch.April 11, 2024

Run Gemma with Ollama on Cloud Run

Deploy Google's Gemma 2B open model on Cloud Run using Ollama for CPU-based inference with this sample project, providing a straightforward way to run smaller LLMs in a serverless environment.February 27, 2024

Book

My O'Reilly book on Google Cloud Run teaches you how to deploy containerized applications on a highly scalable serverless platform, with practical examples for developers, sysadmins, and cloud engineers.December 16, 2020