Artwork

Вміст надано TCP.FM, Justin Brodley, Jonathan Baker, Ryan Lucas, and Matt Kohn. Весь вміст подкастів, включаючи епізоди, графіку та описи подкастів, завантажується та надається безпосередньо компанією TCP.FM, Justin Brodley, Jonathan Baker, Ryan Lucas, and Matt Kohn або його партнером по платформі подкастів. Якщо ви вважаєте, що хтось використовує ваш захищений авторським правом твір без вашого дозволу, ви можете виконати процедуру, описану тут https://uk.player.fm/legal.
Player FM - додаток Podcast
Переходьте в офлайн за допомогою програми Player FM !

318: One Extension to Rule Them All (And in the VS Code Bind Them)

1:05:22
 
Поширити
 

Manage episode 503383812 series 3680004
Вміст надано TCP.FM, Justin Brodley, Jonathan Baker, Ryan Lucas, and Matt Kohn. Весь вміст подкастів, включаючи епізоди, графіку та описи подкастів, завантажується та надається безпосередньо компанією TCP.FM, Justin Brodley, Jonathan Baker, Ryan Lucas, and Matt Kohn або його партнером по платформі подкастів. Якщо ви вважаєте, що хтось використовує ваш захищений авторським правом твір без вашого дозволу, ви можете виконати процедуру, описану тут https://uk.player.fm/legal.

Welcome to episode 318 of The Cloud Pod, where the forecast is always cloudy! We’re going on an adventure! Justin and Ryan have formed a fellowship of the cloud, and they’re bringing you all the latest and greatest news from Valinor to Helm’s Deep, and Azure to AWS to GCP. We’ve water issues, some Magic Quadrants, and Aurora updates…but sadly no potatoes. Let’s get into it!

Titles we almost went with this week:

  • You’ve Got No Mail: AOL Finally Hangs Up on Dial-Up
  • Ctrl+Alt+Delete Climate Change
  • H2-Oh No: Your Gmail is Thirsty
  • The Price is Vibe: Kiro’s New Request-Based Model
  • Spec-tacular Pricing: Kiro Leaves the Waitlist Behind
  • SHA-zam! GitHub Actions Gets Its Security Cape
  • Breaking Bad Actions: GitHub’s Supply Chain Intervention
  • Graph Your Way to Infrastructure Happiness
  • The Tables Have Turned: S3 Gets Its Iceberg Moment
  • Subnet Where It Hurts: GKE Finally Gets IP Address Relief
  • All Your Database Are Belong to Database Center
  • From Droplets to Dollars: DigitalOcean’s AI Pivot Pays Off
  • DigitalOcean Rides the AI Wave to Record Earnings
  • Agent Smith Would Be Proud: Microsoft’s Multi-Agent Matrix
  • Aurora Borealis: A Decade of Database Enlightenment
  • Fifteen Shades of Cloud: AWS’s Unbroken Streak
  • The Fast and the Failover-ious: Aurora Edition
  • Gone in Single-Digit Seconds: AWS’s Speedy Database Recovery
  • Agent 007: License to Secure Your AI

A big thanks to this week’s sponsor:

We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our Slack channel for more info.

General News

01:02 AOL is finally shutting down its dial-up internet service | AP News

  • AOL is discontinuing its dial-up internet service on September 30, 2024, marking the end of a technology that introduced millions to the internet in the 1990s and early 2000s.
  • Census data shows 163,401 US households still used dial-up in 2023, representing 0.13% of homes with internet subscriptions, highlighting the persistence of legacy infrastructure in underserved areas – which is honestly crazy.
  • Here’s hoping that these folks are able to switch to alternatives, like Starlink.
  • This shutdown reflects broader technology lifecycle patterns as companies retire legacy services like Skype, Internet Explorer, and AOL Instant Messenger to focus resources on modern platforms.
  • The transition away from dial-up demonstrates the evolution from telephone-based connectivity to broadband and wireless technologies that now dominate internet access.
  • AOL’s journey from a $164 billion valuation in 2000 to being sold by Verizon in 2021 illustrates the rapid shifts in technology markets and the challenges of adapting legacy business models.

02:30 British government asks people to delete old emails to reduce data centres’ water use

  • The UK government is advising citizens to delete old emails and photos to reduce water consumption by data centers, as England faces potential water shortages by 2050.
  • Data centers require significant water for cooling systems, with some facilities using millions of gallons daily to maintain optimal operating temperatures for servers.
  • This highlights the often-overlooked environmental impact of cloud storage, where seemingly harmless archived data contributes to ongoing resource consumption even when unused.
  • The recommendation represents a shift toward individual responsibility for cloud sustainability, though the actual impact of consumer data deletion versus enterprise usage remains unclear.
  • This raises questions about whether cloud providers should implement more aggressive data lifecycle policies or invest in water-efficient cooling technologies rather than relying on user behavior changes.
  • Bottom line: good for data privacy, bad for water usage.

03:01 Ryan – “It’s going to make it worse! Data at rest doesn’t use a whole lot of resources. Deleting anything from a file system is expensive from a CPU perspective, and it’s going to cause the temperature to go up – therefore, more cooling…”

01:17 Data centres to be expanded across UK as concerns mount

  • The UK is planning nearly 100 new data centers by 2030, representing a 20% increase from the current 477 facilities, with major investments from Microsoft, Google, and Blackstone Group totaling billions of pounds.
  • This expansion is driven by AI workload demands and positions the UK as a critical hub for cloud infrastructure.
  • Energy consumption concerns are mounting as these facilities could add 71 TWh of electricity demand over 25 years, with evidence from Ohio showing residential energy bills increasing by $20 monthly due to data center operations.
  • The UK government has established an AI Energy Council to address supply-demand challenges.
  • Water usage for cooling systems is creating infrastructure strain, particularly in areas serviced by Thames Water, with Anglian Water already objecting to one proposed site. New facilities are exploring air cooling and closed-loop systems to reduce environmental impact.
  • Planning approval timelines of 5-7 years are pushing some operators to consider building in other countries, potentially threatening the UK’s position as a major data center hub.
  • The government has designated data centers as critical national infrastructure and is overturning local planning rejections to accelerate development.
  • The concentration of new facilities in London and surrounding counties raises questions about regional infrastructure capacity and whether existing power and water systems can support this rapid expansion without impacting residential services and pricing.

07:12 Justin – “Power and cooling are definitely a problem… There is pressure on using water in data centers to cool them. That is a valid concern – especially with a hundred new data centers coming online, as well as powering. How do you power all those hungry, hungry GPUs?”

Cloud Tools

08:30 GitHub Actions policy now supports blocking and SHA pinning actions – GitHub Changelog

  • GitHub Actions now lets administrators explicitly block malicious or compromised actions by adding a ! prefix to entries in the allowed actions policy, providing a critical defense mechanism when third-party workflows are identified as security threats.
  • The new SHA pinning enforcement feature requires workflows to reference actions using full commit SHAs instead of tags or branches, preventing automatic execution of malicious code that could be injected into compromised dependencies.
  • This addresses a major supply chain security gap where compromised actions could exfiltrate secrets or modify code across all dependent workflows, giving organizations rapid response capabilities to limit exposure.
  • GitHub is also introducing immutable releases that prevent changes to existing tags and assets, enabling developers to pin tags with confidence and use Dependabot for safe updates without risk of malicious modifications.
  • These features are particularly valuable for enterprises managing large GitHub Actions ecosystems, as they can now enforce security policies at the organization or repository level while maintaining the flexibility of the open source action marketplace.

09:41 Ryan – “This is something that’s been really relevant to my day job; I’ve been arguing for months now to NOT expand permissions to cloud and other integrations for GitHub actions, because I’m not a fan of the security actions.”

AWS

11:26 Kiro Pricing Plans Are Now Live – Kiro

  • Kiro is launching a tiered pricing model with Free, Pro ($29/month), Pro+ ($99/month), and Power ($299/month) plans, transitioning from their preview/waitlist model to allow broader access to their cloud development tool.
  • The pricing structure is based on “Vibe” and “Spec” requests, with the free tier offering 50 Vibe requests monthly and paid tiers providing varying amounts of both request types, plus optional overage charges for flexibility.
  • New users receive a 14-day welcome bonus of 100 Spec and 100 Vibe requests to evaluate the tool’s capabilities before committing to a paid plan, with immediate plan activation and modification available.
  • The tool integrates with Google, GitHub, and AWS Builder ID authentication, suggesting it’s positioned as a cloud development assistant or automation tool that works across major platforms.
  • Kiro appears to solve the problem of cloud development workflow optimization by providing request-based interactions, though the exact nature of what Vibe and Spec requests accomplish isn’t detailed in this pricing announcement.

13:19 Ryan – “I think it’s great, but I’m kind of put off by the free plan not including anything, and then the 14-day limit for new users. I just feel like that’s too constricting, and it will keep me from trying it.”

13:47 Amazon Athena now supports CREATE TABLE AS SELECT with Amazon S3 Tables

  • Athena now supports CREATE TABLE AS SELECT (CTAS) with Amazon S3 Tables, enabling users to query existing datasets and create new S3 Tables with results in a single SQL statement.
  • This simplifies data transformation workflows by eliminating the need for separate ETL processes.
  • S3 Tables provide the first cloud object store with built-in Apache Iceberg support, and this integration allows conversion of existing Parquet, CSV, JSON, Hudi, and Delta Lake formats into fully-managed tables.
  • Users can leverage Athena’s familiar SQL interface to modernize their data lake architecture.
  • The feature enables on-the-fly partitioning during table creation, allowing optimization for different query patterns without reprocessing entire datasets. This flexibility is particularly valuable for organizations managing large-scale analytics workloads.
  • Once created, S3 Tables support INSERT and UPDATE operations through Athena, moving beyond the traditional read-only nature of S3-based analytics.
    • This positions S3 Tables as a more complete data warehouse alternative for cost-conscious organizations.
  • Available in all regions where both Athena and S3 Tables are supported, though specific pricing for S3 Tables operations isn’t detailed in the announcement.
    • Organizations should evaluate the cost implications of S3 Tables’ managed optimization features versus traditional S3 storage.

14:28 Ryan – “It’s the partitioning of data in your table on the fly. That’s the part where this is super valuable.”

16:44 Celebrating 10 years of Amazon Aurora innovation | AWS News Blog

  • Aurora celebrates 10 years since GA with a livestream event on August 21, 2025, featuring technical leaders discussing the architectural decision to decouple storage from compute that enabled commercial database performance at one-tenth the cost.
  • Key milestone announcements include Aurora DSQL (GA May 2025), a serverless distributed SQL database offering 99.99% single-Region and 99.999% multi-Region availability with strong consistency across all endpoints for always-available applications.
  • Storage capacity doubled from 128 TiB to 256 TiB with no upfront provisioning and pay-as-you-go pricing, while Aurora I/O-Optimized provides predictable pricing with up to 40% cost savings for I/O-intensive workloads.
  • Aurora now integrates with AI services through pgvector for similarity search, zero-ETL to Amazon Redshift and SageMaker for near real-time analytics, and Model Context Protocol (MCP) servers for AI agent integration with data sources.
  • Aurora PostgreSQL Limitless Database provides serverless horizontal scaling (sharding) capabilities, while blue/green deployments simplify database updates, and optimized read instances improve query performance for hundreds of thousands of AWS customers.

19:21 AWS named as a Leader in 2025 Gartner Magic Quadrant for Strategic Cloud Platform Services for 15 years in a row | AWS News Blog

  • AWS maintains its position as the highest-ranked provider on Gartner‘s “Ability to Execute” axis for the 15th consecutive year, reinforcing its market leadership in strategic cloud platform services.
  • Gartner highlights AWS’s custom silicon portfolio (Graviton, Inferentia, Trainium) as a key differentiator, enabling better hardware-software integration and improved power efficiency for customer workloads.
  • The report emphasizes AWS’s extensive global community as a competitive advantage, with millions of active customers and tens of thousands of partners providing knowledge sharing and support through the new AWS Builder Center hub.
  • AWS Transform emerges as the first agentic AI service specifically designed to accelerate enterprise modernization of legacy workloads, including .NET, mainframe, and VMware migrations.
  • The recognition underscores AWS’s operational scale advantage, with its market share enabling a more robust partner ecosystem that helps organizations successfully adopt cloud services.
  • Right below Amazon was Google (yes, it came above Microsoft on ability to execute and completeness of vision), then Oracle in 4th. Alibaba was the only challenger from China, and IBM placed too. Although we’re not sure how.

27:45 Amazon Web Services (AWS) Advanced Go Driver is generally available

  • AWS releases an open-source Go driver that wraps pgx PostgreSQL and native MySQL drivers to reduce database failover times from minutes to single-digit seconds for RDS and Aurora clusters.
  • The driver monitors cluster topology and status to identify new writers quickly during failovers, while adding support for Federated Authentication, AWS Secrets Manager, and IAM authentication.
  • This addresses a common pain point where standard database drivers can take 30-60 seconds to detect failovers, causing application timeouts and errors during Aurora’s automated failover events.
  • Available under Apache 2.0 license on GitHub, the driver requires no code changes beyond swapping import statements, making it a drop-in replacement for existing Go applications using PostgreSQL or MySQL.
  • For teams running critical Go applications on Aurora, this could significantly reduce downtime during maintenance windows and unplanned failovers without additional infrastructure costs.

27:43 Best performance and fastest memory with the new Amazon EC2 R8i and R8i-flex instances | AWS News Blog

  • AWS launches R8i and R8i-flex instances with custom Intel Xeon 6 processors, delivering 20% better performance and 2.5x memory bandwidth compared to R7i instances, specifically targeting memory-intensive workloads like SAP HANA, Redis, and real-time analytics.
  • R8i instances scale up to 96xlarge with 384 vCPUs and 3TB memory (double the previous generation), achieving 142,100 aSAPS certification for SAP workloads – the highest among comparable cloud and on-premises systems.
  • R8i-flex instances offer 5% better price-performance at 5% lower cost for workloads that don’t need sustained CPU usage, reaching full performance 95% of the time while maintaining the same memory bandwidth improvements.
  • Both instance types feature sixth-generation AWS Nitro Cards with 2x network and EBS bandwidth, plus configurable bandwidth allocation (25% adjustments between network and storage) for optimizing database performance.
  • Currently available in four regions (US East Virginia/Ohio, US West Oregon, Europe Spain) with specific performance gains: 30% faster for PostgreSQL, 60% faster for NGINX, and 40% faster for AI recommendation models.

30:58 Ryan – “I feel like AWS is just trolling us wth instance announcements now. I feel like there’s a new one – and I don’t know the difference. They’re just different words.”

GCP

32:20 Multi-subnet support for GKE clusters increases scalability | Google Cloud Blog

  • GKE clusters can now use multiple subnets instead of being limited to a single subnet’s primary IP range, allowing clusters to scale beyond previous node limits when IP addresses are exhausted.
  • This addresses a common scaling bottleneck where clusters couldn’t add new nodes once the subnet’s IPs were depleted.
  • The feature enables on-demand subnet addition to existing clusters without recreation, with GKE automatically selecting subnets for new node pools based on IP availability. This provides more efficient IP address utilization and reduces waste compared to pre-allocating large IP ranges upfront.
  • Available in preview for GKE version 1.30.3-gke.1211000 or greater, with CLI and API support currently available, while Terraform and UI support are coming soon. This puts GKE on par with EKS, which has supported multiple subnets since launch.
  • Key benefit for enterprises running large-scale workloads that need to grow beyond initial capacity planning, particularly useful for auto-scaling scenarios where node count can vary significantly. The feature works with existing multi-pod CIDR capabilities for comprehensive IP management.
  • No additional costs are mentioned for the multi-subnet capability itself, though standard networking charges apply for the additional subnets created in the VPC.

30:58 Justin – “I always like when a feature comes out right when I need it.”

34:45 Database Center expands coverage | Google Cloud Blog

  • Database Center now monitors self-managed MySQL, PostgreSQL, and SQL Server databases on Compute Engine VMs, extending beyond just managed Google Cloud databases to provide unified fleet management across your entire database estate.
  • The service automatically detects common security vulnerabilities in self-managed databases, including outdated versions, missing audit logs, overly permissive IP ranges, missing root passwords, and unencrypted connections – addressing a significant gap for customers running databases on VMs.
  • New alerting capabilities notify teams when new databases are provisioned or when Database Center detects new issues, while Gemini-powered natural language queries now work at the folder level for better organization-wide database management.
  • Historical comparison features have expanded from 7 days to 30 days, enabling better capacity planning and trend analysis across database fleets, with Database Center remaining free for Google Cloud customers.
  • This positions Google competitively against AWS Systems Manager and Azure Arc, which offer similar hybrid database monitoring, though Google’s AI-powered approach and zero-cost model provide notable differentiation for enterprises managing mixed database environments.

35:33 Justin – “I’m glad to have this. I’m also glad that it can notify me that someone created a SQL cluster, rather than me being surprised by the bill, so that I do appreciate!”

36:54 Introducing Cloud HSM as an encryption key service for Workspace CSE | Google Cloud Blog

  • Google Cloud HSM now integrates with Workspace client-side encryption (CSE) to provide FIPS 140-2 Level 3 compliant hardware security modules for organizations in highly regulated sectors like government, defense, and healthcare that need to maintain complete control over their encryption keys.
  • The service addresses compliance requirements for ITAR, EAR, FedRAMP High, and DISA IL5 certifications while ensuring customer-managed encryption keys never leave the HSM boundary, giving organizations demonstrable data sovereignty and control over sensitive intellectual property or regulated data.
  • Cloud HSM for Google Workspace offers a 99.95% uptime SLA and can be deployed in minutes with a flat pricing model, currently available in the U.S. with global expansion planned in the coming months.
  • The architecture uses a two-step encryption process where data encryption keys (DEKs) are wrapped by customer-managed encryption keys (CMEKs) stored in the HSM, with all cryptographic operations performed inside the hardware security module and comprehensive audit logging through Cloud Logging.
  • This positions Google competitively against AWS CloudHSM and Azure Dedicated HSM by specifically targeting Workspace users who need hardware-backed key management, though pricing details aren’t disclosed in the announcement.

35:33 Justin – “It’s really going to be the CSE side, so it’s actually encrypting on my client. So my Gmail client actually will have a key that is being accessed from this HSM to encrypt the mail at my browser, before it gets sent.”

39:05 Security Summit 2025: Enabling defenders and securing AI innovation | Google Cloud Blog

  • Google Cloud announces comprehensive AI security capabilities at Security Summit 2025, introducing agent-specific protections for Agentspace and Agent Builder, including automated discovery, real-time threat detection, and Model Armor integration to prevent prompt injection and data leakage.
  • The new Alert Investigation agent in Google Security Operations autonomously enriches security events and builds process trees based on Mandiant analyst practices, reducing manual effort in SOC operations while providing verdict recommendations for human intervention.
  • Security Command Center gains three preview features: Compliance Manager for unified policy enforcement, Data Security Posture Management with native BigQuery integration, and Risk Reports powered by virtual red team technology to identify cloud defense gaps.
  • Agentic IAM coming later this year will auto-provision agent identities across cloud environments with support for multiple credential types and authorization policies, addressing the growing need for AI-specific identity management as organizations deploy more autonomous agents.
  • Mandiant Consulting expands services to include AI governance frameworks, pre-deployment hardening guidance, and AI threat modeling, recognizing that organizations need specialized expertise to secure their generative and agentic AI deployments.

35:33 Ryan – “A lot of good features; I’ve been waiting for these announcements…I’m really happy to see these, and there’s a whole bunch I didn’t know about that they announced that I’m super excited about.”

42:26 Rightsizing LLM Serving on vLLM for GPUs and TPUs | Google Cloud Blog

FYI – the link is broken. I tried to find an alternate version, but couldn’t. You’re just going to have to rely on Justin and Ryan’s summary. I apologize in advance. -Heather

  • Google published a comprehensive guide for optimizing LLM serving on vLLM across GPUs and TPUs, providing a systematic approach to selecting the right accelerator based on workload requirements like model size, request rate, and latency constraints.
  • The guide demonstrates that TPU v6e (Trillium) achieved 35% higher throughput (5.63 req/s vs 4.17 req/s) compared to H100 GPUs when serving Gemma-3-27b, resulting in 25% lower costs ($40.32/hr vs $54/hr) to handle 100 requests per second.
  • Key technical considerations include calculating minimum VRAM requirements (57GB for Gemma-3-27b), determining tensor parallelism needs, and using the auto_tune.sh script to find optimal gpu_memory_utilization and batch configurations.
  • The approach addresses a critical gap in LLM deployment where teams often overprovision expensive hardware without systematic benchmarking, potentially saving significant costs for production workloads.
  • Google’s support for both GPU and TPU options in vLLM provides flexibility for different use cases, with TPUs showing particular strength for models requiring tensor parallelism due to memory constraints.

Azure

45:38 Announcing MSGraph Provider Public Preview and the Microsoft Terraform VSCode Extension | Microsoft Community Hub

  • Ryan claims he’s excited about this story, so I stand by my previous prediction that he is angling for an Azure job.
  • Microsoft launches the Terraform MSGraph provider in public preview, enabling day-zero support for all Microsoft Graph APIs, including Entra ID and M365 services like SharePoint, through standard HCL syntax.
  • This positions MSGraph as the AzureAD equivalent of what AzAPI is to AzureRM – providing immediate access to new features without waiting for provider updates.
  • The new Microsoft Terraform VSCode extension consolidates AzureRM, AzAPI, and MSGraph support into a single tool, replacing the separate Azure Terraform and AzAPI extensions. Key features include exporting existing Azure resources as Terraform code, intelligent code completion, and automatic conversion of ARM templates to AzAPI format.
  • This release targets organizations managing Microsoft 365 and Entra ID resources alongside traditional Azure infrastructure, addressing a gap where AWS has separate providers for different services (aws, aws-cc, awscc) while Microsoft now offers unified tooling. The MSGraph provider extends beyond the limited azuread provider to support all beta and v1 Graph endpoints.
  • The extension includes practical migration features like one-click migration from the old Azure Terraform extension and built-in conversion tools for moving AzureRM resources to AzAPI.
  • No pricing information was provided, but the tools follow standard Terraform provider models.
  • For DevOps teams, this enables infrastructure-as-code workflows for previously manual tasks like managing privileged identity management roles, SharePoint site provisioning, and Outlook notification templates – bringing Microsoft 365 administration into the same automation pipelines as cloud infrastructure.

46:42 Ryan – “So I understand why you hate this, because you hate all the services that are behind the Graph API, but there’s a single API point if you want to do anything in Teams. It’s the same API point if you want to query Entra ID for membership in a list of groups. It’s a graph API endpoint for anything in the docs or the mail space.. It’s all just the same API. Because it’s a single API that way, the structure can get real weird real fast… so this is kind of neat. I’m hoping it makes things easier.”

48:07 Agent Factory: The new era of agentic AI—common use cases and design patterns | Microsoft Azure Blog

  • Microsoft introduces Agent Factory, a six-part blog series showcasing five core patterns for building agentic AI that moves beyond simple Q&A to executing complex enterprise workflows through tool use, reflection, planning, multi-agent collaboration, and real-time reasoning (ReAct).
  • Azure AI Foundry serves as the unified platform for agentic AI development, offering local-to-cloud deployment, 1,400+ enterprise connectors, support for Azure OpenAI and 10,000+ open-source models, and built-in security with managed Entra Agent IDs and RBAC controls.
  • Real-world implementations show significant efficiency gains: Fujitsu reduced proposal creation time by 67%, ContraForce automated 80% of security incident response for under $1 per incident, and JM Family cut QA time by 60% using multi-agent orchestration patterns.
  • The platform differentiates from competitors by supporting open protocols like Agent-to-Agent (A2A) and Model Context Protocol (MCP) for cross-cloud interoperability, while providing enterprise-grade observability through Azure Monitor integration and automated evaluation tools.
  • Target customers include enterprises seeking to automate complex multi-step processes across systems, with the platform addressing common challenges like secure data access, agent monitoring, and scaling from single agents to collaborative agent networks without custom scaffolding.

49:46 OneLake costs simplified: lowering capacity utilization when accessing OneLake | Microsoft Fabric Blog | Microsoft Fabric

  • Microsoft has unified OneLake’s capacity pricing by reducing proxy transaction rates to match redirect rates, eliminating cost differences based on access method and simplifying capacity planning for Fabric customers.
  • OneLake serves as the centralized data storage foundation for all Microsoft Fabric workloads, including lakehouses and warehouses, with storage billed pay-as-you-go per GB, similar to Azure Data Lake Storage and Amazon S3.
  • The pricing alignment removes architectural complexity for organizations using OneLake with third-party tools like Azure Databricks or Snowflake, as all access paths now consume Fabric Capacity Units at the same low rate.
    • The term “low” is VERY subjective.
  • This positions OneLake as a more cost-predictable alternative to managing separate data lakes across cloud providers, particularly for enterprises already invested in the Microsoft ecosystem.
  • The change reflects Microsoft’s strategy to make OneLake an open, vendor-neutral data platform that can serve as a single source of truth regardless of which analytics tools organizations choose to use.

51:12 Introducing Azure Linux with OS Guard: Secure, Immutable, and Open-Source Container Host

  • Azure Linux with OS Guard is Microsoft’s new hardened container host OS that enforces immutability, code integrity, and mandatory access control – essentially a locked-down version of Azure Linux designed specifically for high-security container workloads on AKS.
  • The OS uses IPE (Integrity Policy Enforcement), recently upstreamed in Linux kernel 6.12, to ensure only trusted binaries from dm-verity protected volumes can execute, including container layers – this prevents rootkits, container escapes, and unauthorized code execution.
  • Built on FedRAMP-certified Azure Linux 3.0, it inherits FIPS 140-3 cryptographic modules and will gain post-quantum cryptography support as NIST algorithms become available – positioning it for regulated workloads and future security requirements.
  • Unlike AWS Bottlerocket, which focuses on minimal attack surface, Azure Linux with OS Guard emphasizes code integrity verification throughout the stack – from Secure Boot through user space – while maintaining compatibility with standard container workloads.
  • Available soon as an AKS OS SKU via preview CLI with feature flag, customers can test the community edition now on Azure VMs – targeting enterprises needing stronger container security without sacrificing the operational benefits of managed Kubernetes.

46:42 Ryan – “This is interesting, because according to the blog post, it takes a sort of different approach than what we’ve seen in the past with core OS and Bottlerocket and stuff – where they’re trying to reduce what’s in that limit so much that you can’t have anything that vulnerable that can be exploited in it. And this uses a lot more of the protected VMs, where it uses the sort of encrypted memory objects. And so this is sort of a new take on securing container-wise workloads at the compute level.”

53:45 Microsoft is a Leader in the 2025 Gartner® Magic Quadrant for Container Management | Microsoft Azure Blog

  • Justin is turning into a softy, and so wanted to make it up to Azure for being so low on the last Magic Quadrant, so where we are.
  • Microsoft has been named a Leader in Gartner’s 2025 Magic Quadrant for Container Management for the third consecutive year, highlighting its comprehensive container portfolio that includes Azure Kubernetes Service (AKS), Azure Container Apps, and Azure Arc for hybrid/multi-cloud deployments.
  • AKS Automatic (preview) aims to simplify Kubernetes adoption by providing production-ready clusters with automated node provisioning, scaling, and CI/CD integration, while Azure Container Apps offers serverless containers with scale-to-zero capabilities and per-second billing for GPU workloads.
  • The platform integrates AI workload support through GPU-optimized containers in AKS and serverless GPUs in Container Apps, with Microsoft’s KAITO project simplifying open-source AI model deployment on Kubernetes – notably powering ChatGPT’s infrastructure serving 500M weekly users.
  • Azure Kubernetes Fleet Manager addresses enterprise-scale challenges by enabling policy-driven governance across multiple AKS clusters, while node auto-provisioning automatically selects cost-effective VM sizes based on workload demands to optimize spending.
  • Key differentiators include deep integration with Azure’s ecosystem (networking, databases, AI services), developer tools like GitHub Copilot for Kubernetes manifest generation, and Azure Arc’s ability to manage on-premises and edge Kubernetes deployments through a single control plane.

Oracle

54:57 Oracle To Offer Google Gemini Models To Customers 2025 08 14

  • Oracle is partnering with Google Cloud to bring Gemini 1.5 Pro and Gemini 1.5 Flash models to Oracle Cloud Infrastructure (OCI) Generative AI service, marking Oracle’s first major third-party LLM partnership beyond Cohere.
  • This positions Oracle as a multi-model cloud provider similar to AWS Bedrock and Azure OpenAI Service, though arriving later to market with a more limited selection compared to competitors’ broader model portfolios.
  • The integration targets Oracle’s existing enterprise customers who want to use Google’s models while keeping data within OCI’s security boundaries, particularly appealing to regulated industries already invested in Oracle’s ecosystem.
  • Gemini models will be available through OCI’s standard APIs with Oracle’s built-in security features, though pricing details remain unannounced, which makes cost comparison with direct Google Cloud access impossible.
  • The real test will be whether Oracle can attract new AI workloads or simply provide convenience for existing Oracle shops that would have used Google Cloud directly anyway.

56:01 Ryan – “What a weird thing.”

Other Clouds

56:42 DigitalOcean stock jumps nearly 29% as earnings and revenue top expectations – SiliconANGLE

  • DigitalOcean reported Q2 earnings of 59 cents per share on $219M revenue (14% YoY growth), beating analyst expectations and driving a 29% stock surge. The company’s focus on higher-spending “Scalers+” customers (spending $500+ monthly) showed 35% YoY growth and now represents nearly 25% of total revenue.
  • The company launched Gradient AI Platform, providing managed access to GPU infrastructure and foundation models from Anthropic, Meta, Mistral AI, and OpenAI. AI-related revenue more than doubled year-over-year, indicating strong developer adoption for building AI applications.
  • DigitalOcean partnered with AMD to expand GPU capabilities through GPU Droplets and the AMD Developer Cloud.
  • This positions them to compete more effectively in the AI infrastructure market against larger cloud providers.
  • The company achieved its highest incremental ARR since Q4 2022 and maintained a 109% net dollar retention rate for Scalers+ customers.
  • Full-year guidance of $888-892M revenue exceeded analyst expectations of $880.81M.
  • With over 60 new product features shipped across compute, storage, and networking categories, DigitalOcean continues to expand beyond its traditional developer-focused offerings.
  • The strong financial performance suggests their strategy of targeting both core cloud and AI workloads is resonating with customers.

58:14 Introducing SQL Stored Procedures in Databricks | Databricks Blog

  • Hold on to your butts… Databricks has entered Jurassic Park territory. Insert an Ian Malcolm meme here.
  • Databricks introduces SQL Stored Procedures following ANSI/PSM standards, enabling users to encapsulate repetitive SQL logic for data cleaning, ETL workflows, and business rule updates while maintaining Unity Catalog governance.
  • This addresses a key gap for enterprises migrating from traditional data warehouses that rely heavily on stored procedures.
  • The feature supports parameter types (IN, OUT, INOUT), nested/recursive calls, and integrates with SQL Scripting capabilities, including control flow, variables, and dynamic SQL execution. Unlike functions that return values, procedures execute sequences of statements, making them ideal for complex workflows.
  • Early adopters like ClicTechnologies report improved performance, scalability, and reduced deployment time for critical workloads like customer segmentation. The ability to migrate existing procedures without rewriting code significantly simplifies transitions from legacy systems.
  • Key limitations heading toward GA include a lack of support for cursors, exception handling, and table-valued parameters, with temporary tables and multi-statement transactions currently in private preview. These gaps may impact complex enterprise workload migrations.
  • This positions Databricks to better compete with traditional enterprise data warehouses by offering familiar SQL constructs while maintaining lakehouse advantages. The commitment to contribute this to Apache Spark ensures broader ecosystem adoption beyond Databricks.

59:28 Ryan – “Database people are gonna do data things.”

Cloud Journey

1:00:42 A guide to platform engineering | Google Cloud Blog

  • Google introduces “shift down” strategy for platform engineering, moving responsibilities from developers into the underlying platform infrastructure rather than the traditional DevOps “shift left” approach that pushes work earlier in development cycles.
  • The approach categorizes development ecosystems into types (0-4) based on how much control and quality assurance the platform provides – from flexible “YOLO” (yes, it really is called that, and yes, Ryan is now contractually obligated to get a tattoo of it) environments to highly controlled “Assured” systems where the platform handles security and reliability.
  • Key technical implementation relies on proper abstractions and coupling design to embed quality attributes like security and performance directly into the platform, reducing operational burden on individual developers.
  • Organizations should work backwards from their business model to determine the right platform type, balancing developer flexibility against risk tolerance and quality requirements for different applications.
  • This represents a shift in thinking about platform engineering – instead of one-size-fits-all approaches, Google advocates for intentionally choosing different platform types based on specific business needs and acceptable risk levels.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

  continue reading

313 епізодів

Artwork
iconПоширити
 
Manage episode 503383812 series 3680004
Вміст надано TCP.FM, Justin Brodley, Jonathan Baker, Ryan Lucas, and Matt Kohn. Весь вміст подкастів, включаючи епізоди, графіку та описи подкастів, завантажується та надається безпосередньо компанією TCP.FM, Justin Brodley, Jonathan Baker, Ryan Lucas, and Matt Kohn або його партнером по платформі подкастів. Якщо ви вважаєте, що хтось використовує ваш захищений авторським правом твір без вашого дозволу, ви можете виконати процедуру, описану тут https://uk.player.fm/legal.

Welcome to episode 318 of The Cloud Pod, where the forecast is always cloudy! We’re going on an adventure! Justin and Ryan have formed a fellowship of the cloud, and they’re bringing you all the latest and greatest news from Valinor to Helm’s Deep, and Azure to AWS to GCP. We’ve water issues, some Magic Quadrants, and Aurora updates…but sadly no potatoes. Let’s get into it!

Titles we almost went with this week:

  • You’ve Got No Mail: AOL Finally Hangs Up on Dial-Up
  • Ctrl+Alt+Delete Climate Change
  • H2-Oh No: Your Gmail is Thirsty
  • The Price is Vibe: Kiro’s New Request-Based Model
  • Spec-tacular Pricing: Kiro Leaves the Waitlist Behind
  • SHA-zam! GitHub Actions Gets Its Security Cape
  • Breaking Bad Actions: GitHub’s Supply Chain Intervention
  • Graph Your Way to Infrastructure Happiness
  • The Tables Have Turned: S3 Gets Its Iceberg Moment
  • Subnet Where It Hurts: GKE Finally Gets IP Address Relief
  • All Your Database Are Belong to Database Center
  • From Droplets to Dollars: DigitalOcean’s AI Pivot Pays Off
  • DigitalOcean Rides the AI Wave to Record Earnings
  • Agent Smith Would Be Proud: Microsoft’s Multi-Agent Matrix
  • Aurora Borealis: A Decade of Database Enlightenment
  • Fifteen Shades of Cloud: AWS’s Unbroken Streak
  • The Fast and the Failover-ious: Aurora Edition
  • Gone in Single-Digit Seconds: AWS’s Speedy Database Recovery
  • Agent 007: License to Secure Your AI

A big thanks to this week’s sponsor:

We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our Slack channel for more info.

General News

01:02 AOL is finally shutting down its dial-up internet service | AP News

  • AOL is discontinuing its dial-up internet service on September 30, 2024, marking the end of a technology that introduced millions to the internet in the 1990s and early 2000s.
  • Census data shows 163,401 US households still used dial-up in 2023, representing 0.13% of homes with internet subscriptions, highlighting the persistence of legacy infrastructure in underserved areas – which is honestly crazy.
  • Here’s hoping that these folks are able to switch to alternatives, like Starlink.
  • This shutdown reflects broader technology lifecycle patterns as companies retire legacy services like Skype, Internet Explorer, and AOL Instant Messenger to focus resources on modern platforms.
  • The transition away from dial-up demonstrates the evolution from telephone-based connectivity to broadband and wireless technologies that now dominate internet access.
  • AOL’s journey from a $164 billion valuation in 2000 to being sold by Verizon in 2021 illustrates the rapid shifts in technology markets and the challenges of adapting legacy business models.

02:30 British government asks people to delete old emails to reduce data centres’ water use

  • The UK government is advising citizens to delete old emails and photos to reduce water consumption by data centers, as England faces potential water shortages by 2050.
  • Data centers require significant water for cooling systems, with some facilities using millions of gallons daily to maintain optimal operating temperatures for servers.
  • This highlights the often-overlooked environmental impact of cloud storage, where seemingly harmless archived data contributes to ongoing resource consumption even when unused.
  • The recommendation represents a shift toward individual responsibility for cloud sustainability, though the actual impact of consumer data deletion versus enterprise usage remains unclear.
  • This raises questions about whether cloud providers should implement more aggressive data lifecycle policies or invest in water-efficient cooling technologies rather than relying on user behavior changes.
  • Bottom line: good for data privacy, bad for water usage.

03:01 Ryan – “It’s going to make it worse! Data at rest doesn’t use a whole lot of resources. Deleting anything from a file system is expensive from a CPU perspective, and it’s going to cause the temperature to go up – therefore, more cooling…”

01:17 Data centres to be expanded across UK as concerns mount

  • The UK is planning nearly 100 new data centers by 2030, representing a 20% increase from the current 477 facilities, with major investments from Microsoft, Google, and Blackstone Group totaling billions of pounds.
  • This expansion is driven by AI workload demands and positions the UK as a critical hub for cloud infrastructure.
  • Energy consumption concerns are mounting as these facilities could add 71 TWh of electricity demand over 25 years, with evidence from Ohio showing residential energy bills increasing by $20 monthly due to data center operations.
  • The UK government has established an AI Energy Council to address supply-demand challenges.
  • Water usage for cooling systems is creating infrastructure strain, particularly in areas serviced by Thames Water, with Anglian Water already objecting to one proposed site. New facilities are exploring air cooling and closed-loop systems to reduce environmental impact.
  • Planning approval timelines of 5-7 years are pushing some operators to consider building in other countries, potentially threatening the UK’s position as a major data center hub.
  • The government has designated data centers as critical national infrastructure and is overturning local planning rejections to accelerate development.
  • The concentration of new facilities in London and surrounding counties raises questions about regional infrastructure capacity and whether existing power and water systems can support this rapid expansion without impacting residential services and pricing.

07:12 Justin – “Power and cooling are definitely a problem… There is pressure on using water in data centers to cool them. That is a valid concern – especially with a hundred new data centers coming online, as well as powering. How do you power all those hungry, hungry GPUs?”

Cloud Tools

08:30 GitHub Actions policy now supports blocking and SHA pinning actions – GitHub Changelog

  • GitHub Actions now lets administrators explicitly block malicious or compromised actions by adding a ! prefix to entries in the allowed actions policy, providing a critical defense mechanism when third-party workflows are identified as security threats.
  • The new SHA pinning enforcement feature requires workflows to reference actions using full commit SHAs instead of tags or branches, preventing automatic execution of malicious code that could be injected into compromised dependencies.
  • This addresses a major supply chain security gap where compromised actions could exfiltrate secrets or modify code across all dependent workflows, giving organizations rapid response capabilities to limit exposure.
  • GitHub is also introducing immutable releases that prevent changes to existing tags and assets, enabling developers to pin tags with confidence and use Dependabot for safe updates without risk of malicious modifications.
  • These features are particularly valuable for enterprises managing large GitHub Actions ecosystems, as they can now enforce security policies at the organization or repository level while maintaining the flexibility of the open source action marketplace.

09:41 Ryan – “This is something that’s been really relevant to my day job; I’ve been arguing for months now to NOT expand permissions to cloud and other integrations for GitHub actions, because I’m not a fan of the security actions.”

AWS

11:26 Kiro Pricing Plans Are Now Live – Kiro

  • Kiro is launching a tiered pricing model with Free, Pro ($29/month), Pro+ ($99/month), and Power ($299/month) plans, transitioning from their preview/waitlist model to allow broader access to their cloud development tool.
  • The pricing structure is based on “Vibe” and “Spec” requests, with the free tier offering 50 Vibe requests monthly and paid tiers providing varying amounts of both request types, plus optional overage charges for flexibility.
  • New users receive a 14-day welcome bonus of 100 Spec and 100 Vibe requests to evaluate the tool’s capabilities before committing to a paid plan, with immediate plan activation and modification available.
  • The tool integrates with Google, GitHub, and AWS Builder ID authentication, suggesting it’s positioned as a cloud development assistant or automation tool that works across major platforms.
  • Kiro appears to solve the problem of cloud development workflow optimization by providing request-based interactions, though the exact nature of what Vibe and Spec requests accomplish isn’t detailed in this pricing announcement.

13:19 Ryan – “I think it’s great, but I’m kind of put off by the free plan not including anything, and then the 14-day limit for new users. I just feel like that’s too constricting, and it will keep me from trying it.”

13:47 Amazon Athena now supports CREATE TABLE AS SELECT with Amazon S3 Tables

  • Athena now supports CREATE TABLE AS SELECT (CTAS) with Amazon S3 Tables, enabling users to query existing datasets and create new S3 Tables with results in a single SQL statement.
  • This simplifies data transformation workflows by eliminating the need for separate ETL processes.
  • S3 Tables provide the first cloud object store with built-in Apache Iceberg support, and this integration allows conversion of existing Parquet, CSV, JSON, Hudi, and Delta Lake formats into fully-managed tables.
  • Users can leverage Athena’s familiar SQL interface to modernize their data lake architecture.
  • The feature enables on-the-fly partitioning during table creation, allowing optimization for different query patterns without reprocessing entire datasets. This flexibility is particularly valuable for organizations managing large-scale analytics workloads.
  • Once created, S3 Tables support INSERT and UPDATE operations through Athena, moving beyond the traditional read-only nature of S3-based analytics.
    • This positions S3 Tables as a more complete data warehouse alternative for cost-conscious organizations.
  • Available in all regions where both Athena and S3 Tables are supported, though specific pricing for S3 Tables operations isn’t detailed in the announcement.
    • Organizations should evaluate the cost implications of S3 Tables’ managed optimization features versus traditional S3 storage.

14:28 Ryan – “It’s the partitioning of data in your table on the fly. That’s the part where this is super valuable.”

16:44 Celebrating 10 years of Amazon Aurora innovation | AWS News Blog

  • Aurora celebrates 10 years since GA with a livestream event on August 21, 2025, featuring technical leaders discussing the architectural decision to decouple storage from compute that enabled commercial database performance at one-tenth the cost.
  • Key milestone announcements include Aurora DSQL (GA May 2025), a serverless distributed SQL database offering 99.99% single-Region and 99.999% multi-Region availability with strong consistency across all endpoints for always-available applications.
  • Storage capacity doubled from 128 TiB to 256 TiB with no upfront provisioning and pay-as-you-go pricing, while Aurora I/O-Optimized provides predictable pricing with up to 40% cost savings for I/O-intensive workloads.
  • Aurora now integrates with AI services through pgvector for similarity search, zero-ETL to Amazon Redshift and SageMaker for near real-time analytics, and Model Context Protocol (MCP) servers for AI agent integration with data sources.
  • Aurora PostgreSQL Limitless Database provides serverless horizontal scaling (sharding) capabilities, while blue/green deployments simplify database updates, and optimized read instances improve query performance for hundreds of thousands of AWS customers.

19:21 AWS named as a Leader in 2025 Gartner Magic Quadrant for Strategic Cloud Platform Services for 15 years in a row | AWS News Blog

  • AWS maintains its position as the highest-ranked provider on Gartner‘s “Ability to Execute” axis for the 15th consecutive year, reinforcing its market leadership in strategic cloud platform services.
  • Gartner highlights AWS’s custom silicon portfolio (Graviton, Inferentia, Trainium) as a key differentiator, enabling better hardware-software integration and improved power efficiency for customer workloads.
  • The report emphasizes AWS’s extensive global community as a competitive advantage, with millions of active customers and tens of thousands of partners providing knowledge sharing and support through the new AWS Builder Center hub.
  • AWS Transform emerges as the first agentic AI service specifically designed to accelerate enterprise modernization of legacy workloads, including .NET, mainframe, and VMware migrations.
  • The recognition underscores AWS’s operational scale advantage, with its market share enabling a more robust partner ecosystem that helps organizations successfully adopt cloud services.
  • Right below Amazon was Google (yes, it came above Microsoft on ability to execute and completeness of vision), then Oracle in 4th. Alibaba was the only challenger from China, and IBM placed too. Although we’re not sure how.

27:45 Amazon Web Services (AWS) Advanced Go Driver is generally available

  • AWS releases an open-source Go driver that wraps pgx PostgreSQL and native MySQL drivers to reduce database failover times from minutes to single-digit seconds for RDS and Aurora clusters.
  • The driver monitors cluster topology and status to identify new writers quickly during failovers, while adding support for Federated Authentication, AWS Secrets Manager, and IAM authentication.
  • This addresses a common pain point where standard database drivers can take 30-60 seconds to detect failovers, causing application timeouts and errors during Aurora’s automated failover events.
  • Available under Apache 2.0 license on GitHub, the driver requires no code changes beyond swapping import statements, making it a drop-in replacement for existing Go applications using PostgreSQL or MySQL.
  • For teams running critical Go applications on Aurora, this could significantly reduce downtime during maintenance windows and unplanned failovers without additional infrastructure costs.

27:43 Best performance and fastest memory with the new Amazon EC2 R8i and R8i-flex instances | AWS News Blog

  • AWS launches R8i and R8i-flex instances with custom Intel Xeon 6 processors, delivering 20% better performance and 2.5x memory bandwidth compared to R7i instances, specifically targeting memory-intensive workloads like SAP HANA, Redis, and real-time analytics.
  • R8i instances scale up to 96xlarge with 384 vCPUs and 3TB memory (double the previous generation), achieving 142,100 aSAPS certification for SAP workloads – the highest among comparable cloud and on-premises systems.
  • R8i-flex instances offer 5% better price-performance at 5% lower cost for workloads that don’t need sustained CPU usage, reaching full performance 95% of the time while maintaining the same memory bandwidth improvements.
  • Both instance types feature sixth-generation AWS Nitro Cards with 2x network and EBS bandwidth, plus configurable bandwidth allocation (25% adjustments between network and storage) for optimizing database performance.
  • Currently available in four regions (US East Virginia/Ohio, US West Oregon, Europe Spain) with specific performance gains: 30% faster for PostgreSQL, 60% faster for NGINX, and 40% faster for AI recommendation models.

30:58 Ryan – “I feel like AWS is just trolling us wth instance announcements now. I feel like there’s a new one – and I don’t know the difference. They’re just different words.”

GCP

32:20 Multi-subnet support for GKE clusters increases scalability | Google Cloud Blog

  • GKE clusters can now use multiple subnets instead of being limited to a single subnet’s primary IP range, allowing clusters to scale beyond previous node limits when IP addresses are exhausted.
  • This addresses a common scaling bottleneck where clusters couldn’t add new nodes once the subnet’s IPs were depleted.
  • The feature enables on-demand subnet addition to existing clusters without recreation, with GKE automatically selecting subnets for new node pools based on IP availability. This provides more efficient IP address utilization and reduces waste compared to pre-allocating large IP ranges upfront.
  • Available in preview for GKE version 1.30.3-gke.1211000 or greater, with CLI and API support currently available, while Terraform and UI support are coming soon. This puts GKE on par with EKS, which has supported multiple subnets since launch.
  • Key benefit for enterprises running large-scale workloads that need to grow beyond initial capacity planning, particularly useful for auto-scaling scenarios where node count can vary significantly. The feature works with existing multi-pod CIDR capabilities for comprehensive IP management.
  • No additional costs are mentioned for the multi-subnet capability itself, though standard networking charges apply for the additional subnets created in the VPC.

30:58 Justin – “I always like when a feature comes out right when I need it.”

34:45 Database Center expands coverage | Google Cloud Blog

  • Database Center now monitors self-managed MySQL, PostgreSQL, and SQL Server databases on Compute Engine VMs, extending beyond just managed Google Cloud databases to provide unified fleet management across your entire database estate.
  • The service automatically detects common security vulnerabilities in self-managed databases, including outdated versions, missing audit logs, overly permissive IP ranges, missing root passwords, and unencrypted connections – addressing a significant gap for customers running databases on VMs.
  • New alerting capabilities notify teams when new databases are provisioned or when Database Center detects new issues, while Gemini-powered natural language queries now work at the folder level for better organization-wide database management.
  • Historical comparison features have expanded from 7 days to 30 days, enabling better capacity planning and trend analysis across database fleets, with Database Center remaining free for Google Cloud customers.
  • This positions Google competitively against AWS Systems Manager and Azure Arc, which offer similar hybrid database monitoring, though Google’s AI-powered approach and zero-cost model provide notable differentiation for enterprises managing mixed database environments.

35:33 Justin – “I’m glad to have this. I’m also glad that it can notify me that someone created a SQL cluster, rather than me being surprised by the bill, so that I do appreciate!”

36:54 Introducing Cloud HSM as an encryption key service for Workspace CSE | Google Cloud Blog

  • Google Cloud HSM now integrates with Workspace client-side encryption (CSE) to provide FIPS 140-2 Level 3 compliant hardware security modules for organizations in highly regulated sectors like government, defense, and healthcare that need to maintain complete control over their encryption keys.
  • The service addresses compliance requirements for ITAR, EAR, FedRAMP High, and DISA IL5 certifications while ensuring customer-managed encryption keys never leave the HSM boundary, giving organizations demonstrable data sovereignty and control over sensitive intellectual property or regulated data.
  • Cloud HSM for Google Workspace offers a 99.95% uptime SLA and can be deployed in minutes with a flat pricing model, currently available in the U.S. with global expansion planned in the coming months.
  • The architecture uses a two-step encryption process where data encryption keys (DEKs) are wrapped by customer-managed encryption keys (CMEKs) stored in the HSM, with all cryptographic operations performed inside the hardware security module and comprehensive audit logging through Cloud Logging.
  • This positions Google competitively against AWS CloudHSM and Azure Dedicated HSM by specifically targeting Workspace users who need hardware-backed key management, though pricing details aren’t disclosed in the announcement.

35:33 Justin – “It’s really going to be the CSE side, so it’s actually encrypting on my client. So my Gmail client actually will have a key that is being accessed from this HSM to encrypt the mail at my browser, before it gets sent.”

39:05 Security Summit 2025: Enabling defenders and securing AI innovation | Google Cloud Blog

  • Google Cloud announces comprehensive AI security capabilities at Security Summit 2025, introducing agent-specific protections for Agentspace and Agent Builder, including automated discovery, real-time threat detection, and Model Armor integration to prevent prompt injection and data leakage.
  • The new Alert Investigation agent in Google Security Operations autonomously enriches security events and builds process trees based on Mandiant analyst practices, reducing manual effort in SOC operations while providing verdict recommendations for human intervention.
  • Security Command Center gains three preview features: Compliance Manager for unified policy enforcement, Data Security Posture Management with native BigQuery integration, and Risk Reports powered by virtual red team technology to identify cloud defense gaps.
  • Agentic IAM coming later this year will auto-provision agent identities across cloud environments with support for multiple credential types and authorization policies, addressing the growing need for AI-specific identity management as organizations deploy more autonomous agents.
  • Mandiant Consulting expands services to include AI governance frameworks, pre-deployment hardening guidance, and AI threat modeling, recognizing that organizations need specialized expertise to secure their generative and agentic AI deployments.

35:33 Ryan – “A lot of good features; I’ve been waiting for these announcements…I’m really happy to see these, and there’s a whole bunch I didn’t know about that they announced that I’m super excited about.”

42:26 Rightsizing LLM Serving on vLLM for GPUs and TPUs | Google Cloud Blog

FYI – the link is broken. I tried to find an alternate version, but couldn’t. You’re just going to have to rely on Justin and Ryan’s summary. I apologize in advance. -Heather

  • Google published a comprehensive guide for optimizing LLM serving on vLLM across GPUs and TPUs, providing a systematic approach to selecting the right accelerator based on workload requirements like model size, request rate, and latency constraints.
  • The guide demonstrates that TPU v6e (Trillium) achieved 35% higher throughput (5.63 req/s vs 4.17 req/s) compared to H100 GPUs when serving Gemma-3-27b, resulting in 25% lower costs ($40.32/hr vs $54/hr) to handle 100 requests per second.
  • Key technical considerations include calculating minimum VRAM requirements (57GB for Gemma-3-27b), determining tensor parallelism needs, and using the auto_tune.sh script to find optimal gpu_memory_utilization and batch configurations.
  • The approach addresses a critical gap in LLM deployment where teams often overprovision expensive hardware without systematic benchmarking, potentially saving significant costs for production workloads.
  • Google’s support for both GPU and TPU options in vLLM provides flexibility for different use cases, with TPUs showing particular strength for models requiring tensor parallelism due to memory constraints.

Azure

45:38 Announcing MSGraph Provider Public Preview and the Microsoft Terraform VSCode Extension | Microsoft Community Hub

  • Ryan claims he’s excited about this story, so I stand by my previous prediction that he is angling for an Azure job.
  • Microsoft launches the Terraform MSGraph provider in public preview, enabling day-zero support for all Microsoft Graph APIs, including Entra ID and M365 services like SharePoint, through standard HCL syntax.
  • This positions MSGraph as the AzureAD equivalent of what AzAPI is to AzureRM – providing immediate access to new features without waiting for provider updates.
  • The new Microsoft Terraform VSCode extension consolidates AzureRM, AzAPI, and MSGraph support into a single tool, replacing the separate Azure Terraform and AzAPI extensions. Key features include exporting existing Azure resources as Terraform code, intelligent code completion, and automatic conversion of ARM templates to AzAPI format.
  • This release targets organizations managing Microsoft 365 and Entra ID resources alongside traditional Azure infrastructure, addressing a gap where AWS has separate providers for different services (aws, aws-cc, awscc) while Microsoft now offers unified tooling. The MSGraph provider extends beyond the limited azuread provider to support all beta and v1 Graph endpoints.
  • The extension includes practical migration features like one-click migration from the old Azure Terraform extension and built-in conversion tools for moving AzureRM resources to AzAPI.
  • No pricing information was provided, but the tools follow standard Terraform provider models.
  • For DevOps teams, this enables infrastructure-as-code workflows for previously manual tasks like managing privileged identity management roles, SharePoint site provisioning, and Outlook notification templates – bringing Microsoft 365 administration into the same automation pipelines as cloud infrastructure.

46:42 Ryan – “So I understand why you hate this, because you hate all the services that are behind the Graph API, but there’s a single API point if you want to do anything in Teams. It’s the same API point if you want to query Entra ID for membership in a list of groups. It’s a graph API endpoint for anything in the docs or the mail space.. It’s all just the same API. Because it’s a single API that way, the structure can get real weird real fast… so this is kind of neat. I’m hoping it makes things easier.”

48:07 Agent Factory: The new era of agentic AI—common use cases and design patterns | Microsoft Azure Blog

  • Microsoft introduces Agent Factory, a six-part blog series showcasing five core patterns for building agentic AI that moves beyond simple Q&A to executing complex enterprise workflows through tool use, reflection, planning, multi-agent collaboration, and real-time reasoning (ReAct).
  • Azure AI Foundry serves as the unified platform for agentic AI development, offering local-to-cloud deployment, 1,400+ enterprise connectors, support for Azure OpenAI and 10,000+ open-source models, and built-in security with managed Entra Agent IDs and RBAC controls.
  • Real-world implementations show significant efficiency gains: Fujitsu reduced proposal creation time by 67%, ContraForce automated 80% of security incident response for under $1 per incident, and JM Family cut QA time by 60% using multi-agent orchestration patterns.
  • The platform differentiates from competitors by supporting open protocols like Agent-to-Agent (A2A) and Model Context Protocol (MCP) for cross-cloud interoperability, while providing enterprise-grade observability through Azure Monitor integration and automated evaluation tools.
  • Target customers include enterprises seeking to automate complex multi-step processes across systems, with the platform addressing common challenges like secure data access, agent monitoring, and scaling from single agents to collaborative agent networks without custom scaffolding.

49:46 OneLake costs simplified: lowering capacity utilization when accessing OneLake | Microsoft Fabric Blog | Microsoft Fabric

  • Microsoft has unified OneLake’s capacity pricing by reducing proxy transaction rates to match redirect rates, eliminating cost differences based on access method and simplifying capacity planning for Fabric customers.
  • OneLake serves as the centralized data storage foundation for all Microsoft Fabric workloads, including lakehouses and warehouses, with storage billed pay-as-you-go per GB, similar to Azure Data Lake Storage and Amazon S3.
  • The pricing alignment removes architectural complexity for organizations using OneLake with third-party tools like Azure Databricks or Snowflake, as all access paths now consume Fabric Capacity Units at the same low rate.
    • The term “low” is VERY subjective.
  • This positions OneLake as a more cost-predictable alternative to managing separate data lakes across cloud providers, particularly for enterprises already invested in the Microsoft ecosystem.
  • The change reflects Microsoft’s strategy to make OneLake an open, vendor-neutral data platform that can serve as a single source of truth regardless of which analytics tools organizations choose to use.

51:12 Introducing Azure Linux with OS Guard: Secure, Immutable, and Open-Source Container Host

  • Azure Linux with OS Guard is Microsoft’s new hardened container host OS that enforces immutability, code integrity, and mandatory access control – essentially a locked-down version of Azure Linux designed specifically for high-security container workloads on AKS.
  • The OS uses IPE (Integrity Policy Enforcement), recently upstreamed in Linux kernel 6.12, to ensure only trusted binaries from dm-verity protected volumes can execute, including container layers – this prevents rootkits, container escapes, and unauthorized code execution.
  • Built on FedRAMP-certified Azure Linux 3.0, it inherits FIPS 140-3 cryptographic modules and will gain post-quantum cryptography support as NIST algorithms become available – positioning it for regulated workloads and future security requirements.
  • Unlike AWS Bottlerocket, which focuses on minimal attack surface, Azure Linux with OS Guard emphasizes code integrity verification throughout the stack – from Secure Boot through user space – while maintaining compatibility with standard container workloads.
  • Available soon as an AKS OS SKU via preview CLI with feature flag, customers can test the community edition now on Azure VMs – targeting enterprises needing stronger container security without sacrificing the operational benefits of managed Kubernetes.

46:42 Ryan – “This is interesting, because according to the blog post, it takes a sort of different approach than what we’ve seen in the past with core OS and Bottlerocket and stuff – where they’re trying to reduce what’s in that limit so much that you can’t have anything that vulnerable that can be exploited in it. And this uses a lot more of the protected VMs, where it uses the sort of encrypted memory objects. And so this is sort of a new take on securing container-wise workloads at the compute level.”

53:45 Microsoft is a Leader in the 2025 Gartner® Magic Quadrant for Container Management | Microsoft Azure Blog

  • Justin is turning into a softy, and so wanted to make it up to Azure for being so low on the last Magic Quadrant, so where we are.
  • Microsoft has been named a Leader in Gartner’s 2025 Magic Quadrant for Container Management for the third consecutive year, highlighting its comprehensive container portfolio that includes Azure Kubernetes Service (AKS), Azure Container Apps, and Azure Arc for hybrid/multi-cloud deployments.
  • AKS Automatic (preview) aims to simplify Kubernetes adoption by providing production-ready clusters with automated node provisioning, scaling, and CI/CD integration, while Azure Container Apps offers serverless containers with scale-to-zero capabilities and per-second billing for GPU workloads.
  • The platform integrates AI workload support through GPU-optimized containers in AKS and serverless GPUs in Container Apps, with Microsoft’s KAITO project simplifying open-source AI model deployment on Kubernetes – notably powering ChatGPT’s infrastructure serving 500M weekly users.
  • Azure Kubernetes Fleet Manager addresses enterprise-scale challenges by enabling policy-driven governance across multiple AKS clusters, while node auto-provisioning automatically selects cost-effective VM sizes based on workload demands to optimize spending.
  • Key differentiators include deep integration with Azure’s ecosystem (networking, databases, AI services), developer tools like GitHub Copilot for Kubernetes manifest generation, and Azure Arc’s ability to manage on-premises and edge Kubernetes deployments through a single control plane.

Oracle

54:57 Oracle To Offer Google Gemini Models To Customers 2025 08 14

  • Oracle is partnering with Google Cloud to bring Gemini 1.5 Pro and Gemini 1.5 Flash models to Oracle Cloud Infrastructure (OCI) Generative AI service, marking Oracle’s first major third-party LLM partnership beyond Cohere.
  • This positions Oracle as a multi-model cloud provider similar to AWS Bedrock and Azure OpenAI Service, though arriving later to market with a more limited selection compared to competitors’ broader model portfolios.
  • The integration targets Oracle’s existing enterprise customers who want to use Google’s models while keeping data within OCI’s security boundaries, particularly appealing to regulated industries already invested in Oracle’s ecosystem.
  • Gemini models will be available through OCI’s standard APIs with Oracle’s built-in security features, though pricing details remain unannounced, which makes cost comparison with direct Google Cloud access impossible.
  • The real test will be whether Oracle can attract new AI workloads or simply provide convenience for existing Oracle shops that would have used Google Cloud directly anyway.

56:01 Ryan – “What a weird thing.”

Other Clouds

56:42 DigitalOcean stock jumps nearly 29% as earnings and revenue top expectations – SiliconANGLE

  • DigitalOcean reported Q2 earnings of 59 cents per share on $219M revenue (14% YoY growth), beating analyst expectations and driving a 29% stock surge. The company’s focus on higher-spending “Scalers+” customers (spending $500+ monthly) showed 35% YoY growth and now represents nearly 25% of total revenue.
  • The company launched Gradient AI Platform, providing managed access to GPU infrastructure and foundation models from Anthropic, Meta, Mistral AI, and OpenAI. AI-related revenue more than doubled year-over-year, indicating strong developer adoption for building AI applications.
  • DigitalOcean partnered with AMD to expand GPU capabilities through GPU Droplets and the AMD Developer Cloud.
  • This positions them to compete more effectively in the AI infrastructure market against larger cloud providers.
  • The company achieved its highest incremental ARR since Q4 2022 and maintained a 109% net dollar retention rate for Scalers+ customers.
  • Full-year guidance of $888-892M revenue exceeded analyst expectations of $880.81M.
  • With over 60 new product features shipped across compute, storage, and networking categories, DigitalOcean continues to expand beyond its traditional developer-focused offerings.
  • The strong financial performance suggests their strategy of targeting both core cloud and AI workloads is resonating with customers.

58:14 Introducing SQL Stored Procedures in Databricks | Databricks Blog

  • Hold on to your butts… Databricks has entered Jurassic Park territory. Insert an Ian Malcolm meme here.
  • Databricks introduces SQL Stored Procedures following ANSI/PSM standards, enabling users to encapsulate repetitive SQL logic for data cleaning, ETL workflows, and business rule updates while maintaining Unity Catalog governance.
  • This addresses a key gap for enterprises migrating from traditional data warehouses that rely heavily on stored procedures.
  • The feature supports parameter types (IN, OUT, INOUT), nested/recursive calls, and integrates with SQL Scripting capabilities, including control flow, variables, and dynamic SQL execution. Unlike functions that return values, procedures execute sequences of statements, making them ideal for complex workflows.
  • Early adopters like ClicTechnologies report improved performance, scalability, and reduced deployment time for critical workloads like customer segmentation. The ability to migrate existing procedures without rewriting code significantly simplifies transitions from legacy systems.
  • Key limitations heading toward GA include a lack of support for cursors, exception handling, and table-valued parameters, with temporary tables and multi-statement transactions currently in private preview. These gaps may impact complex enterprise workload migrations.
  • This positions Databricks to better compete with traditional enterprise data warehouses by offering familiar SQL constructs while maintaining lakehouse advantages. The commitment to contribute this to Apache Spark ensures broader ecosystem adoption beyond Databricks.

59:28 Ryan – “Database people are gonna do data things.”

Cloud Journey

1:00:42 A guide to platform engineering | Google Cloud Blog

  • Google introduces “shift down” strategy for platform engineering, moving responsibilities from developers into the underlying platform infrastructure rather than the traditional DevOps “shift left” approach that pushes work earlier in development cycles.
  • The approach categorizes development ecosystems into types (0-4) based on how much control and quality assurance the platform provides – from flexible “YOLO” (yes, it really is called that, and yes, Ryan is now contractually obligated to get a tattoo of it) environments to highly controlled “Assured” systems where the platform handles security and reliability.
  • Key technical implementation relies on proper abstractions and coupling design to embed quality attributes like security and performance directly into the platform, reducing operational burden on individual developers.
  • Organizations should work backwards from their business model to determine the right platform type, balancing developer flexibility against risk tolerance and quality requirements for different applications.
  • This represents a shift in thinking about platform engineering – instead of one-size-fits-all approaches, Google advocates for intentionally choosing different platform types based on specific business needs and acceptable risk levels.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

  continue reading

313 епізодів

כל הפרקים

×
 
Loading …

Ласкаво просимо до Player FM!

Player FM сканує Інтернет для отримання високоякісних подкастів, щоб ви могли насолоджуватися ними зараз. Це найкращий додаток для подкастів, який працює на Android, iPhone і веб-сторінці. Реєстрація для синхронізації підписок між пристроями.

 

Короткий довідник

Слухайте це шоу, досліджуючи
Відтворити