321: The Cloud Pod is in Tears Trying to Understand Azure Tiers
Manage episode 507360886 series 3680004
The Cloud Pod is in Tears Trying to Understand Azure Tiers
Welcome to episode 321 of The Cloud Pod, where the forecast is always cloudy! Justin, Ryan, and Matt are all on hand to bring you the latest in cloud and AI news, including increased metrics data (because who doesn’t love more data), some issues over at Cloudflare, and even bigger issues at Builder.ai – plus so much more. Let’s get started!
Titles we almost went with this week
- Lost in Translation: Google Helps IPv6 Find Its Way to IPv4
- BigQuery’s Soft Landing for Hard Problems
- CloudWatch Gets a Two-Week Memory Upgrade
- VM Glow-Up: From Gen1 Zero to Gen2 Hero
- Azure Gets Contextual: API Management Learns to Speak AI
- The Cloud Pod: Now Broadcasting from 20,000 Leagues Under the Sea
- LoRA LoRA on the Wall, Who’s the Finest Model of Them All
- Azure Says MFA or the Highway for Resource Management
- Two-Factor or Two-Furious: Azure’s Security Ultimatum
- Agent 007: License to Build
- CUD You Believe It? Google’s Discounts Get More Flexible
- WAF’s New Deal: Free Logs with Every Million Requests Served
- SOC It To Me: Google’s AI Security Workshop Tour
- MFA mandatory in Azure, now you too can hate/hate MS Authenticator
- AWS AMIs no longer the Tribbles of cloud computing
- ECS Exec; Justin’s prediction from 2018 finally comes true
General News
00:56 FinOps Weekly Summit 2025
- Victor Garcia reached out and asked us to share the news about the FinOps Weekly Summit coming up on October 23rd, 2025.
- A lot of great speakers; if you’re in the FinOps space, we recommend it.
- Want to register? You can do that here.
01:53 Ignite Registration Opens
- San Francisco, Moscone Center
- November 18–21, 2025
- Need to convince your manager to pay for you to go? Find that letter here.
02:45 Addressing the unauthorized issuance of multiple TLS certificates for 1.1.1.1
- Some issues over at Cloudflare recently…
- Fina CA issued 12 unauthorized TLS certificates for Cloudflare’s 1.1.1.1 DNS resolver IP address between February 2024 and August 2025, violating domain control validation requirements and potentially allowing man-in-the-middle attacks on DNS-over-TLS and DNS-over-HTTPS connections.
- The incident highlights vulnerabilities in the Certificate Authority trust model where any trusted CA can issue certificates for any domain or IP without proper validation, though exploitation would require the attacker to have the private key, intercept traffic, and target clients that trust Fina CA (primarily Microsoft systems).
- Cloudflare failed to detect these certificates for months despite operating its own Certificate Transparency monitoring service because its system wasn’t configured to alert on IP address certificates rather than domain names, exposing gaps in its internal security monitoring.
- The certificates have been revoked and no evidence of malicious use was found, but the incident demonstrates why Certificate Transparency logs are critical infrastructure – without Fina CA voluntarily logging these test certificates, they might never have been discovered.
- Organizations should review their root certificate stores and consider removing or restricting CAs with poor validation practices, while DNS client developers should implement Certificate Transparency validation requirements similar to modern browsers to prevent future incidents.
02:58 Matt – “I really like how in this they say we messed up, but also you should go review everyone that you don’t trust, and only keep ours, because we ARE trusted, and look what we just found and how we fixed it.”
AI Is Going Great – Or How ML Makes Money
06:02 How Builder.ai Collapsed Amid Silicon Valley’s Biggest Boom – The New York Times
- Builder.ai collapsed from a $1.5 billion valuation to bankruptcy after the board discovered sales were overstated by 75% – reported $217M revenue in 2024 was actually $51M, highlighting risks in AI startup valuations during the current investment boom
- The company spent 80% of revenue on marketing rather than product development, using terms like “AI-powered” and “machine learning” without substantial AI technology – its “Natasha AI” product manager was reportedly assisted by 700 Indian programmers rather than autonomous AI
- Microsoft invested $30M and partnered with Builder for cloud storage integration, while other investors included Qatar Investment Authority, SoftBank’s DeepCore, and Jeffrey Katzenberg – total funding reached $450M before the collapse
- SEC has charged multiple AI startups with fraud this year, including GameOn ($60M investor losses) and Nate (shopping app using Filipino contractors instead of AI), with Builder now under investigation by Southern District of New York prosecutors
- The .ai domain registrations are approaching 1 million addresses with 1,500 new ones daily, compared to an estimated 10,000 total ventures during the dot-com era, which demonstrates the scale of the current AI investment frenzy, where companies rebrand to attract funding
07:30 Ryan – “I’ve definitely seen this before, and you know, this sort of model of that’s like ‘we’ve got machine learning, we got this, and now it’s with AI too’. It’s the same sort of thing – fake it till you make it only goes so far.”
09:31 The Visual Studio August Update is here – smarter AI, better debugging, and more control – Visual Studio Blog
- Visual Studio’s August 2025 update integrates GPT-5 and introduces Model Context Protocol (MCP) support, enabling developers to connect AI agents directly to databases, code search, and deployment systems without custom integrations for each tool.
- MCP functions as “the HTTP of tool connectivity” with OAuth support for any provider, one-click server installation from web repositories, and governance controls via GitHub policy settings for enterprise compliance.
- The enhanced Copilot Chat now uses improved semantic code search to automatically retrieve relevant code snippets from natural language queries across entire solutions, reducing manual navigation time.
- Developers can now bring their own AI models using API keys from OpenAI, Google, or Anthropic, providing flexibility for teams with specific performance, privacy, or cost requirements in their cloud development workflows.
- New features include partial code completion acceptance (word-by-word or line-by-line), Git history context in chat, and unified debugging for Unreal Engine that combines Blueprint and native C++ code in a single session.
10:50 Ryan – “I’ve been using Copilot almost exclusively for a little while in VS Code, just because it’s better than some of the add-ons. There’s a couple of other integrations you can use with AWS Q and Gemini, and you can sort of tack them on, but Copilot, you can use multiple languages, and it has just built-in hooks into the client itself. So I don’t know if it’s a matter of it’s the first one I use, so I’m biased or what, but I really like it.”
AWS
11:37 AWS adds the ability to centrally manage access to AWS Regions and AWS Local Zones
- AWS Global View now provides centralized management of Region and Local Zone access through a single console page, eliminating the need to check opt-in status across multiple locations individually.
- The Regions and Zones page displays infrastructure location details, opt-in status, and parent Region relationships, giving administrators a comprehensive view of their global AWS footprint for compliance and governance purposes.
- This feature addresses a common pain point for enterprises managing multi-region deployments who previously had to navigate to each Region separately to verify access and opt-in status.
- The capability integrates with existing AWS Global View functionality that allows viewing resources across multiple Regions, extending the service’s utility for global infrastructure management.
- Available in all commercial AWS Regions at no additional cost, the feature simplifies Region access auditing and helps prevent accidental deployments to unauthorized locations.
- This is available for free…so thanks, Amazon. We’ll always happily accept services that should have existed a decade ago.
14:42 Amazon CloudWatch now supports querying metrics data up to two weeks old
- CloudWatch Metrics Insights now queries metrics data up to 2 weeks old instead of just 3 hours, enabling longer-term trend analysis and post-incident investigations using SQL-based queries.
- This extension addresses a significant limitation for teams monitoring dynamic resource groups, who previously couldn’t visualize historical data beyond 3 hours when using Metrics Insights queries.
- The feature is automatically available at no additional cost in all commercial AWS regions, with standard CloudWatch pricing applying only for alarms, dashboards, and API usage. (Although you’re already paying for CloudWatch metric insights, so don’t let them fool you.)
- Operations teams can now investigate incidents days after they occur and identify patterns across their infrastructure without switching between different query methods or data sources.
- This positions CloudWatch Metrics Insights as a more viable alternative to third-party monitoring solutions that already offer extended historical data access for SQL-based metric queries.
15:35 Ryan – “ 3 hours is nowhere near enough. So many workloads are cyclical across a day, or we’ll even have different traffic patterns across a week, so it’s kind of crazy to me – 3 hours. I never used CloudWatch Metrics and now I understand why.”
16:46 Amazon CloudWatch query alarms now support monitoring metrics individually
- CloudWatch query alarms now monitor multiple individual metrics through a single alarm using Metrics Insights SQL queries with GROUP BY and ORDER BY conditions, automatically adjusting as resources are created or deleted.
- This solves the operational burden of managing separate alarms for dynamic resource fleets like auto-scaling groups, where teams previously had to choose between aggregated monitoring or maintaining individual alarms for each resource.
- The feature works by creating alarms on Metrics Insights queries that dynamically update results with each evaluation, ensuring no resources go unmonitored as infrastructure scales up or down.
- Available in all commercial AWS regions plus GovCloud and China regions, with standard Metrics Insights query alarm pricing applying per the CloudWatch pricing page.
- Yet another of the “this should have been here 10 years ago” features. But what do we know?
- Real-world use cases include monitoring per-instance metrics across auto-scaling groups, tracking individual Lambda function performance in serverless architectures, or watching container metrics in dynamic ECS/EKS clusters without manual alarm management.
17:34 Ryan – “I can’t believe this took so long.”
18:09 Announcing general availability of Organizational Notification Configurations for AWS User Notifications
- AWS User Notifications now supports centralized notification management across Organizations, allowing Management Accounts or up to 5 Delegated Administrators to configure and view notifications for specific OUs or entire organizations from a single location.
- The feature integrates with Amazon EventBridge Events, enabling organizations to create notification rules for security events like console sign-ins without MFA, with alerts delivered to the AWS Console Mobile Application and Console Notifications Center.
- This addresses a key operational challenge for multi-account organizations by eliminating the need to configure notifications individually in each member account, significantly reducing administrative overhead for security and compliance monitoring.
- Organizations can now implement consistent notification policies across hundreds or thousands of accounts, improving incident response times and ensuring critical events don’t go unnoticed in sprawling AWS environments.
- The service is available in all AWS Regions where User Notifications is supported, with no additional pricing beyond standard EventBridge and notification delivery costs.
21:15 Justin – “The theme of the Amazon section today is just everything Ryan and I asked for ten years ago, in general.”
20:27 Amazon EC2 announces AMI Usage to better monitor the use of AMIs
- AMI Usage provides free visibility into which AWS accounts are consuming your AMIs across EC2 instances and launch templates, eliminating the need for custom tracking scripts that previously created operational overhead.
- The feature enables dependency checking within your account to identify resources using specific AMIs, including EC2 instances, launch templates, Image Builder recipes, and SSM parameters before deregistration.
- This addresses a common operational challenge where organizations struggle to track AMI proliferation across multiple accounts and teams, potentially reducing costs from unused or orphaned AMIs.
- The service is available at no additional cost in all AWS regions, including China and GovCloud, making it accessible for compliance-sensitive workloads that need AMI governance.
- Organizations can now safely deprecate old AMIs by understanding their full usage footprint, supporting better security hygiene, and reducing the attack surface from outdated images.
22:21 ECS Exec is now available in the AWS Management Console
- ECS Exec now provides direct console access to running containers without SSH keys or inbound ports, eliminating the need to switch between console and CLI for debugging tasks.
- The feature integrates with CloudShell to open interactive sessions directly from task details pages, while displaying the underlying CLI command for local terminal use.
- Console configuration includes encryption and logging settings at the cluster level, with ECS Exec enablement available during service and task creation or updates.
- This addresses a common debugging workflow where developers need quick container access for troubleshooting applications and examining running processes in production environments.
- Available in all AWS commercial regions with no additional charges beyond standard ECS and CloudShell usage costs.
23:10 Justin – “You can get to EC2 through SSM, and then you could access ECS tasks from there. But now you can just go right from the console, which is kind of nice.”
26:55 AWS IAM launches new VPC endpoint condition keys for network perimeter controls
- AWS IAM introduces three new global condition keys (aws:VpceAccount, aws:VpceOrgPaths, aws:VpceOrgID) that enable organizations to enforce network perimeter controls by ensuring requests to AWS resources only come through their VPC endpoints.
- These condition keys automatically scale with VPC usage and eliminate the need to manually enumerate VPC endpoints or update policies when adding or removing endpoints, working across SCPs, RCPs, resource-based policies, and identity-based policies.
- The feature addresses a common security requirement for enterprises that need to restrict access to AWS resources from specific network boundaries, particularly useful for organizations with strict compliance requirements around data locality and network isolation.
- Currently limited to a select set of AWS services that support AWS PrivateLink, which may require careful planning for organizations looking to implement comprehensive network perimeter controls across their entire AWS footprint.
- This enhancement simplifies zero-trust network architectures by providing granular control at the account, organization path, or entire organization level without the operational overhead of maintaining extensive VPC endpoint lists in policies.
27:39 Ryan – “It’s a good thing to have. It’s definitely on a lot of control frameworks, so it’s nice to have that easier button to check that compliance box.”
28:50 AWS WAF now includes free WAF Vended Logs based on request volume
- AWS WAF now provides 500 MB of free CloudWatch Logs Vended Logs ingestion for every 1 million WAF requests processed, helping customers reduce logging costs while maintaining security visibility.
- The free allocation applies automatically to your AWS bill at month’s end and covers both CloudWatch and S3 destinations, with usage beyond the included amount charged at standard WAF Vended Logs pricing.
- This change addresses a common customer pain point where WAF logging costs could become substantial for high-traffic applications, making comprehensive security monitoring more accessible for cost-conscious organizations.
- Customers can leverage CloudWatch’s analytics capabilities, including Log Insights queries, anomaly detection, and dashboards, to analyze web traffic patterns and security events without worrying about base logging costs.
- The pricing model scales with usage, meaning customers who process more requests through WAF automatically receive more free log storage, aligning logging costs with actual traffic volume.
31:27 AWS Config now supports resource tags for IAM Policies
- AWS Config now adds resource tag tracking for IAM policies, enabling teams to filter and evaluate IAM policy configurations based on tags for improved governance and compliance monitoring.
- This enhancement allows Config rules to evaluate IAM policies selectively using tags, making it easier to enforce different compliance standards across development, staging, and production policies without creating separate rules for each environment.
- Multi-account organizations can now use Config aggregators to collect IAM policy data across accounts filtered by tags, streamlining centralized governance for policies that match specific tag criteria like department or compliance scope.
- The feature arrives at no additional cost in all supported AWS regions and automatically populates tags when recording IAM policy resource types, requiring only Config recorder configuration to enable.
- This addresses a common pain point where teams struggled to apply granular Config rules to subsets of IAM policies, previously requiring custom Lambda functions or manual processes to achieve tag-based policy governance.
32:32 Ryan – “Taking away all that Lamda spackle…making that no longer necessary? That’s fantastic.”
GCP
33:23 Connect IPv6-only workloads to IPv4 with DNS64 and NAT64 | Google Cloud Blog
- Google Cloud introduces DNS64 and NAT64 to enable IPv6-only workloads to communicate with IPv4 services, addressing the critical gap as enterprises transition away from increasingly scarce IPv4 addresses while maintaining access to legacy IPv4 applications.
- This feature allows organizations to build pure IPv6 environments without dual-stack complexity, using DNS64 to synthesize IPv6 addresses from IPv4 DNS records and NAT64 gateways to translate the actual traffic between protocols.
- The implementation leverages Google’s existing Cloud NAT infrastructure with a simple three-step setup process: create IPv6-only VPC and subnets, enable a DNS64 server policy, and configure a NAT64 gateway through Cloud Router.
- Key use cases include enterprises facing private IPv4 address exhaustion, organizations with IPv6 compliance requirements, and companies wanting to future-proof their infrastructure while maintaining backward compatibility with IPv4-only services.
- While AWS offers similar functionality through NAT64 and DNS64 in their VPCs, Google’s approach integrates directly with their Cross-Cloud Network strategy, potentially simplifying multi-cloud IPv6 deployments for organizations using hybrid architectures.
34:50 BigQuery Managed Disaster Recovery adds soft failover | Google Cloud Blog
- BigQuery Managed Disaster Recovery now offers soft failover, which waits for complete data replication before promoting the secondary region, eliminating the risk of data loss during planned failovers compared to traditional hard failover, which could lose up to 15 minutes of data within the RPO window.
- This addresses a key enterprise concern where companies previously had to choose between immediate failover with potential data loss or delayed recovery while waiting for a primary region that might never recover, making DR testing particularly challenging for compliance-driven industries like financial services.
- The feature provides multiple failover options through BigQuery UI, DDL, and CLI, giving administrators granular control over disaster recovery transitions while maintaining their required RTO and RPO objectives without the operational complexity of manual verification.
- While AWS RDS offers similar automated failover capabilities and Azure SQL Database has auto-failover groups, BigQuery’s implementation focuses specifically on analytics workloads with built-in support for cross-region dataset replication and compute failover in a single managed service.
- The soft failover capability enables realistic DR drills without production impact, particularly valuable for regulated industries that require regular disaster recovery testing for compliance while maintaining zero data loss tolerance during planned maintenance windows.
35:36 Ryan – “There’s nothing worse than trying to DR for a giant data set, especially if you have big data querying or job-based things that you’re fronting into your application with those insights. It just can be so nightmarish.”
36:26 Expanded coverage for Compute Flex CUDs | Google Cloud Blog
- Google expands Compute Flex CUDs to cover memory-optimized VMs (M1-M4), HPC instances (H3, H4D), and serverless offerings like Cloud Run and Cloud Functions, allowing customers to apply spend commitments across more services.
- The new billing model charges discounted rates directly instead of using credits, simplifying cost tracking while expanding coverage beyond traditional compute instances to specialized workloads.
- This positions GCP competitively against AWS Reserved Instances and Azure Reserved VM Instances by offering more flexibility – commitments aren’t tied to specific resource types or regions.
- Key beneficiaries include SAP HANA deployments, scientific computing workloads, and organizations with mixed traditional and serverless architectures who can now optimize costs across their entire stack.
- Customers can opt in immediately, with automatic transition for all accounts by January 21, 2026, though new billing accounts created after July 15, 2025, will automatically use the new model.
37:18 Justin – “So, you have to remember there’s CUD and there’s Flex CUDs. So Flex CUDs were only on certain instance types, and it’s more like a savings plan, where the CUD is more like an RI. You get a better discount with a non-flex CUD. So if your workload is pretty static, then a CUD is actually a better use case. But then, when you do want to upgrade, you’re kind of hosed. So this ability allows you to move between the different versions without losing that CUD benefit.”
39:33 Introducing the Agentic SOC Workshops for security professionals | Google Cloud Blog
- Google Cloud is launching Agentic SOC Workshops, a free half-day training series for security professionals to learn practical AI applications in security operations centers, starting in Los Angeles and Chicago this September.
- The workshops focus on teaching security teams how to use AI agents to automate routine security tasks and reduce alert fatigue, positioning Google’s vision of every customer having a virtual security assistant trained by leading security experts.
- Participants will get hands-on experience with Gemini in Google Security Operations through practical exercises and a Capture the Flag challenge, learning to automate workflows that currently consume analyst time.
- This initiative targets security architects, SOC managers, analysts, and CISOs who want to move beyond AI marketing hype to actual implementation, with workshops planned for major cities across North America.
- While AWS and Azure offer security training and AI tools separately, Google is combining both into a focused workshop format specifically designed for SOC modernization, though no pricing details are provided for the underlying Google Security Operations platform.
40:44 Announcing Dataproc multi-tenant clusters | Google Cloud Blog
- Google Dataproc now supports multi-tenant clusters, allowing multiple data scientists to share compute resources while maintaining per-user authorization to data resources through service account mappings.
- This addresses the traditional tradeoff between resource efficiency and workload isolation in shared environments.
- The feature enables dynamic user-to-service-account mapping updates on running clusters and supports YAML-based configuration for managing large user bases.
- Each user’s workloads run with dedicated OS users, Kerberos principals, and restricted access to only their mapped service account credentials.
- Integration with Vertex AI Workbench and third-party JupyterLab deployments provides notebook users with distributed Jupyter kernels across cluster worker nodes.
- The BigQuery JupyterLab extension enables seamless connectivity, with kernel launch times of 30-50 seconds.
- This positions GCP competitively against AWS EMR Studio and Azure Synapse Spark pools by offering granular IAM-based access control in shared clusters. The autoscaling capability allows administrators to optimize costs by scaling worker nodes based on demand rather than provisioning isolated resources per user.
- Currently in public preview with no specific pricing announced beyond standard Dataproc cluster costs.
- Key use cases include data science teams in financial services, healthcare, and retail who need collaborative environments with strict data access controls.
41:21 Ryan – “Like two months ago, they announced serverless Dataproc, and I thought that that would basically mean you wouldn’t need this anymore? Because this means you’re going to host a giant Dataproc cluster and just pay for it all the time in order to use this.”
42:16 Now available: Rust SDK for Google Cloud | Google Cloud Blog
- Google Cloud launches its first official Rust SDK supporting over 140 APIs, including Vertex AI, Cloud KMS, and IAM, addressing the gap where developers previously relied on unofficial or community-maintained libraries that lacked consistent support and security updates.
- The SDK includes built-in authentication with Application Default Credentials, OAuth2, API Keys, and service accounts, with Workload Identity Federation coming soon, making it easier for Rust developers to integrate with Google Cloud’s security model.
- This positions Google Cloud competitively with AWS (which has had an official Rust SDK since 2021) and Azure (which offers Rust support through community SDKs), particularly targeting high-performance backend services, data processing pipelines, and real-time analytics workloads.
- The SDK is available on crates.io and GitHub with comprehensive documentation and code samples, though pricing follows standard Google Cloud API usage rates with no additional SDK-specific costs.
- Key use cases include building memory-safe microservices, secure data processing systems, and performance-critical applications where Rust’s zero-cost abstractions and memory safety guarantees provide advantages over traditional languages.
43:41 Justin – “Good to see more Rusts happening, hopefully to replace legacy C++ apps that are not thread safe.”
Azure
45:22 Generally Available: Upgrade existing Azure Gen1 VMs to Gen2-Trusted launch
- Azure now allows customers to upgrade existing Generation 1 VMs to Generation 2 with Trusted Launch enabled, addressing security gaps in legacy infrastructure without requiring VM recreation or data migration.
- Trusted Launch provides foundational security features, including Secure Boot and vTPM (virtual Trusted Platform Module), protecting VMs against boot kits, rootkits, and kernel-level malware – capabilities that were previously unavailable to Gen1 VM users.
- This positions Azure competitively with AWS Nitro System and GCP Shielded VMs, though Azure’s approach focuses on retrofitting existing workloads rather than requiring new deployments, potentially saving customers significant migration costs and downtime.
- The upgrade path targets enterprises running legacy Windows Server 2012/2016 and older Linux distributions on Gen1 hardware, enabling them to meet modern compliance requirements without application refactoring.
- While the upgrade process requires a VM restart and temporary downtime, it preserves existing configurations, network settings, and data disks, making it practical for production workloads during maintenance windows.
45:25 Matt – “So unlike Windows, Azure sometimes takes a scorched Earth technique – kind of like Apple does – when they release a lot of features and it takes them a while to get that migration path in there, and I kind of think some of it is because they want that time to test it out and get the scale.”
46:25 Generally Available: Gateway-level metrics and native autoscaling for Azure API Management v2 tiers
- Azure API Management v2 tiers now include gateway-level metrics that provide granular visibility into API performance, request patterns, and error rates at the gateway level rather than just service-wide metrics.
- Native autoscaling automatically adjusts compute capacity based on real-time gateway usage metrics, eliminating manual scaling operations and reducing costs during low-traffic periods while maintaining performance during spikes.
- This positions Azure API Management closer to AWS API Gateway’s automatic scaling capabilities, though Azure’s implementation focuses on gateway-specific metrics rather than Lambda-style request-based scaling.
- The feature targets enterprises running mission-critical APIs that need predictable performance without overprovisioning, particularly useful for organizations with variable traffic patterns or seasonal workloads.
- Available across all v2 tiers (Basic, Standard, and Premium), making enterprise-grade scaling accessible to smaller deployments while maintaining the simplified pricing model introduced with v2 tiers.
46:54 Matt – “The Premier Tier – it’s an arm and a leg, so be careful what you’re doing, and by default it’s not HA. It adds up real fast.”
47:42 Announcing gpt-realtime on Azure AI Foundry: | Microsoft Community Hub
- Microsoft launches gpt-realtime on Azure AI Foundry, a speech-to-speech model that combines voice synthesis improvements into a single API with 20% lower pricing than the preview version, positioning Azure to compete with Google’s voice AI capabilities and Amazon’s Polly service.
- The model introduces two new natural voices (Marin and Cedar), enhanced instruction following, and image input support that allows users to discuss visual content through voice without requiring video, expanding beyond traditional text-to-speech limitations.
- Pricing starts at $40 per million input tokens and $160 per million output tokens for the standard tier, with function calling capabilities that let developers integrate custom code directly into voice interactions for building conversational AI applications.
- Target use cases include customer service automation, accessibility tools, and real-time translation services, with the Real-time API enabling developers to build interactive voice applications that process speech input and generate natural responses in a single pass.
- Integration with Azure AI Foundry provides direct model access through Azure’s infrastructure, offering enterprise customers built-in compliance and security features while simplifying deployment compared to managing separate speech recognition and synthesis services.
48:56 The Responses API in Azure AI Foundry is now generally available | Microsoft Community Hub
- Azure’s Responses API simplifies building AI agents by handling multi-turn conversations, tool orchestration, and state management in a single API call, eliminating the need for complex orchestration code that developers typically write themselves.
- The API includes six built-in tools: File Search for unstructured content, Function Calling for custom APIs, Code Interpreter for Python execution, Computer Use for UI automation, Image Generation, and Remote MCP Server connectivity, allowing agents to decide which tools to use without manual intervention.
- This positions Azure between AWS Bedrock Agents (which requires more manual orchestration) and Google’s Vertex AI Agent Builder, offering a middle ground with pre-built tools while supporting all OpenAI models, including GPT-5 series and fine-tuned models.
- Early adopters like UiPath are using it for enterprise automation where agents interpret natural language and execute actions across SaaS applications and legacy desktop software, with other implementations in financial services for compliance tasks and healthcare for document analysis.
- The API integrates with Azure AI Foundry’s broader agent stack, where developers can start with the Responses API for single agents, then scale to Agent Service for multi-agent orchestration and enterprise integrations with SharePoint, Bing, and Microsoft Fabric.
49:34 Ryan – “I like these things, but I’ve been burnt by the 365 Graph API so many times…I would use it, but I don’t trust it.”
50:19 Azure mandatory multifactor authentication: Phase 2 starting in October 2025 | Microsoft Azure Blog
- Azure is implementing mandatory MFA for all resource management operations starting October 1, 2025, expanding beyond the portal-only enforcement that was completed in March 2025.
- This Phase 2 enforcement covers Azure CLI, PowerShell, REST APIs, SDKs, and Infrastructure as Code tools, addressing the fact that MFA blocks 99.2% of account compromise attacks.
- The enforcement uses Azure Policy for gradual rollout and allows Global Administrators to postpone implementation if needed.
- Workload identities like managed identities and service principals remain unaffected, maintaining automation capabilities while securing human access.
- Organizations need to update to Azure CLI version 2.76 and Azure PowerShell version 14.3 or later for compatibility. Microsoft provides built-in Azure Policy definitions to test impact before enforcement, allowing gradual application across different resource scopes, types, or regions.
- This positions Azure ahead of AWS and GCP in mandatory security controls, as neither competitor currently enforces MFA for all management operations by default. The approach balances security improvements with operational flexibility through postponement options and phased rollouts.
- The enforcement applies to Azure Public Cloud only, with no announced timeline for Azure Government or other sovereign clouds.
- Organizations can use Azure Service Health notifications and email alerts to track their enforcement timeline and prepare accordingly.
50:53 Justin – “It went so well – the first phase of this – I can’t imagine Phase 2 is going to go any better than the first phase did.”
52:16 Agent Factory: From prototype to production—developer tools and rapid agent development | Microsoft Azure Blog
- Azure AI Foundry addresses the challenge of rapidly moving AI agents from prototype to production by providing a unified development experience across VS Code, GitHub, and enterprise deployment channels.
- The platform supports both Microsoft frameworks, like Semantic Kernel and AutoGen, alongside open-source options, including LangGraph, LlamaIndex, and CrewAI, allowing developers to use their preferred tools while maintaining enterprise-grade capabilities.
- The platform implements open protocols, including Model Context Protocol (MCP) for tool interoperability and Agent-to-Agent (A2A) for cross-platform agent collaboration, positioning Azure as protocol-agnostic compared to more proprietary approaches from competitors.
- This enables agents built on different frameworks to communicate and share capabilities across vendor boundaries.
- Azure AI Foundry integrates directly with Microsoft 365 and Copilot through the Microsoft 365 Agents SDK, allowing developers to deploy agents to Teams, BizChat, and other productivity surfaces where business users already work. The platform also provides REST API exposure and Logic Apps integration with thousands of prebuilt connectors to enterprise systems.
- The VS Code extension enables local agent development with integrated tracing, evaluation, and one-click deployment to Foundry Agent Service, while the unified Model Inference API allows model swapping without code changes. This addresses the common pain point of agents working locally but requiring extensive rewrites for production deployment.
- Built-in observability, continuous evaluation through CI/CD integration, and enterprise guardrails for identity, networking, and compliance are integrated into the development workflow rather than added post-deployment. This positions Azure AI Foundry as focusing on production readiness from the start, targeting enterprises that need rapid agent development without sacrificing governance.
Closing
And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod
313 епізодів