Saturday, December 20, 2025

Building a Google Workspace Security Agent with ADK and Policy API

You might have Google Workspace configured perfectly for your startup or enterprise today. But configuration is not a one-time event; it's a neverending lifecycle. You need to verify if 2-Step Verification is truly enforced for everyone, or check which context-aware access levels are active. Clicking through the Google Admin Console to verify hundreds of settings is manual labor. 

How to create an agent with Gemini 3 (Flash) that will check the settings continuously?



Google has recently introduced the Policy API (part of Cloud Identity API), which lets you programmatically view settings that usually live deep inside the Google Workspace Admin Console UI.



Since I've been spending a lot of time with the Agent Development Kit (ADK) lately, I had an idea: Why not build a specific "Auditing Agent" to do the heavy lifting for me?



The Google Workspace Admin Console is great for setting things up, but difficult for auditing at scale. If you want to know "What is the security posture of the 'Marketing' group?", you have to dig deep. The official Policy API allows you to fetch all these policies programmatically. 




I built a Proof-of-Concept Security Agent (with ADK) that reads the API documentation (OpenAPI spec), authenticates securely (OAuth), and answers our questions directly with Gemini 3 Flash.

The Architecture recap
The setup is surprisingly simple. It is about connecting three things: 
1. Policy API: The source of truth for your policies. 
2. ADK (Agent Development Kit) The framework that handles the plumbing (OAuth, Tool calling).
3. Gemini 3 Flash:: The brain that interprets the policy data. 

Agent setup

I will describe my workflow how I did it.

I visited the Cloud Identity API documentation page, because Policy API is part of this endpoint






There is no OpenAPI specification directly here but it is described in Google Discover API - check this JSON file https://cloudidentity.googleapis.com/$discovery/rest?version=v1

So how to do it? I wrote a converter app in Google I Studio that converts JSON format to YAML as OpenAPI.

My prompt was
create an application to convert static google discovery specification to openapi 3 yaml format.
I will paste a google discovery format (as json) and get return in yaml (text)



Once I had the exported YAML file, I was ready for the next step. This file is the "secret sauce" of the whole setup. It serves as a bridge between the complex Policy API and our agent's decision-making process. In the world of AI, agents are essentially a Large Language Model (LLM) paired with a set of tools. While Gemini 3 Flash provides the "brain," the tools provide the "hands" to actually interact with your Google Workspace data.

Usually, you have to define these tools by writing individual Python functions. You’d write the code, add detailed docstrings, and then the agent would use those descriptions to figure out which function to call. It works, but it’s a lot of manual work if you’re dealing with a complex API.

The beauty of using a YAML (OpenAPI schema) definition is that you can skip the manual coding. Instead of writing a separate function for every single API endpoint, you simply provide the YAML file to the Agent Development Kit (ADK) (as seperate file in my case)
The agent is smart enough to parse the specification, understand the available endpoints, and know exactly what parameters are required for each request. It’s a much more efficient way to build, especially when you want your agent to have full access to a comprehensive API like the Policy API.

Since we’re dealing with sensitive organizational data, authentication is a critical piece of the puzzle. Google’s ADK provides built-in OAuth objects that handle the heavy lifting of user authorization.

I visited to the Google Cloud Console (http://console.cloud.google.com/auth/clients/) to generate an OAuth Client ID and OAuth Client Secret in Google Auth Platform.



The trick is to never hardcode these directly into your scripts. I store them in a .env file—it’s a simple way to keep your credentials secure and your code clean. 

Note: Model Gemini 3 Flash Preview is available in global region

.env ---


The agent.py file is where the magic happens. It contains the agent’s core definition (Agent()), including specific instructions provided through a prompt. I’ve clearly defined its role: what it should accomplish and which tools it needs to call upon to get the job done. I used best-practices for Gemini 3 prompting (eg. using <meta>tags</meta>


Next, I configured the OAuth flow. This involves specifying the authorization URL and the necessary scopes. For this auditing agent, I used the scope  https://www.googleapis.com/auth/cloud-identity.policies.readonly . We only need to read the data to analyze it; there's no need for write access, which follows the principle of least privilege.

The last step was setting up the redirect URI. For local testing, I pointed it to "http://localhost:8000/oauth-callback". Just a small heads-up: you must remember to add this local address to your allow-list in the Google Cloud Console. It’s a common stumbling block, but once that's in place, the authentication handshake works perfectly



agent.py ---
Run Agent

With the agent defined and OAuth ready, it was time to take it for a run. 

The Agent Development Kit makes this process incredibly easy with a single command: adk web. Running this in your shell launches a local development interface—a sandbox where you can talk to your agent and see how it thinks.


Once the ADK DEV UI is up, I started with a direct question: "Audit my Google Workspace." Because the agent needs to access live data, it immediately triggered the OAuth flow. I was redirected to a standard Google login screen. 

The beauty of this approach is that I didn't have to tell the agent how to call the API. It used the YAML definition to look up the correct endpoints in the Policy API, fetched the current configuration, and compared it against security best practices. It's like giving Gemini 3 a map and a set of keys, and letting it do the exploration for you.

The final result was exactly what I was looking for. Instead of digging through the Admin Console or parsing through raw JSON, I received a concise summary in natural language. The agent pointed out exactly which settings weren't aligned with my security goals.





Now you can audit your Google Workspace with one just specialized agents


Note: Google Cloud credits are provided for this project during #AISprint 



Monday, June 30, 2025

Vibe Scraping with Google Apps Script and Gemini's URL Context

Nine years ago, I wrote an article describing how to scrape data from the internet in about five minutes. It featured a Google Apps Script library that allowed you to specify what to download from a webpage by identifying the text surrounding your target information. This became one of my most-read articles, and the library itself saved me a significant amount of time.

With the advent of large language models like Gemini, this entire paradigm is shifting. Just as "vibe coding" describes a more intuitive approach to programming, I'd say we're now entering an era of "vibe-scraping." 

You simply define what information you want and URL, and the Gemini API handles the retrieval.The new features available in the Gemini API through Google AI Studio take this concept even further. 



Let's explore this with a practical example I have recently wanted to solve..

I maintain a list of movies I'm interested in watching in a Google Sheet. I want this sheet to include details like current ratings, genre, movie length, and other information typically found on ČSFD (a popular movie database for Czech users, similar to IMDb).

It occurred to me: what if I could simply tell a model what information to fetch, and it would automatically populate the data, structured, into the respective cells in my spreadsheet?


1. This function is dedicated to interacting with the Gemini API. This refers to the API endpoint accessible through Google AI Studio


2. Extracting Structured Data from the URL.
Next second function then calls the Gemini API, utilizing the powerful URL Context parameter.https://ai.google.dev/gemini-api/docs/url-context
This parameter instructs Gemini to ground its responses on the actual content of the provided URL and significantly decreate the likelihood of the hallucinations.

I found that for simple text work, the Gemini 2.5 Flash model is sufficient.


4. To make this solution as universal as possible, I decided to define what information to extract using prompts in the first row of the Google Sheet.


For example, I'd have column headers like "Name", "Rating," "Genre," "Runtime," etc. This means a user can easily customize the data they want to pull by simply changing these header texts, without needing to modify any code. The script then reads these headers and instructs Gemini to find and place the corresponding information into the cells below for each movie.

In my case, I listed several pieces of information in the header row. The script then processes each movie title, finds its page, extracts the specified details using Gemini, and neatly places them into the correct cells in the Google Sheet. This approach elegantly combines the power of Gemini with the flexibility of Google Sheets for efficient, targeted web data extraction.






This approach elegantly combines the power of Gemini with the flexibility of Google Sheets for efficient, targeted web data extraction. You simply enter corresponding URL and the system works to populate your sheet. The script processes each movie entry, leveraging Gemini to extract the specified details, and automatically organizes them into your spreadsheet, streamlining what used to be a time-consuming manual or complex coding task.

Friday, February 28, 2025

Create AI agents in Google Apps Script with Vertex AI and Gemini




Imagine that you write in plain English what you want to do in Google Workspace (eg. workflows) and it happens just like magic. Insert text prompt, Gemini will generate the code for you and run it immediately. A dream? No, reality, thanks to my conceptual and practical idea of how to implement AI Agents in Google Apps Scripts to leverage the V8 runtime.



Google Apps Script


Google Apps Script lets you connect and automate Google Workspace services (like Gmail, Docs, and Drive) by writing JavaScript code in your browser, without needing a separate server. Until now, you had to know programming to create that code.

Gemini


Large Language Models (LLMs) like Gemini are revolutionizing how we interact with technology. Gemini can convert natural language instruction ("prompts") into executable code. Imagine simply describing what you want to automate in plain English, and Gemini generates the code for you.

Vertex AI

Google Cloud's Vertex AI platform offers a powerful tools for working with AI and machine learning models. The Vertex AI Reasoning Engine is a particularly interesting aspect. Its code interpreter allows to run generated code like it would be done in a local environment, but development requires a Python environment and it also becomes more complex to integrate into the Google Workspace environment via APIs. While testing, it occurred to me, could it be simpler?"

Introducing AI Agents for Google Apps Script

In this project, I explored creating a concept for AI Agents. Using a natural language you describe your need, this is then passed to Gemini, leveraging Gemini 2.0 Flash Thinking, to generate the necessary code. The generated code is then fed back into the Google Apps Script environment. The generated code can then execute as part of App Script.

You have to double-check before executing the code. To mitigate this potential risk, a "dry run" function can be included.  This sends the generated code to a smaller model, Gemini 2.0 Flash. The test compares what the code does compared to original task. The results from testing is presented in plain text for verification.




Explanation

1. Open Google Apps Script: The easiest way to start is to simply visit https://script.new in your browser. This will instantly create a new Apps Script project.



2.  You'll need to configure the Apps Script project's manifest file. The manifest defines the settings for your Apps Script project. The manifest specifies the permissions ("scopes") that your application will need to access Google services. Be sure to carefully select and authorize only the necessary scopes for your agent. If you get an error when running, it's because you don't have the necessary permissions and you need to add more scopes.



3. Now I have prepared a function that calls the Gemini API within Vertex AI.




Copy the code into a new .gs file within the Google Apps Script project."

This code describes running an agent. First, it must be configured using the GCP Project and region. Then, you describe in natural language what needs to be done. After running the .act() method, Gemini 2.0 Flash Thinking within Vertex AI is called to generate code.

To ensure that the agent is doing the right thing, you can test the execution via a dry-run, where the code is sent to an internal Tester agent who, via Gemini 2.0 Flash, comments on the code and summarizes it in a log.

If everything is in order, you can then run .run(). I remind you again that you have given the script rights to everything. So if you write something wrong, for example, to delete data, it will actually be deleted. I am not responsible for the results of the script, and you should always review it before running it.

The generated code is stored in the Cache, so after running a dry-run and then a run, the same version will be executed within the Cache limit (currently set to 5 minutes).

  
  ```javascript
  // Define the label name
  const labelName = 'DEMO';

  // Check if the label exists, create if not
  let demoLabel = GmailApp.getUserLabels().find(label => label.getName() === labelName); // Find the label by name
  if (!demoLabel) { // If the label doesn't exist
    demoLabel = GmailApp.createLabel(labelName); // Create the label
    Logger.log(`Label "${labelName}" created.`); // Log that the label was created
  } else { // If the label already exists
    Logger.log(`Label "${labelName}" already exists.`); // Log that the label already exists
  }

  // Load last 10 emails with subject 'Security alert'
  const threads = GmailApp.search('subject:"Security alert"', 0, 10); // Search for threads with the specified subject, starting from the first thread (0) and retrieving a maximum of 10 threads

  // Set the 'DEMO' label to found emails
  if (threads.length > 0) { // If any threads were found
    threads.forEach(thread => { // Iterate over each thread
      thread.addLabel(demoLabel); // Add the 'DEMO' label to the current thread
    });
    Logger.log(`Label "${labelName}" applied to ${threads.length} emails.`); // Log the number of emails the label was applied to
  } else { // If no threads were found
    Logger.log('No emails with subject "Security alert" found in the last 10 threads.'); // Log that no emails were found
  }
```

**Summary:**

The code functions exactly as described in the task:

1.  **Creates a new label 'DEMO' if it doesn't exist:** The code first checks if a label named 'DEMO' already exists. If not, it creates the label.
2.  **Loads the last 10 emails with the subject 'Security alert':** The code then searches for the last 10 emails that have the subject 'Security alert'.
3.  **Sets the 'DEMO' label to the found emails:** Finally, the code iterates through the found emails (represented as threads) and applies the 'DEMO' label to each of them.

The code also includes logging statements to provide information about the actions being performed, which is good practice.
  

Acknowledgments

This project was developed during the Vertex sprints organized by Google’s Developer Expert Program. Google Cloud credits were provided. Thanks, Google, for providing GCP credits for making this happen. #VertexAISprint