Foundry thumbnail

Foundry

Enterprise platform providing simulation, evaluation, data, and reinforcement learning infrastructure for building and improving AI web agents at scale.

0.0 (0 reviews)

Categories

Overview

Foundry is an enterprise platform that provides simulation, evaluation, data, and reinforcement learning infrastructure for building and improving AI web agents. Backed by Y Combinator (F24) and founded by former Scale AI engineers Manil Lakabi and Pranav Raja, Foundry enables organizations to develop browser-based AI agents that can navigate real enterprise software platforms, handle complex multi-step workflows, and operate reliably without the usual challenges of web drift, IP bans, or rate limits.

Key Capabilities

  • Simulation: Pixel-perfect, reproducible browser environments that eliminate drift, noise, and rate limits, allowing agents to be tested consistently across thousands of scenarios without depending on live external services. This makes it possible to run agents repeatedly in identical conditions, which is essential for reliable evaluation and regression testing.
  • Evaluation: Every agent action is tracked, classified, and tagged. Click failures, layout shifts, and misfires are surfaced immediately, giving teams visibility into exactly where and why agents fail. Evaluation covers multiple dimensions including factuality, tool use, tone, creativity, safety, and relevance.
  • Data: Expert annotators generate custom, long-horizon datasets for supervised fine-tuning of browser agents on real enterprise platforms such as Gmail, Salesforce, and LinkedIn. These datasets capture realistic multi-step workflows that are difficult to source through automated means.
  • Reinforcement Learning: Safely sample and evaluate unlimited trajectories, enabling teams to train browser agents at scale without anti-bot constraints or production risks. This RL infrastructure allows agents to learn from exploration in simulated environments before deployment.

How It Works

Foundry provides a Python SDK that integrates into existing agent workflows. Teams define tasks, spin up reproducible browser environments via Foundry's Agent Web Engine (AWE), run their agents against those environments, and compare results against ground truth. The platform records every event, state mutation, and agent action for detailed analysis. Setup takes approximately five minutes, and evaluations can run up to ten times faster than traditional manual testing approaches.

Use Cases

  • Enterprise SaaS Automation: Build and test agents that navigate complex enterprise software like Salesforce, Google Sheets, and CRM platforms for tasks such as lead enrichment, data entry, and report generation. Agents can handle login credentials, payment details, and multi-step processes across different SaaS applications.
  • Agent Benchmarking and Research: The SDRBench benchmark provides 50 deterministic tasks simulating realistic sales development workflows, enabling reproducible evaluation and leaderboard-ready baselines for browser agent research on cross-application state persistence and error recovery.
  • Pre-deployment Quality Assurance: Run agents through thousands of simulated scenarios before production deployment to catch regressions in factuality, tool use, tone, and task completion accuracy before they impact end users.

Integration

Teams integrate Foundry through its Python SDK, which provides environment management, task definition, and CDP (Chrome DevTools Protocol) URL access for running agents. The platform supports existing agent frameworks and requires minimal changes to existing agent codebases. The evaluation loop follows a straightforward pattern: initialize an environment for each task, run the agent, capture the final state and events, and score results against ground truth.

Getting Started

Foundry is currently in private beta. Teams can apply for access through the Foundry website to unlock all platform features. The platform also offers a benchmark leaderboard for comparing agent performance and will provide sample evaluation datasets for local testing. Organizations interested in enterprise-grade agent infrastructure can contact the Foundry team directly through the website.

Tool Overview

Pricing

Not specified
Added:...
Updated:...

Similar AI Tools

Stability AI Developer Platform thumbnail

Stability AI Developer Platform

Stability AI is a developer platform for building image, video, audio, and 3D applications with APIs, sandbox tools, and credit-based pricing.

0.0(0)
ChatGPT Code Interpreter thumbnail

ChatGPT Code Interpreter

OpenAI sandboxed Python environment within ChatGPT that executes code, analyzes data, creates visualizations, and processes files through natural language conversations.

0.0(0)
TeamPal thumbnail

TeamPal

No-code AI workforce platform for building, customizing, and deploying AI agents across marketing, sales, HR, operations, finance, R&D, design, and IT departments.

0.0(0)
Automix thumbnail

Automix

AI-powered career development platform offering resume review, mock interviews, recruiter tools, and AI chat to automate and enhance the job search workflow.

0.0(0)
Remalt thumbnail

Remalt

Remalt is an AI content workspace where founders and creators orchestrate 9+ LLMs on an infinite visual brainboard to generate platform-ready content for LinkedIn, YouTube and Instagram in under an hour.

0.0(0)