Greenhouse Bought a Voice-AI Interviewer: Audit First

On Monday morning, Greenhouse — the hiring platform sitting under most mid-market US recruiting teams — announced it had agreed to acquire Ezra AI Labs. Ezra is a voice-AI interviewing startup. Its product calls candidates on the phone, runs a structured interview against a role-specific rubric, and writes a transcript and a score back into the ATS for the recruiter to review.

Two days later, on May 7, Greenhouse announced Greenhouse MCP — a separate product, rolling out in June, that lets recruiters run AI workflows (pipeline summaries, candidate roundups, audit narratives) against their Greenhouse data through conversational prompts.

Here’s the number from Greenhouse’s own May press cycle that explains why both shipped the same week: applications per recruiter on Greenhouse have spiked 412% since 2023. Fewer than 7% of applicants get an interview. 63% of job seekers have already faced an AI interview — and most haven’t had a good one yet.

Greenhouse’s May 5 newsroom announcement of the definitive agreement to acquire Ezra AI Labs — the headline that brings voice-AI interviewing into the most-used mid-market ATS in North America Source: Greenhouse Newsroom

Translation for the in-house recruiter or HR coordinator at a 50-500-person company: voice-AI phone screens are about to become the default first round at thousands of mid-market employers in Q3. If your team is going to flip one on — Ezra after the acquisition closes, or HireVue, Paradox Olivia, BrightHire, or any of the other voice-AI screeners that exist today — you have one job that needs to happen before the first AI call goes out.

You need to audit four things, and you need to write a 1-page policy. The whole job takes 30 to 45 minutes if you do it right. It can be done by Tuesday.

What just changed (and why “we’ll do this later” is wrong)

Three things shifted in the last 10 days that turn this from a “next year” project into a “this quarter” decision.

Greenhouse-Ezra closes this quarter. When it does, voice-AI screening becomes a default option inside the most-used mid-market ATS in North America. You won’t have to evaluate vendors and write procurement contracts — it will be a toggle in your existing system. That dramatically lowers the bar for “let’s just turn it on and see.”

Greenhouse’s “Why we’re acquiring Ezra AI Labs” blog post explains the strategy: applications per recruiter up 412% since 2023, fewer than 7% of applicants get an interview, and trust in hiring is down — voice-AI is positioned as the answer Source: Greenhouse Blog

Greenhouse MCP ships in June. That means a recruiter at a Greenhouse customer will be able to ask “show me my Mother’s Day-week candidate roundup with red-flag patterns” in a chat interface, and the system will pull from candidate records, voice-AI transcripts, and pipeline data simultaneously. The voice-AI screen output isn’t sitting in a silo anymore — it’s plugged directly into the AI workflows your recruiters are already running.

Connecticut SB 5 passed the state legislature on May 1. It’s heading to Governor Lamont’s desk and is being framed as the first US law requiring independent AI model verification plus employment-decision disclosure. Among many provisions, it regulates “automated employment-related decision technology” — broadly defined to cover any tech that processes personal data and produces a score, rank, recommendation, or classification that materially influences a hiring decision. Voice-AI phone screens are squarely in scope. Iowa SF 2417 already passed (focused on chatbots disclosing they’re not humans, especially to minors) and Colorado has a similar bill in committee.

That regulatory wave is not 18 months away. It is now. CT SB 5’s automated-employment-decision provisions begin October 1, 2026. Effective dates for chatbot-disclosure laws are landing through 2027. The compliance team will be asking the recruiting team for a written policy by September.

The 4-question 30-minute audit

This audit works for any voice-AI screener — Ezra, Paradox Olivia, HireVue, BrightHire, the new entrants that haven’t shipped yet. The vendor matters less than the four questions. Run all four before the first candidate gets an AI call.

1. Candidate-fairness: does the rubric map to a protected class — directly or indirectly?

Ezra’s design (and most peer products) interviews the recruiter and the hiring manager first to build a role-specific evaluation rubric. That rubric becomes the AI’s scoring instrument. So whatever bias is in the rubric is in every score the AI produces, at scale, with the patina of objectivity.

The five-question check, plain English:

Does the rubric reward voice qualities (volume, accent, clarity, pace) that aren’t job-related? Voice quality scores function as a proxy for hearing impairments, speech impairments, and ESL status — protected categories under federal anti-discrimination law.
Does it reward “cultural fit” language that maps to a school, neighborhood, or zip code? Those proxy for race, socioeconomic class, and national origin.
Does it score on prior-employer prestige? That proxies for class background.
Does it score “confidence” or “energy”? Both are heavily correlated with gender presentation and are common bias vectors.
Does it score “fluency” beyond what the job actually requires? A retail-floor role doesn’t need a TOEFL-grade English screen.

If the rubric contains any of these, fix it before the first call. The fix is usually subtraction, not addition.

2. Structured-rubric: is it archived, reviewed, and benchmarked?

The rubric Ezra (or any voice-AI tool) builds for a role gets re-used for every candidate for that role. So when you change the role, you have to remember to change the rubric. When the job description shifts, you have to remember to re-version. When a recruiter leaves, the rubric has to be reviewable by the next person.

Three boxes to tick:

The rubric is stored inside the ATS (Greenhouse, Lever, Workday, Ashby, etc.) and tied to the role’s req ID — not in a vendor-specific dashboard you might lose access to.
A human reviews it at least monthly. Calendar invite, recurring, owner assigned. The review checks for drift between the rubric and the current job description.
It is benchmarked quarterly against actual hire outcomes — does the score predict 90-day performance, or is it predicting something else?

The third one is the catch. Most teams skip it because it’s the hardest. It’s also the one regulators will ask about first under disparate-impact analysis.

3. Candidate-experience: disclosure, do-overs, accommodation

Three things every voice-AI screen has to support before it goes live:

Pre-call disclosure that this is an AI conversation. Plain language, before the call starts: “You’ll be speaking with an AI interviewer named [name]. It’s not a human.” This is the thing Iowa SF 2417, CT SB 5, and Colorado’s pending bill all converge on. The disclosure must be explicit, in writing, before the candidate joins the call. A line buried in a privacy policy doesn’t count.

A do-over policy. Phones cut out. The candidate’s dog barks. Their kid runs in the room. The voice-AI sometimes mishears. The candidate has to be able to request another round under defined conditions, and there has to be a documented escalation path to a human if the AI screen fails for non-candidate reasons.

An accommodation pathway. This is a legal requirement under the ADA, not a nice-to-have. Candidates with hearing differences, speech differences, cognitive differences, or anxiety conditions that make voice-only formats hostile must be able to request a different format — text-based asynchronous, video, or live human screen. The pathway must be visible at the same point as the AI disclosure, not hidden in a separate FAQ. The default escalation has to be to a human who has authority to override the AI and proceed with the candidate.

4. Audit-trail: recordings, scores, flags, retention

The trace you need on every AI call:

Audio recording or transcript stored for at least the longer of 1 year or your jurisdiction’s record-retention requirement.
The score and the rubric version that produced it.
The disclosure timestamp and consent capture.
A human-escalation pathway with a documented SLA — for our recommendation, every flagged call gets a human review within 24 hours.
A consent and retention disclosure to the candidate, in writing, before the call starts.

The reason this matters: when a candidate disputes an outcome (and they will, because 46% of candidates already report decreased trust in hiring per Greenhouse’s own May data), you need the trace. When a regulator asks, you need the trace. When your CISO asks why audio recordings of candidate phone calls are sitting in a vendor cloud, you need the data-residency clause.

The 1-page voice-AI-screening policy template

Once the audit is clean, write the policy. One page. The point isn’t to satisfy a lawyer — it’s to be readable by the next recruiter who joins the team. Cover these eight items, two sentences each:

Which roles use voice-AI screens. List them. If “all roles,” say all roles.
Which vendor. Single named product, not “various tools.”
The rubric review cadence. Monthly review owner, quarterly benchmarking owner.
The disclosure script. Verbatim text the candidate sees before the call.
The do-over policy. Conditions and limits.
The accommodation pathway. Verbatim text + the human contact.
The audit-trail. What’s stored, where, for how long, who has access.
The escalation SLA. Time-to-human-review for flagged calls.

That’s the document your CISO, your employment lawyer, and your operations VP each want a copy of. Print it. Sign it. Pin it in the team Notion. When CT SB 5’s October 1 disclosure provisions kick in, this is what you’ll hand to the compliance team.

What this means for you

If you’re a 1-3 person in-house recruiting team at a 50-500 employee company: the audit is your Q3 work, not “later.” Schedule a 90-minute working block this week. Run the four questions on whatever voice-AI vendor your leadership is leaning toward (or the Ezra-after-Greenhouse-close path if you’re a Greenhouse customer). Write the policy. The first time a candidate’s lawyer asks for the rubric and the audit trail, you want this document already in the ATS, signed off, dated.

If you’re a hiring manager who owns reqs but doesn’t run recruiting full-time: your role is the rubric-builder interview. When the voice-AI tool’s setup process asks you “what are the must-have qualifications, what are the nice-to-haves, what are the disqualifiers” — your answers become the AI’s scoring rubric. Spend the 20 minutes that takes seriously. Don’t let it default to vague “communication skills” language. Specify the actual behavioral signals you’d be listening for in your own first-round phone screen.

If you’re a talent-ops lead at an SMB with mostly hourly hiring: voice-AI screens are most useful in your category — high-volume entry-level reqs with structured questions. They’re also where the bias risk is highest because the candidate pool is more diverse on every demographic axis. The audit isn’t optional for you; it’s the difference between scaling cleanly and creating a disparate-impact lawsuit.

If you’re an agency or contingent recruiter who places candidates into Greenhouse-using clients: your business model is changing in Q3. The first round is now likely an AI call your client’s system runs without you. Have a re-pricing conversation with your top 5 clients this week. The two viable repositioning paths: drop your placement fee to reflect the offloaded first-round work, or add a “voice-AI-screen QA” service tier where you review the AI’s scoring and flag patterns before the hiring manager sees the shortlist.

If you’re a small-business owner who hires occasionally without a dedicated recruiter: you’re not the audience for voice-AI screening tools yet — the per-call economics don’t work below ~30 hires/year. But you should still know about this for two reasons: (1) when you do hire, candidates may have been pre-screened by AI at other employers and that experience colors how they engage with you; (2) the candidate-side of these tools — AI interview prep — is showing up in candidate behavior. People are practicing for AI interviews now. Your in-person interviews are getting compared against that bar.

What this audit can’t fix

The 30-minute audit gets you to “we can defend this if asked.” It does not get you to “this is the right hiring tool for our company.” Some honest limits:

It can’t tell you whether voice-AI screens improve your hires. That requires 90 days of post-hire performance data tied to AI scores. Run the experiment with one role at a time. Don’t roll out to every req on day one.

It can’t fix a vendor’s underlying model bias. If the voice-AI tool’s underlying scoring model has been shown to underperform on accented speech (and several have been), no amount of rubric audit will save you. Read the vendor’s published bias audits. If they don’t have any, that’s a red flag bigger than your rubric.

It can’t replace a human escalation. The audit specifies that flagged calls go to humans. If your team’s hiring volume is growing and the voice-AI flags are growing too, the human escalation team has to grow proportionally. AI screening is not a “free” first round — it’s a different cost shape.

It can’t read state law for you. CT SB 5’s effective dates are staggered. Iowa SF 2417 applies to specific contexts. Colorado is in committee. New York City Local Law 144 has been live since 2023 and has its own rules. If your candidates span multiple states, the disclosure requirements are not uniform. Your employment lawyer has to map your candidate footprint to the disclosure requirements per state — this is not a thing the vendor’s compliance page can tell you.

The bottom line

Voice-AI phone screens are crossing the line from “experimental tool a few teams use” to “default first-round mechanic in the most common mid-market ATS.” Greenhouse-Ezra is the move that signals it. Your team will be asked, this quarter, whether to turn one on.

The cheapest insurance you can buy before that conversation is the 30-minute audit and the 1-page policy. Run the four questions. Write the eight-item document. Tie it to the ATS req. When the compliance team, the employment lawyer, the CISO, or the candidate’s attorney asks — you have the answer in writing.

Then, in 90 days, run the post-hire benchmarking pass. That’s the conversation about whether the AI is actually helping you hire better, or just helping you process more applications.

For HR teams who want a deeper playbook on running AI tools without creating new compliance debt, our AI for HR (legal-safe) course walks through the bias audit, the disclosure script library, and the post-hire benchmarking pattern.

Greenhouse Bought a Voice-AI Interviewer: Audit First

Table of Contents

What just changed (and why “we’ll do this later” is wrong)

The 4-question 30-minute audit

1. Candidate-fairness: does the rubric map to a protected class — directly or indirectly?

2. Structured-rubric: is it archived, reviewed, and benchmarked?

3. Candidate-experience: disclosure, do-overs, accommodation

4. Audit-trail: recordings, scores, flags, retention

The 1-page voice-AI-screening policy template

What this means for you

What this audit can’t fix

The bottom line

Sources

Build Real AI Skills

AI for HR Without Creating a Legal Mess

Smarter Hiring with AI

Interview Preparation with AI