LeetMentor — Voice-Enabled AI Interviewer (Chrome Extension)
Jorge Delgadillo#repo link LeetMentor
LeetMentor overlays a conversational, voice‑enabled AI interviewer directly on LeetCode. It supports two voice modes—Traditional and Realtime—and cleanly separates the extension runtime (background, content, popup) from a small Node WebSocket proxy for low‑latency Realtime audio.
This document is written as MDX so we can embed rich UI bits later (e.g., callouts, tabs). It’s intentionally concise but technical, with links to source files and a couple of deep‑dive code snippets.
TL;DR
- What: Chrome MV3 extension with a React + TypeScript UI over LeetCode, styled with Tailwind.
- Why: Practice algorithms via a guided mock interview with voice.
- How: Background service worker orchestrates config, sessions, OpenAI calls, and messaging; a content script injects the interview UI; a Node Realtime proxy bridges mic → OpenAI Realtime → audio stream back.
Architecture (lean)
Extension (MV3)
Popup: settings + start button
Content Script: injects UI on LeetCode and extracts/caches problem data
Background (service worker): config/session store, message router, OpenAI calls, SPA navigation detection
Interview Page (web‑accessible): React app that reads cached problem + session/config and runs the conversation
Backend (optional, for Realtime): tiny Node ws proxy that relays mic/audio frames between the client and OpenAI Realtime and keeps the API key server‑side
Data/State
chrome.storage.sync: API key, model, voice settings, history window
chrome.storage.local: current session, transcript, cached problem
chrome.runtime.sendMessage: popup/content/interview ↔ background
Key Surfaces
-
Background service worker (
src/background/background.ts
)
Central message bus; reads/writes config (API key, model, voices) tochrome.storage.sync
; manages sessions & OpenAI API calls; detects SPA nav and notifies content. -
Content script (
src/content/standalone-react.tsx
orsrc/content/content.ts
)
Injects the panel, extracts LeetCode problem metadata, caches current problem tochrome.storage.local
for the interview page. -
Interview page (
src/interview/interview.tsx
→InterviewApp
)
Loads session & cached problem, fetches config, then drives the conversation via Traditional Voice (VoiceService
) or Realtime Voice (RealtimeVoiceService
). -
Realtime proxy (
backend/server.js
)
Thin WS bridge:client ⇄ proxy ⇄ OpenAI Realtime
. Keeps keys server-side, forwards audio frames, handles reconnection & basic per-origin control.
Runtime Flow
- Open a LeetCode problem. Content script extracts title/difficulty/tags and writes to
chrome.storage.local
. - Click Start Interview (popup or injected UI). Popup/CS sends
START_INTERVIEW
→ background. - Background resolves config, ensures a session, and opens interview.html (web-accessible resource) with a session id.
- InterviewApp loads session + problem, greets the user, and selects a voice mode based on settings.
- Traditional mode uses
ChatGPTService
(Chat Completions) +VoiceService
(Web Speech / Whisper + TTS).
Realtime mode usesRealtimeVoiceService
to stream mic audio to the Node proxy, which speaks back with low latency.
Tech Stack
- UI: React 18, TypeScript, Tailwind (
src/**
,src/shared/styles.css
) - Bundling: Webpack 5,
ts-loader
, CSS pipeline, two configs (webpack.config.js
,webpack.standalone.config.js
) - Chrome: Manifest V3 service worker, content scripts, web-accessible resources (
public/manifest.json
) - OpenAI: Chat Completions, Whisper (STT), TTS, Realtime (
src/shared/constants.ts
) - Backend: Node +
ws
(backend/server.js
)
Commands
# Build the extension (dist/)
npm run build
# Watch mode for rapid iteration
npm run dev
# Watch + local React test page (e.g., src/content/react-test.html)
npm run dev:react
# Realtime voice backend
cd backend && npm run dev
Load the unpacked extension from dist/ at chrome://extensions.
Configuration & Storage
User config (API key, model, voice settings, history window) → chrome.storage.sync
Session, transcript, problem cache → chrome.storage.local
Messaging → chrome.runtime.sendMessage (popup/content/interview ↔ background)
Keep API keys out of the client where possible. Only Realtime requires a backend proxy (and keeps keys server‑side).
Voice Modes Traditional Voice
TTS: Browser voice or OpenAI TTS via background
STT: Web Speech API; fallback to Whisper via background
LLM: ChatGPTService composes concise prompts + last‑N history and calls Chat Completions; tracks usage/cost
Realtime Voice
Transport: Mic audio → Node proxy → OpenAI Realtime → streaming audio back
UX: Lower latency, barge‑in, conversational flow
Deep‑Dive Snippet #1 — Background Chat Handling
Here’s a real excerpt from the background service showing how embedded interview messages are routed through OpenAI. It demonstrates:
Config resolution (including storage fallback)
Prompt construction with system + history window
Phase‑aware interviewing (understanding → implementation → testing)
API call to Chat Completions
private async handleEmbeddedMessage(data: any): Promise<{ response: string }> {
const { problem, message, config, conversationHistory = [], interviewPhase = 'problem-understanding' } = data;
if (!config || !config.apiKey) {
const configResponse = await this.getConfig();
if (!configResponse || !configResponse.apiKey) {
throw new Error('API key not configured. Please configure your OpenAI API key in the extension popup.');
}
data.config = configResponse;
}
try {
const conciseSystem = `You are a technical interviewer. Keep answers 1–2 sentences. Ask questions, avoid lecturing.`;
const minimalProblem = `Problem: ${problem?.title || 'Unknown'} (${problem?.difficulty || 'Unknown'}). Phase: ${interviewPhase}`;
const phaseInstruction = interviewPhase === 'implementation'
? 'Explicitly ask the candidate to implement now in the editor.'
: interviewPhase === 'testing-review'
? 'Focus on analysis, complexity, and edge cases.'
: '';
const historyN = (config.historyWindow || 8);
const lastN = (conversationHistory || []).slice(-historyN);
const messages = [
{ role: 'system', content: conciseSystem },
{ role: 'system', content: minimalProblem },
phaseInstruction ? { role: 'system', content: phaseInstruction } : undefined,
...lastN,
{ role: 'user', content: message }
].filter(Boolean);
const response = await fetch('https://api.openai.com/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': `Bearer ${data.config.apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: data.config.model || 'gpt-4o',
messages,
max_tokens: 200,
temperature: 0.7,
})
});
const result = await response.json();
return { response: result.choices[0]?.message?.content ?? 'Error: empty response.' };
} catch (err) {
console.error('Error handling embedded message:', err);
throw err;
}
}
Notice how the background service centralizes API calls. This keeps keys out of content scripts and ensures consistent prompting.
Content Script Notes
- Current manifest loads
. The more feature‑rich detector lives in
. - Ensure the chosen content script caches the problem under
leetmentor_current_problem
; the interview page depends on it.
chrome.storage.local.set({ leetmentor_current_problem: { title, difficulty, examples } });
Getting Started (Dev)
- Create
.env
files (client and backend). Do not commit keys. npm run dev
to watch the extension;npm run dev:react
for UI test pages.cd backend && npm run dev
to enable Realtime mode.- Load
dist/
inchrome://extensions
; toggle Realtime in the settings popup.
Minimal Test Plan
- Problem detection: Navigate across LeetCode SPA routes; ensure
NAVIGATION_DETECTED
fires and cache updates. - Popup flow: Start interview, opens
interview.html
, session id present. - Traditional voice: Mic capture → STT (Web Speech / Whisper) → Chat → TTS round‑trip.
- Realtime voice: Proxy connects; confirm full‑duplex audio and barge‑in.
- Persistence: Config survives reload via
chrome.storage.sync
; local transcripts stored.
Security Considerations
- Keep OpenAI keys out of content scripts. Background is acceptable for non‑Realtime calls; Realtime must use the backend.
- Restrict host permissions to what is necessary (
api.openai.com
,leetcode.com/*
). - Consider per‑origin connection controls in the proxy; log minimal PII; rotate keys if leaked.
Appendix: File Map
package.json
— scripts, depspublic/manifest.json
— MV3 definitionsrc/background/background.ts
— service worker / orchestratorsrc/content/standalone-react.tsx
orsrc/content/content.ts
— injected UIsrc/interview/components/InterviewApp.tsx
— interview UXsrc/shared/voice-service.ts
— Traditional voicesrc/shared/realtime-voice-service.ts
— Realtime voicesrc/shared/chatgpt-service.ts
— Chat + usage trackingbackend/server.js
— WS proxy to OpenAI Realtime
Changelog
-
2025‑09‑17: First MDX draft with embedded handleEmbeddedMessage snippet.