LeetMentor — Voice-Enabled AI Interviewer (Chrome Extension)

Jorge Delgadillo

#repo link LeetMentor

LeetMentor overlays a conversational, voice‑enabled AI interviewer directly on LeetCode. It supports two voice modes—Traditional and Realtime—and cleanly separates the extension runtime (background, content, popup) from a small Node WebSocket proxy for low‑latency Realtime audio.

This document is written as MDX so we can embed rich UI bits later (e.g., callouts, tabs). It’s intentionally concise but technical, with links to source files and a couple of deep‑dive code snippets.


TL;DR


Architecture (lean)

Extension (MV3)

Popup: settings + start button

Content Script: injects UI on LeetCode and extracts/caches problem data

Background (service worker): config/session store, message router, OpenAI calls, SPA navigation detection

Interview Page (web‑accessible): React app that reads cached problem + session/config and runs the conversation

Backend (optional, for Realtime): tiny Node ws proxy that relays mic/audio frames between the client and OpenAI Realtime and keeps the API key server‑side

Data/State

chrome.storage.sync: API key, model, voice settings, history window

chrome.storage.local: current session, transcript, cached problem

chrome.runtime.sendMessage: popup/content/interview ↔ background

Key Surfaces


Runtime Flow

  1. Open a LeetCode problem. Content script extracts title/difficulty/tags and writes to chrome.storage.local.
  2. Click Start Interview (popup or injected UI). Popup/CS sends START_INTERVIEWbackground.
  3. Background resolves config, ensures a session, and opens interview.html (web-accessible resource) with a session id.
  4. InterviewApp loads session + problem, greets the user, and selects a voice mode based on settings.
  5. Traditional mode uses ChatGPTService (Chat Completions) + VoiceService (Web Speech / Whisper + TTS).
    Realtime mode uses RealtimeVoiceService to stream mic audio to the Node proxy, which speaks back with low latency.

Tech Stack


Commands

# Build the extension (dist/)
npm run build

# Watch mode for rapid iteration
npm run dev

# Watch + local React test page (e.g., src/content/react-test.html)
npm run dev:react

# Realtime voice backend
cd backend && npm run dev

Load the unpacked extension from dist/ at chrome://extensions.

Configuration & Storage

User config (API key, model, voice settings, history window) → chrome.storage.sync

Session, transcript, problem cache → chrome.storage.local

Messaging → chrome.runtime.sendMessage (popup/content/interview ↔ background)

Keep API keys out of the client where possible. Only Realtime requires a backend proxy (and keeps keys server‑side).

Voice Modes Traditional Voice

TTS: Browser voice or OpenAI TTS via background

STT: Web Speech API; fallback to Whisper via background

LLM: ChatGPTService composes concise prompts + last‑N history and calls Chat Completions; tracks usage/cost

Realtime Voice

Transport: Mic audio → Node proxy → OpenAI Realtime → streaming audio back

UX: Lower latency, barge‑in, conversational flow

Deep‑Dive Snippet #1 — Background Chat Handling

Here’s a real excerpt from the background service showing how embedded interview messages are routed through OpenAI. It demonstrates:

Config resolution (including storage fallback)

Prompt construction with system + history window

Phase‑aware interviewing (understanding → implementation → testing)

API call to Chat Completions

private async handleEmbeddedMessage(data: any): Promise<{ response: string }> {
const { problem, message, config, conversationHistory = [], interviewPhase = 'problem-understanding' } = data;


if (!config || !config.apiKey) {
const configResponse = await this.getConfig();
if (!configResponse || !configResponse.apiKey) {
throw new Error('API key not configured. Please configure your OpenAI API key in the extension popup.');
}
data.config = configResponse;
}


try {
const conciseSystem = `You are a technical interviewer. Keep answers 1–2 sentences. Ask questions, avoid lecturing.`;
const minimalProblem = `Problem: ${problem?.title || 'Unknown'} (${problem?.difficulty || 'Unknown'}). Phase: ${interviewPhase}`;
const phaseInstruction = interviewPhase === 'implementation'
? 'Explicitly ask the candidate to implement now in the editor.'
: interviewPhase === 'testing-review'
? 'Focus on analysis, complexity, and edge cases.'
: '';


const historyN = (config.historyWindow || 8);
const lastN = (conversationHistory || []).slice(-historyN);
const messages = [
{ role: 'system', content: conciseSystem },
{ role: 'system', content: minimalProblem },
phaseInstruction ? { role: 'system', content: phaseInstruction } : undefined,
...lastN,
{ role: 'user', content: message }
].filter(Boolean);


const response = await fetch('https://api.openai.com/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': `Bearer ${data.config.apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: data.config.model || 'gpt-4o',
messages,
max_tokens: 200,
temperature: 0.7,
})
});


const result = await response.json();
return { response: result.choices[0]?.message?.content ?? 'Error: empty response.' };
} catch (err) {
console.error('Error handling embedded message:', err);
throw err;
}
}

Notice how the background service centralizes API calls. This keeps keys out of content scripts and ensures consistent prompting.


Content Script Notes

chrome.storage.local.set({ leetmentor_current_problem: { title, difficulty, examples } });

Getting Started (Dev)

  1. Create .env files (client and backend). Do not commit keys.
  2. npm run dev to watch the extension; npm run dev:react for UI test pages.
  3. cd backend && npm run dev to enable Realtime mode.
  4. Load dist/ in chrome://extensions; toggle Realtime in the settings popup.

Minimal Test Plan


Security Considerations


Appendix: File Map


Changelog