This repository presents a lightweight, multilingual avatar system for real-time Human-AI interaction in Kazakh, Russian, and English. We compare two deployment architectures developed at ISSAI:
Local: Uses quantized Qolda model (4.3B parameters), Whisper Turbo ASR, and Matcha-TTS Cloud-based: Uses Oylan LLM and MangiSoz APIs
Key Results:
- Local deployment is 62% faster (2.20s vs 5.74s end-to-end latency)
- LLM inference: 76% faster locally (0.99s vs 4.11s)
- ASR: 38% faster locally
- Avatar rendering uses only 15-20% GPU at 60 FPS
- On-device models enable responsive, offline multilingual interaction
- Multilingual Support: Kazakh, Russian, and English language processing
- Dual Deployment Architectures: Cloud-based and local deployment options
- Real-time Human-AI Interaction: Low-latency conversational interface
- 3D Avatar Interface: Ready Player Me-based avatar rendering at 60 FPS
- Speech Processing Pipeline: End-to-end ASR, LLM inference, and TTS synthesis
The following diagram illustrates the complete system architecture comparing cloud-based and local deployment approaches:
- Node.js (v16 or higher)
- npm or yarn package manager
- Modern browser with microphone access support
- MangiSoz API access (STT and TTS services)
- Clone the repository:
git clone <repository-url>
cd r3f-virtual-girlfriend-frontend- Install dependencies:
npm install
# or
yarn install- Start development server:
npm run dev
# or
yarn dev- Open in browser:
Navigate to
http://localhost:5173/
- Click the microphone button to start voice recognition
- Speak your question to the AI educator
- Stop speaking - automatic 2-second countdown begins
- Message sends automatically - no button clicking needed!
- AI responds - microphone auto-pauses during response
- Auto-resumes after AI finishes for seamless conversation
src/
├── components/
│ ├── LandingPage.jsx # Beautiful landing page
│ ├── ClassroomPage.jsx # Main classroom wrapper
│ ├── ClassroomUI.jsx # Zoom-like interface
│ ├── ClassroomExperience.jsx # 3D classroom environment
│ ├── VoiceRecognition.jsx # Voice control component
│ ├── Avatar.jsx # AI educator 3D model
│ ├── Experience.jsx # Original 3D scene
│ └── UI.jsx # Original UI (legacy)
├── hooks/
│ ├── useChat.jsx # AI conversation management
│ ├── useMangiSozSTT.jsx # MangiSoz STT integration
│ └── useVoiceRecognition.jsx # Legacy voice recognition (deprecated)
├── assets/
├── App.jsx # Main app with routing
├── main.jsx # App entry point
└── index.css # Global styles + animations
Modify src/components/ClassroomExperience.jsx to add new environments:
- Change lighting presets
- Add new 3D models
- Customize classroom layout
Update src/hooks/useChat.jsx to modify:
- Educational context
- Subject specialization
- Response style
- Learning level
Adjust src/hooks/useVoiceRecognition.jsx for:
- Silence detection timing (default: 2 seconds)
- Language settings
- Audio sensitivity
npm run build
# or
yarn buildnpm run preview
# or
yarn previewThis project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material
Under the following terms:
- Attribution — You must give appropriate credit to ISSAI
- NonCommercial — You may not use the material for commercial purposes
For more details, see the CC BY-NC 4.0 License.