Skip to content

feat: structured parsing#190

Merged
minglet merged 2 commits intodevfrom
feat/structured_parsing
Jan 9, 2026
Merged

feat: structured parsing#190
minglet merged 2 commits intodevfrom
feat/structured_parsing

Conversation

@byungchul-chae
Copy link
Member

Structured Resume Parsing

Implements LLM-based structured resume parsing that converts raw resume text into structured JSON format.

Key Changes

  • New resume_structured_parser.py module using ChatUpstage LLM
  • Structured data cached separately in resume_structured_cache
  • New /api/resume/structured endpoint for retrieving structured data
  • Automatic structured parsing in /api/matching and /api/chat endpoints
  • Comprehensive logging of parsing results in Docker logs
  • Data normalization: empty strings → null for consistency

Data Flow

PDF → Upstage API (text) → LLM (structured JSON) → Cache

Structured output follows RESUME_STRUCTURE_SCHEMA.md with fields like basic_info, careers, educations, skills, etc.

@ed-kyu
Copy link
Member

ed-kyu commented Jan 9, 2026

프론트에서 api 가져다 붙여볼게요 작업 감사합니다

@minglet
Copy link
Collaborator

minglet commented Jan 9, 2026

@byungchul-chae 그때 제가 작업했던 backend/util/url_extractor.py에서 이력서에서 url만 가져오는 것을 했었는데, 이 모듈은 사용하지 않으신거져?
일단 파싱결과로 (하이퍼링크로 있는 경우) email, blog 등의 url은 "email", "link" 키에 들어가야될 것 같긴합니다! 이 부분도 추가해주실 수 있을까요?

@minglet minglet merged commit 7e04cf0 into dev Jan 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants