Resume PDF Parser (Node.js + TypeScript)
A simple Node.js project that extracts structured resume information from a PDF file and converts it into a clean JSON format.
This parser reads a resume PDF, groups text logically (lines → sections), and outputs structured data like:
- Profile details
- Education
- Work experience
- Skills
- Projects (if present)
Features
- Parse resume data from PDF
- Extract structured fields:
- Name, email, links, summary
- Education history
- Work experience
- Skills
- Outputs clean
resume.json - Built with TypeScript
- Uses pdfjs-dist for PDF text extraction
Installation
npm install
Build
Compile TypeScript to JavaScript:
npm run build
Run
npm start
Run this project after build only to get output
This will:
-
Read
sample_resume.pdf -
Extract resume data
-
Generate
resume.json
Output example:
{ "profile": { "name": "John Doe", "email": "john@example.com", "url": "linkedin.com/in/johndoe", "summary": "Software Engineer with experience in..." }, "educations": [], "workExperiences": [], "skills": { "descriptions": ["JavaScript", "Node.js", "Cloud"] } }
How It Works
-
readPdf – extracts raw text items from PDF
-
groupTextItemsIntoLines – converts text items into readable lines
-
groupLinesIntoSections – detects resume sections (Education, Experience, Skills)
-
extractResumeFromSections – builds structured resume JSON
License
MIT

