Python Program to Extract Skills from a Resume


Install Dependencies

First, install the necessary Python libraries:

pip install spacy pandas
python -m spacy download en_core_web_sm

import spacy
import re

# Load spaCy's English NLP model
nlp = spacy.load("en_core_web_sm")

# Predefined list of common skills (can be expanded)
SKILLS = [
    "Python", "Java", "C++", "Machine Learning", "Deep Learning", "NLP",
    "Data Science", "SQL", "TensorFlow", "PyTorch", "Computer Vision",
    "JavaScript", "React", "Django", "Flask", "Docker", "Kubernetes",
    "AWS", "Cloud Computing", "Agile", "Scrum", "Tableau", "Power BI"
]

def extract_skills(text):
    """
    Extract skills from a given resume text.
    """
    doc = nlp(text)
    
    # Extract nouns and proper nouns as potential skills
    extracted_skills = set()
    for token in doc:
        if token.text in SKILLS:
            extracted_skills.add(token.text)
    
    return list(extracted_skills)

def clean_resume(text):
    """
    Clean and preprocess resume text.
    """
    text = re.sub(r"\n+", " ", text)  # Remove new lines
    text = re.sub(r"\s+", " ", text)  # Remove extra spaces
    text = text.lower()  # Convert to lowercase
    return text

# Load resume text from a file
with open("resume.txt", "r", encoding="utf-8") as file:
    resume_text = file.read()

# Clean and process resume
cleaned_text = clean_resume(resume_text)
skills = extract_skills(cleaned_text)

print("Extracted Skills:", skills)

How It Works

  1. Loads a resume text file (resume.txt).
  2. Cleans the text by removing unnecessary spaces and formatting.
  3. Uses spaCy to analyze the text and extract skills.
  4. Matches detected words against a predefined list of skills.
  5. Prints the extracted skills.

Where to Use This?

  • HR & Recruitment: Automate skill extraction from resumes for candidate screening.
  • Job Matching Systems: Compare candidate skills with job descriptions.
  • Resume Parsing Software: Enhance NLP-based resume processing tools.

How to Use?

  1. Prepare a Resume File: Save a resume as resume.txt.
  2. Run the Script: Execute the Python program.
  3. Get Extracted Skills: The program will print the skills found in the resume.

🚀 Enhancements: You can improve the model by using Named Entity Recognition (NER) with spaCy or BERT for better accuracy. Let me know if you want an advanced version!