Natural Language Processing
Fundamental Concepts
- Tokenization
- Stemming
- Lemmatization
- POS Tagging
- Named Entity Recognition
- Stopword Removal
- Syntax
- Dependency Parsing
- Parsing
- Chunking
Text Processing & Cleaning
- Text Normalization
- Bag of Words
- TF-IDF
- N-grams
- Word Embeddings
- Sentence Embeddings
- Document Similarity
- Cosine Similarity
- Text Vectorization
- Noise Removal
Tools, Libraries & APIs
- NLTK
- spaCy
- TextBlob
- Hugging Face Transformers
- Gensim
- OpenAI
- CoreNLP
- FastText
- Flair NLP
- ElasticSearch + NLP
Program(s)
- Build a Chatbot Using NLP
- Extracting Meaning from Text Using NLP in Python
- Extracting Email Addresses Using NLP in Python
- Extracting Names of People, Cities, and Countries Using NLP
- Format Email Messages Using NLP
- N-gram program
- Resume Skill Extraction Using NLP
- Sentiment Analysis in NLP
- Optimizing Travel Routes Using NLP & TSP Algorithm in Python
Python Program to Extract Skills from a Resume
Install Dependencies
First, install the necessary Python libraries:
pip install spacy pandaspython -m spacy download en_core_web_sm
import spacyimport re
# Load spaCy's English NLP modelnlp = spacy.load("en_core_web_sm")
# Predefined list of common skills (can be expanded)SKILLS = [ "Python", "Java", "C++", "Machine Learning", "Deep Learning", "NLP", "Data Science", "SQL", "TensorFlow", "PyTorch", "Computer Vision", "JavaScript", "React", "Django", "Flask", "Docker", "Kubernetes", "AWS", "Cloud Computing", "Agile", "Scrum", "Tableau", "Power BI"]
def extract_skills(text): """ Extract skills from a given resume text. """ doc = nlp(text)
# Extract nouns and proper nouns as potential skills extracted_skills = set() for token in doc: if token.text in SKILLS: extracted_skills.add(token.text)
return list(extracted_skills)
def clean_resume(text): """ Clean and preprocess resume text. """ text = re.sub(r"\n+", " ", text) # Remove new lines text = re.sub(r"\s+", " ", text) # Remove extra spaces text = text.lower() # Convert to lowercase return text
# Load resume text from a filewith open("resume.txt", "r", encoding="utf-8") as file: resume_text = file.read()
# Clean and process resumecleaned_text = clean_resume(resume_text)skills = extract_skills(cleaned_text)
print("Extracted Skills:", skills)
How It Works
- Loads a resume text file (
resume.txt
). - Cleans the text by removing unnecessary spaces and formatting.
- Uses spaCy to analyze the text and extract skills.
- Matches detected words against a predefined list of skills.
- Prints the extracted skills.
Where to Use This?
- HR & Recruitment: Automate skill extraction from resumes for candidate screening.
- Job Matching Systems: Compare candidate skills with job descriptions.
- Resume Parsing Software: Enhance NLP-based resume processing tools.
How to Use?
- Prepare a Resume File: Save a resume as
resume.txt
. - Run the Script: Execute the Python program.
- Get Extracted Skills: The program will print the skills found in the resume.
🚀 Enhancements: You can improve the model by using Named Entity Recognition (NER) with spaCy or BERT for better accuracy. Let me know if you want an advanced version!