Level-based Resume Classification on Nursing Job Positions Público

Gu, Haoqi (Spring 2020)

Permanent URL: https://etd.library.emory.edu/concern/etds/dv13zv23z?locale=es
Published

Abstract

In this thesis, we mainly focus on documents of real application resumes. Different from most similar works, we are not categorizing resumes into the suitable groups, for example, IT job resume, medical care job resume, teachers resume, and so on, but we will categorize application resumes on a specific level-based job position called Clinical Research Coordinator from the School of Nursing at Emory University. The job position has 4 different levels, CRC I, II, III, and IV, for applicants to apply to and we aim to write an algorithm to classify resumes into these 4 levels based on their content. Methods used are string matching, feature vectors, bags of words and ensemble models. The best model to predict the admission result of a resume reaches 66.89%.

Table of Contents

1 Introduction 1

2 Background 4

2.1 Resume Classification (Job Category Classification) . . . . . . 4

2.2 Resume Classification (Level-based Position Classification) . . 5

2.3 DatasetUniqueness........................ 5

3 Dataset 7 

3.1 DatasetStatistics......................... 8

3.2 Annotation ............................ 9

3.3 FormatConversion(DOCX)................... 10

3.4 SectionExtraction(DOCX) ................... 10

3.5 FormatConversion(PDF) .................... 11

3.6 SectionExtraction(PDF) .................... 12

3.7 ExtractionAccuracy ....................... 12

3.8 PreprocessedDataset....................... 12

4 Approach 14

4.1 StringMatching.......................... 15

4.2 FeatureVector .......................... 18

4.3 BagsofWords........................... 19

4.4 EnsembleModels ......................... 21

5 Experiments 23

5.1 DataSplit............................. 23

5.2 Results............................... 24

5.3 ErrorAnalysis........................... 35

6 Conclusion 37

Appendix A - Complete Result 39

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Palabra Clave
Committee Chair / Thesis Advisor
Committee Members
Última modificación

Primary PDF

Supplemental Files