Author: Maheshkumar Kharade

Posted On Jun 13, 2018   |   3 min

Over the last year, we have seen significant investments in AI capabilities made into companies with some or the other recruitment solution in the HR Tech industry. Based on the current trends, this is expected to increase in 2018, primarily in the area of automation workflows for candidate sourcing, and screening. Though this helps in addressing most of the challenges faced by recruiters, I think AI can be used to solve the issues in current resume parsing techniques as well.

Over the years, resume parsers have always struggled to achieve much-desired accuracy. In my opinion, the resume is the most dynamic data source that recruitment product needs to process. Due to this, most of them use third – party libraries or services to process this data. The resume parser depends on keyword, format, and pattern matching. This approach handles the specific formats well, but fails to process variations as it lacks an ability to interpret, and focuses on parsing.

At Harbinger, we have used AI for providing an ability to interpret candidate resumes using custom NLP (Natural Language Processing) engine. In a nutshell, NLP is an AI technique used to understand human language. As every resume is different, and curated by an individual candidate, NLP adds an ability to interpret humanly created documents. There exist multiple NLP packages (such as NLTK and Stanford CoreNLP) offering basic language processing abilities. Abilities of such NLP packages depend on the corpus or corpora. Corpora mainly consist of themed text used for statistical linguistic analysis while processing the sentences or write-ups. By default, NLP package makes use of core corpora to understand and interpret basic sentences and text. Resume processing needs a custom corpus, specific to domain or industry such as HR Tech.

During one of our recent implementations, we worked on developing such custom corpus for resume parsing. We made use of public skill sources to create and train custom skill corpus. This skill corpus is used by custom NLP engine to extract primary and secondary skills from resumes, and job descriptions. Use of machine learning along with NLP engine enables extraction of the same skill, described in multiple forms and phrases. The implementation also supports a periodic refresh to skill corpus, based on predefined scheduled process. This ensures seamless addition of new skills into the corpus, and improved accuracy. To summarize, I think NLP can add much-needed human-like interpretation abilities to resume parsers and recruitment products. To know more or share your experiences in building custom NLP engines, please write at