January 19, 2020

Reusable Data Enrichment Process

A government agency needed standardization and consistency in their data. Learn how Oxford delivered. 


Natural Language Processing
Data Engineering
Cloud Engineering

Data Scientist
Cloud Engineer
Data Engineer
Automation Engineer

The Challenge
Our client, a state government office focused on data and analytics associated with economic opportunity, collects data from various state organizations. The data set has many quality issues and lacks standardization and consistency. The client partnered with us to create a solution that would cleanse this data on an ongoing basis, allowing them to make better data-driven decisions regarding state programs and regulations.

The Solution
Our four-person engineering team designed a solution on Amazon Web Services™ (AWS) using Fuzzy Logic and Text Mining. We were able to implement a Python-based natural language processing algorithm, which was necessary to provide the appropriate data enrichment. Additionally, the team developed an AWS automation capability with Terraform™ so that environments could be easily created, updated and versioned.

The Result
Our solution allows our client to match wage and benefits data with other data sources that will influence programs and policymaking. The process is reusable, so it can be run as often as necessary to create value from new and changed data sets as they flow into the organization. In addition to resolving our client’s data matching problem, our solution will also service as a technology foundation for similar big data and artificial intelligence (AI) workloads.

Quality. Commitment.

Whether you want to advance your business or your career, Oxford is here to help. With nearly 40 years’ experience, we know that a great partnership is key to success. Start a conversation today.