#DataForGood: Data Visualization in the Public Sector ETL (Part I)

The public sector deals with vast quantities of data just like, if not more so than, the private sector. For-profit companies heavily rely on data analysis to determine their bottom line, but nonprofits also require these tracking metrics. In order to provide the most value per funded dollar, this organization intelligence is of upmost importance. For this first part of our two part series on Data Visualization in the Public Sector ETL, I’ll cover the importance of using an ETL process for data management in nonprofits and government agencies by giving a quick overview of what happens in each stage.

Public sector organizations, including governmental agencies, can interact with hundreds of people per day. This results in a massive amount of data which often comes in various formats from external sources. A data pipeline is often used to pool this data into a centralized location (Data Warehouse). However, it is important to turn this data into something digestible that can be used to highlight accomplishments, forecast performance, isolate areas that need to be improved, and spot trends. Use of an ETL (extract, transform, load) process is an excellent way of producing manageable data.

Extract

In the first phase of an ETL pipeline, data is gathered from various sources, such as governmental reporting agencies. For example, state run public school systems send data they’ve collected to the U.S. Department of Education. Each state may have data collected in different formats and at varying levels of granularity. As one cannot compare the proverbial apples to oranges, data must be homogenized – this is where the next step comes in.

Governor's Office of Georgia Graduation Statistics
Post high school graduate data collected in the Governor’s Office of Georgia is downloadable in Excel format.
State of California Post Graduation Statistics
Post high school graduate data from the State of California is downloadable in .txt format.

Transform

This data now needs to be transformed into a cohesive format. Nulls need to be removed as well as duplicates. In fact, we have worked with public sector organizations that have entire job positions dedicated to manually removing duplicates from their records. On top of the cleaned data, calculations are often needed to aggregate data, join multiple tables, and transpose columns and rows. Once data is dressed and ready, it needs to have a final destination where it can be properly maintained.

Load

Although the “load” phase may seem more trivial than its predecessors, keeping pretty data pretty is an integral part of any data pipeline. Once data is loaded into a Data Warehouse, integrity and uniqueness must be enforced so that you don’t end up with nicely formatted duplicates. New processes, including data backup, are often executed during this phase. After taking the time to collect and format data, you wouldn’t want to end up losing all of it!

Once you have a nicely populated Data Warehouse, it’s time to put that information to use. Here, data visualization and exploration tools, such as Tableau, come into play and can be a valuable asset both internally and externally to your organization. Once the basic architecture of data visualization software is setup, it can help to remove some reliance on the IT team that implemented the ETL. This software can enable other team members to access stored data and manipulate it to suit reporting needs. Colorful reports and up-to-date dashboards can all be created through use of a well designed data visualization system. Public accessibility to these tools has increased, making them necessary for any sized organization.

Conclusion

A common misconception when people think of nonprofits and government agencies is that they are the opposite of a for-profit business and therefore operate outside of the normal business model. The reality is that while there are differences in their respective models, there are many similarities, particularly when it comes to data management solutions.

In Part II, I’ll cover the importance of data visualization in the public sector by showing how Tableau graphs can help when telling a compelling story of your organization’s work and enable your organization to make a data-driven strategic plan for the future.

Until then, if you work with data in the public sector, let us know your experience with an ETL workflow or ingress data management by leaving a comment below.

By |2018-11-01T14:19:09+00:00October 26th, 2018|Categories: Data Analysis|Tags: , , , , , , |

About the Author:

Dedicated data visualization analyst who collaborates with others to achieve the end result of all projects. I have extensive experience working with data management for nonprofit and lobbying organizations. I enjoy finding new and creative ways to solve data challenges using Tableau and practicing Taekwondo and Muay Thai.

Leave A Comment