The legal industry, like other industries, is currently undergoing a digital transformation. The adoption of new technologies is changing the way legal professionals work, making data processing faster and more efficient, improving legal research, and streamlining internal workflows.

picture
In this case study, we’ll take another dip into the world of LegalTech with Andrii Kovalov, one of our skilled Python Engineers, and explore how he successfully implemented Grafana to visualize a custom bill-processing ETL pipeline.

Background

Access to accurate and adequately structured data is one of the critical components of working with legal information. However, the traditional methods of manually reviewing and analyzing piles of legal records are time-consuming, lacking real-time insights, and error-prone. We have worked with bills from the U.S. Congress for many years, using the open API’s that are provided by the Government Publishing Office.

Recognizing the need for modernization of our bill-processing workflow, our team has designed a custom ETL (Extract, Transform and Load) pipeline that automatically processes bills in the United States Congress.

“Our solution processes and transforms bills into .xml, .json, or .pdf files. It then extracts the necessary information from these files and presents the analyzed data on the client dashboard every morning. This modernization of the bill-processing workflow provides our clients with real-time access to accurate and structured data, which is essential for making informed decisions in the legal field.”

Challenge

Unfortunately, nothing is perfect, and there are times when the pipeline encounters issues. As a result, some items and data in the pipeline fail to get processed.

“Bill processing is a multifaceted operation, which, quite obviously, demands precision and accuracy. Although issues occur with less than 1% of total data and are really rare, some of the items that failed to process are crucial for our clients. Therefore, fixing this problem is our top priority.”

Reprocessing these items is not as easy as it may seem at first sight and it actually poses 2 significant challenges:

  1. Time-consuming search for failed items. As the pipeline contains an immense volume of items, it can take a lot of time and effort to query the database and scout logs to find failed or non-processed items manually.
  2. Time-consuming issue resolution. As we are a remote team with different schedules, fixing these unprocessed items is also time-consuming as it involves engaging in numerous conversations to identify the problems and delegate the necessary fixes.

Figuring the way out

The time-consuming nature of both processes highlighted the need for a more efficient and streamlined solution than a manual search. We needed a tool that could display all items that could not be processed the first time through and enable us to quickly find and reprocess them.

“Our primary criteria were the ability to handle large data volumes, along with pagination and filtering capabilities. With this functionality at hand, we could simplify the process of locating unprocessed items and re-run the pipeline for a single or batch of items.”

Initially, we considered implementing a custom front-end app but gave up this idea due to the high workload of our front-end team. Another option was to build a static web app with Flask, but we also decided against it, as it would not look visually appealing, and pagination and filtering would be still challenging.

Solution

During brainstorming sessions, our team identified Grafana as a potential solution to the obstacles we were facing. The primary objective was to leverage Grafana’s dashboards to create a transparent, real-time visualization of our bill-processing ETL pipeline. This approach would allow everyone in the team to monitor unprocessed items and gain crucial insights for issue resolutions.

“My journey with Grafana began with a dedicated day of intensive investigation. During this initial stage, I delved into understanding the Grafana functionality, capabilities, and compatibility with our existing systems to see how we can apply it to meet the needs of our project. The more I became familiar with Grafana, the more I realized that it was precisely the solution we had been looking for.”

Andrii led the implementation of Grafana into the project with a clear and straightforward strategy. In just 2 days, he managed to set up a custom Grafana Docker image (based on the original one), adjusted Nginx configurations, and created the initial “Hello World” dashboards. After a few more days of handling data normalization, creating API endpoints for Grafana, and designing the final dashboards, our team received a perfect tool to effectively visualize, filter, and search data in the pipeline.

Benefits

The implementation of Grafana dashboards has yielded the following benefits for our project:

  1. Transparency and accessibility. Grafana dashboards provide a clear and real-time view of the data in the ETL pipeline, making it accessible to all team members.
  2. Enhanced efficiency. Real-time access to data in dashboards streamlines data analysis. The team can quickly identify the unprocessed items or data and promptly address these issues, ensuring that critical items get processed.

“Grafana allows everyone in our team to use the dashboards for their data needs, which was previously inaccessible to them. It eliminates time-consuming interactions and allows us to focus on getting the crucial data processed and reducing the delays in the bill processing workflow.”

Conclusion

In our search for solutions to our challenges, Grafana’s data visualization and monitoring capabilities have become a game changer. By integrating it into our project, we were able to promote transparency and provide democratized access to critical data, leading to faster problem-solving.

“My experience with Grafana showed me that creativity, adaptability, and the willingness to experiment can transform a good experience into an extraordinary one. If you are facing similar problems and want more visibility in your project, I recommend familiarizing yourself with Grafana. Who knows, you might end up being able to have your cake and eat it, too.”