In recent years, the term “big data” has become a popular buzzword, sometimes thrown around to make projects seem more modern and exciting. But what does it really mean?
This is the first in a series of blogs to help you, as a provider or other professional working with the Military Health System (MHS), better understand some of the newer sources of data and how they can be used to answer questions. This first post will start with the most general terms and ideas (that might not be news to many of you data folks), and we’ll work up to more advanced topics. If you have any questions along the way, please comment below and we’ll get back to you.
One of the most commonly cited definitions of big data comes from a paper written by Doug Laney in 2001 which says that big data is all about volume, velocity and variety or the “3 Vs.” Volume refers to the amount of data, velocity to the speed at which the data is generated and variety is about the types or format of the data (e.g., pictures, text, financial transactions).
If you dig into this topic more, you’ll see that there is, like everything else in the social sciences, a lot of gray area around what is considered big data. It’s easy to say that a survey of three people isn’t big data, but billing records for all of TRICARE is, but there’s a lot of space in between. And beyond that, a lot of people want to classify big data into specific formats.
One specific type of big data is administrative data or data collected for a reason that isn’t research. In the MHS, this often means billing and medical records. While many social scientists (with education in psychology, social work, public health, economics and more) learn about surveys, focus groups and structured interviews as the main forms of data collection, the evolution of technology has led more scientists to use administrative data. Like any data, there are pros and cons to using administrative data.
It surprises many social scientists when they start to understand administrative data that a lot of the time and effort involved is actually about data management and quality, rather than strictly analysis and statistics. Because administrative data isn’t about entering survey data, but involves taking data collected for other purposes and in huge amounts, it takes work to get the raw data into something that can be analyzed to answer questions. This can be overwhelming, but understanding the process and what data is available can help you think about how to answer your questions or use data in more productive ways.
For example, at the Deployment Health Clinical Center, we use administrative data from electronic health records to better understand the types of conditions that patients seek treatment for at primary care clinics. We look at how often ICD-10 codes are recorded at particular clinics and can then make recommendations about training for providers, where group sessions might be able to serve the needs of the patients, and more.
So as you can see, there are benefits to big data which can provide you and your organization valuable information.
Make sure to read our future blog posts, where we’ll share many more examples of ways we have used administrative data in the MHS to inform policy recommendations and improve patient care.
The views expressed in Clinician's Corner blogs are solely those of the author and do not necessarily reflect the opinion of the Psychological Health Center of Excellence or Department of Defense.