View our schedule of industry leading free to attend virtual conferences. Each a premier gathering of industry thought leaders and experts sharing key solutions to current challenges.View Schedule of Events
Courtesy of 's Jairo Gomez, below is a transcript of his speaking session on 'Closing the gap: The building blocks of an effective data quality strategy' to Build a Thriving Enterprise that took place at IT Infrastructure & Cloud Strategies Live.
Closing the gap: The building blocks of an effective data quality strategy
The presentation provides business leaders with general directions, best practices, and certainly, lessons learned, in the pursuit of an effective data quality strategy:
Great speakers, and I'm very excited about our next speaker right there. I have Hydro Gomez with us, he's a senior leader of infrastructure and technology for eight cams, and I'll tell you what, that isn't a little bit, coming directly from beautiful Miami, Florida today. Hello there, hire, great to have you with us.
And that, thank you for having me.
Ladies and Gentlemen, HIO is a Senior Leader of Infrastructure and Technology for the Association of Certified Anti Money Laundering Specialists, eight cans. He has over 10 years of experience, leading cross, functional and cross regional teams, and co-ordinating strategic and tactical efforts related to continuous improvement business transformations across multiple service oriented business industries. Cairo, Really privileged to have you with us. Very much looking forward to your presentation, thank you for taking the time, for sharing your expertise.
Thank you, Georgia, Thank you for having me. And looking forward to it as well.
Great. Hello, everyone.
I, today we're going to be talking about a topic that is very dear to my heart, and I think that every one that everyone will definitely relate to, which is regarding data and data quality.
So I'd like to start this journey, by first giving you an idea of the critical role: data quality plays in almost every business, and I'm sure, personal decisions we make.
Also love by covering first a short passage from one of my favorite books on the topic, Humble Pie Parker.
one passage there speaks about the effects of poor data quantity and human errors.
Parker goes into a series of examples of how little mistakes can directly impact our decision making and cause much larger miscalculations, such as the State Office of Education in Utah, which back in 20 12 miscalculated their budget by $25 million, because of what they call a faulty reference from a spreadsheet.
But perhaps the most shocking example was with Amraam.
Now, some of you may recall that in the aftermath of the Enron scandal back in 4001, there was a massive disclosure of the company's information.
Part of it included more than 15,000 spreadsheets, Spreadsheets with we're sitting on the e-mail servers, and we which were released to the public.
All of those documents were quickly picked up by media, scholars, But one research in particular sought to draw conclusions around the quality and the quantity of the data inside these documents.
Aside from the fact that there were 15,000 spreadsheets being shared via e-mail, which personally is shocking enough, their findings keep an eye opening insight into a state of the company's data.
Now, if you look at the screen, one of the things that they found is that the average size was around 113 kilobytes, but one spreadsheet, in particular, was a surprising, 41 megabytes. So, I can imagine, what type of data, What's contained in it.
On average, this spreadsheets contain around five worksheets, but one, in particular, had 175 different worksheets, and I'm sure that many of you can relate to that, as well.
Um, sort of like speaks to the misuse, not just they use but the misuse of tools such as Excel, 42% did not include a single formula.
one can only assume that these were simply used to convey information, which makes you question the tool they choose to do so. Excel may not be the right one in that case. But perhaps the most shocking statistic was that a startling 24% of all of the formula containing spreadsheets had errors errors in it.
Now you can imagine the decisions that were taken using those close to 2000 or more than 2000 spreadsheets.
But this all goes to show that data quality can impact the quality the quality over decisions, and depending on its use, it can actually branch out to create and trickle down into more, more incorrect data.
Now, let's set the context of this presentation by speaking about the challenges companies face when dealing with information.
But before diving into the intricacies of data quality stuff about data itself, will start by discussing data analytics. Which is a relatively new discipline that emerge around 70 years ago, when early tools were developed that could help capture information and help analysts at the time to certain certain patterns and draw certain conclusions.
Now, at the time, datasets were much smaller than today, and thus, they were structure. So, you can say that data quantity was a relatively contained issue.
Then, around the mid two thousands, companies like Google or Facebook began collecting and analyzing new types of information hours.
And while the term big data wasn't coined until 2010, early on, everyone understood the value and the scope of the vast amount of information being collected.
From customer related demographics to personal identifiable information, or company's internal operations, or right, or self related figures to lead gen data source from private or from public data sources, looked at arrival of big data. With the arrival of big data, new technologies. And processes were also developed to allow companies to mine and produce insights, which will ultimately help transform them into profit oriented, cost saving operational changes.
The challenge, with dealing with a vast amount of information is, and will always be the reliability companies crave data, but as much a secret as they crave reliability as much as they crave data itself.
Now, this is because, information has become a fundamental part of a company's strategic decision making, and the source by which many decisions are made, from evaluating the feasibility of a project, for example.
two, deciding which markets to avoid or go after, or to even small operational changes to help address declining KPIs.
There are many challenges that affect a company's ability to collect, analyze, and produce any form of insights.
For example, you could say that data is personal, and can become irrelevant and actionable over time.
If you start thinking about customer customers' information, they can the information can be irrelevant if we, if we don't, we're not in a constant process of refreshing, they also need to needs to be mined and visualized in a business friendly way in order to be actionable.
I mean, I can't imagine drawing conclusions from one of those animals spreadsheets which contain around 7000 different M non empty cells, on average, without the use of any formal data visualization tool, or even a pivot table.
Any data related processes, from its collection, to its mining to its reporting, should also be efficient and preferably automated.
But perhaps, the most important challenge when dealing with data, it's squatting.
To give you an industry perspective about the importance of data quality, I'd like us to walk through o'reilly's, cross industry Survey, title, the State of the State of data quality, which was published sometime last year.
Now, we could probably spend a whole session diving into each one of these topics. Companies identified as pain points when dealing with data. And this is because, just like on each of our fishbone diagram, each of these can be traced back to one or multiple root causes.
Let's take, for example, too many data sources.
Now, these are companies have shifted preference towards using more software to service applications.
one of the most important aspects to ensure is to, or to ensure a user experience for both internal and external customers if the integration between multiple applications are companies rely on these third party vendors, not just to help us deliver our products or services, both sought to collect data on how these applications applications are being used.
The problem comes when you're attempting to produce insights, using data that may reside over multiple different systems, or from data, which may have gone through some form of transformation or ETL to end up in a data warehouse. For example, for you where you end up with a basket full of apples and oranges when you really wanted a basket full of pairs.
Another issue comes when no controls are put into place, and everyone has access to the information.
Even poorly labeled data can make the process of mining, and reporting extremely inefficient.
Know, perhaps one of the most insightful aspects of this study was the dichotomy between the frustration companies feel about poor data quality And the fact that 70% indicated that they do not have dedicated teams to address these issues. Sometimes, this is necessary when problems with the data quality has term chronic have turned crunk.
Now, let's cover a separate survey, which evaluated the industry perspective on data quality and reliability.
This time, in the context of Salesforce, a cloud based application used by over 150,000 companies worldwide, which also highlights some of these challenges.
The survey found that over half of respondents believe that 80% of their data was not useful or reliable.
Now, these may speak more to their frustration and these type of issues can be amplified by user perception.
We also found that interestingly, duplicate records, errors and outdated information ranked as the top reasons why users struggle to trying to draw some insights from these data.
Almost half of the respondents do not have any processes in place to prevent new data from creating problems, such as having internal controls, validation rules, and some form of error handling.
And almost half, around 43% of upper management and executive leaders believe that the main reason to strive for data quality is to allow for better decision making and reporting compared to any other motivator.
The glass, the absence of irrelevant data and the press of dirty data, were the biggest blindspot to blind spots to obtaining a complete customer overview.
In contrast, companies often seem to lack a basic understanding of who really owns data or is responsible for its quality.
And the answer is rather that all functional areas of the business are responsible for ensuring the quantity of the data.
From an IT perspective, in implementing controls and permissions to sales and service operations teams at the time of entering new sales orders are simply confirm and updated information from a kind whenever they're in a support call.
Perhaps the marketing, it's using, perhaps marketing, has also a responsibility when it comes to managing lead generation forms, or implementing data enrichment tools.
Therefore, data quality requires a cross functional commitment, and, in many cases, a drastic shift in a company's culture.
The trick is seeing the forest behind the tree, and understanding that the little steps we take towards ensuring the accuracy of our data ultimately help us operate more efficiently, and achieve our companies and departmental goals.
Now, let's talk about, what does Data Quality feed within our organization data governance framework?
But first, let's define what data governance is.
Better governance is a comprehensive framework that deals with all of the IT and operational aspects when dealing with data from its collection, security, transformation, mining, accessibility, or reporting.
I didn't score on the base of it.
You have the company's policies and procedures which set the governing rules over each one of these processes. For example, data retention or privacy policies.
All of these policies and procedures are then later operationalize your structure, using a structure and organize framework or procedures, and also IT infrastructures, which are the vehicle by which users conduct each one of these data processing activities that will ultimately determine how actionable or reliable or theta is.
These are the building blocks of a company's data and understanding each one of these elements and their interdependencies is the most important aspect to implementing a solid and effective data quality management system.
All right, Now, let's talk about different types of issues that you can encounter when looking at, when looking at the data. We're looking at a dataset.
Data quality comes in many different forms, and they all require different set of controls in order to address and prevent them, whether they're technical, operational, or otherwise.
For example, your company may be experiencing issues with data duplication, which, can be ultimately traced back to lack of control. For example, if you're encountering that you have a lot of duplicate records, you may not have all the necessary duplicate management controls in place.
You may also have inconsistent format's, especially when you're dealing with pieces of information that require context in order to be interpreted.
Especially when you're dealing with multiple regions.
For example, phone numbers may require area codes to be contextualized.
Incomplete pieces of data, such as missing details in billing addresses, could ultimately impact their ability to invoice customers or process a payment, what your company may experience lack of demographic data, which prevents us from understanding our customer base. So you're put in a position of not knowing who you're dealing with.
You may find yourself looking at a customer record that has not been updated in years and wonder if there are copra e-mail address or even their phone numbers are still the same.
Look, you may even have a Saudi data quality foundation. But if your data scatter around multiple systems, good luck. Trying to put together a decent dashboard or visualization without any manual intervention, or the help of Excel.
And being able to trace back the data to its source is an important aspect to ensure any reliability.
Now let's speaking earlier about Ishikawa Fishbone diagram, which is used very often to draw some conclusions about root causes. So, I thought it was, it would be a great, It would be great to visualize the root causes of poor data quality in this form.
Setting aside that every data quality issue require its own root cause analysis, I thought it was important to cover the three most common and most important factors that may produce these type of issues: Technology, processes, and culture.
From a technical standpoint, we previously discussed how the lack or even poor system integrations can contribute to poor data quality.
If your applications do not have a governance rules in place in the form of validations or permissions, we will felt the fit to set the boundaries that our users need at the time of entering or processing data.
And we could end up facing issues like the one we discussed on the previous slide.
For example, a lack of duplicate management rules will allow users to enter a record that may already be in your system.
A poor system architecture can also lead to inadequate processes, automated or manual.
And this can impact the quantity of the data collected or process.
Some data enrichment tools can help outsource some of the data quality to third party vendors, like meaningful or Dun and Bradstreet, which May, which they may have more robust and up to date datasets of our customers, for example.
And of course, bugs, bugs, bugs, bugs can all relate to that one.
Boats, more often than not, will cause issues either directly or indirectly. For example, failing to store records or junior or generating duplicate ones or overriding data.
Books will persistently decay the quality of our data unless and until these are resolved.
Now, moving on to the or from an operational standpoint, our business processes and procedures must also cover proper data processing practices and should outline the importance of following such guidelines.
This is why a solid document governance and training will go a long way in addressing the operational aspects of data quality.
This will ensure all processing activities such as entering a new lead will follow the proper standards before that user hits the save button.
I think controls can be effective, but ultimately, the responsibility falls on the user who is creating, editing, or removing a record.
And last but not least, we could we have a competence culture.
Well, I can't stress this enough. There is a shared responsibility about preserving the integrity of our information.
I strongly believe that a dream that a data driven culture isn't organizational chef that goes from the top to bottom, and starts with the leadership commitment to implementing a company wide data governance strategy.
Small steps will go a long way. So, you're getting the basics right. Is it is better than just having no structure at all.
So, I strongly recommend that you go through the exercise of asking yourself, Which of those building blocks that we covered earlier, you have in your organization or are covered in.
Alright? So, now, let's take the second part of this presentation to cover more tactical matters related to data quality, and we'll start with a basic approach on how to clean your data.
This framework will help you visualize the steps to take in order to address the most common yet critical data quality issues companies face. It is agnostic, and he's transferable regardless of any type of organization or sequence that you use.
But first, I strongly recommend that you establish a cross functional data governance committee to provide the overall direction and governance regarding your data.
Once we have defined the who, then we can move to answer the what.
Meaning, we have a solid understanding about the type of data that we use, and profiling can get the job done, and can help you answer that question. It allows you to map the architecture content source of your data, as well as the relationships with all the datasets.
So the goal is ultimately to understand, what are the sources of your data?
Can we do this by mapping the different data collection processes involved in gathering and storing your records?
For example, if you have Customer web forums that customers use in order to enter their information, or if you have any internal data entry forms, or even API integrations with third party apps.
Then we want to understand the scope and the size of your data. For example, understanding how many records, what is the aging of your data, and at what rate does your database grow?
We also want to identify the types of data that you have.
This one is more of a qualitative analysis of your data with the goal of understanding the format and the type of information you have stored.
It is also important to map out the interdependencies and the relationships with other datasets as I mentioned earlier.
And last but not least, knowing what is, what is the voice of the customer telling you about the data?
And this is perhaps one of the most important aspects of understanding the current state of your data quality.
Speak to your users.
They'll be happy to share their thoughts with you, and their feedback will be critical in helping you understand both the perception and the facts behind the state of your data.
Now, we've covered the who and the what.
Once you have a solid understanding of the scope and the magnitude of your problems.
The next step is to outline a prioritize the issues, and as with any other decision, the rule of thumb here is to prioritize those issues, ultimately, which would ultimately drive more benefits.
So, it's important to understand what is important to your business.
Perhaps there is a data quality issue potentially causing a compliance gap, or perhaps impacting our ability to run accurate financial reporting and forecast.
The goal here is to align the priorities based on the company's strategic goals.
Then we have perhaps, the most tedious and arduous, yet, the most rewarding step at all.
Clean your data.
This is because if you've got your priorities right on the previous step, then you should start seeing the benefits, as you complete each one of these data key milestones.
Now, in order to avoid garbage, a garbage in, garbage out type of situation, we should always address the underlying issues that produce the data inaccuracies in the first place.
It is important that we understand the root causes and set clear plan to address each one of these via either resolving any outstanding bugs or issues, setting the IT controls that those governing rules and data controls in the form of permissions, validations, and addressing any process and training gaps.
This is likely not going to happen overnight, therefore, in the meantime, consider implementing event to event or periodic cua reports, which can help you monitor for exceptions and identify these issues in a proactive manner.
You may also consider using a tracker, or a scorecard, and monitor each one of these data data cleansing activities and their impact, the impact that you address, the underlying issues costing them. So, data, this is just an example of one, but it can be as complex and as robust or as simple as you really needed.
The idea and the goal behind this is that you capture key information about your data cleaning activities.
For example, understanding what is it, what functional areas are impacted, or understanding what are the pain points?
Perhaps he's just duplicates.
The first thing that you want to tackle, what do you say impact to the business, whether if it's an impact on the retailers revenue related, or if it's cost avoidance, or if it's a compliance gap that you want to address.
Really understanding the size of the problem is important. So understanding the number of records impact that is going to be an important task.
Of course, as we discussed earlier, they, we have to make sure that we close the gaps to avoid really creating, and basically spending all of these time and cleaning the data, and being time wasted. So understanding the root causes in order to address them is important.
And ultimately, the tracker really is important to track, and basically just put checkmarks on each one of the tasks that are completed.
For example, if you completed that the tasks require a cleanup, historical cleanup of your data, or if there was an exception report that needed to be implemented, or once the root causes of stress.
Once the framework is in place, we'll be in a much better position to monitor for these errors. Identify the root causes, addressing them, and re-evaluating the need for further exception reporting.
Last, it is important to understand the pursuing data acquired requires a constant commitment both from a cultural and an operational standpoint. And even getting the basics right will get will go a long way in improving the usability and the actionability of your data and will provide a foundation pertaining to compliance which will ultimately benefit our organizations.
So you'll be basically be following the cycle that you see on your screen where we have or we're all going to be constantly monitoring the exceptions. Identifying root causes, Addressing them, whether they are related to a bug in your system. Or maybe there's a training gap. And there's a data in your data processing activities, like data entry that need to be addressed. Is important that you collaborate with these, with these, other, with these other cross functional teams to close those gaps. And then, once we have clean up the data, and you have a cleanup historical data that may have been affected, are impacted, And once you close those gaps, then, the last step is really re-evaluate if you need, if there's a further needs for, except for reporting.
And on that note, I wanted to conclude this session, and thank everyone, and I really, really hope that this was helpful in giving you some of the building blocks in order to recalibrate our approach to data and data quality.
And with that note, I would like to open it up for questions and come back, just say.
An excellent hydro, very, very good. A masterclass on the data quality here for us is such an important topic, because you know, we can't really leverage great technologists to their full potential.
If the underlying data that we're using for a lot of these technologies is not, is not good, I'm gonna ask you to stop showing, your is sharing, or your presentation there. He goes, So they can see the bigger screen. And we have time here for questions with harvest. So, if you haven't submit your question, Yeah, go ahead and submit that, and I'm going to be monitoring them as we as we continue our discussion here.
So, our first question is, related to this. This challenges that you highlighted here are, in your presentation are large and many challenges? I mean, if you just think about business process management alone, you know, most organizations don't have a good handle on their processes, on their key business processes. They don't know how to measure them effectively. And that, And we haven't even started talking about data yet. Because, if they had good processes, and if they knew how to measure the health of those processes effectively, then you would have, maybe it's the health metrics for those processes that you could monitor. And now, you're getting to the point where data quality really matters. But, that's the big, a big set of assumptions that I have there to get to the data itself.
So, I'm curious, and this is just 1, 1, 1 path of challenges, There are many other paths that you have discussed in your presentation.
I'm curious, how do you start taking a bite out of this elephant?
You go into an organization, and, you know, new governance is unclear. Business processes are not clearly defined. I mean, you know, the organization is functioning. There even making money, despite all of that. So how do you start? If your issue.
If you don't have something in place right now that focused on data quality, or no.
It's going to sound overly simplistic to some, but you said it yourself, is one bite at a time.
It's, um, it all really starts with understanding the foundations of it.
So there's, there's two ways of approaching this, building the foundation from the ground up.
And that means understanding what our governing, our governing rules.
And then based on that, building the prophecies.
Now, you said it yourself, many organizations already focus on their day two days. So this is just something that is an afterthought, if they don't think about it much.
So starting with a, the other approach, which is either, I mean, it's more reactionary, but at the same time, is more reactive.
But it is also effective is to establish, as I mentioned, the data governance committee, to make sure that this is in people's minds. And, and, and, and it's basically a, ensure that it is a priority for the business.
Starting to look at each one of these data issues. And I mentioned, and I mentioned this, and I can't stress this enough, it's important to capture the voice of the customer, really understand what are, by customer, I really mean, internal customers and external customers as well. So, really, we need to understand the feedback from those users that are actually entering the data in the first place.
And, yes, there will be a combination of perception. That's why, I said, that there was a combination of both perception and fach around it, But, it is going to be important to hear them out.
And up, every single challenge that you end up capturing with regards to data quantity is ultimately going to have a root cause.
I covered a few that were, to me, the most common in the most important ones. But you were, it would likely have a component. Each one of these are a component, so the, maybe use a process that needs to be, that needs to be revised. And you're absolutely right. We could probably have a separate session. Just talk about continuous improvement.
Because there's a lot of benefits, and going through these type of process documentation, and continuous improvement exercises.
You mentioned that companies just are focused on their day to day. But I challenge that.
Or I encourage people to look at, just form a, just create, get a group with the process owners, and document each one of those steps in order to accomplish any task in the organization.
More often than not, what you see is that as those SMEs are going through the steps, they're going to ask themselves the question.
Wait, we're doing that, really?
And they're scratching their heads and they'll see, OK, So we have to wait, that's not as efficient as we thought. So just putting the process in front of you will give you that.
There's definitely all these fancy tools that can be used in order to identify root causes or are sort of like, these type of, like, six Sigma process improvement efforts.
But getting the basics, which is understanding how your, what your data processing activities are, will highlight the issues that I guarantee.
Now, know, that that's that's a good tip. I have Leanna Montoya here is asking a question. First of all, She said, this was a great presentation. And she has a very specific question about a term they used during the presentation. What are event driven QA reports?
So, it's really, those can, those can be used, depending on the application or the systems that you have, but event driven reports at its core, is a, It sort, is a type of automation, that you can set in place, so, that it creates exception reports based on certain conditions.
For example, if you want to generate an exception report, that is catching instances where your customer data is missing, some demographic information like industry, then you can, you can have an exception report that throws automatically, or comes out and just said, Hey, there's 500 records, 200 records for that five records, that just meet these criteria. So, those are the types of event driven. They have to meet a specific criteria. And pure reports are an important aspect of this, by the way. Because he will help you monitor, will really help you monitor. The state of your data is not just a one-time thing that you do, is if setting plays, exception reports can really help evaluate and monitor and ensure the quality of your data is maintained.
That's very good. Thank you. Thank you, and thank you for the question, Leanna.
On on that, now, you, you know, there are comments here and questions around the theme of, you know, you're leading infrastructure and technology on the Association of Certified Anti Money Laundering specialists, which, which sounds like an area that would be heavily regulated. And so, the question that is really related to that is that, how does data quality differ, or how do you address data quality on this have really regulated areas of business. Is. there are different approaches to that.
I'll speak to that in a more. And more in a, prior in the broadest sense of compliance. And the hard data quality really interacts with the, with the scope of compliance.
And it really depends on the type of organization.
If you're talking, for example, about organizations that are, that are publicly traded, that you have, certain compliance needs that you need to meet.
And you want to make sure that your financial reporting, for example, it is as accurate and as as intact as it needs to be. Because you want to make sure that you don't report on the wrong, under our financials.
So, it does have the ensuring data quality will. Definitely, it definitely has a compliance aspect.
Let's say, for example, GDPR, the European data privacy laws went into effect on May 2018.
You want to make sure that your customer data is the, basically, that you want to preserve the integrity of your customer data because at the end of the day, those type of data privacy level proved the burden on the organizations to actually secure and make sure that you have that you ensure the integrity of it.
So, there's a lot of it really depends. The compliance aspects about data really depends on the type of business or an organization.
My, in my particular, my personal experience, and in working with my organization, we do have, definitely some compliance obligations. So we need to meet and therefore, data it seats at its core of what we do and ensuring the quality is definitely a primary responsibility of ours.
Where where do you see?
Another question that has come up here is that where where are there industries that you see out there that are more advanced as an industry, when it comes to the applications and the understanding of the importance of data quality? Or this is kind of scattered and the best practices are kind of all over the place.
As opposed to being, having industry segments that may, that may be a little bit more advanced than others when it comes to this journey of data quality.
I mean, definitely, heavily regulated industries will have a lot more emphasis on preserving the data, and that really is just the fact.
But I would say companies do ultimately are more geared towards ensuring that they have good quality of data, because at the end of the day, it will be. It is not just it is something that you would use to meet a compliance obligation. You want to have good data, because it ultimately is going to inform your decisions.
You want to have good data, because, at the end of the day, if you want to make a decision on your business, about A, what do I take my company, the next five years, you have to base data out of out of, out of information, out of historical information, or market information.
So, the, What are the, the data will play an integral role in making those type of decisions.
Um, I will say that when it comes to, um, two benchmarks and tools out there, um, there's, there's many.
But in, and it's definitely conversations that we can dive into us, as we would, if we, if we asked the questions, what type of systems to use. Because there's some data quantity that.
so there's some best practices and also some tools that are available, depending on the type of applications that you use, whether it is a CRM or what type of tool you're talking about.
There's some, I always, I always, say as a personal exercise, but also as I recommend going with, for example the day, Magic Quadrant, fromme, Gardner. Like if you went out, you want to look at, for example, some benchmarks around applications out there. That's, that's a good tool. I mean, there's, there's, so understanding that the tools and the benchmark say you want to apply, and the packages will also apply, very specific to tools and your business. So it's important to understand what your what your requirements are.
Yeah, for sure, for sure that those are good perspectives. Hydro. I'll share, or not express that, I have had, for all my career working with different clients, especially on the engineering, science, physical space, it's quite an interesting experience. I have had projects where a supplier would come up to me and say, hey, I have 18,000 data points on how this equipment functions. And then I'll look at it, and it was like, you know, it's kinda meaningless, even though there's 18,000 data points. And then we'll go back to them and say, I need eight data points, but I need those data points to be taken in this specific way, and I can get better information on those eight data points than on those data points that they provided to me.
So, the point is, the extrapolation of that is going to in an era of big data. We have big data, applications everywhere versus small data applications. And, ah, and quite often small data, if it's collected, right, it has better information equality, then big data.
So, the specific to the application of Big Data, how do you ensure big data data quality? Because, people have made this is sometimes they make some assumptions that big data is better, and also a dangerous assumption that the quality is decent on that big dataset. And you, As you know, we can have very much correlations there. Happenstance That could happen. That can drive really wrong decision making and behaviors based on big data. So, in the era that, we're in of, big data, in the availability. What is the approach for, specific on those types of applications to ensure data quality?
So, I'll first echo your, your, your, your sentiment at the beginning.
I strongly believe that big data amplified the issue with data quantity because at the end of the day, the amount of information just just completely grew exponentially.
The amount of information that we started collecting on everything, on our clients, on our, on, our, on our systems, and absolutely everything.
Um, it all goes back and I'll take you back to that slide around the building blocks.
It is, if you have good policies and good, good policies around data governance, it will, ultimately, that's just a foundation of everything. It is the basis upon which your infrastructure, your IT infrastructure, is built.
It is the basis of, of how your processes are built. So, having that framework or policies on how you own that. Basically, the, the framework with a baseline, it'll, you know, it's always going to help.
So, it is having its, having just just speaking about it.
I kept saying earlier, getting the basics right will go a long way.
So, I'm not saying some of these things people may look at, I may think, Oh, you know what? That's very aspirational, that judge if I mean, we were probably like years away from having a very comprehensive and robust data quality management system.
You don't really need to get to that point, You're really interested to get the basics right, and Understanding those questions, what type of data that you have, understand that what is the format in which the in that type of data?
What is the understanding? Where is your data located?
How is it being collected? So getting the basics right, and understanding where your data's coming from it is going to go a long way. You don't have to really get to that or robust implementation of a data quality management system to make a difference.
Farewell. Higher one final question here. If you're using, is there like in your practice, because you're a practitioner. That's why we like to have speakers like you, you do these things. You just don't write a book about it.
But really, the anchor view of your people in your practice, are the useful tools, techniques they're using, for data quality, that you would suggest that people should check out.
Like, I was telling you earlier, I think it really depends on the system that you use.
For example, if you're using if you're using Informatica, for example, is a company that does have a lot of data quality data, quality tools, or applications that you can use to, two, collect, or, sorry, to clean your data? Maybe you're talking about, you want to refresh customer data. And I was speaking about companies like swimming for Trauma and Dun and Bradstreet. You have massive, robust customer databases that can, That can help you refresh some of the data you may have on your customers.
Maybe if you're using Salesforce, you want to use cloud Dingo. I think it's one of the one of the obstacles related to data quality depends on understanding the tools are out there.
So, it really starts with understanding what your requirements are, and, again, getting those and the answer answers to those questions that I mentioned earlier.
Hydro, thank you so much for a masterclass on data quality, and we appreciate the insights and the thorough approach that you use in explaining those concepts. There is great, great insights on your presentation and in our dialog. So, we're very thankful, on behalf of our global community, for you to take the time to share your expertise with all of us.
Thank you. I'll say thank you for having me and thank you, everyone.
Ladies and gentlemen, that was that was High Gomez, who is, again, the senior Leader of Infrastructure and Technology at the Association of Certified Anti Money Laundering Specialists, eight cans. And with a masterclass on the importance of data governance structure, and specifically data quality is a very important subject, easy, to be overlooked in organizations. Because of other competing priorities, and sometimes a bit of a rush to implement technology.
Wow, we're gonna wrap up today, and I want to set up a little bit about what we're expecting tomorrow.
We're gonna, we're gonna have great presentations tomorrow, again, from global industry leaders, and IT infrastructure, and cloud. And we're gonna start tomorrow with a Beacon Street, and they're a vice-president of Software Engineering and Quality Assurance, discussing how you intelligent leverage data and why is it OK to shift, right, as part of your testing strategy. So Josh Mask is going to be with us and talking about that, we're going to have presentations from Walter score, and the banking. And we're gonna finish the day with the head of client deliver management for the MBA world region from S A P. Shima. Rohingya is going to talk about Purpose Driven Cloud strategy and the role of IT infrastructure in it.
A Shima is a global leader in this area for sap B and the and the among with all of the other Speakers ... tomorrow, will provide a very comprehensive view, multi-faceted view of IT infrastructure and Cloud Strategists. So for now, we're going to wrap up here. I ask you that if you have questions, comment there. You want to thank the speakers and sponsors. Checkout on their LinkedIn, Joseph Theories, say, the person that I have about this conference and make your comment there.
Again, we appreciate the engagement that you have of us during the conference, and after the conference as well. And wherever you are in the world. Have a great rest of your day. And I look forward to seeing you here tomorrow, again, for another great day on IT infrastructure and Cloud strategies. Thank you for now.
Sr. Manager, Infrastructure & Technology,
Subject matter expert in areas of business transformation, GRC, and business intelligence. 10+ years of experience in leading global cross-functional teams, coordinating both strategic and tactical efforts to drive continuous improvement and business transformation across multiple service-oriented industries.
Search for anything
November 9, 2021
11:00 AM - 12:00 PM ET
January 13, 2022
1:00 PM - 2:00 PM ET
January 27, 2022
1:00 PM - 2:00 PM ET
View our schedule of industry leading free to attend virtual conferences. Each a premier gathering of industry thought leaders and experts sharing key solutions to current challenges.View Schedule of Events
Watch On-Demand Recording - Access all sessions from progressive thought leaders free of charge from our industry leading virtual conferences.Watch On-Demand Recordings For Free
Courtesy of DC Government's Ernest Chrappah, below is a transcript of his speaking session on 'Going Digital To Enhance The Customer Experience' to ...
Courtesy of 's Anu Senan, below is a transcript of his speaking session on '' to Build a Thriving Enterprise that took place at Enterprise ...
Courtesy of Tasktop's Dr. Mik Kersten, below is a transcript of his speaking session on 'Project to Product: Driving Digital Transformation Insights ...
Courtesy of Nintex Pty's Paul Hsu, below is a transcript of his speaking session on 'Improve employee productivity during and post-COVID by ...
View our schedule of industry leading free to attend virtual conferences. Each a premier gathering of industry thought leaders and experts sharing key solutions to current challenges.View Schedule of Events