Leveraging Big Data in the Modern Tax Function

Aug 14th 2013
Share this content

By Stephanie Maxwell, David Steiner, and Scott Stein

Technology and the modern tax function go hand in hand. The complexity generated by increased globalization as well as a wealth of new regulatory and compliance requirements can only be managed effectively through the use of solutions geared toward the myriad calculations, reports, and analyses demanded of tax departments. The amount of data that is generated on a daily basis has swelled to staggering proportions.

Organizations that hope to remain competitive must be prepared to not only manage but capitalize on this windfall of information. Emphasis on data-driven decision making must be an enterprise-level undertaking, but such a transformation will take time and may be best accomplished in stages. This article will focus on the advent of big data, as well as solutions and tools that may be leveraged by best-in-class tax departments to make the transition to an efficient and effective member of the digital universe.

The Era of Big Data

One of the biggest challenges facing organizations today is how to handle the huge amounts of data that are being generated, including new and increasingly complex data types. Walmart handles more than a million customer transactions every hour, feeding databases estimated at more than 2.5 petabytes. Facebook is home to forty billion photos and their corresponding metadata. And decoding the human genome involves analyzing three billion base pairs, which took ten years the first time it was done in 2003 but can now be achieved in one week.

These are examples of a phenomenon that has been termed "big data." According to the Gartner IT Glossary",Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making." 

Volume is typically measured relative to the capacity of the existing available resources for storage and management of the data. The number of devices and applications capable of capturing digital data is growing daily, which means that the volume of available data will continue to expand.

In addition, the types of data being captured through social media, websites, and e-mails contribute to an ever-increasing variety of information available to organizations. Of particular relevance to tax departments, for example, is the potential to leverage information captured in e-mails and paper documents to comply with FATCA's US indicia requirements.

There are two basic categories for this data – structured and unstructured. Structured data refers to the more traditional data types that can easily be read or managed by existing technology, such as relational databases. Unstructured data is less formal and may be textual, such as e-mails, documents, or instant messages, or it may be generated by non-textual sources, such as images or audio and video files and the corresponding metadata, or "data about data." With so many sources, the volatility or frequency of the data produced by organizations is growing by leaps and bounds. However, the crux of the challenges and opportunities presented by big data is value: finding the proverbial needle in the haystack. Identifying and using the meaningful data among the mass of information flowing into an organization requires not only deep analytical skills, but tools designed for just this purpose. The difficulty of this task is increased when data is buried in disparate or obsolete systems built for other purposes.

A June 2011 report by the McKinsey Global Institute reveals that there is a 40 percent projected growth in global data generated each year, and within the United States, fifteen out of seventeen industry sectors have more data stored per company than the US Library of Congress. Organizations that get ahead of this trend and implement policies and strategies to harness big data will enjoy a significant competitive advantage within the marketplace.

This will have particular importance for tax departments that have to contend with the aforementioned growing burden of compliance and reporting requirements and the equally important focus on quality and risk management. The organization, retention, and disposal of the vast collection of data available to companies, including data within the tax department, are key considerations to manage risk and avoid liability down the road. Without the ability to manage volume and extract the important, quality data from the growing quantity, tax will be further hindered by inefficiency and risk.

Furthermore, time spent on manual data manipulation and reporting limits opportunities for tax to focus on value-added tasks, such as strategic planning and forecasting. Historically, tax departments have tended to sit in a silo, apart from the organization as a whole. As the trend toward big data leads organizations to emphasize the value to be found in deep analyses of available information, this tendency can no longer continue.

The core benefit of harnessing big data is the opportunity to exploit the collective mass of data throughout an organization to find business trends, detect patterns, identify and address risks, and to boost productivity and market value. Tax must keep pace with the larger organization and seek opportunities to implement technology and process improvements. This will serve to both optimize day-to-day operations as well as provide opportunities to add value through planning, while enhancing the risk management profile of the tax function.

Tools of the Trade

According to a 2012 Forbes article",big data is the new quant investing." Many of the ideas from quant investing make sense in this context; histories are huge and experimentation is easy. There is an underlying behavioral model, plus, you know your counter-parties. The large volume and variety of data allow use of new "data voracious" statistical and machine-learning methods that, in finance, are useful for high-frequency trading, but are worthless on daily or monthly market data. It is words as well as numbers, so natural language tools can work along with numerical calculation.

With such emphasis on data-driven decision making, organizations must make it a priority to effectively manage and utilize the wealth of data available to them in order to remain competitive, and tax departments are no exception to this reality. The capability to predict the potential tax impact of international transactions or proposed acquisitions through business intelligence solutions is just one example of the opportunities for tax to contribute to the success of the organization through the power of data. Data imaging technology offers the capability to better leverage paper notices by extracting meaningful information and transforming it into a readable electronic format that can be stored and later queried from a centralized database. There are multiple tools available to assist tax departments with these types of process improvements.

Data management and collection

The use of a centralized repository to house structured data is not a new concept, but it remains an effective and critical solution. One of the persistent struggles facing tax departments today is efficiently using data stored across multiple systems on varying platforms, many of which were designed without taking into account the needs of the tax department. Consolidating tax data into a data warehouse is an important first step toward a meaningful data management solution. Data warehouses provide the ability to maintain a single version of the truth and eliminate the risk of data getting out of synch between systems. 

Data warehouses may be as broad as a mainframe solution that captures all of the relevant data maintained within the organization, or it may be focused on a single functional area such as the tax department.

Hand in hand with a data warehouse are ETL tools. The acronym ETL refers to Extract-Transform-Load, and as the name suggests, these tools provide the ability to collect data from disparate sources and transform it into a standard format for analysis and reporting. ETL tools are flexible enough to accommodate various data types and platforms while ensuring a consistent definition and use of the data that is collected. On their own, these tools may also offer the opportunity to overcome challenges presented by legacy systems, which are not configured to accommodate the needs of the tax department by automating the retrieval of critical information into downstream databases or tools for use in tax-oriented analysis and reports.

Data mining

Big data is made up of structured and unstructured information. Structured data is typically found in databases and accounts for 10 percent of the big data that organizations possess. The other 90 percent is made up of unstructured data, such as e-mails, call center recordings, social media posts, and website logs. Organizations are faced with the challenge of translating all of this data into meaningful content in an efficient manner that allows for effective decision making.

Data warehouses and ETL tools open the door for effective data mining. "The era of big data brings new data management and data mining challenges, but just as many opportunities. The more data that exists and the more that data is in chaos, the more important translation systems are to future success. We rely on IT to manage these translation systems and to innovate and improve on both the systems and the translation processes" (TechNet Magazine",IT Management: The Petabyte Era).

Data mining tools are used to identify the relevant and meaningful pieces of information within a dataset, a capability that is becoming increasingly important with the emergence of big data. As an example, text mining of keywords, tags, and patterns or terms in both structured and unstructured data, such as paper documents, can help identify US citizens to meet new FATCA reporting requirements. Data mining provides a chance to highlight areas requiring deeper analysis and investigation to help minimize the risk of negative tax impact from proposed business decisions. However, this analysis is only as good as the data itself. Data must be organized and stored at a sufficient level of detail to create meaningful value from data mining.

Business intelligence and data discovery

As a next step, once data has been gathered and organized, business intelligence solutions and data discovery tools provide the functions to turn raw data into valuable results. Business intelligence (BI) solutions include the dashboards, analytics, and reports that allow tax departments to leverage the large amounts of structured data maintained in data warehouses. 

One of the most important components of BI is customizable dashboard views that provide the opportunity to organize and present data visually at varying levels of detail depending upon the intended audience. Given the vast quantities of data available, this type of solution is essential to distill the important pieces of information and present them in a way that is meaningful to each user. Dashboards offer the opportunity for collaboration and integration across the organization by making tax data and analysis visible to other departments and all levels of the organization in a way that they can easily digest in real time as the data is captured.

Predictive analytics is a specific branch of BI analysis that uses elements called predictors to create models that can be analyzed to forecast future probabilities. The models are revised or validated as new data becomes available. Predictive analytics are a valuable tool for data-driven decision making by offering the ability to identify trends and evaluate the tax or financial impact of potential business decisions.

On an operational level, these solutions may be used help identify processes that previously experienced significant or repeated delays in order to proactively implement improvements and plan work appropriately. From a risk management perspective, utilizing the output of this type of analysis can reduce the occurrence of missed or late filings. These solutions are designed to use data to make better business decisions and can significantly improve both the quality of the information and reporting that comes out of the tax department as well as increasing its overall efficiency. 

While the tools mentioned provide a wealth of capabilities, they are often best suited for structured data and sometimes struggle with the increasing volume of data captured by modern organizations. Data discovery is a growing new segment of business intelligence, which offers innovative technologies that leverage advanced processing to produce effective performance when analyzing the huge volume of big data. In a report published on June 17, 2011, Gartner states: "These data discovery alternatives to traditional BI platforms offer highly interactive and graphical user interfaces built on in-memory architectures to address business users' unmet ease-of-use and rapid deployment needs." This report further highlights the use of data discovery tools to address specific requirements of smaller workgroups that are most often addressed through manual data analysis and spreadsheets today.

These tools offer the option to function in a self-service model, which eliminates the need for the technical expertise of the IT department. Data discovery includes the capability for search-based discovery tools that can take advantage of unstructured data through the use of text-search inputs that can find the important information among reams of documents, e-mails, and notices. Where BI tools are traditionally oriented to the needs of an entire organization and implemented at an enterprise level, data discovery tools are focused on the individual business user.

Data discovery can be leveraged to address specific needs within a particular workgroup, such as analyzing tax department performance during prior tax cycles to identify opportunities for improvement. A number of companies are emerging to offer suites of data visualization tools that provide interactive graphical results built on top of BI-based analytics and reporting. These visually oriented outputs provide meaningful insights into complex information in a way that is intuitive for human consumption and can be easily presented to stakeholders at all levels. This is much more impactful than the traditional spreadsheet-based reports that have historically been the output of tax analysis and reporting.

Web-enabled portals for data access and tracking

At the front end of data management, a web-enabled portal takes full advantage of the carefully collected and organized data. The analysis offered by the previously mentioned tools is brought into a central view that offers increased opportunities for collaboration through shared data and reporting both within the tax department and beyond.

Portals utilize standard processes and queries that ensure all users of the data are receiving current, consistent information. With the growing international scope of business today, this type of collaboration becomes even more critical to ensuring that tax builds a relationship with the global organization to assist in managing the expanding compliance and reporting requirements of multiple jurisdictions.

Operationally, portals can also help reduce the amount of paper mailings to organization stakeholders, enabling quicker dissemination of information and allowing for reliable tracking and analysis of these activities. Stakeholder information can be managed directly within portals as well, which reduces the administrative burden on the tax department and allows data to flow directly to the data warehouse for use in analysis and reporting.

Going one step further, by leveraging the Internet, portals can support connections to external applications or third-party vendors for specialized features or calculations not available in-house. Along similar lines, portals allow for increased transparency where desired by allowing organizations to make relevant data available to investors, customers, management, or regulators.

From a risk-management perspective, the use of encryption in a web-enabled portal increases security of private data and provides auditability through website logs and tracking of data requests, submissions, and other transactions. These logs and other audit information are prime examples of unstructured big data that can then be accessed through data discovery and BI tools to provide insight into investor and stakeholder activities as well as maintaining an electronic record of who received or submitted what and when.

Document management

As the final step in the overall big data strategy, a well-executed and robust document management approach is critical to an effective tax function and can take inefficiencies out of the tax life cycle, thereby freeing up resources. Frequently, tax departments use shared file servers or paper files for a majority of their document management needs. These decentralized and outdated document management systems make it difficult to quickly and easily locate essential documentation needed to support the tax function. Quite often this leads to hours spent on rework and locating the final version of key documents.

Innovative document management systems, when managed and maintained properly, are highly effective in driving the re-use of tax documentation as well as significantly reducing risk of using outdated or incorrect materials. As an example, private equity firms typically collect and store withholding statements for each investor, often in paper format or scanned into a repository typically maintained by the investor relations department. Tax will then take a copy of those statements in order to calculate withholding for investors, but they may not check to ensure that they have the latest versions of those documents. Consequently they may not know, if, for example, the investors have moved since the last filing. By storing these types of documents in a central location, accessible by both investor relations and tax, version control is effectively resolved and the risk of incorrect withholding is drastically reduced. 

A centralized, structured document management solution provides a consistent, known location for critical documentation and provides effective capabilities for classification and indexing. In turn, organizations improve risk management through one version of the truth; documents in modern document management systems provide for tracking, searching, version control and lockdown, and the ability to audit the edits of documents. And as more and more tools continue to emerge to mine the unstructured data found in these files, a well-organized document management solution will put the organization one step ahead in leveraging this information.

As previously mentioned, document repository content is among the largest sources of unstructured data within an organization and represents a previously untapped source of valuable information about investors and business trends. A solution that offers the ability to easily tag or categorize the information in documents to facilitate ease of retrieval and identification of relevant data can offer significant improvements in efficiency and eliminate countless hours spent searching through disorganized data stores.

Tax departments may choose to enable web access to their document management system in order to facilitate real-time collaboration with internal stakeholders as well as outside service providers. As an additional benefit, the organization can more easily impose processes and standards around document archival and retention, security, and backups in a well-maintained document repository. The established document management system offers an efficient method for sharing and collaborating within the overall organization as well as minimizing risk within the tax function and eliminating administrative burdens around outdated shared drive file storage.

The Way Forward

Implementing a full-scale global tax platform is often too complex or costly for many organizations, particularly in today's economy. But that does not preclude opportunities to enhance the data literacy of the tax department. Most of the solutions discussed above can be effectively utilized independently and still add significant value to the organization. Regardless of the scale of data transformation within an organization, the key steps to an effective data management strategy are the same.

Determine the business goal

It is crucial that the emphasis be placed on accomplishing a specific business goal. The value of big data lies in what new or better knowledge can be gained through its analysis. Identify a specific goal within the tax function that is served through better or additional data as the focus of the new strategy and use this as an opportunity to establish a framework.

One example might be leveraging the reams of paper and electronic documents and e-mails that were previously unavailable for FATCA-related analysis, but which can become fully utilized through data mining or document management solutions. Starting small, with a specific objective, leaves room for refinement and expansion as the team becomes more comfortable with the new tools and methodologies, while minimizing investment until the value of the endeavor is proven.

Identify quality data

It cannot be overstated: the knowledge gained from big data is only as good as the data itself. Identifying the important, quality information from within the multitude of available data is critical to the success of a data management strategy. At a minimum, even if no other technology is utilized, tax departments can gain efficiencies from simply organizing and standardizing the currently available data.

Beyond that, knowing which data is important can help determine the best investment in technology and processes to meet the stated goal. It is also important to consider the need to coordinate data from multiple sources, including those external to tax. Finance, customer relations, and even marketing data can all be potential sources of meaningful information, and an effective strategy should include a mechanism to gather data from all relevant sources in an efficient manner.

Embrace new roles

With the growing emphasis on data-driven decision making, organizations will need to find resources with the proper skill sets to find the meaning in the vast quantities of data that are now available to them. Embracing the role of the data scientist in every area of the organization, including tax, is a necessary step to get the greatest benefit from advanced analytics. Creating opportunities for tax resources to increase their analytic skills and build their data literacy will also be crucial. Ultimately the mind-set of the tax function must be transformed to one that focuses on data and the corresponding algorithms to make decisions and generate knowledge to the benefit of the organization.

One of the biggest benefits of data discovery is its ease of use and focus on business versus IT users, so for organizations looking to leverage deeper data analysis quickly, these solutions can be a good option. However, even without new tools or technologies, the approach to data analysis and the focus on identifying new patterns and new relationships to gather insight about the business must be at the core of any data management strategy.

Change the culture

No new strategy will succeed if the culture within the organization does not evolve with it. Sponsorship from the top down and emphasis on the importance of data-driven decision making are essential. Regardless of the scope and level of investment in a data management platform, organizations must change the way that data and the resources that use it are perceived. Data analysis and insight is no longer the sole province of IT; it must be a common thread throughout the organization. Defining business goals supported by data and analytics, developing the appropriate skill sets, and implementing the proper architecture and framework to support data initiatives will send a clear message from leadership that data literacy is now a core value of the organization.

As the data-driven economy continues to evolve, it is clear that organizations that embrace the opportunities presented by an effective data management strategy will maintain a competitive edge within the global marketplace. While implementing a full-scale and comprehensive solution designed around big data will require significant investment and may take time to come to fruition, taking smaller steps to enhance the use and management of available data immediately is an essential step in today's information focused culture.

For the tax department in particular, the explosion of new regulations and reporting requirements along with increased scrutiny by management and tax authorities alike makes this even more crucial. Redefining the tax technology strategy with an emphasis on data management can help tax functions meet increasing demands for transparency, quality, and reduced risk and become a more integrated and influential component of the larger organization.

About the authors:

Stephanie Maxwell is a senior manager at PwC. She has worked at PwC for fourteen years, focusing on tax technology application implementations and support.

David Steiner is a partner at PwC. He has been in the financial services industry for nineteen years, principally involved in the audit and taxation of hedge funds, private equity funds, funds of funds, and investment advisors. In addition, Steiner is the national partner in charge of technology and process for the Asset Management practice.

Scott Stein is a managing director in PwC's Asset Management tax practice. During his sixteen years with PwC, Stein has been involved in the design, development, implementation, and support of various tax technology initiatives.


Replies (0)

Please login or register to join the discussion.

There are currently no replies, be the first to post a reply.