Category Archives: CaCPA

Why Most eDiscovery Tools and Online Archiving Offerings Are Terrible for Information Governance

By John Patzakis and Chas Meier

Many organizations assume that information governance initiatives—such as data privacy audits, purging ROT (Redundant, Obsolete, or Trivial) data, merger and acquisition-driven data separation, or data breach impact assessments—can be effectively addressed using eDiscovery tools or online archiving platforms. After all, eDiscovery solutions excel at identifying and searching through large volumes of unstructured data in high-stakes, reactive legal scenarios.

However, there is a critical distinction between eDiscovery and information governance workflows that organizations must understand when selecting the right solution. eDiscovery typically involves copying large volumes of data at multiple stages and continually moving that data upstream, eventually into third-party cloud platforms for processing and hosting. In contrast, duplicating and moving massive data sets is often the last thing you want to do in information governance projects, which are typically large-scale, enterprise-wide initiatives.

In fact, here are five major reasons why most eDiscovery tools and online archiving solutions are terrible for information governance. These tools:

  1. Dramatically Increase Risk
    Consider a scenario where an organization suffers a data breach and must assess 100 terabytes of data to identify compromised PII and determine reporting obligations. Most eDiscovery tools require a full copy of this data to be made and uploaded into a third-party environment—doubling the volume of sensitive material and compounding the risk. Instead of helping, this kind of mass data duplication exacerbates the compliance and privacy risks that governance initiatives aim to reduce. In fact, such inefficient data duplication directly conflicts with GDPR principles, which require data minimalization and proportionality.
  2. Are Exorbitantly Expensive
    Information governance is not a small, tactical effort—it is a broad, enterprise-wide initiative. At X1, we rarely see governance projects involving less than 50 terabytes of data. Using traditional eDiscovery pricing models, even with volume-based discounts, these projects can quickly rack up tens of millions of dollars in costs due to unnecessary processing, storage, and hosting workflows designed for litigation—not governance.
  3. Can’t Meet Time Constraints
    Copying, transferring, uploading, and indexing 100 terabytes of data into a third-party cloud platform can easily take six months or more, even in an ideal scenario. That timeline is incompatible with the urgent nature of most information governance use cases, such as data breach impact assessments or M&A-related audits. Worse yet, by the time the data has been copied and indexed, it will likely already be stale—undermining the integrity of the project from the outset.
  4. Create Remediation Roadblocks
    Suppose you incur the costs and risk to copy and upload a full data set in an external review platform and successfully identify sensitive or outdated data for remediation. Now what? You are merely working with copies of the data. The originals remain distributed across Microsoft 365, file servers, laptops, and other locations. Trying to trace back and manually remediate live data sources is costly, disruptive, and error-prone—defeating the very efficiency goals of the governance project.
  5. Do not Support Microsoft 365 Effectively
    Many so-called “governance” tools are simply rebranded email archiving systems that rely on bulk copying data out of Microsoft 365. Not only is this approach expensive and inefficient, but it also creates serious technical and compliance risks. Microsoft 365 does not support mass data exports at scale without significant friction, and errors are common—as illustrated in FTC v. Match Group, No. 3:19-CV-2281-K, 2025 WL 46024 (N.D. Tex. Jan. 7, 2025). In that case, Microsoft Purview exports into an archival system failed, resulting in court-imposed discovery sanctions. If a solution does not support index-in-place capabilities—allowing analysis directly upon the native data—it is simply not viable for modern information governance needs.

A Different Approach is Required
Information governance requires agility, precision, and a fundamentally different approach than traditional eDiscovery processes. Organizations must be wary of legacy eDiscovery tools and outdated archiving platforms masquerading as governance solutions.

X1 Enterprise was purpose-built to address the challenges and inefficiencies that plague traditional eDiscovery tools and archiving platforms when applied to information governance. At the core of the X1 Enterprise Platform is its patented micro-indexing architecture, which enables organizations to search, analyze, and act on data in place, without needing to first copy, move, or centralize it.

This index-in-place capability means X1 can connect directly to endpoints, file shares, Microsoft 365, and other enterprise data sources to perform fast, scalable, and highly targeted data sweeps and analysis—without duplicating the data or exposing it to unnecessary risk. Whether you are performing a data privacy audit, a breach impact assessment, or an M&A data separation project, you can run real-time searches across tens of terabytes and thousands of custodians—with results returned in minutes, not months, and the data remediation performed in-place.

By eliminating the need for data movement, X1 avoids the five major pitfalls of legacy tools:
Risk: No mass duplication of data, reducing exposure and aligning with GDPR and other regulatory requirements.
Cost: No massive ingestion or hosting fees—X1 dramatically lowers total project costs by working directly with live data.
Time: Deploy and execute governance initiatives in a fraction of the time required by traditional methods.
Remediation: Act directly on live data—flag it, move it, delete it, or apply tags—in the original source locations.
Microsoft 365 Compatibility: X1 integrates natively with Microsoft 365 and other systems without requiring cumbersome exports or expensive additional licensing and services, enabling robust, reliable governance at enterprise scale. Simply put, we believe X1 provides the best available support for M365 data sources.

In short, X1 Enterprise offers a faster, safer, and far more cost-effective way to execute complex information governance projects—turning what used to be massive, reactive, months-long efforts into streamlined, proactive, and strategic workflows.

Learn more about how X1 Enterprise can streamline your next information governance project. Schedule a demo today at sales@x1.com or visit www.x1.com/solutions/x1-enterprise-platform.

Leave a comment

Filed under Best Practices, CaCPA, Cloud Data, Corporations, ECA, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, GDPR, Information Governance, law firm, m365, Preservation & Collection, Records Management

Courts Favor Targeted eDiscovery Collections, but It Is Up to In-House Teams to Enable Such Cost Saving Proportional Efforts

By John Patzakis

In-House Legal Teams Enable Cost Savings

Corporate legal departments face ever-increasing costs and risk related to eDiscovery, driven largely by excessive and indiscriminate data collection. Many organizations default to an overbroad “collect everything” approach out of an abundance of caution or due to inefficient workflows imposed by third-party service providers or even outside counsel. Over collection results in far higher costs upstream, critical delays and increased risk. However, for this reason courts consistently endorse proportional and targeted discovery practices that balance the needs of litigation with cost-effectiveness and reasonableness. But in order to best realize the benefits of proportionality, organizations should establish an in-house eDiscovery capability supported by best-practices technology.

Courts Support Proportional and Targeted ESI Collection
The Federal Rules of Civil Procedure (FRCP) emphasize proportionality and reasonableness in discovery. Specifically, Rule 26(b)(1) limits discovery to information that is relevant to any party’s claim or defense and proportional to the needs of the case.

Courts have routinely upheld this principle, encouraging parties to avoid overbroad collections:

  1. The Sedona Conference Principles
    While not binding, courts frequently rely on The Sedona Principles, which advocate for “reasonable and good faith efforts” to identify relevant ESI. (See The Sedona Principles, Third Edition, 19 Sedona Conf. J. 1 (2018)). Courts cite these principles to support reasonable limits on preservation and collection.
  2. In re Bard IVC Filters Prods. Liab. Litig., 317 F.R.D. 562 (D. Ariz. 2016)
    Here, the court recognized the proportionality limits of Rule 26(b)(1) and ruled that the defendant’s proposed targeted discovery approach—using custodians, date ranges, and agreed-upon search terms—satisfied its obligations.
  3. Oxbow Carbon & Minerals LLC v. Union Pacific Railroad Co., 322 F.R.D. 1 (D.D.C. 2017)
    The court rejected broad discovery requests that lacked proportionality, holding that the producing party could limit its search for ESI to agreed-upon custodians and relevant date ranges. The court emphasized that broad, burdensome demands are contrary to Rule 26(b)(1).
  4. Hernandez v. City of Houston, No. 4:16-CV-3577, 2020 WL 2542625 (S.D. Tex. May 19, 2020)
    Here, the court denied a motion to compel additional production of ESI beyond agreed search terms, explaining that the requested expansion was disproportionate given the marginal relevance and substantial burden of additional collection.

These and other decisions (further analysis available here) demonstrate that targeted, proportional collection efforts are not only defensible but expected by the courts. Overcollection is hardly mandated by the court and, in fact, can increase risk by preserving irrelevant or privileged information unnecessarily.

So, the problem is not the law. The challenge is that many eDiscovery service providers favor full disk imaging or other forms of massive data over-collection for two reasons: 1) As they are not integrated into a company’s IT data architecture with an established and repeatable process, they revert to a reactive, once-off effort to collect everything that could possibly be relevant; and 2) They are financially incentivized to collect as much data as possible.

Advantages of In-House eDiscovery Capabilities for Targeted Collections
To align with the principles of proportionality, legal departments should move away from the outsourced collection model that favors bulk extraction. Instead, maintaining an in-house eDiscovery capability provides the following key advantages:

  1. Integrated, Precise Search and Collection
    Solutions like X1 Enterprise are designed to index data in place, allowing corporate legal and IT teams to search, cull, and collect only what is relevant—without moving massive volumes of unnecessary data. This reduces costs and minimizes data exposure.
  2. Iterative, Defensible Process
    With in-house capabilities, legal teams can collaborate directly with IT to conduct collections iteratively. They can refine search criteria and custodians in real-time, in response to case developments or meet-and-confer negotiations, ensuring defensibility and responsiveness.
  3. Faster Response Times and Lower Costs
    Deeply integrated technology removes reliance on expensive, reactive third-party vendors who often require full data exports up front. By indexing data where it resides, in-house teams can respond quickly to litigation holds and discovery deadlines.
  4. Enhanced Compliance and Risk Management
    By avoiding massive data dumps, corporations reduce the risk of producing irrelevant, privileged, or sensitive data unnecessarily. Proportionality helps mitigate privacy risks and comply with data minimization principles under privacy laws like the GDPR and CCPA.
  5. Control and Repeatability Across Multiple Use Cases
    In-house solutions preserve institutional knowledge and workflows. Future cases can reuse workflows and search parameters, creating repeatable, consistent, and auditable processes. Further, the same process can be readily leveraged for various information governance and other compliance use cases.

Conclusion
Courts expect discovery to be proportional, targeted, and reasonable—not excessive or indiscriminate. Establishing an in-house eDiscovery capability with proven integrated technology like X1 Enterprise allows your organization to operationalize this legal standard. By doing so, you will reduce costs, minimize risks, and demonstrate good faith compliance with discovery obligations.

Leave a comment

Filed under Best Practices, CaCPA, Cloud Data, Corporations, ECA, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, GDPR, m365, Preservation & Collection, proportionality

Dark Data is an Unmet Cyber Security Challenge

By John Patzakis

Enterprises today are creating and storing massive volumes of unstructured, data distributed across the enterprise at a very fast pace. IT experts refer to this data type as “dark data.” Research advisory firm Gartner defines dark data as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes.” according to Rahul Telang, professor of information systems at Carnegie Mellon University, “[o]ver 90% of the data in business is dark data.”

Dark data exists due to organizational silos and a highly distributed and mobile workforce, a trend that proliferated during the COVID pandemic and has now solidified as the new normal. As a result, there is a proliferation of unmanaged data stored in file shares, laptops, unarchived email accounts, shared cloud drives such as OneDrive and Dropbox and many other repositories. According to Anthony Juliano, CTO of Landmark Ventures, “dark data is exploding rapidly with the dissolution of the perimeter; it’s a largely unaddressed risk vector. A vast majority of the CIOs and CISOs I speak with are now prioritizing solving this problem not only going forward, but also backwards – and it’s not easy.”

Cyber security platforms generally have a good handle on perimeter integrity, encryption, and other key priorities such as zero day network attacks and malware. However, while these measures are clearly important, distributed dark data is largely a blind spot for cybersecurity tech, and as such organizations have very little visibility into the content of such data. GDPR, CCPA and other recent privacy regulatory requirements add increased urgency to this challenge.

CISOs and legal and compliance executives often aspire to implement information governance and security programs like defensible deletion, data migration, and data audits across their unstructured data to detect risks and remediate non-compliance. However, without an actual and scalable technology platform to effectuate these goals, those aspirations remain just that.

One tactic attempted by some CIOs to attempt to address this daunting challenge is to periodically migrate disparate data from around the global enterprise into a central location, such as an archiving platform. But boiling the ocean through data migration and centralization is extremely expensive, highly disruptive, and frankly unworkable for numerous reasons. While such a concept may seem like a good idea when drawn up on the whiteboard, originations quickly learn that you cannot just migrate hundreds of terabytes of distributed dark data to an archive, mainly due to network bandwidth and other logistical constraints, as well as the reality that you are merely copying and duplicating the data being migrated, which actually makes the situation worse.

Another tactic is data loss prevention (DLP). Again, this approach is thwarted by the new normal of a distributed, global workforce. Additionally, DLP tools are traditionally hampered by an inability to have deep content insight to unstructured data, resulting in false positives, inaccurate classification and unacceptable disruption to employee and business workflows.

What has always been needed is gaining immediate visibility into unstructured distributed data across the enterprise in-place, through the ability to search and report across several thousand endpoints, file shares and other unstructured data sources, and return results within minutes instead of days or weeks. None of the other approaches outlined above come close to meeting this requirement and in fact actually perpetuate information security and governance failures.

Born and bred to address global eDiscovery challenges, X1 Enterprise platform (X1E) represents a unique approach to dark data, by enabling enterprises to quickly and easily search across multiple distributed endpoints and data servers in place through a true distributed, parallelized computing architecture. Legal, security and compliance teams can easily perform unified complex searches across both unstructured content and metadata, obtaining statistical insight into the data in minutes, instead of days or weeks. With X1E, organizations can also automatically migrate, collect, or take other action on the data as a result of the search parameters. Built on our award-winning and patented X1 Search technology, X1E is the first product to offer true and massively scalable distributed searching that is executed in its entirety on the end-node computers for data audits across an organization. This game-changing capability vastly reduces costs while greatly mitigating risk and disruption to operations.

Leave a comment

Filed under CaCPA, Cyber security, eDiscovery & Compliance, GDPR, Information Governance, Information Management

Architecting a New Paradigm in Legal Governance

By Michael Rasmussen

Editor’s note: Today we are featuring a guest blog post from Michael Rasmussen, the GRC Pundit & Analyst at GRC 20/20 Research, LLC.

Exponential growth and change in business strategy, risks, regulations, globalization, distributed operations, competitive velocity, technology, and business data encumbers organizations of all sizes. Gone are the years of simplicity in business operations.

Managing the complexity of business from a legal and privacy perspective, governing information that is pervasive throughout the organization, and keeping continuous business and legal change in sync is a significant challenge for boards, executives, as well as the legal professionals in the legal department. Organizations need an integrated strategy, process, information, and technology architecture to govern legal, meet legal commitments, and manage legal uncertainty and risk in a way that is efficient, effective, and agile and extends into the broader enterprise GRC architecture.

In my previous blog, Operationalizing GRC in Context of Legal & Privacy: The Last Mile of GRC, I began this discussion, and here I aim to expound on it further from a legal context.

Legal today is more than legal matters, actions, and contracts. Today’s legal organization has to respond to incident/breach reporting and notification laws in a timely and compliant manner, respond to Data Subject Access Requests (DSAR), harmonize and monitor retentions obligations, conduct eDiscovery, manage legal holds on data, and continuously monitor regulations and legislation and apply them to a business context.

In today’s global business environment, a broad spectrum of economic, political, social, legal, and regulatory changes are continually bombarding the organization. The organization continues to see exponential growth of regulatory requirements and legal obligations (often conflicting and overlapping) that must be met, which multiply as the organization expands global operations, products, and services. This requires an integrated approach to legal governance, risk management, and compliance (GRC) with a goal to reliably achieve objectives while addressing uncertainty and act with integrity.[1] This includes adherence to mandatory legal requirements and voluntary organizational values and the boundaries each organization establishes. The legal department, with responsibility for understanding matter management, issue identification, investigations, policy management, reporting and filing, legal risk, and the regulatory obligations faced by the organization, is a critical player in GRC (what is understood as Enterprise or Integrated GRC), as well as improving GRC within the legal function itself.

A successful legal management information architecture will be able to connect information across risk management and business systems. This requires a robust and adaptable legal information architecture that can model the complexity of legal information, discovery, transactions, interactions, relationship, cause and effect, and the analysis of information, which can integrate and manage a range of business systems and external data. Key to this information architecture is a clear data inventory and map of information that informs the organization of what data it has, who in the organization owns it, what regulatory retention obligations are attached to it, and what third parties have access to it. This is a fundamental requirement for applying process and effectively operationalizing an organization’s GRC activities, as detailed in the previous blog.

There can and should be an integrated technology architecture that extends GRC technology and operationalizes it in a legal and privacy context. This connects the fabric of the legal processes, information, discovery, and other technologies together across the organization. This is a hub of operationalizing GRC and requires that it be able to integrate and connect with a variety of other business systems, such as specialized legal discovery solutions and integrate with broader enterprise GRC technology.

The right technology architecture choice for an organization involves the integration of several components into a core enterprise GRC and Legal GRC architecture – which can facilitate the integration and correlation of legal information, discovery, analytics, and reporting. Organizations suffer when they take a myopic view of GRC technology that fails to connect all the dots and provide context to discovery, business analytics, objectives, and strategy in the real-time that a business operates in. 

Extending and operationalizing GRC processes and technology in context of legal and privacy enables the organization to use its resources wisely to prevent undesirable outcomes and maximize advantages while striving to achieve its objectives. A key focus is to provide legal assurance that processes are designed to mitigate the most significant legal issues and are operating as designed. Effective management of legal risk and exposure is critical to the board and executive management, who need a reliable way to provide assurance to stakeholders that the enterprise plans to both preserve and create value. Mature GRC enables the organization to weigh multiple inputs from both internal and external contexts and use a variety of methods to analyze legal risk and provide analytics and modeling.


[1] This is the OCEG definition of GRC.

Leave a comment

Filed under Best Practices, CaCPA, eDiscovery & Compliance, GDPR, Information Governance, Information Management, Uncategorized

Operationalizing GRC in Context of Legal & Privacy: The Last Mile of GRC

By Michael Rasmussen

Editor’s note: Today we are featuring a guest blog post from Michael Rasmussen, the GRC Pundit & Analyst at GRC 20/20 Research, LLC.

At its core, GRC is the capability to reliably achieve objectives [GOVERNANCE], address uncertainty [RISK MANAGEMENT], and act with integrity [COMPLIANCE]. GRC is something organizations do, not something they purchase. They govern, they manage risk, and they comply with obligations. However, there is technology to enable GRC related processes, such as legal and privacy, to be more efficient, effective, and agile.

However, too often the focus on GRC technology is limited to the process management of forms, workflow, tasks, and reporting. These are critical and important elements, but the role of technology for GRC is so much broader to operationalize GRC activities that are labor intensive, particularly in the context of legal and privacy. Simply managing forms, workflow, and tasks are no longer enough. Organizations need to start thinking how they can integrate eDiscovery and data/information governance solutions within their core GRC architecture.

What is needed is the ability to search, find, monitor, interact, and control data throughout the business environment. GRC platforms are excellent at managing forms, workflow, tasks, analytics, and reporting. But behind the scenes there are still labor-intensive tasks or disconnected solutions that actually find, control, and assess the disposition of sensitive data in the enterprise. eDiscovery and information governance solutions have been disconnected and not strategically leveraged for GRC purposes. Together, the core GRC platform that integrates with eDiscovery and information governance technologies builds exponential economies in efficiency, effectiveness, and agility.

Specifically, an integrated GRC solution that weds the core GRC platform with eDiscovery and information governance technology delivers full value to an organization that:

  • Discovers the attributes and metadata of data no matter where it lives within the environment as a key component of GRC processes for legal and privacy compliance.
  • Enables 360° awareness to assessments by discovering the information needed to conduct and deliver assessments effectively into the core GRC platform.
  • Delivers a centralized console to interact with data/information and metadata of files on devices across the organization (such as network file shares, OneDrive, and Dropbox data).
  • Automates the ability to interact with downstream endpoints/systems to provide the ability to search the content of records for keywords and perform analysis using regular expressions and classifiers.
  • Controls data wherever it is with the ability to get to the data and analyze it from a centralized console.

An integrated approach that brings together the core GRC platform with eDiscovery and information governance technology enables the organization to discover, manage, monitor, and control data right from the central GRC platform console. It enables the organization to get centralized and accessible insight into where sensitive information is, how it is being used, and what can be done with it.

  • For example. Within the GRC platform I can initiate a search based on key words or patterns (e.g., social security number). The eDiscovery/information governance solution then finds where that information is throughout the enterprise and delivers a list of records back to the GRC platform for analysis and monitoring.

This enables an integrated GRC architecture that brings 360° contextual awareness into information across the enterprise. It delivers enhanced efficiency in time saved and money saved chasing information through disconnected solutions and processes, it provides greater effectiveness through insight and control of information and enables greater agility across a dynamic environment to be responsive to issues of information governance. Together, a GRC platform with eDiscovery/information governance capabilities enables and delivers more complete and accurate data governance and privacy assessments, integrated findings, with the ability to manage remediation tasks from one central place.

Leave a comment

Filed under Best Practices, CaCPA, Data Audit, eDiscovery & Compliance, GDPR, Information Governance, Information Management