Tag Archives: Index-in-place

The Three Different eDiscovery Approaches to Address Microsoft 365 Data

By John Patzakis

Microsoft reports 345 million paid users worldwide of its Microsoft 365 platform (“M365”), spanning over two million companies, with more than one million of them based in the United States. M365’s cloud-based data sources such as OneDrive, Outlook mail, Teams and SharePoint online represent arguably the majority of ESI being produced in litigation going forward. However, M365 presents significant eDiscovery challenges and costs, requiring legal and eDiscovery professionals to be aware of the various methods to address this critical data source.

This article briefly addresses the benefits and challenges of each of the three main approaches to addressing eDiscovery and information governance in M365: 1) Utilizing Microsoft Purview; 2) Outsourced Services; or 3) Relying on a 3rd Party Purpose-build eDiscovery Solution.

Microsoft Purview
Microsoft Purview is the built-in M365 eDiscovery tool. It comes in different licensing tiers, the highest and most useful being Premium, or also known as E5 licenses. A key benefit of utilizing Purview Premium is that it’s integrated with M365, which is obviously convenient for workflow and also budgeting. Purview features a good legal hold process that allows the application of legal holds in place for key M365 data sources.

There is also a good consultant ecosystem to provide training and add-on services, which are often needed to address the larger projects at extra cost. And a premium license provides other functionalities unrelated to eDiscovery such as data analytics for business as well as a lot of security functions.

As far as the challenges of MS Purview Premium that we hear from users, a common complaint is that it can be very expensive, with licenses costing about $600 per employee annually. For large cases, licenses for several thousand custodians run in the millions of dollars and well into the tens of millions when you are dealing with a company with about 40,000 employees.

But the biggest complaint that we hear is that it’s not suited for large cases, M365 is built for user productivity, and the shared architecture is designed to support hundreds of millions of global users with normal individual workloads. eDiscovery and information governance projects are very large and aberrant workloads, so the system is designed to throttle large data throughputs. For instance, when you start a case in Purview, a separate and new index is created to allow eDiscovery and compliance searches in Purview, but there is a 2 GB hourly limit when creating this index — according to Microsoft’s own documentation — which limits your ability to address larger cases in a timely manner. There are many documented concerns about the accuracy and transparency of search results and data exports, especially as cases get bigger and there’s more custodians with higher volumes. Also, large attachments over 150 mb are not being a supported, as well as many filetypes such as engineering files like CAD drawings. MS only supports 50 file types, while the right eDiscovery software will support over 500.

These search accuracy and throughput limitations were called out by a Special Master Phillip Favro in the case of Deal Genius, LLC v. O2COOL, LLC, No. 21-C-2046, 2022 WL 17418933, at *1–2 (N.D. Ill. Oct. 24, 2022), and further expounded upon by Favro is his recent technical whitepaper:

“Purview eDiscovery does not provide the advanced features offered by a full service e-discovery platform needed to support discovery efforts in complex cases such as multidistrict litigation and class actions or regulatory investigations like Hart-Scott-Rodino Second Requests. Even small lawsuits that involve high volumes of ESI can present difficulties for organizations that wish to manage much of their discovery process with Purview eDiscovery. Responding parties that rely on Purview eDiscovery may not be able to perform a comprehensive search to reasonably identify relevant information. Responding parties who wish to incorporate Purview eDiscovery functionality into their discovery workflows must understand its search limitations and take steps to address them so they can establish the defensibility of their discovery process.” “Microsoft Purview eDiscovery: Key Features and Limitations,” Practical Law (July 2024).

Finally, Purview only addresses data within 365. It’s not going to address data sources such as Slack, or on-premises sources including laptops, fileshares, even on prem exchange or on-Prem SharePoint.

Outsourced Services
The second approach to addressing M365 for eDiscovery is to retain an outsourced service provider. There are well over 100 consulting firms that perform such services, and the main benefit is that the right consultants can get the job done. The consultants know how to export M365 data into a standard eDiscovery workflow, are very good at project management, and are well-versed with working with attorneys and their litigation deadlines. For companies that are smaller without the internal resources or expertise or have backlogs, this can be a good approach.

The main drawback is that it can be very expensive, because often times what we generally see is the service providers parachute in and run very basic scripts to conduct a mass data export from M365. After that, it defaults to a traditional eDiscovery workflow with processing tools, a lot of manual services, and then an upload to a standard review platform. This reactive approach results in a high amount of expensive data overcollection. Additionally, outsourced service providers typically require very high level, super-admin privileges in order to run their bulk data download scripts, which can be a significant concern from a security standpoint. These privileges can be delegated sometimes without the company’s knowledge, so it is important to be aware of and audit the privileges that are being granted.

Also, we have seen that for large eDiscovery collection projects in Europe, EU based companies are required to perform a data protection impact analysis (DPIA), and mass bulk collections involving copying of all the employees’ emails and other sensitive files and taking that data offsite are frowned upon by privacy auditors. That approach runs afoul of the GDPR’s proportionality and data minimalization requirements.

Third Party eDiscovery Software Solution
And finally, a third approach is utilizing a non-Microsoft eDiscovery solution that’s purpose- built to conduct eDiscovery, including by connecting to M365. A benefit of this approach is that the right solution can scale for larger data sets. This is particularly important for information governance projects such as data compliance audits. The good solutions will not require expensive Premium Purview licensing for every custodian and will enable you to employ it as an established and repeatable process. It can also address the indexing throughput and completeness challenges in Purview. And finally, a platform like this should be able to support data outside of M365 such as on-premises sources or data such as Slack.

One of the challenges of an in-house system is that internal IT resources or tech savvy paralegals are needed to run the process. Some technology platforms still require you to have the most expensive Purview Premium licensing to support essential functionality, such as collection of hyper-linked documents, and other key features. Further, many of these vendors are simply providing repurposed email archiving platforms, which function by a mass copy and transfer of all the organization’s data in M365. This poses significant logistical challenges in terms of scalability, not to mention unnecessary cost. M365 does not easily allow for the mass data download, which can lead to errors and data corruption, as in the recent case of FTC v. Match Group, No. 3:19-CV-2281-K, 2025 WL 46024, at *4 (N.D. Tex. Jan. 7, 2025) where MS Purview exports to an email archival system failed, resulting in court imposed discovery sanctions. So, if the solution does not allow for index in place functionality, but a bulk download, copy and data transfer, then there can be significant challenges with that approach.

The X1 Enterprise platform for 365 and on-premises sources takes a unique approach with a micro indexing architecture so that each data source and each custodian is associated with their own index. This enables a true index in place keep capability for targeted search and analytics at the point of collection, which enables the bypassing of most of the M365 throttling issues so that hundreds of custodians can be addressed in hours, not weeks. Our customers have successfully addressed matters involving thousands of custodians and upwards of 80 terabytes of M365 data that was indexed in a very short period of time. X1 Enterprise does not require Purview Premium licensing to address all the required functionality, such as the search and collection of hyperlinked files, archived email, inactive mailboxes, as well as many other detailed requirements.

Simply put, we believe X1 Enterprise is the best solution available to address M365 data for eDiscovery and information governance requirements.

Ready to Learn More?
For companies navigating complex information governance and eDiscovery requirements, including those involving M365, organizations that rely on the  X1 Enterprise Platform  not only reduce costs and save valuable time but also gain a strategic advantage in managing their eDiscovery and information governance needs. For a demonstration of the X1 Enterprise Platform, contact us at sales@x1.com. For more details on this innovative solution, please visit www.x1.com/solutions/x1-enterprise-platform.

Leave a comment

Filed under Best Practices, Cloud Data, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, Information Governance, law firm, m365, Preservation & Collection

True Index-in-Place Capability for Global Enterprise eDiscovery and Information Governance Only Possible with Distributed Micro-Indexing Architecture

By John Patzakis and Chas Meier

As legal and compliance teams grapple with exponential data growth, the need for faster, more efficient eDiscovery has never been greater. One key trend emerging from the 2025 State of Industry Report by eDiscovery Today is the growing demand for in-place indexing, with 15.5% of respondents citing it as a critical priority. But achieving true ‘index-in-place’ without bulk data transfers or excessive infrastructure costs—requires a fundamentally different architecture: distributed micro-indexing.

Unlike traditional eDiscovery tools that rely on centralized crawling and bulk data transfers, X1 Enterprise’s distributed micro-indexing architecture allows organizations to search, analyze, and collect data directly at the source—without moving vast amounts of information to a separate processing environment. This means faster insights, lower costs, and reduced security risks.

However, with this capability being highly valued, many vendors have parroted this messaging but have offerings that do not qualify as true index-in-place. Unlike traditional enterprise search or eDiscovery platforms that rely on centralized indexing (e.g., crawling, copying, and transferring all the data into a single repository), X1’s micro-indexing distributes the workload. It creates small, efficient indexes at the data source—whether a user’s laptop, email server, or a cloud source such as Microsoft 365 —and unifies search results on-demand. Transferring data in bulk to a central appliance or server farm via a crawling agent or Robocopy function does not qualify. A true index-in-place using distributed micro-indexes uniquely enables scalability, targeted collection and minimizes security and data governance risks in eDiscovery and information governance matters.

Earlier this year, a Fortune 500 company faced a massive eDiscovery and GDPR compliance challenge: indexing and searching over 70 terabytes of data across Microsoft 365 and on-premises sources—all without disrupting operations. With X1 Enterprise, they accomplished this in just a few weeks—a feat impossible with traditional solutions that rely on slow, centralized processing.

X1’s unique approach is based upon distributed, micro-indexing search and collection capabilities. Below are the top ten benefits of this architecture tailored to eDiscovery and enterprise data governance and how it differs from alternative approaches.

  1. Rapid, In-Place Data Identification: Legal teams can locate relevant documents across endpoints, cloud sources, and network drives instantly—without waiting for slow, centralized crawls. X1’s micro-indexing creates lightweight, decentralized indexes at the endpoint level (e.g., individual laptops, servers, or cloud accounts).
  2. Real-Time Search Across Distributed Systems: Execute complex, Boolean-rich searches across terabytes of data in Microsoft 365, OneDrive, SharePoint, and beyond. X1 enables real-time, federated searches across up to hundreds of terabytes of multiple data sources (e.g., Microsoft 365, local drives, email archives) from a single interface, leveraging micro-indexes updated at the source.
  3. Minimized Over-Collection Risks: X1’s Micro-indexing allows precise targeting of relevant data, minimizing the need to collect entire datasets for review. X1’s granular indexing supports instantaneous keyword searches and metadata filtering at the source.
  4. Lower eDiscovery Costs: By eliminating the need to transfer and reprocess massive datasets, X1 slashes infrastructure and vendor fees. By indexing and searching data in-place (without moving it to a central repository), X1 nearly eliminates reliance on third-party processing tools and expensive manual services, with dramatically reduced time to review.
  5. Optimized M365 eDiscovery Support: Avoids Microsoft Purview throttling, supports modern attachments, and enables cost-effective, high-speed data access. Each custodian is assigned an individual micro-index which enables X1 to achieve unmatched throughput, support modern attachments without premium licensing, address inactive mailboxes and more.
  6. Massive Scalability: X1’s micro-indexing distributes the workload on a parallelized basis, allowing the index and searching of hundreds of terabytes of data in-place at speeds not seen before in the enterprise eDiscovery and information governance industry. Micro-indexes are updated incrementally and in real-time as new data comes in, rather than requiring batch copying and re-indexing of an entire corpus.
  7. Support for Remote and Hybrid Workforces: X1’s endpoint indexing works seamlessly on distributed devices, ensuring data from remote employees or cloud platforms is readily accessible without requiring physical access.
  8. Proactive Compliance & Risk Monitoring: Instantly identify PII, unencrypted sensitive files, and policy violations across the enterprise. With micro-indexes updated in real-time, X1 allows organizations to monitor for policy violations (e.g., PII exposure, unencrypted sensitive files) across endpoints, fileshares and M365 accounts instantly.
  9. In-Place Remediation and Governance: As the data remains in place, remediation is effectively and accurately applied at scale. This contrasts to other “copy and move” processes that are merely working off-site with copies of your data, rendering effective remediation efforts extremely costly and burdensome, if not impossible.
  10. Data Minimization and GDPR Compliance: X1’s capabilities directly map to the GDPR’s proportionality and data minimization requirements. In contrast, tools that require full disc imaging or bulk copy and transfer for basic eDiscovery collection are extremely problematic.

Conclusion
For legal, compliance, and IT teams struggling with slow, expensive, and inefficient eDiscovery workflows, distributed micro-indexing is the future. X1 Enterprise’s unique in-place search ensures rapid results, reduced costs, and ironclad compliance—without moving or duplicating sensitive data. If your organization relies on Microsoft 365, remote workforces, or high-volume data environments, X1 provides the speed, scalability, and security you need.

Ready to Learn More?
Discover how X1 Enterprise can revolutionize your eDiscovery and compliance strategy. Schedule a demo today at sales@x1.com or visit www.x1.com/solutions/x1-enterprise-platform.

Leave a comment

Filed under Best Practices, Case Study, Cloud Data, Corporations, Data Audit, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, Information Governance, Preservation & Collection

X1 Achieves Unmatched Throughput and Results in Several Recent M365 eDiscovery and Information Governance Engagements

By John Patzakis and Chas Meier

As discussed previously on this blog, X1 and our active enterprise customers believe X1 Enterprise Collect is the best solution available to address M365 data sources as well as on-premises sources such as laptops and file shares. In recent weeks, our customers and partners have executed several projects on a massive scale and have captured and documented X1’s performance metrics.

No other solution in the industry can index data across the enterprise as fast or as scalable as the X1 Enterprise platform, including Microsoft Purview Premium. When compared to Microsoft Purview, with its built-in architectural constraints and throttling limitations, X1 can index nearly eight times the daily volume of Purview or any other competitive “connector” technology can achieve in the market. X1’s distributed index-in-place methodology, combined with horizontal scaling of our index hosts, make X1 the only solution truly capable of handling the rapid indexing, identification, searching and collecting/remediation of mass data sets in the TB’s or PB’s across the modern enterprise. X1 effectively addresses cloud and on-premises data sources in a unified manner, including distributed endpoints, network file shares, M365 data sources including Mail, OneDrive, Teams, and SharePoint, as well as other cloud data sources.

In several recent large-scale eDiscovery and information governance projects, X1 Enterprise Collect, on average, was able to collect and index M365 data (MS Mail [including archived mail and modern attachments] Teams, One Drive and SharePoint) at a rate of approximately 350 GB per day. This is nearly 8 times faster than Microsoft Purview, with its documented throughput limitations at 2GB per hour. X1 can achieve even faster throughput by scaling out virtual cloud computing resources.

Daily indexing volumes for endpoints and on-premises file shares vary due to the performance characteristics of each machine, but X1 indexes and searches endpoints in parallel yielding extremely high aggregate daily indexing and collection throughput.

Detailed documentation on these metrics and a further briefing on these engagements can be provided upon request.

X1 achieves such scalability through a decentralized approach that does not rely on the M365 or Purview search Index, which has known issues with the number of file types supported, consistency of search results, accuracy, and throughput. X1’s approach enables a very scalable, accurate, defensible, and robust indexing and data collection at unmatched speeds.

In addition to greatly reducing risk, X1’s capabilities also enable massive cost savings. X1 Enterprise Collect significantly streamlines the eDiscovery workflow by bringing targeted collection results directly into the review platform, thereby eliminating over collection, over processing, and over importing just to cull. X1 will populate ESI (Electronically Stored Information) straight into Relativity from an X1 collection without multiple hand offs, extensive project management and inefficient data processing.

The ability to collect data directly and transparently from custodian laptops, desktops, M365 and other cloud sources into a RelativityOne/Relativity workspace is a game-changer that enables legal and compliance teams to begin review in hours rather than weeks. As facts become known and collection focus changes, X1 allows teams to pivot and respond in hours. With the ability to efficiently take multiple bites of the apple, X1 enables teams to start fast and stay agile.

For a demonstration of the X1 Enterprise Collect Platform, contact us at sales@x1.com. For more details on this innovative solution, please visit www.x1.com/x1-enterprise-collect-platform.

Leave a comment

Filed under Best Practices, Cloud Data, Corporations, ECA, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, Information Governance, MS Teams, OneDrive, Preservation & Collection, SharePoint

Index and Search In-Place Workflows Are Essential for Information Governance

By John Patzakis and Charles Meier

Information Governance

Accurate pre-collection data insight is a game-changing capability that enables organizations and their legal teams to determine the scope, volume, and content of electronic information before the very disruptive and expensive step of collecting the data. This insight is enabled through distributed index and search in-place technology.

A true distributed index and search in-place capability for unstructured data requires a software-based indexing technology be deployed directly onto fileservers, laptops, or in the cloud to address Microsoft 365 and other cloud-based data sources. This indexing occurs where the data sources reside without requiring a bulk transfer of the data to a central location. Once indexed, searches can be performed in seconds, supporting complex Boolean operators, metadata filters and regular expressions. Searches can be iterated and refined without limitation, which is critical for large data sets.

While our previous blog post addressed the critical importance of this capability in eDiscovery matters, it is equally essential in information governance projects such as PII audits, the purging of redundant, obsolete or trivial (ROT) data, and due diligence and data separation efforts in support of corporate mergers and acquisitions. Many X1 customers have recently employed our indexing in-place technology on such projects with remarkable success.

Incredibly, many of these customers also received alternative proposals that leverage traditional eDiscovery workflows presenting much higher estimated costs and much longer durations. Traditional eDiscovery workflows mandate broad and manual data collection, copying and migration efforts, large scale data processing, and loading the data into a different platform for review and analysis. There are three fundamental reasons why this “traditional approach” is fatally flawed for information governance projects.

  1. Prohibitive Cost and Risk. The data scope of information governance projects involves terabytes and sometimes petabytes of data. Mass collection, copying and migration of these data sets with manual hand-offs for later analysis in a centralized location is extremely expensive, disruptive, and time consuming. Also, mass duplication and egress of enterprise data under control to execute ROT, PII, data separation or other due diligence projects is completely antithetical to their very purpose.
  2. The “Now What?” Problem. Let’s assume an organization has decided to incur the enormous cost, disruption and risk associated with the mass copying, migration, and centralization of unstructured data, and after loading the data into a review process, a key subset of documents and emails are finally identified for purging or other remedial action. Now what? You are merely working with copies! The live “original” emails and documents are in M365, email accounts, file servers or on laptops. It is possible to manually retrace and remediate, but that process is expensive and disruptive.
  3. Instant Staleness. Finally, a mass copying and migration effort often requiring several weeks to complete, is immediately stale once eventually completed as the live data in its original location has inevitably changed.

X1 solves these challenges though our proprietary and patented distributed index and search in-place technology that enables scale by bringing true distributed indexing in-place to laptops, file shares, M365 and other cloud sources. X1 Enterprise Collect significantly streamlines information governance workflows by identifying and allowing for the remediation of targeted data in-place, thereby eliminating the need for expensive and cumbersome data duplication and migration.

For a demonstration of the X1 Enterprise Collect Platform, contact us at sales@x1.com. For more details on this innovative solution, please visit www.x1.com/x1-enterprise-collect-platform.

Leave a comment

Filed under Cloud Data, compliance, Corporations, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, Information Governance, law firm, Preservation & Collection

Index-In-Place eDiscovery Tech is in High Demand, but Beware of False Vendor Claims

By John Patzakis

Proportionality-based eDiscovery is a goal that all in-house corporate legal teams want to attain. Under Federal Rule of Civil Procedure 26(b)(1), parties may discover any non-privileged material that is relevant to any party’s claim or defense and proportional to the needs of the case. However, most core eDiscovery costs (outside of attorney review) stem from over-collection of electronically stored information (ESI), and over-collection thwarts the ability to attain proportionality. Law firm Nelson Mullins notes that “over preservation tends to have its own costs relating to storage of large amounts of electronically stored information (ESI) and the resources needed to manage it; leads to increased downstream e-discovery costs associated with collection, processing, and review.”

This is why accurate pre-collection data insight is a game-changing capability that enables counsel to set reasonable discovery limits and ultimately process, host, review and produce much less ESI. Counsel can further use pre-collection proportionality analysis to gather key information, develop a litigation budget, and better manage litigation deadlines. Such insights can also foster cooperation by informing the parties early in the process about where relevant ESI is located, and what keywords and other search parameters can identify and pinpoint relevant ESI.

And the means to enable this capability is distributed index and search in-place technology. Indexing and search in-place in this context means that a software-based indexing technology is deployed directly onto fileservers, laptops, or in the cloud to address cloud-based data sources. This indexing occurs without a bulk transfer of the data to a central location. Once indexed, the searches are performed in a few seconds, with complex Boolean operators, metadata filters and regular expression searches. The searches can be iterated and repeated without limitation, which is critical for large data sets.

However, with this capability being highly valued, many vendors have parroted this messaging, but have offerings that do not qualify as true index-in-place. True distributed index-in-place means that the search indexes are forward-deployed, and are actually installed on the target laptop, Mac computer, fileserver or into the cloud near where the target cloud data sources exist. Transferring data in bulk to a central appliance or server farm via a collector agent or Robocopy function does not qualify. A true index-in-place capability uniquely enables scalability, targeted collection and also minimizes security and data governance risks in eDiscovery and information governance matters.

Conversely, a process requiring massive data copying, migration and centralization does not scale and creates significant data, governance and privacy issues by needlessly duplicating data. For instance, if a matter requires that 10 terabytes be scanned to determine if relevant ESI exists within that data corpus, and the eDiscovery collection platform being used has no index-in-place capability, then all 10 terabytes must be copied and transferred to the tool for indexing and analysis. These limitations stem from tool vendors simply utilizing open source indexing platforms like Lucene or Elastic Search that are not forward-deployable and must reside in centralized locations with a very large amount of computing resources to make them viable for the type of data and data volumes typically seen in discovery and information governance matters.

This is why X1 leverages proprietary and patented index and search technology that is readily forward deployable and thus can scale and allow true distributed indexing in-place. X1 Enterprise Collect significantly streamlines the eDiscovery workflow with integrated culling and deduplication, thereby eliminating the need for expensive and cumbersome ESI processing tools. That way, the ESI can be populated straight into Relativity from an X1 collection without multiple hand offs, extensive project management and inefficient data processing.

The ability to directly and transparently collect data from custodian laptops, desktops, Microsoft 365 and other cloud sources into a RelativityOne/Relativity workspace is a game-changer that enables attorneys to begin review in hours rather than weeks.

For a demonstration of the X1 Enterprise Collect Platform, contact us at sales@x1.com. For more details on this innovative solution, please visit www.x1.com/x1-enterprise-collect-platform.

Leave a comment

Filed under Best Practices, Cloud Data, Corporations, ECA, eDiscovery, Enterprise eDiscovery, ESI, law firm, Preservation & Collection, proportionality