X1 Rapid Discovery: First Enterprise eDiscovery Solution Supporting IaaS Cloud

Today I am pleased to announce our launch of  X1 Rapid Discovery, version 4. X1RD is a proven and now truly cloud-deployable eDiscovery and enterprise search solution enabling our customers to quickly identify, search, and collect distributed data wherever it resides in the Infrastructure as a Service (IaaS) cloud or within the enterprise. X1RD is a sister product to our acclaimed X1 Social Discovery, which we launched last year. Version 3 of X1 Rapid Discovery is a proven early case assessment and enterprise search application, but is now IaaS cloud deployable and features a new interface.

I know what you may be thinking — another eDiscovery CEO re-branding the company’s software as cloud. But hear me out on this. Sure, X1RD can serve as a hosted SaaS solution like many other tools (SaaS hosting has been around for over a decade), but the big news here is that X1RD is now deployable anywhere, anytime in the IaaS cloud within minutes. X1RD also features the ability to leverage the parallel processing power of the cloud to scale up and scale down as needed. In fact, X1RD is the first pure eDiscovery solution (not including a hosted email archive tool) to meet the technical requirements and be accepted into the Amazon AWS ISV program.

So what does this mean? Allow me to illustrate these ground-breaking capabilities through the following two growingly common scenarios faced by organizations today:

Scenario 1: A F1000 company maintains 2 terabytes of data up in the Amazon EC2 or S3 (storage) cloud and suddenly must find the comparatively small amount of relevant data within those 2TB as quickly as possible to respond to a critical investigation requirement. There is no time to spend several weeks downloading the entire 2TB out of the cloud through the thin pipe or waiting for Amazon personnel to copy the entire data set to hard drives and ship it back. What is urgently needed is the ability to quickly install eDiscovery software to index, search and review that data in the very IaaS cloud environment where it exists. That way only the small data set (say 10 gigabytes) of relevant data is identified and then finally exported. That is what X1 Rapid Discovery delivers.

Scenario 2: The same investigation sends the company’s eDiscovery consultant overseas to collect data at a subsidiary site. Upon the collection of the first 200 gigabytes, the attorneys insist  that the data must be quickly indexed for detailed, iterative searching in order to better inform the remaining on-site collection effort. However, the collection team left their large ECA appliance they normally use at home as it doesn’t travel well nor would it pass foreign customs. However, in this case there are several options with X1RD. If an eDiscovery software solution is truly a cloud-capable solution, then it can quickly install anywhere, including the IaaS cloud or on available hardware on-site. So the team can either locate available hardware resources with Windows OS or upload the data to a private or public IaaS cloud environment and operate a virtual eDiscovery lab with X1RD.

X1RD can just as easily be installed behind the firewall as in the cloud, but right now, all of our demos and proof of concepts are being performed in the IaaS cloud. But don’t just take our word for it, we would be happy to demonstrate this for you by remotely installing in your public or private IaaS cloud environment and collecting, indexing and searching your data. We are up for the challenge!

> Register for our live webinar on May 2 to see a demo of X1 Rapid Discovery and to hear from eDiscovery expert, Barry Murphy, on his view of the current eDiscovery market, with respect to the cloud.

Leave a comment

Filed under Cloud Data, eDiscovery & Compliance, Enterprise eDiscovery, IaaS

Defining Truly Cloud-Capable eDiscovery Software

Last week we discussed the challenges of searching and collecting data in Infrastructure as a Service (IaaS) cloud deployments (such as the Amazon cloud or Rackspace) for eDiscovery purposes.  Today we discuss what is needed for eDiscovery and enterprise search vendors to provide a truly cloud-capable solution and provide a decoder ring of sorts to cut through the hype.  For there is a lot of hype with the cloud becoming the latest eDiscovery hot button, with vendor marketing claims far surpassing actual capabilities.

In fact, many eDiscovery and enterprise software vendors claim to support the cloud, but are simply re-branding their long-existing SaaS offerings, which really has nothing to do with supporting IaaS. Barry Murphy of the eDiscovery Journal aptly identified this marketing practice as “cloud washing.” Data hosting, especially where the vendor’s manual labor is routinely required to upload and process data, does not meet defined cloud standards. Neither does a process that primarily exports data through APIs or other means out of its resident cloud environment to slowly migrate the cloud data to the vendor tools, instead of deploying the tools (and their processing power) to the data where it resides in the cloud. In order to truly support IaaS cloud deployments, eDiscovery and enterprise search software must meet the following three core requirements:

1.         Automated installation and virtualization:  The eDiscovery and search solution must immediately and rapidly install, execute and efficiently operate in a virtualized environment without rigid hardware requirements or on-site physical access. This is impossible if the solution is fused to hardware appliances or otherwise requires a complex on-site installation process. As hardware appliance solutions by definition are not cloud deployable and with enterprise search installations often requiring many months of man hours to install and configure, whether many of these vendors will be able to support robust IaaS cloud deployments in the reasonably foreseeable future is a significant question.

2.         On-demand self-service: In its definition of cloud computing, The National Institute of Standards and Technology (NIST) identifies on- demand self-service as an essential characteristic of the cloud where a “consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service provider.”

Many hosted eDiscovery services require shipping of data to the provider or extensive behind the scenes manual labor to load and configure the systems for data ingestion. Conversely, solutions that truly support cloud IaaS will spin up, ingest data and fully operate in an automated fashion without the need for manual on-premise labor for configuration or data import.

3.         Rapid elasticity: NIST describes this characteristic as capabilities that “scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.” This important benefit of cloud computing is accomplished by a parallelized software architecture designed to dynamically scale out over potentially several dozen virtualized servers to enable rapid ingestion, processing and analysis of data sets in that cloud environment. This capability would allow several terabytes of data to be indexed and processed within 2 to 4 hours on a highly automated basis at far less cost than non-cloud eDiscovery efforts.

However, many characteristics of leading eDiscovery solutions fundamentality prevent their ability to support this core cloud requirement. Most eDiscovery early case assessment solutions are developed and configured toward a monolithic processing schema designed to operate on a single expensive hardware apparatus. While recently spawning some bold marketing claims of high speeds and feeds, such architecture is very ill-suited to the cloud, which is powered by highly distributed processing across multitudes of servers. Additionally, many of the leading eDiscovery and enterprise search solutions are tightly integrated with third party databases and other OEM technology that cannot be easily decoupled (and also present possible licensing constraints) making such elasticity physically and even legally impossible.

So is there eDiscovery software that will truly support the IaaS cloud based upon these requirements, and address up to terabytes of data?  Stay tuned….

Leave a comment

Filed under Cloud Data, Enterprise eDiscovery, IaaS

eDiscovery Search and Collection in the Cloud

After several dozen posts on social media eDiscovery, we are going to focus the next few weeks on the related issue of eDiscovery in the cloud. As we see it, despite the enormous cost benefits of the cloud, concerns about the feasibility of eDiscovery and general search across an organization’s critical cloud-resident data has to some degree prevented broader adoption.

The cloud means many things to many people, but I believe the real eDiscovery action (and pain point) is in Infrastructure as a Service (IaaS) cloud deployments (such as the Amazon cloud, Rackspace, or pure enterprise cloud providers such as Fujitsu). According to a recent PwC report, Cloud IaaS will account for 30% of IT expenditures by 2014.  IaaS currently provides the means for organizations to aggressively store and virtualize their enterprise data and software, thus potentially spawning the same large data volumes and requiring the same critical search and eDiscovery requirements as traditional enterprise environments.  Amazon Web Services, the leading IaaS cloud provider, reports in our discussions with them extensive customer eDiscovery requirements that are currently addressed by inefficient and manual means.  So for purposes of this discussion, IaaS, which is essentially cloud for the enterprise and where there is a current significant eDiscovery challenge, is what we will focus on.

So if an organization maintains two terabytes of documents in the Amazon or Rackspace cloud, how do they quickly access, search, triage and collect that data in its existing cloud environment if a critical eDiscovery or compliance search requirement suddenly arises? This scenario is a current significant pain point for IaaS cloud.  In such situations, the organization is typically resorting to one of two agonizingly inefficient processes. The first option involves shipping the provider hard drives for their IT staff to copy the data in bulk for download and having that data shipped back. Rackspace’s guidelines provide that a transfer of 2 terabytes of bulk files would cost over $10,000 in fees and require about four to six weeks. And then all the company gets is a full 2 terabyte duplicate of its data that still must be searched, processed and reviewed.

The other alternative is to slowly download the data through a secure file transfer protocol connection. However, even with a robust T2 line, it would take three to six weeks to transfer the two TBs, depending on how much dedicated bandwidth IT would be willing to dedicate to the exercise.

So what is needed is robust eDiscovery software that can truly support the IaaS cloud where the data resides without first requiring mass data export. We will discuss what that entails and the requirements of truly cloud capable eDiscovery software in our next post, so please stay tuned!

2 Comments

Filed under Cloud Data, IaaS

689 Published Cases Involving Social Media Evidence (With Full Case Listing)

The torrent of social media evidence continues to grow. In November 2011 we searched online legal databases of state and federal court decisions across the United States to identify the number of cases from 2010 and through November 2011 where evidence from social networking sites played a significant role. As we mentioned then, the numbers exceeded even our high expectations. Recently, we revisited the survey with a little more detail to include results for all of 2011 to be sure we eliminated duplicate entries as well as de minimis entries — defined as cases with merely cursory or passing mentions of social media.

Under these criteria, the more exact number came up to 689 cases. Our raw data and tallying methodology is now public, with the spreadsheet available here, allowing for anyone to review the cases and provide your own analysis. The vast majority of the cases are accessible for free on Google Scholar.  About 5 percent of the listed cases are only available by subscription to Westlaw or LexisNexis.

The search, limited to the top four social networking sites, tallied as follows: MySpace (315 cases), Facebook (304), LinkedIn (39) Twitter (30). Oh, and my colleague Tod Cole insisted that I mention the lone Foursquare case. From the detailed review, a significant percentage, if not the majority of the MySpace cases involved criminal matters. Facebook mentions were trending up with MySpace  trending down as cases with more recent facts worked their way through the system.

Criminal matters marked the most common category of cases involving social media evidence, followed by employment related litigation, insurance claims/personal injury, family law and general business litigation (trademark infringement/libel/ unfair competition). As only a very small number of cases involve a published decision that we can access online, it is safe to assume that several thousand, if not tens of thousands more cases involved social media evidence during this time period. Even so, this limited survey is an important data point establishing the ubiquitous nature of social media evidence and the importance of best practices technology to search and collect this data for litigation and compliance requirements.

– VIEW ALL 689 CASES & MORE HERE >

6 Comments

Filed under Best Practices, Case Law