April 19, 2024


Equality opinion

Data Science Center Tames Big Data Projects

Wasting Gas - Legal Planet

Greg Negus is the chief running officer at Cornerstone Study. In this job interview, he shares improvements in huge facts, as very well as how synthetic intelligence and device finding out can be made use of to assist specialist testimony.

Can you tell us a minor about your self and your purpose at Cornerstone Investigation?

Greg Negus: I have savored a thirty-12 months occupation in qualified products and services organization administration. Prior to signing up for Cornerstone Study, I labored as a main working officer and main fiscal officer at several large law companies.

At Cornerstone Investigation, I am liable for leading the firm’s company and administrative features. I also provide on Cornerstone Research’s government committee, amid others. In this place, I also oversee the firm’s Details Science Center, which is an interdisciplinary staff of in-property facts experts. The Info Science Heart team’s experience incorporates bettering do the job efficiencies and utilizing state-of-the-artwork modeling as it relates to evaluation of relevant details and case challenges.

Can you converse about the aims of Cornerstone Research’s Facts Science Centre?

At the agency degree, Cornerstone Investigate provides financial and money consulting and pro testimony in significant-profile litigation, investigations, and regulatory issues. We guidance consumers with rigorous, aim assessment that is grounded in actual-world info, condition-of-the-art study, and situation precedent. We are passionate about exceeding our clients’ expectations and we observed early on that knowledge science would be vital to upholding our normal of excellence.

That is why we made the Knowledge Science Center, our hub of data science expertise. The Info Science Centre is a pioneer in our field in implementing modern approaches this kind of as artificial intelligence, device finding out, and textual content analytics to supplement additional classic econometric analyses. Our mission is to retain our leadership place by continuing to set the benchmarks for technologies, knowledge science, and knowledge engineering.

What price does the Data Science Centre carry to the firm’s casework and for clients?

The Data Science crew brings value in 4 major parts: rising efficiencies, unlocking the opportunity of knowledge, furnishing scalable/bespoke answers, and supporting defensible effects. Let me deal with each and every in convert.

1 example of the efficiencies we provide to customers is our use of IBM Netezza, which gives speeds 20 to 2,500 instances as quickly as typical analytical platforms. The amplified computational speed of Netezza normally straight interprets into critical strategic rewards and results. These may perhaps contain:

  • Timely identification of info excellent deficiencies in developed knowledge
  • The ability to perform the several required iterations, sensitivities, and robustness checks for an expert analysis
  • Brief turnaround on urgent requests, even when working with huge datasets that commonly call for substantive computational processing periods

With regards to unlocking possible, I am referring to our skill to digest and evaluate substantial-scale data that opens up new lines of investigation. We are capable to conduct analyses and pursue concepts that would have been infeasible a couple of a long time ago—or would have taken a large sum of methods to deliver.

For example, in the T-Cell/Dash merger, our specialists analyzed how consumers select wireless carriers and how wi-fi carriers compete. This investigation used highly granular facts comprising billions of information details on when, where by, and how people used their mobile phones. Info Science was ready to conceptualize and make an algorithm to proficiently categorize this data in house.

With regard to scalable/bespoke options, the Information Science Center focuses on providing ideal-sized alternatives. For illustration, we can acquire bespoke analytic strategies when proper or use scalable equipment to support our groups automate examination. So, in addition to delivering clientele with productive and safe large-scale details analytics, we are also capable to offer you tailored programs knowledgeable by years of expertise.

By “defensible outcomes,” I am referring to our experts’ skill to show and converse analyses and results using both standard procedures and much more reducing-edge engineering supported by our Information Science Centre.

Big info is a expression that we have been hearing for various a long time. How is Cornerstone Research’s Info Science Center outfitted to deal with the problem of an more and more data-driven world?

It is an fascinating problem that we have been equipped to meet successfully. What can make this perform in particular tough is not only that the magnitude of details we are questioned to process and evaluate has amplified exponentially, but there has also been an explosion in the types of formats we are dealing with.

As to how we deal with considerable volumes of true-time and historical information, we have seriously invested in safe, on-premises analytics infrastructure with massively parallel processing capabilities, this sort of as IBM Netezza. We routinely get the job done on situations with hundreds of billions of information records. We are also seasoned in leveraging cloud computing capabilities for surge storage or compute potential. Our staff of programming specialists and info engineers guarantees that we can conduct big-scale knowledge analytics proficiently and proficiently in a fraction of the time it utilized to just take.

The second major challenge with major knowledge is that we typically will be requested to get the job done with shopper information and other non-public and general public sources from a extensive wide range of platforms and incompatible formats, which will have to be processed to give dependable details. Our Details Science abilities let us to assist counsel manage the discovery and information creation approach efficiently and efficiently. We also perform with customers to extract information in anticipation of the analytical requirements of subsequent phases of function as effectively as in reaction to immediate requests from regulators or litigants.

For case in point, in a substantial-frequency trading issue, the Info Science Middle worked with various inventory exchanges to obtain far more than 200 TB of details in purchase to figure out the best protocol to access these significant and complex datasets, as very well as to recognize the appropriate details subsets that would be used for the necessary analyses.

Huge information also significantly encompasses more than conventional structured information that comes in rows and columns. Our encounter with AI and equipment learning is useful when analyzing unstructured info, like paperwork and textual content, illustrations or photos, movie, and audio.

How can AI (artificial intelligence) and ML (machine understanding) aid qualified testimony?

AI-primarily based units substitute human conclusions with information-driven types. This can lower subjectivity and error when processing huge volumes of intricate facts. We benefit from AI and ML to drive automation of progressively elaborate tasks and unlock new strategies for assessment, which include applying the two supervised and unsupervised understanding.


Our machine studying capabilities are increased by our in-household graphical processing units (GPUs). GPUs offer computational speeds that exceed those of even the swiftest central processing units (CPUs). For case in point, in antitrust matters, we generally will need to determine the length amongst all suppliers and all individuals (coordinate pairs). Migrating this computation from CPUs to GPUs permits us to determine distances in between nearly 100 million coordinate pairs per 2nd.

Social media and large data are the most outstanding tendencies of the 21st century. How is the Knowledge Science Centre assisting providers preserve tempo with these intertwined systems?

With their large user foundation that eclipses regular media, social media platforms give wealthy resources of details that multiply at dizzying pace. In litigation contexts, being aware of how to correctly navigate, accumulate, and characterize these types of big quantities of details is essential. In addition to our deep familiarity with social media knowledge resources, our expertise with AI and ML resources equips us to assess the relevancy and relative prominence of information and contributors. This is a rapidly-expanding area of data and the insights these resources deliver can be important in supporting qualified analyses of textual content, content material, and sentiment.

What about some illustrations to illustrate that?

For massive-scale examination of Reddit subforums, generally recognised as subreddits, we crafted website information pipelines and automatic techniques, leveraging ML to score a post’s textual/context-pushed relevance to subjects of interest and characterize the prominence of a provided put up relative to other posts in the subforum.

In link with In re Facebook Inc. IPO Securities and Spinoff Litigation, we employed sophisticated language versions to properly distinguish homographs in tweets and make functions for an ML classifier. This framework facilitated the reputable and scalable detection of general public recognition of alleged product omissions prior to expected disclosure.

Lastly, we have comprehensive knowledge with on line shopper reviews of merchandise and services—among the most intriguing (and demanding) social media knowledge. These critiques can be the subject of litigation, but if utilized correctly, they can also provide a useful supply of actual-globe details. We’re experienced in assessing these distinctive knowledge, together with assessing the relative worth of product capabilities, modifications in shopper sentiment around time, and fraudulent evaluations.

Can you speak a bit about Cornerstone Research’s investment in the Information Science Heart? What technological know-how and education supports its perform?

In litigation, we frequently offer with delicate customer details, so we invested closely in secure infrastructure, including large-performance and superior-throughput analytical servers and storage clusters. Our analytical infrastructure is on-premises, this means consumer data are not uncovered to the net.

We have also invested in a range of off-the-shelf and proprietary program tools, deals, and knowledge pipelines to facilitate effective assessment. For illustration, when doing work with documents, we use instruments to add large-quality text levels to documents, speedily extract tabular facts, and build customized strategies to extracting other key details.

At last, we have invested in people today. We have extraordinary info scientists and practitioners with several decades of experience across a massive quantity of unique clients and jobs. Mike DeCesaris, who is the vice president of the team, has a qualifications in economic consulting and computer science, which puts him in an suitable place to navigate the profound transformations that litigation and qualified testimony go on to bear in regard to knowledge.

How do Cornerstone Research’s in-dwelling authorities and community of outside the house experts, which contain leaders from academia and industry, get the job done with the Knowledge Science Middle?

Cornerstone Research’s testifying professionals are at the forefront of litigation tendencies, sector improvements, and academic analysis. In flip, our expertise with implementing innovative data science strategies supports these professionals in their analyses. Specialists appreciate the actuality that we convey such a deep comprehension of AI and equipment studying to automate complicated responsibilities and produce analytic approaches to dietary supplement conventional econometric and statistical solutions. For illustration, Data Science employees applied equipment mastering methods to health care threat adjustment versions, which spelled out about 2 times as a great deal variation in statements data as the standing quo linear regression product.

The views expressed in this article are solely individuals of the speaker, who is dependable for the material, and do not always depict the views of Cornerstone Investigation.