Innovation in screening software

The investment community has challenged pharmaceutical research to become more efficient. IT budgets are stagnant or declining but the requirement to cope with increasing and more complex experimental data volumes is escalating. The pressure is on research IT departments to identify software solutions that make research processes more efficient and address the data analysis bottleneck

Software development for early-stage drug discovery must meet many needs – flexibility, compatibility and scalability. Dr Oliver Leven, Genedata, offers an insight into the R&D requirements of screening data analysis.

To appreciate today’s landscape of different software systems applied in early-stage drug discovery, it helps to review the evolution of software concepts. Broadly speaking, between 1990 and 2000, two major types of software applications emerged to support drug discovery research processes:

a) packages introduced by research IT departments to address broad or specific needs in scientific analysis or data handling; and b) software built by individuals to address a single issue at hand, which was not supported by option a). The former had a defined life cycle, was fully documented and supported; the latter was more of a one-off with the potential to be adapted to other needs yet was not well-supported by research IT departments, so it rarely became part of a supported package. No single software solution met cross-departmental research needs, resulting in many different installations of software packages that lacked compatibility and scalability. Moreover, they did not advance productivity and innovation.

At the turn of this century, new workflow-based software applications were introduced to help bridge the two camps. Based on a set of pre-defined rules, specific applications were supposed to automate data processing, yet offer the flexibility to accommodate slightly different workflows. While such applications were a major step forward, they did not prove to be truly scalable nor flexible enough, particularly when it came to meeting evolving requirements created by newer technologies like high content screening (HCS), high throughput screening (HTS) and other assays, e.g. label-free.

Over the past few years, the investment community has challenged pharmaceutical research to become more efficient

Parallel to workflow-based software, pharmaceutical R&D embraced all-purpose calculation engines, which should provide a backbone for all types of calculations and visualisations (e.g. Excel and Spotfire). These engines, to a certain extent, provide basic and extensible functionality when combined with custom workflows in hard-coded, specific formulae or small pieces of integrated functionality.

While such engines allowed relatively easy distribution across a research organisation and addressed a broad set of requirements, they have been unable to handle high sample data throughputs. They also carry other risks: for example, a change in software versions comes at a high cost as there are no abstraction layers or application programming interfaces (APIs). Additionally, in the event of a technology change, all the domain knowledge must be re-implemented.

Today, the situation has stabilised somewhat, and a high-level overview of applications results in three major offerings:

  • Instrument/technology-specific applications: Software that almost exclusively analyses data generated by a specific instrument or technology, often delivered in combination with instrument control software. It usually lacks data integration with other technologies and does not perform well with larger data sets and higher sample throughput.
  • Laboratory Information Management Systems (LIMS): Software platforms often implemented to support a complete lab or entire departments on a company-wide global basis. Typically, LIMS capture data directly from instruments and ensure data and context are recorded and archived, but usually provide only limited scientific data analysis capabilities.
  • Electronic Lab Notebooks (ELNs): Evolved from LIMS, these also capture raw instrument data. In addition, they store experimental information centrally and provide access to data, experimental conditions and data analysis results. This approach does not, however, allow for experimental data analysis particularly in high-throughput settings with large quantities of data, such as analysis of mass spectra data, next-generation sequencers or data from HCS or label-free instruments. Typically for these experiments, a specific infrastructure setup is required, which is loosely connected to LIMS or ELNs.

Over the past few years, the investment community has challenged pharmaceutical research to become more efficient. As a consequence, IT budgets are stagnant or declining while requirements, such as coping with increasing and more complex experimental data volumes, continue to escalate. The pressure is on research IT departments to identify software solutions that make research processes more efficient and in particular address the data analysis bottleneck.

In-house development of customised software applications is no longer sustainable and these dynamics have R&D organisations looking to third-party software vendors who can, at much lower total cost of ownership, efficiently develop, maintain and continually advance such software solutions. Key requirements are software solutions that automate complex research processes, particularly in terms of data analysis, while being scalable, flexible, and open to support research at the frontiers of science.

High Content Screening Extension Screenshot of data and images from a High Content Screening experiment. Scientist has complete access to: raw, cell-level data (left) including images (middle); automated processing for well-level results (upper right); and plate QC (lower right). (Screenshot from Genedata Screener)

Plate-based screening

While this trend away from costly in-house software development to vendor-provided software holds true for all software used in pharma R&D, the following will focus on plate-based screening requirements and software solutions. The standardisation of experiments onto the microtiter plate had a huge impact on the productivity of biopharmaceutical research. Not only did it facilitate running multiple experiments in parallel, increasing throughput and reliability, it also helped to automate the whole process. Pipetting devices, liquid handling devices, cell seeding and measuring instruments all became compatible and were automated by additional robotic devices to perform experiments at unseen throughput.

Similarly, data analysis needed to become automated yet remain flexible. To this end, an ideal software system should have three layers:

  • 1. Storage and handling of raw data from plate-based instruments (something ELNs/LIMS cannot do);
  • 2. Comprehensive data analysis;
  • 3. Result storage, query and reporting.

A well-designed system can separate these layers by defined and stable interfaces (APIs). This leads to a modular design capable of easily accommodating changes in input, analysis and storage layers when needed.

Pharmaceutical R&D organisations try to address both the intrinsic complexity of recent experimental technologies and the variation in the actual experiments (different plate format, different primary readouts, etc.) by specialisation. Typically, success is within small groups and organisations with single, empowered individuals who set up with scientists the required data analysis workflows. If such support becomes sparse, this software model fails.

A different, more promising approach is broad and generic support for required workflows and applications while giving users options to choose from – such as pre-defined, well-tested methods along the data analysis workflow. This approach is delivered in the Genedata Screener data management platform designed for screening data analysis.

Today’s plate-based screening technologies such as HCS, HTS, label-free, and Ion Channel demand a data management and analysis platform that can balance flexibility with standardisation

Today’s plate-based screening technologies such as HCS, HTS, label-free, and Ion Channel demand a data management and analysis platform that can balance flexibility with standardisation. For example, HCS experiments by their nature generate a vast amount of data in the form of images and numerical results. Each well in a microtiter plate contains hundreds or thousands of cells, and HCS must capture these cells to quantify their response to the changed experimental conditions as expressed by their phenotypes. Typical experiments range from a few 96-well plates to hundreds of 384-well plates. A flexible system can combine results from multiple wells and enable researchers to study the potency of new chemical substances or to quantify the damage to cells after drug administration. A data analysis workflow system that consolidates image-data gathered from multiple instruments allows researchers to import data into a single platform and conduct in-depth analysis of HCS experiments with thousands of images. The efficacy of such a platform is validated by the number of pharmaceutical companies that have adopted a standardised yet flexible platform for data screening analysis.

Standardisation drives efficiency

Standardisation drives cost savings, increases productivity, and improves efficiencies. When assessing data analysis platforms for plate-based screening, look for solutions that:

  • Empower all scientists to run basic data analyses without any IT support; this enables IT to focus on IT-related projects
  • Support newer applications of established technologies
  • Adopt new and proven applications so that new functionality is easily integrated into the system
  • Analyse and integrate native instrument data to avoid oversight and duplication of effort
  • Ensure complete experiment overview to identify all trends and outliers
  • Guarantee performance for basic operations such as data loading, result calculation and result storage
  • Standardised software capable of instrument-independent workflow support opens a whole new world of possibilities for R&D data screening processes, leading to solutions that generate intellectual property for pharmaceutical companies and ultimately new life-saving drug discoveries.