Working with Phenotypes

What are Phenotypes?

Clinical phenotypes are manifestations of disease, treatment and response as represented in the electronic health record (EHR). EHRs have a large number of data elements representing diagnoses, procedures, laboratory test results, medication orders, and more that together indicate what diagnoses and treatments a patient has had, and what the patient's treatment responses have been. The problem this, the medical record typically does not represent these phenotypes as separate data elements. They must be inferred from combinations of data. Clinicians do this in their head as they browse and scan a medical record. Personnel using the medical record for secondary purposes such as reseach and quality improvement do this during the process of chart abstraction. The chart abstraction process' manual nature inherently limits the volume of EHR data that can be used in a research or quality analysis.

Eureka! aims in part to automate chart abstraction. In other words, it automatically computes the data elements defined for a research or quality improvement study from the data elements in the medical record. We call such study data elements derived data elements. The Phenotype Editor screens in Eureka! are where you can specify your study's data elements and how they should be computed, either from EHR data elements or other study data elements that you have already specified. Eureka's phenotype editor supports creating four types of user-defined data elements (Category, Sequence, Frequency, and Value Threshold), described below. Together, these may be combined to specify clinical phenotypes. Examples of combinations include a custom grouping or category of diagnosis codes and a frequency threshold on the number of code values from your custom category that appear in a patient's data, or scanning for a blood pressure result that is high and at least two hypertension diagnosis codes within 6 months. A wizard-style interface guides you through creating these derived data elements.

Editing Phenotypes

The Phenotype Editor screen may be accessed by clicking the Editor link near the top of the screen. It contains a list of the derived data elements that you have specified already, and it provides a Create New Element link to launch the derived data element wizard. Icons to the left of each derived data element in your list provide for editing and deleting the adjacent element.

Select derived data element type

Clicking on Create New Element opens up a wizard style interface. First, you select a data element type. The four available types are:

Category
Category data elements allow specifying clinically significant groupings and hierarchies of data elements.
Sequence
Sequence data elements allow specifying two or more data elements that must occur in a specified temporal order.
Frequency
Frequency data elements allow specifying the number of times a specified data element must be present in or computed from a patient’s data.
Value threshold
Value threshold data elements allow specifying lower and/or upper limits on one or more numerical observation data elements such as laboratory test results or vital signs.

Select Elements from Ontology Explorer

After selecting the data element type, you specify your derived data element's definition in the provided forms. All involve dragging clinical concepts from the provided ontology explorer on the left side of the screen. An ontology is a representation of clinical concepts and their relationships to each other. The explorer has two tabs: System and User Defined. The System tab contains a hierarchy of clinical concepts representing common data found in electronic health records. The User Defined tab lists the derived data elements you have previously specified in the phenotype editor. Thus, you can define derived data elements in terms of other derived data elements. For example, you can specify a frequency derived data element on a previously defined value threshold data element like at least two blood pressure values over 130/80.

In general, you drop concepts from the ontology explorer into one or more blue boxes on the right side of the form. Underneath those blue boxes, you may specify additional constraints on the concept in form fields. You may specify a duration constraint (for example, 2 days to 1 month). You also may specify a particular property value, if the dropped concept has properties (for example, encounters with type INPATIENT). Make sure to check the checkboxes next to the constraint fields that you intend to use.

Category data element

Drag and drop concepts from the ontology explorer into the blue box on the right side to specify the members of your category. You may drop in previously created categories from the User-defined tab to create category hierarchies. If you make a mistake, click the icon to the left of each category member to delete the member. Click Next when you are done. Note you may drop categories into another derived data element definition's drop boxes anywhere you can drop one of the category's members to specify that the specified derived data values may be computed from any of the category's members.

Sequence data element

Drag and drop concepts from the ontology explorer into the blue boxes for the primary data element (at the top of the form) and the first related data element. Computed sequence derived data values will have a start time and finish time corresponding to the start and finish time of the specified primary data element. Optionally select duration and property constraints as described above. Then, specify the temporal relationship between these concepts (before or after) and whether the second concept must be a specified minimum and/or maximum time distance away from the first (for example, before by 2 days to 2 months). Click the Add to sequence link to specify additional temporal relationships between the primary and related data elements. Click Next when you are done.

Frequency data element

Drag and drop a concept from the ontology explorer into the blue box. Select the count of the data values represented by the dropped concept. Then, select whether you care only about the first n values, or any time at least n values occur. If you drop a value threshold into the blue box, a consecutive check box will appear. If checked, the threshold has to be satisfied by consecutive values of whatever data element was thresholded. Optionally select duration and property constraints as described above. Finally, you may require that data values be a specified minimum and/or maximum time distance from each other in order to participate in the frequency relationship. Click Next when you are done. Computed frequency derived data values will have a start time and finish time corresponding to the minimum temporal extent of the data that satisfy the specified frequency threshold.

Value threshold data element

Drag and drop a concept representing a data element that has a numerical observation from the the concept tree on the left side of the screen into the box on the right side labeled Drop Thresholded Data Element Here. Then specify upper and/or lower thresholds on the value of that data element in the provided form fields. Click the Add threshold link to specify thresholds on additional data element values. If you specify more than one threshold, use the Value thresholds selector at the top of the form to specify whether your derived data element should be computed if any of the specified thresholds is found or only if all of them are found. In some situations, you may want a threshold to apply in one or more clinical contexts such as patients with a particular diagnosis. You may specify a context as one or more data elements that must be present in order for the thresholds to apply by dragging data elements from the ontology explorer into the Drop Contextual Data Element Here boxes. You may require these contextual data elements to be within a specified time distance before or after the data element(s) being thresholded. Click Next when you are done. Computed value threshold derived data values will have the same timestamps as the data from which they were computed.

Select a name

Next, you specify a name and a description for your derived data element.

Review and save

After saving, your specified derived data element will be computed for any subsequent data processing job that you submit.

Phenotyping a Dataset