SNOMED Clinical Terms® Supporting Post-coordination with an Expression Repository IHTSDO Implementation SIG Webinar by David Markwell The Clinical Information Consultancy Ltd david@clininfo.co.uk www.cliniclue.com and www.clininfo.co.uk © 2002-2010 The Clinical Information Consultancy Ltd (includes some material © 2002-2010 IHTSDO) 1 Overview Refresher about expressions – What is an expression – Pros and cons of post-coordination Issues with post-coordination A practical approach to storage issues – Expression repository A practical approach to retrieval issues – Expression link table – Expression transitive closure SNOMED CT Expressions A „SNOMED CT expression‟ is A collection of references to one or more SNOMED CT concepts, used to express an instance of a clinical idea Expressions can be used to represent: – Instances of clinical information in electronic health records – Knowledge links in resources such as decision support protocols and online reference materials Expressions can be pre-coordinated or post-coordinated Pre-coordinated expression – A single ConceptId represents the required meaning • Example • 31978002 – (fracture of tibia) Post-coordinated expression – A combination of ConceptIds represents a concept • Example • 31978002 : 272741003 = 7771000 – (fracture of tibia : laterality = left) – In human readable form … “fracture of left tibia” Expressions can exist in different forms Close-to-user form – The concept or concepts selected by the user • (or by a user-interface designer) Normal form – The result of applying a set of logical rules that transform different expressions with the same meaning into a common comparable form Both these forms may include or exclude a situation context wrapper – If included this explicitly states the context of a finding or procedure Advantages of post-coordination Scope coverage and terminology size – Coverage of scope to an adequate level of specificity does not require every possible concept to exist – Reduces the need for “combinatorial explosion” in concept numbers to cover every eventuality Terminology maintenance – The maintenance burden is related to terminology size • Discussed further on next slide Structured data entry – Ability to represent refined content is not dependent on specific concept existing – Expressions can be constructed in a consistent manner rather than searching hundreds of similar terms for precisely the correct one Consistent retrieval – Less dependency on modelling of individual concepts • Discussed further on subsequent slide Advantages of post-coordination Terminology maintenance Maintenance burden is roughly proportional to terminology size. This is due to: – Requirements to add new content • Accurate modelling and addition of synonymous terms • Synonymous terms • Translation adding terms in other languages – Errors such as • Ambiguity • Non-synonymous terms (within or between languages) • Inconsistent modelling – Enhancements of the SNOMED CT Concept Model • Which require some concepts to be remodelled Note: This rule applies to Extensions as well as the International Release Advantages of post-coordination Consistent retrieval Use of post-coordination make retrieval less dependent on modelling of individual concepts For example: – Accurate retrieval of the following pre-coordinated expression is dependent on the accuracy and specificity of the defined causative agent • 91936005 | allergy to penicillin | – In contrast, the following post-coordinated expression explicitly identifies the substance • 416098002 | drug allergy | : 246075003 | causative agent | = 372725003 | penicillin V | Disadvantages of post-coordination (Note: issues addressed in more detail on subsequent slides) Human readability, entry and display – „Extreme post-coordination‟ leads to loss of natural terms Data entry – Requiring users to select several items to build a postcoordinated expression may be a burden for entry of common composite data items Storage – Post-coordinated expressions have variable length so it may be difficult to efficiently represent them in a database table Retrieval – Performance may be impaired by • Storage issues that may prevent optimal indexing • Complexity of testing query predicates against post-coordinated expressions Pre & post-coordination with SNOMED CT SNOMED CT supports both pre and post-coordination – No absolute boundaries between them SNOMED CT enables computation of equivalence and subsumption between pre and post-coordinated expressions that have the same meaning Addressing post-coordination issues Human-readability „Extreme post-coordination‟ leads to loss of natural terms Example If “appendectomy” is only represented as: 71388002 | procedure | : { 260686004 | method | = 129304002 | excision action | , 405813007 | procedure site - Direct | = 66754008 | appendix structure | } the word „appendectomy‟ is not present Thus this does not support search by or display of the term clinical users expect to see Clinical ideas that are associated with common names (other than composites that could be derived from a post-coordinated expression) need to be represented by adding concepts to SNOMED CT Addressing post-coordination issues Concept model limits Clinical ideas that cannot be fully represented by a postcoordinated expression due to limitation of the SNOMED CT Concept Model should be represented by either: – Adding concepts to SNOMED CT; or – Using information model constructs to link expressions together or to other related data The choice between these approaches depends on the information to be represented. Only ideas that fit within the scope and editorial guidelines applied to SNOMED CT should result in addition of new concepts Addressing post-coordination issues Data entry The two many keystrokes problem – Multiple selections to build a post-coordinated expression may be a burden for entry of common composite data items – Options to address this issue will be the subject of another Implementation SIG webinar Addressing post-coordination issues Storage Post-coordinated expressions have variable length so it may be difficult to efficiently represent them in a database table – Using the SNOMED CT grammar • The shortest expression is 6 characters in length • The length of an expression is theoretically unlimited • Real examples exist with over 300 characters (using id‟s only) or over 1,000 characters (including the term text) One way to address this is by using an „Expression Repository‟ – This is described in the next few slides Expression Repository General approach Users and/or user-interface designers have full access to post-coordinated expressions as a way to record information When a post-coordinated expression is entered, it is looked up in a repository of expressions If the expression is found in the repository – The unique identifier associated with that expression is stored in the record If the expression is not found in the repository – It is added to the repository with a new unique identifier – The new unique identifier is stored in the record Expression Repository Nature of the expression looked up The record looked up in, or added to, the repository is the post-coordinated expression as: – entered by the user via a generalized user-interface; or – specified by the designer of a data entry form (or protocol) to represent a particular user selection The post-coordinated expression is not transformed to normal form before look up The order of refinements in the expression may be sorted – To avoid creating repository entries for expressions containing identical refinements simply due to the order in which a refinement was applied Expression Repository Management of the repository Expression repository management is a fully automated process Every distinct expression entered is recorded in the repository with a unique identifier Entries in the repository are never deleted The repository must be backed and kept secure so the recorded data is not compromised – Note: This backup is mission critical and needs to be treated as part of the overall record system No manual intervention or terminology expert involvement in repository maintenance – The repository is simply a technical artefact that provides a reference link between the literal expression and the stored unique identifier Expression Repository Supporting communication of expressions Communications use the literal form of the expression – Not the unique repository identifier When a record that contains an expression is sent – The unique identifier is looked up in the repository – The associated expression is included in the communication When a record including an expression is received – The expression received in looked up in the repository – If the expression is not found a new entry is made in the repository – The associated unique identifier is stored in the record Expression Repository Sharing and merging of the expression repositories No need to share the expression repository to support communication – The literal form of the expression is communicated – Where necessary the target expression repository allocates a new identifier for a received expression Systems may share an expression repository subject to – Performance and data integrity considerations – Organisational boundaries and responsibilities – System architecture and software application design Expression repositories may need to be merged to support changes in organisational structure and system architecture – The method of allocating identifiers should ensure global uniqueness to facilitate future mergers without compromising data integrity Expression Repository Associations between terms and expressions The expression repository must not contain any many specified terms associated with a postcoordinated expression – The meaning of a post-coordinated expression in the repository is no more and no less than the meaning represented by its constituent parts. – Excluding manually entered terms avoids the risk of deviation between such terms and the inherent meaning of the expression The expression identifier is just a reference to the full expression – This is the main difference between an entry in the expression repository and a concept definition – This is the key to avoiding manual maintenance of the repository Expression Repository An example schema Column Datatype uid UUID Unique identifier of Expression in the repository. Created by standard UUID/GUID generation algorithm. hash String (40) Hash from expression used as search key (not necessarily unique) expression String (varlen) Text of the expression createdTime UtcTime Time stamp of creation of this entry Clinical record entry table Partial example schema Column Datatype Notes uid UUID Unique identifier of record entry. Created by standard UUID/GUID generation algorithm. patientId UUID/LocalId Patient identifier. Links record entry to the patient. expressionUid UUID Reference to row in the Expression Repository. Refers to the Expression that was used in this record entry. … … Various other fields containing text, values, units, dates, times provenance, etc Addressing post-coordination issues Retrieval Performance may be impaired by – Storage issues limiting use of indexing • Indexing of variable length fields is less efficient • Some database limit the maximum field length for indexing – Complexity of testing query predicates against postcoordinated expressions • Normal form transforms are tractable but repeating them each time a retrieval request is made is unlikely to meet performance requirements • Alternative approaches using Description Logic classification computation are similarly less efficient than a simple search for a set of codes Enhancements to an „Expression repository‟ are one way to address this issue – This is described in the next few slides Normal form optimisation Expression Linkage Expression Repository – Each expression in a is represented by an expressionUid – This can be dereferenced in the Expression Repository Expression Links – An Expression Links Table represents links between Expressions using the UIDs to the linked Expressions Normal Form Expression Generation – Each new expression added to the Expression Repository is transformed to its Normal Form – The Normal Form expression is also looked up in and if necessary added to the Expression Repository – A link is created between an Expression and its Normal Form Expression Rapid access to Normal Forms – It is simple and fast to lookup the Normal Form for an Expression by querying using the Expression Links Table Normal form optimisation Expression Link Maintenance For each new SNOMED CT release – The normal forms of all expressions affected by modelling changes are re-computed – The new Normal Forms are looked up (or if necessary added to) the Expression Repository – New Links are added to the Expression Links Table – Use of the state-valid approach (as adopted in the SNOMED CT Release Format 2) allows these new links to supersede the pre-existing links as they have a more recent effectiveTime The resulting maintenance burden can be managed automatically Expression Link Table An example schema Column Datatype Notes sourceUid UUID UID of an Expression linkType Integer Enumeration of transformation types. Allows alternative transformation results for the same expression (e.g. with/without context) effectiveTime UtcTime Time stamp of creation of this entry. If normal forms change due to enhancement of the Concept Model and/or modification of the Concept Definitions a new row with the same sourceUid and linkType but a newer effectiveTime supersedes this. targetUid UUID The UID of the Normal Form Expression. Retrieval Optimization Expression Repository Transitive Closure For added performance the Expression Repository can be “auto-classified” using a description logic classifier. The results of classification can appended to a transitive closure table – This Expression Transitive Closure Table enables instant subsumption testing using general purpose SQL queries Retrieval Optimization Expression Repository Transitive Closure Maintenance For each new SNOMED CT release – Expression Repository Transitive Closure (ERTC) needs to be recomputed – The computed ERTC can use an effectiveTime to enable views of the subsumption relationships between expression after every release or change This process can be automated resulting in minimization of the maintenance burden Demonstration Demonstration of – Expression Repository – Expression Link Table Using – CliniClue Xplore to • Select expressions • Generate Normal Forms – Access database with following tables • Expressions : The Expression Repository • Journals : A simple record mockup • ExpressionLinks : The ExpressionLinks to NormalForms Summary The value of post-coordination Attempts to avoid the use of post-coordination by adding pre-coordinated concepts to meet every requirement – Increases the size of the terminology – Introduce risks of errors from modeling • In particular mismatches between descriptions and associated concept definitions – Create a growing maintenance burden – The impact of these factors is probably even greater if additions are made in Extensions as a result of duplication of effort on different and potentially divergent Extensions Summary Practical approaches to storage and retrieval of post-coordinated expressions An Expression Repository – Enables a predictable indexable storage of postcoordinated expressions – Allows post-coordinated expressions to be entered, stored, retrieved and communicated Adding an Expression Links Table – Allows rapid access to Normal Forms Adding an Expression Transitive Closure – Supports high performance subtype testing These tables can all be maintained by software without manual input Questions? © 2002-2010 The Clinical Information Consultancy Ltd (includes some material © 2002-2010 IHTSDO) 32
© Copyright 2025 Paperzz