<img alt="" src="https://secure.perk0mean.com/171547.png" style="display:none;">

Medical Coding in Clinical Data Management

medical coding in clinical trials.jpg

In any clinical trial conduct, recording and storing data in a controlled, consistent, and reproducible manner for data retrieval and analysis is a necessity for regulatory compliance and clinical study success.

What is Medical Coding?

Medical coding is the classification of multiple similar verbatim terms, using a validated medical (or medication-based) dictionary supplied by the customer, or under licence by the relevant licensing bodies (MSSO, Uppsala), in order to produce a statistically quantifiable count of all similar terms in a given database.

Along with data entry, validation, data processing, reconciliation, external data load, and many more clinical data management related activities performed in Clinical Data Management systems (CDMS), medical coding is performed to facilitate the summarizing and analysis of certain sets of data (e.g. Adverse Events, Medical History records, Concomitant Medications etc.). To provide control and consistency, a variety of medical coding dictionaries may be used to process, analyse, and report collected data. The coded variables/terms are used by sponsors/medical monitors to review the events and medications throughout the study as appropriate.

Study statisticians and medical writing groups use the coding reports to get the quantitative numbers which is included in the corresponding sections of the TLFs (Tables, Listings & Figures) generated for the study which is eventually reflected in the Clinical Study Report (CSR) created for regulatory submission.

With multiple versions of medical dictionaries released by the managing bodies every year, processes must be established for managing the release of multiple versions of the same dictionary, handling different dictionaries or versions that have been used, and integrating data coded with different dictionaries or versions.


Standard Dictionaries used for Medical Coding

Two of the most commonly used dictionaries are:

Medical Dictionary for Regulatory Activities (MedDRA)

MedDRA: Coding of Adverse Events and Medical History events using this dictionary is required to group data for meaningful analysis. MedDRA is the ICH-developed and recommended dictionary for all medical events captured in clinical trials, including, but not limited to, AEs and medical history terms etc. MedDRA has multi-axial functionality and provides multiple levels of terms and codes which require a distinct understanding by the coder to pick the correct code. Coders and reviewers of medical information must have an understanding of the flexibility of MedDRA as well the implications that its storage and implementation can have on safety reporting.

The levels of terms used in MedDRA are as follows:

  • Lowest level term (LLT)
  • Preferred term (PT)
  • High level term (HLT)
  • High level group term (HLGT)
  • System Organ Class (SOC)

MSSO (Maintenance and Support Services Organization)  is the organization responsible for publishing and maintaining MedDRA. MSSO releases two versions annually. The upgradation mainly covers retirement of terms, addition of new terms identified and approved, and updation of the assignments to SOC and consistency of available terminology. In MedDRA, a preferred term (PT) may be associated with multiple SOCs. However, each PT is associated with only one primary SOC.

The latest available version of MedDRA is 20.0 which was available from 1st March 2017. Coding specialists can be certified via the MSSO Certified MedDRA Coder (CMC) exam.


World Health Organization Drug Dictionary (WHO Drug or WHODD)

WHO Drug Dictionary works in a similar way to MedDRA, but is used to code medications. Verbatim terms (the medication names) are coded to a hierarchy of terms (coded term, preferred term and to ATC level 4), for example;


Alimentary tract and metabolism

(1st level, anatomical main group)


Drugs used in diabetes

(2nd level, therapeutic subgroup)


Blood glucose lowering drugs, excl. insulins

(3rd level, pharmacological subgroup)



(4th level, chemical subgroup)


ATC assignment requires the indication and/or dose and/or route to be available. If insufficient information is available to code the ATC accurately, the coding specialist needs to get additional information from site to select the correct code. As per standard practice, if additional information is not available and an ATC classification is required, the most common ATC classification is assumed. This should be agreed upon by the sponsor beforehand and documented.

If high level ATC classification (level 4) is to be performed for a project, any term assigned a multi-ATC code needs to be manually assigned by the coding specialist using the dose, route, and indication of the associated drug. Terms will be coded to the highest level of specificity possible.

World Health Organization (WHO) designed the WHO Drug Dictionary for medication coding. In 2005, the Uppsala Monitoring Centre (UMC) introduced the WHO Drug Dictionary Enhanced (WHODDE) Browser. WHO-DDE combines data from the original WHO Drug Dictionary (WHO-DD) with additional country-specific drug information. UMC is responsible for maintenance and publishing the dictionary. UMC releases the WHO Drug Dictionary on a quarterly basis which can be used by organizations as per subscription. The next WHODrug Dictionary release is due in June 2017.

Other dictionaries available for medical coding are:

  • COSTART (Coding Symbols for a Thesaurus of Adverse Reaction Terms)
  • ICD (International Classification of Diseases)
  • WHO ART (WHO Adverse Reactions Terminology)


Medical Coding Tools and Methods

Medical coding is performed using the dictionaries installed in the software applications. Coding specialists work on this tool to assign the appropriate codes to the terms. The features of the tool or the standard processes per which coding activity occurs are as below.

  1. Auto encoders: A programmatically assisted process for matching a reported term to a dictionary term. In this process, the tool runs the validation during which the verbatim terms with exact match present in the dictionary gets auto-coded.
  2. Manual Coding: In the manual coding process, the coding specialist selects an appropriate dictionary entry for each reported term in the coding tool. The coding specialist should be able to raise queries and use the search feature of the tool for efficient coding. The coding tool or applications should be validated systems as per regulatory requirements with audit trails in place. In manual coding processes, the coding reviewer uses the tool to review the coded terms for accuracy and consistency.
  3. Hybrid Approaches to Coding: This is the most standard approach followed in coding set up in which the reported verbatim terms are first automatically coded to the exact match or that match a term that has previously been coded (i.e., a synonym list) present in dictionary using an auto-encoder. The terms that are not auto-encoded are then manually coded by the coding specialist.


Reporting of terms

For efficient and correct coding it should be ensured that the verbatim terms recorded by site are specific. The conventions should be agreed upon by the site beforehand and documented. Some of the examples are listed below:

  • AE term reported should not be ambiguous i.e., the event recorded should have a clear meaning
  • Intensity or the severity of the condition should not be recorded along with AE term
  • Two or more terms should not be reported as one verbatim, e.g. ‘diarrhoea and vomiting’, these may be indicative of one diagnosis
  • Adjectives should not be recorded along with the AE term as initial words
  • Symbols and abbreviations should be avoided while recording AE terms as these may get misinterpreted

After the recent MedDRA & WHODDrug user group conference held in Bangalore on 6th February 2017 there are several points to consider from a coding perspective:

  • WHODD has been re-named as WHODrug products (re-branding)
  • If WHODDE is mentioned in the file it means it is an enhanced version, if codes are present within the PT term it means B2 format
  • The first release of full B3/C3 formats is available now from 1st Mar 2017
  • In B3/C3 format, the truncation issue will be resolved which has been observed in B2/C format, the field length has been increased to 1500
  • Validation process for each dictionary versions to be done for clients, the sponsors should also have WHODrug versions which the CRO is working on - this is a mandatory step to be followed for dictionary validation
  • In B3/C3 format, the preferred names will be generic
  • In Q1, B2 formats will be available, however from Q2 onwards only B3 formats will be available
  • MSSO provides free training webinars, website to be checked for details
  • As a licensed subscriber, 100 change requests can be sent to MSSO per month, however they will be approved/rejected based on MSSO’s judgment
  • Self Service Application (SSA) will be released in 2017


Learn more about how our Clinical Data Management team can support your clinical trial with our medical coding expertise and other services by submitting a Request for Information (RFI).

Subscribe to our blog


Subscribe to the Blog