This blog provides an overview of efficacy endpoints in oncology studies. We will focus on RECIST (Response Evaluation Criteria In Solid Tumors). This is a method of assessing how solid tumors change over the course of a study. We will cover the measurements taken for RECIST assessments (Target lesions, Non target lesions, New lesions) and the possible outcomes, (complete response, partial response, stable disease, progressive disease) and what they mean – for example; patient recovered, stayed the same or got worse. We will also cover how visit windows and censoring can be handled. How and why measurement methods are important. We will cover other endpoints relevant to oncology such as Overall survival, Objective Response Rate, Best overall response, Disease control at X weeks and quality of life.
Response to cancer treatment is not a simple question of whether the subject still has it, so endpoints are useful to track if the cancer gets better or worse and also to have a comparable measure when accessing different treatments for their efficacy.
Response Evaluation Criteria In Solid Tumors. This method of measuring cancer is often used in clinical trials. It is simpler than some previously used techniques as it relies only on linear measurements, rather than 2D measurements. It is unambiguous and can be used to generate a selection of endpoints making it useful in clinical trials where different therapies are being compared. It is not widely used outside of clinical trials where the accuracy of assessment is not needed.
These are the solid tumors that are measured and assessed; they are grouped into different types. Target lesions are the tumors that are measured, they are selected at baseline, they are usually the largest lesions in an organ however they should be suitable for repeated measurement, this includes criteria such as easy to identify. The longest diameter will be measured for each target lesion and the sum of longest diameters will be calculated. Non target lesions are all other lesions, their positions are noted and the state of the lesion is described. If extra lesions appear during a study these are documented as new lesions.
RECIST has 4 responses, to rate how a patient is doing. They are complete response, partial response, stable disease and progressive disease. Complete response is the best result with all of the lesions disappearing. Progressive disease is the worst with the target lesions increasing in size by 20%, the patient would also be said to have progression if the physician makes the subjective assessment that the non-target lesions have progressed or if any new lesions were found. Partial response is if sum of longest diameters has reduced by 30%, this is a positive response although not as good as complete response. Stable disease is when the sum of longest diameters does not meet the criteria for progressive disease or partial response. When a response is being assigned to a visit the non-target lesions and any new lesions are considered with the result from the target lesions. In some studies a confirmed response is called for, this is when a response is required at 2 consecutive visits before patient can be assigned that response.
Methods of Measurements
Computed tomography (CT) is generally the preferred method of measurement as it is easily reproducible and the measurements can be independently reviewed. Spiral CT and MRI are often allowed if it is felt that this will not limit the outcome of the study. X-rays are generally not recommended due to lack of accuracy and repeatability. Ultrasound should not be used as the measurements may not be reproducible and cannot be independently reviewed. Once a method of measurement has been picked for a subject it should be maintained throughout the study so that measurements can be compared between visits. Some studies will require lesions from all subjects to be measured in the same way to ensure that all data is consistent and comparable.
All measuring techniques have limits of accuracy and this can make measuring small lesions difficult, the doctor may be sure that the lesion has gone and give a longest diameter of 0 or they may think that the tumor is present but cannot accurately measure it in which case they would record it as <Xmm where X is the default value for lesions too small to measure, X will then be used in RECIST calculations. Sometimes lesions are flagged as non-measurable as they are too big, this may be that they are too large to fit in the scan or for the instruments used to measure them, in some cases it is clear that a patient has progressed (i.e. new lesions) so the doctor decides not to put the subject though the scanning process and flags those lesions and non-evaluable (NE) or too big.
When lesions have been radiated they are no longer eligible to be used in RECIST calculations as the size changes may be due to the radiation not the treatment in question, these lesions becomes non-evaluable which may lead to subject being censored.
If a subject comes in early for a visit and has a response recorded this may bias the data as that response would not have been available until the next scheduled visit if the subject had not come in early. Therefore measurements should be taken at consistent time points so that data is comparable between subjects. Windows are applied around visits so that as much data as possible can be used without biasing the results. Measurements may not all be done on the same day for a particular visit, for instance scans for target lesions in 1 organ may be done 1 day and for another organ scans could be done on a different day, if all the scans are done within the window for that visit they can all be used as part of the overall visit calculation, however the dates are used differently when assigning responses, with progressive disease being recorded on the date of the first scan (used as part of the progressive disease calculation) and complete/partial response being recorded on the date of the final scan.
Missing Data and Censoring
If only a small amount of data is missing at a visit there may be ways of dealing with it. In some studies if a value is missing then the sum of longest diameters will be calculated using scaling of the data that is present. If the sum of longest diameters from the tumors measured gives progression with the missing tumors considered to have length of 0 then progression can be recorded regardless of the missing data. Radiated lesions are classed as missing as noted before so can lead to censoring. If too much data is missing then scaling would not be reliable and it may not be possible to be sure of progression, so that visit will be classed as non-evaluable. If a subject has a visit classed as non-evaluable or misses a visit and then progresses at the next visit you cannot be sure when they progressed so their data may be censored at the last time they had a scheduled visit. Similarly if measurements are missed so that a complete visit response cannot be applied, the visit may be classed as non-evaluable and the subject will be censored at the last evaluable visit.
Version 1.0 was released in 2000 and version 1.1 was released in 2009. Some differences are due to measuring methods becoming more accurate; the maximum number of target lesions has been reduced as less are needed to give a representative survey of tumour burden, Progressive disease now includes an absolute increase of 5mm, as with very small tumours being measurable it is possible to have a 20% increase of 2-3mm which may be due measurement variance rather than progression. Guidance has been added or measurability updated for lymph nodes, bone lesions and other tumours which can behave differently to typical tumours.
Many oncology efficacy endpoints are based on the numbers or responses from RECIST measurements. Some endpoints use questionnaires for subjects to answer or safety data such as number and seriousness of AEs or laboratory and biomarker values.
Progression free survival (PFS) is the time from a subject’s diagnosis to RECIST progression being recorded. Overall survival (OS) is time from diagnosis to death. If PFS and OS can be shown to be correlated this can be used to simplify or shorten studies however it is not always the case that increasing PFS for a subject will increase their OS.
This is used to compare treatments at a preselected time point, i.e. take RECIST response at 24 weeks and summarize the data to compare the treatments. As noted for PFS, if this can be shown to reflect survival it can be used to reduce trial times and give the patient and physician an earlier indication of efficacy.
Best overall response (BoR) is the best RECIST response the subject has had at any visit during the study. It is possible that their RECIST response has deteriorated between that time point and the end of the study. Best response per subject can then be summarized for comparison between treatments.
Quality of Life
The subjects may be given questionnaires to assess their quality of life (QoL). 1 style of questionnaire will ask the subject to rate on a scale of 1-5 particular aspects of their life over the past week, there are standard items used for all cancer types, such as pain, impact on life, how subject feels, there are also additional items specific to the relevant cancer i.e. shortness of breath for lung cancer. Questionnaire data will then be summarized for each item and a total score calculated and summarized. Adverse events will be tracked for safety, these may also be summarized for quality of life, i.e. if a drug affects vision the subjects may not be allowed to drive, some subjects may find this more limiting than other side effects. Something else that may be considered is cost of health care, it is generally accepted that cancer drugs are not cheap however if the drugs benefit the subject such that they have less symptoms and so need less treatment or hospitalization then the overall cost may be minimal or cheaper than the alternative, this may in turn free up resources to benefit others.
RECIST gives us a clear way to track and compare tumor burden for cancer patients, this allows us to calculate many endpoints which is good for presenting results and comparing treatments. However when discussing endpoints and treatment options it is worth considering all the different factors to find out what is important to the subject i.e. length of life, reduced tumor burden or better quality life, as what is the best treatment option for 1 person may not be the best for others.