In Rheumatoid Arthritis (RA) clinical trials, treatment response is often assessed via the American College of Rhematology (ACR) composite responder score ACR20. This is a binary criterion that incorporates several indices of treatment activity in terms of symptoms reduction and is equal to 1 if at least 20% of improvement between a baseline and post-baseline measurement is observed in tender and swollen joint counts, in at least three out of five other indicators (C-Reactive Protein, patient global assessment of disease activity, physician global assessment of disease activity, patient pain scale, Health Assessment Questionnaire Disability Index), and 0 otherwise.
In cases where there is a similarity of a new biosimilar compared to a reference product being established, equivalence is often evaluated by assessing whether the 2-sided confidence interval for the difference in proportions of ACR20 responders at a given time point lies within a pre-specified equivalence margin (e.g. -12% to 12%). In this context, equivalence between treatments across time-points is often overlooked despite the fact that both regulators and clinicians have raised this as an important issue to be addressed.
In the following blog we’ll discuss alternative methods to assess equivalence across time-points, and provide results from a small simulation study.
An exponential time-response model
In a previous blog post on treatment response curves we have introduced the exponential model suggested by Reeve et al.  to describe the ACR20 time response pattern in subjects with RA undergoing a biologic treatment in addition to Methotrexate (MTX). This is made of three parameters: β, describing the slope of the curve, α describing the reference treatment ‘peak effect’ (the horizontal asymptote for the reference curve) and θ, describing the additional effect of the new treatment. Such a model can be easily fitted in SAS using PROC NLMIXED, as follows:proc nlmixed data = <dataset>;
parms theta = <theta> beta = <beta> alpha = <alpha>; /*initial values*/
pred = (alpha + theta*treatment)*(1 - exp(-beta*week));
predict pred out = nlmixout;
model acr20n ~ binary(pred);
In Figure 1 two potential patterns are displayed, with the actual values of the model parameters reported for the sake of illustration.
Figure 1. ACR20 response pattern according to Reeve’s model. Left panel displays a parallel curve pattern (identical slope between treatments), right panel displays a crossed curves pattern (different slopes between treatments).
In the same blog post, a method proposed in a couple of published papers [2-3] to assess equivalence across time-points was also described. This method, called ‘2-norm’ simply involves taking the square differences between treatment differences along the fitted exponential curves, and taking the square root of their sum as a measure of ‘distance’ between treatments. The result obtained could then be compared with an equivalence margin obtained from historical data, i.e. via meta-analysis.
The main limitation of the 2-norm approach is that it doesn’t allow any direct comparison with the efficacy metric of interest, i.e. the response rate and its difference between treatments. To circumvent this limitation we’ve investigated two different approaches: a weighted mean and a Generalized Estimating Equations (GEE) approach
The alternative methods
a) Weighted mean approach
The weighted mean approach involves estimating the differences in response rates at each time-point alongside its standard error, and then averaging them using the inverse standard error as weight. To make inference on this weighted mean, we estimate a weighted standard error as follows:
where N is the number of time-points, dt is the difference at time t, d is the weighted mean, wi is the weight for the i-th difference, V1 is the sum of weights and V2 is the sum of squared weights. Then, a confidence interval using the Normal approximation can be estimated at the prescribed confidence level.
b) GEE approach
This approach relies on estimating a GEE model for binary data using an identity link to obtain the difference in proportions across all relevant time-points alongside its confidence interval. This can be done using the following example code:
PROC genmod data = <dataset> descending;
class treatment(ref = 'Control') subjid week;
model acr20n = treatment time treatment*time/ dist = bin link = identity alpha = 0.05 cl;
lsmeans treatment/ ilink diff cl;
repeated subject = subjid / type = exch;
Starting from the model suggested by Reeve and briefly described in Section 2, we have simulated ACR20 data using different parameter models and a varying overall sample size, ranging from 50 to 250, in order to assess the power to detect equivalence of the two alternative approaches described above.
In Figure 1 we report a 3D plot showing the values of the 2-norm (obtained using the trapezoidal rule) in relation to a variety of parameter values. The shape of the surfaces suggests that 2-Norm values are more affected by the slope β (Panel A) and the differential treatment effect θ (Panel B) than they are by reference drug effect α. Moreover it also highlights, as mentioned above, that these values cannot be mapped to any useful scale in terms of equivalence margins, making them very difficult to interpret.
Figure 2. Estimated 2-norm values under different parameter values for Reeve’s model. Panel A is based on a fixed θ parameter (= -0.05) and panel B on a fixed β parameter (= 0.40). A sample size n of 200 was considered here for the sake of illustration.
In Figure 3 we report the power curves for the weighted mean and the GEE approach to detect equivalence between treatments in the parallel curve settings, using given values of model parameters. The figure was obtained using 500 simulated dataset and a -10% to 10% equivalence margin. The GEE approach is seen to slightly outperform the weighted mean approach, though the two curves have very similar trends.
Figure 3. Power comparison between the Weighted Mean and GEE approach (range considered: 0 to 24 weeks). A 1:1 allocation ratio was considered for the sake of simplicity.
The proposed methods are valid alternatives to assess treatment equivalence over a range of time-points (e.g. they can be used to assess equivalence once the two treatments have reached their steady state, or early time-points equivalence), and allow to obtain a measure which can be directly compared with clinically meaningful equivalence margins. The GEE method performs slightly better in terms of power and has the advantage of explicitly taking into account the correlated nature of the ACR20 values over time. Further investigations are needed to assess the potential of both methods when missing data due to drop-outs are present.
This article was originally presented as a Quanticate poster by our statistical consultancy group at the annual PSI ‘Promoting Statistical Insight and Collaboration in Drug Development’ conference in Berlin, Germany in May 2016. Learn more about how our team could support your clinical trial by scheduling a call with one of our sales representatives.
- Reeve R, Pang L, Ferguson B, O’Kelly M, Berry S and Xiao W. Rheumatoid Arthritis Disease Progression Modeling. Ther Innov Regul Sci 2013, 47(6): 641-650
- Choe J, Prodanovic N, Niebrzydowski, Staykov I, Dokoupilova E, Baranauskaite A, Yatsyshyn R, Mekic M, Porawska W, Ciferska H, Jedrychowicz-Rosiak K, Zielinska A, Choi J, Rho YH and Smolen JS. A randomized, double-blind, phase III study comparing SB2, an infliximab biosimilar, to the infliximab reference product Remicade in patients with moderate to severe rheumatoid arthritis despite methrotrexate therapy. Ann Rheum Dis 2015.
- Emery P, Vencovský J, Sylwestrzak A, Leszczyński P, Porawska W, Baranauskaite A, Tseluyko V, Zhdan VM, Stasiuk B, Milasiene R, Barrera Rodriguez AA, Cheong SY and Ghil J. A phase III randomised, double-blind, parallel-group study comparing SB4 with etanercept reference product in patients with active rheumatoid arthritis despite methotrexate therapy. Ann Rheum Dis 2015.