Forum: help
Monitor Forum | Start New ThreadRE: Simulation Study for Left censored data [ Reply ] By: Ott Toomet on 2021-05-08 16:55 | [forum:48942] |
Sorry, could not check this for a week... My reading from this result is that the log-likelihood is wrong. The warnings hint that it did not converge well (landed in a non-concave region) and the parameter estimates look weird too. But I am not sure what kind of data did you feed into it. It is also possible that the log-likelihood is not wrong but has just a shape that is hard for the optimizers to figure out, but in my practice this is much rarer than having it coded wrong. You can test the results with no censoring--do you get correct results? The log-likelihood is pretty simple in that case. And then start adding a little bit censored observations, the results should be stable, although the estimates get more and more imprecise. But the estimated values should not drift somewhere else. |
RE: Simulation Study for Left censored data [ Reply ] By: Clara Gru on 2021-04-27 09:29 | [forum:48934] |
-------------------------------------------- Maximum Likelihood estimation Newton-Raphson maximisation, 26 iterations Return code 8: successive function values within relative tolerance limit (reltol) Log-Likelihood: 14560.28 2 free parameters Estimates: Estimate Std. error t value Pr(> t) theta1 5.730e-02 1.278e-03 44.83 <2e-16 *** lambda1 1.492e+05 NA NA NA --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 -------------------------------------------- theta1 lambda1 5.730118e-02 1.491954e+05 theta1 lambda1 theta1 1.633591e-06 -3.267183e-06 lambda1 -3.267183e-06 -5.497493e-01 [1] 0.056 221982.448 [1] -0.444 221980.448 [1] 1.974000e-01 1.649563e+21 Warning messages: 1: In sqrt(diag(vc)) : NaNs produced 2: In sqrt(diag(vc)) : NaNs produced |
RE: Simulation Study for Left censored data [ Reply ] By: Clara Gru on 2021-04-27 09:27 | [forum:48933] |
Already fixed the code sir Ott for the left censoring part. However, I'm still having a hard time translating the loglikelihood function. I get an error seen below. What can you suggest I should do? Thank you. ---------------------- cg <- function(n, nsim, cens, theta1, lambda1) { #start of cg ## n: number of obs ## cens: percentage of censored obs ## theta<-c();lambda<-c() for(i in 1:nsim) { # for nsim fxn <- function(n, theta1, lambda1) { ## Burr-V RNG U <- runif(n, 0, 1) y <- atan(log(lambda1/(U^(-1/theta1) - 1))) y } library(maxLik) ## Generate Burr-V random sample with left censoring x <- fxn(n, theta1, lambda1) # orig data y <- sort(x, decreasing = TRUE) r <- n-(n*cens) # Creating Event and Censoring Indicators # TRUE is event, FALSE is censored if(r > 0) delta <- y > y[r] # indicates the obs is not censored else delta <- rep(TRUE, n) yNC <- y[y > y[r]] # non-censored y values y[!delta] <- y[r] #cat("x:",x,"\n") #CHECKING VALUES #cat("y:",y,"\n") #cat("y[r]:",y[r],"\n") #cat("delta:",delta,"\n") #cat("yNC:",yNC,"\n") #cat("y[!delta]:",y[!delta],"\n") LL <- function(params) { theta1 <- params[1] lambda1 <- params[2] if(theta1 <= 0 | lambda1 <= 0) { return(NA) } tm1 <- (!delta)*r*-theta1*log(1 + lambda1*exp(-tan(y[r+1]))) # only present if censored tm2 <- (delta)*(n-r)*log(theta1*lambda1) # only present if not censored tm3 <- (delta)*(theta1 + 1)*log(1 + (lambda1)*(exp(-tan(y)))) # only present if not censored loglike <- tm1 + tm2 - tm3 return(loglike) } est <- maxLik(LL, start=c(theta1=theta1, lambda1=lambda1)) theta<-c(theta,est$estimate[1]) lambda<-c(lambda,est$estimate[2]) i <- i + 1 } # for nsim # first r obs are censored #cat("First", r, "obs out of", n, "are censored\n") print(summary(est)) print(coef(est)) print(vcov(est)) means<-c(mean(theta),mean(lambda)) vars<-c(var(theta),var(lambda)) vies<-means-c(theta1,lambda1) eqms<-(vars+vies)^2 result<-new.env() result$means<-round(means,3) #result$vars<-round(vars,3) result$vies<-round(vies,3) result$eqms<-round(eqms,5) result$lambda<-lambda result$theta<-theta print(result$means)#Average Relative Estimate print(result$vies) #Bias print(result$eqms) #MSE } #end of cg set.seed(123); cg(150,1000,0.1,0.5,2.0) |
RE: Simulation Study for Left censored data [ Reply ] By: Ott Toomet on 2021-04-27 05:02 | [forum:48932] |
Shoot, you are probably right. For some reason I only thought about left-censoring, coded the data in this way, and obviously the right-censoring formula did not work. Hope you can fix it! Ott |
RE: Simulation Study for Left censored data [ Reply ] By: Clara Gru on 2021-04-27 04:08 | [forum:48931] 709724_FULLTEXT01.pdf (97) downloads |
Hello again, sir Ott! For the first problem, I think you're referring to the right censoring likelihood function formula sir. I referred to the attached paper that the first term for left censoring is ${F(y_r+1)}^r$. Also, when I check the values of y and delta, it shows the censored values higher (or equal) to the highest uncensored one (type II-right censored). It should be lower (or equal) to the lowest uncensored value for it to be left-censored. I strongly believe my wrong code greatly contributed to this error and for that, I apologize. Nevertheless, your code sir would work well on right-censored data. Thank you so much sir for taking the time to help me. |
RE: Simulation Study for Left censored data [ Reply ] By: Ott Toomet on 2021-04-25 04:12 | [forum:48923] |
Hi Clara, I think I figured it out. There were/are two major problems: * the formula (3) in the paper is misleading--the first term should not be ${F(y_r+1)}^r$ (the cdf) but ${1 - F(y_r+1)}^r$, the corresponding survival function. The following formulas are carry the same problem further. * There was some mess with summing and multiplying with r or n-r. I removed all summing and products, the loglik now returns a vector of observation-specific likelihoods. As a bonus, you can now also use BHHH. Smaller issues: * cleaned the code a lot for my taste, it took a while to understand it. * your defaults: 150 obs and only 15 non-censored observations gives enormous standard errors (but it works). Here are a few examples: > cg(n=150, nc=0.1) First 135 obs out of 150 are censored -------------------------------------------- Maximum Likelihood estimation Newton-Raphson maximisation, 7 iterations Return code 1: gradient close to zero (gradtol) Log-Likelihood: -134.3832 2 free parameters Estimates: Estimate Std. error t value Pr(> t) theta 0.7836 0.2171 3.609 0.000308 *** lambda 0.7822 0.7266 1.076 0.281718 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 -------------------------------------------- > cg(n=1500, nc=0.9) First 150 obs out of 1500 are censored -------------------------------------------- Maximum Likelihood estimation Newton-Raphson maximisation, 3 iterations Return code 8: successive function values within relative tolerance limit (reltol) Log-Likelihood: -4662.835 2 free parameters Estimates: Estimate Std. error t value Pr(> t) theta 0.54193 0.02169 24.98 <2e-16 *** lambda 1.74247 0.15157 11.50 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 -------------------------------------------- Cheers, Ott ----------- cg <- function(n=150, nc=0.5, ...) { ## n: number of obs ## nc: percentage of non-censored obs ## fxn <- function(n, theta, lambda) { ## Burr-V RNG U <- runif(n, 0, 1) y <- atan(log(lambda/(U^(-1/theta) - 1))) y } library(maxLik) ## Define parameters theta <- 0.5; lambda <- 2.0 m <- n*nc # not censored obs ## Generate Burr-V random sample with left censoring x <- fxn(n, theta, lambda) # orig data y <- sort(x, decreasing = TRUE) # sort x decreasing r <- n - m # first r obs are censored cat("First", r, "obs out of", n, "are censored\n") if(r > 0) delta <- y < y[r] # indicates the obs is not censored else delta <- rep(TRUE, n) yNC <- y[y < y[r]] # non-censored y values y[!delta] <- y[r] LL <- function(params) { theta <- params[1] lambda <- params[2] if(theta <= 0 | lambda <= 0) { return(NA) } tm1 <- (!delta)*theta*log(1 + lambda*exp(-tan(y[r + 1]))) tm1 <- (!delta)*log(1 - (1 + lambda*exp(-tan(y[r + 1])))^(-theta)) # only present if censored tm2 <- delta*log(theta*lambda) # only present if not censored tm3 <- delta*(theta + 1)*log(1 + lambda*exp(-tan(y))) # only present if not censored loglike <- tm1 + tm2 - tm3 return(loglike) } est <- maxLik(LL, start=c(theta=theta, lambda=lambda), ...) summary(est) } |
RE: Simulation Study for Left censored data [ Reply ] By: Clara Gru on 2021-04-22 03:03 | [forum:48920] |
Hello Ott! Thanks for replying. This is the paper's link https://www.wseas.org/multimedia/journals/mathematics/2013/025706-152.pdf. Also attached one. With regards to 'nsim', I thank you for the comment, completely overlooked that. |
RE: Simulation Study for Left censored data [ Reply ] By: Ott Toomet on 2021-04-21 15:44 | [forum:48917] |
Hi Clara, can you tell me more? In particular it would be helpful to see the paper you are replicating. Otherwise I do not understand what you are doing. One thing that puzzles me is what is your `nsim`? You replicate your random experiment nsim times, fine, but then you never store nor use those results in any way and only look at the last one? |
Simulation Study for Left censored data [ Reply ] By: Clara Gru on 2021-04-21 04:48 | [forum:48913] |
Hello! I was working on replicating the simulation results of a paper but I can seem to be able to. The codes are below, I want to be able to get average relative estimate for theta to be approx 1.0118. Thank you for the big help! n=150; theta <- 0.5; yada <- 2.0; m <- n*0.1 #Define parameters nsim=1000 for(i in 1:nsim) { # for nsim fxn <- function(n, theta, yada) { U <- runif(n, 0, 1) y <- atan(-log((yada)^(-1)*((U)^(-1/theta)-1))) return(y) } ##Generation of random samples for fxn with left censoring x <- fxn(n,theta, yada) #orig data t <- sort(x, decreasing = TRUE) #sort x decreasing r <- n-m t[(r+1):n] = t[r] delta <- ifelse (seq(1:n)<(r+1),1,0) library(maxLik) ##MLE #Simplifying vector y <- t[1:r] LL <- function(params){ theta <- params[1] yada <- params[2] if (theta<=0){ return(NA) } if (yada<=0){ return(NA) } tm1 <- r*theta*(1-delta)*log(1+(yada)*(exp(-1*tan(y[r])))) tm2 <- (n-r)*delta*log(theta*yada) tm3 <- (theta+1)*delta*sum(log(1+(yada)*(exp(-1*tan(y))))) loglike <- - tm1 + tm2 - tm3 #full fxn return(sum(loglike)) } #checkLL <- sum(loglike) #Computing for MLE est <- maxLik(LL,start=c(theta,yada)) i <- i + 1 } #for nsim summary(est) |