SCM

Forum: help

Monitor Forum | Start New Thread Start New Thread
RE: Simulation Study for Left censored data [ Reply ]
By: Ott Toomet on 2021-05-08 16:55
[forum:48942]
Sorry, could not check this for a week...

My reading from this result is that the log-likelihood is wrong. The warnings hint that it did not converge well (landed in a non-concave region) and the parameter estimates look weird too. But I am not sure what kind of data did you feed into it.

It is also possible that the log-likelihood is not wrong but has just a shape that is hard for the optimizers to figure out, but in my practice this is much rarer than having it coded wrong. You can test the results with no censoring--do you get correct results? The log-likelihood is pretty simple in that case. And then start adding a little bit censored observations, the results should be stable, although the estimates get more and more imprecise. But the estimated values should not drift somewhere else.

RE: Simulation Study for Left censored data [ Reply ]
By: Clara Gru on 2021-04-27 09:29
[forum:48934]
--------------------------------------------
Maximum Likelihood estimation
Newton-Raphson maximisation, 26 iterations
Return code 8: successive function values within relative tolerance limit (reltol)
Log-Likelihood: 14560.28
2 free parameters
Estimates:
Estimate Std. error t value Pr(> t)
theta1 5.730e-02 1.278e-03 44.83 <2e-16 ***
lambda1 1.492e+05 NA NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
--------------------------------------------
theta1 lambda1
5.730118e-02 1.491954e+05
theta1 lambda1
theta1 1.633591e-06 -3.267183e-06
lambda1 -3.267183e-06 -5.497493e-01
[1] 0.056 221982.448
[1] -0.444 221980.448
[1] 1.974000e-01 1.649563e+21

Warning messages:
1: In sqrt(diag(vc)) : NaNs produced
2: In sqrt(diag(vc)) : NaNs produced

RE: Simulation Study for Left censored data [ Reply ]
By: Clara Gru on 2021-04-27 09:27
[forum:48933]
Already fixed the code sir Ott for the left censoring part. However, I'm still having a hard time translating the loglikelihood function. I get an error seen below. What can you suggest I should do? Thank you.

----------------------
cg <- function(n, nsim, cens, theta1, lambda1)
{ #start of cg
## n: number of obs
## cens: percentage of censored obs
##

theta<-c();lambda<-c()
for(i in 1:nsim) { # for nsim
fxn <- function(n, theta1, lambda1) {
## Burr-V RNG
U <- runif(n, 0, 1)
y <- atan(log(lambda1/(U^(-1/theta1) - 1)))
y
}
library(maxLik)

## Generate Burr-V random sample with left censoring
x <- fxn(n, theta1, lambda1) # orig data
y <- sort(x, decreasing = TRUE)
r <- n-(n*cens)

# Creating Event and Censoring Indicators
# TRUE is event, FALSE is censored
if(r > 0)
delta <- y > y[r]
# indicates the obs is not censored
else
delta <- rep(TRUE, n)
yNC <- y[y > y[r]]
# non-censored y values
y[!delta] <- y[r]

#cat("x:",x,"\n") #CHECKING VALUES
#cat("y:",y,"\n")
#cat("y[r]:",y[r],"\n")
#cat("delta:",delta,"\n")
#cat("yNC:",yNC,"\n")
#cat("y[!delta]:",y[!delta],"\n")

LL <- function(params) {
theta1 <- params[1]
lambda1 <- params[2]
if(theta1 <= 0 | lambda1 <= 0) {
return(NA)
}
tm1 <- (!delta)*r*-theta1*log(1 + lambda1*exp(-tan(y[r+1])))
# only present if censored
tm2 <- (delta)*(n-r)*log(theta1*lambda1)
# only present if not censored
tm3 <- (delta)*(theta1 + 1)*log(1 + (lambda1)*(exp(-tan(y))))
# only present if not censored
loglike <- tm1 + tm2 - tm3

return(loglike)
}
est <- maxLik(LL,
start=c(theta1=theta1, lambda1=lambda1))
theta<-c(theta,est$estimate[1])
lambda<-c(lambda,est$estimate[2])

i <- i + 1

} # for nsim

# first r obs are censored
#cat("First", r, "obs out of", n, "are censored\n")
print(summary(est))
print(coef(est))
print(vcov(est))

means<-c(mean(theta),mean(lambda))
vars<-c(var(theta),var(lambda))
vies<-means-c(theta1,lambda1)
eqms<-(vars+vies)^2
result<-new.env()
result$means<-round(means,3)

#result$vars<-round(vars,3)
result$vies<-round(vies,3)
result$eqms<-round(eqms,5)
result$lambda<-lambda
result$theta<-theta


print(result$means)#Average Relative Estimate
print(result$vies) #Bias
print(result$eqms) #MSE

} #end of cg

set.seed(123); cg(150,1000,0.1,0.5,2.0)

RE: Simulation Study for Left censored data [ Reply ]
By: Ott Toomet on 2021-04-27 05:02
[forum:48932]
Shoot, you are probably right. For some reason I only thought about left-censoring, coded the data in this way, and obviously the right-censoring formula did not work.

Hope you can fix it!
Ott

RE: Simulation Study for Left censored data [ Reply ]
By: Clara Gru on 2021-04-27 04:08
[forum:48931]

709724_FULLTEXT01.pdf (97) downloads
Hello again, sir Ott!

For the first problem, I think you're referring to the right censoring likelihood function formula sir. I referred to the attached paper that the first term for left censoring is ${F(y_r+1)}^r$.

Also, when I check the values of y and delta, it shows the censored values higher (or equal) to the highest uncensored one (type II-right censored). It should be lower (or equal) to the lowest uncensored value for it to be left-censored. I strongly believe my wrong code greatly contributed to this error and for that, I apologize.

Nevertheless, your code sir would work well on right-censored data. Thank you so much sir for taking the time to help me.

RE: Simulation Study for Left censored data [ Reply ]
By: Ott Toomet on 2021-04-25 04:12
[forum:48923]
Hi Clara,
I think I figured it out. There were/are two major problems:

* the formula (3) in the paper is misleading--the first term should not be ${F(y_r+1)}^r$ (the cdf) but ${1 - F(y_r+1)}^r$, the corresponding survival function. The following formulas are carry the same problem further.
* There was some mess with summing and multiplying with r or n-r. I removed all summing and products, the loglik now returns a vector of observation-specific likelihoods. As a bonus, you can now also use BHHH.

Smaller issues:
* cleaned the code a lot for my taste, it took a while to understand it.
* your defaults: 150 obs and only 15 non-censored observations gives enormous standard errors (but it works).

Here are a few examples:

> cg(n=150, nc=0.1)
First 135 obs out of 150 are censored
--------------------------------------------
Maximum Likelihood estimation
Newton-Raphson maximisation, 7 iterations
Return code 1: gradient close to zero (gradtol)
Log-Likelihood: -134.3832
2 free parameters
Estimates:
Estimate Std. error t value Pr(> t)
theta 0.7836 0.2171 3.609 0.000308 ***
lambda 0.7822 0.7266 1.076 0.281718
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
--------------------------------------------

> cg(n=1500, nc=0.9)
First 150 obs out of 1500 are censored
--------------------------------------------
Maximum Likelihood estimation
Newton-Raphson maximisation, 3 iterations
Return code 8: successive function values within relative tolerance limit (reltol)
Log-Likelihood: -4662.835
2 free parameters
Estimates:
Estimate Std. error t value Pr(> t)
theta 0.54193 0.02169 24.98 <2e-16 ***
lambda 1.74247 0.15157 11.50 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
--------------------------------------------



Cheers,
Ott
-----------

cg <- function(n=150, nc=0.5, ...) {
## n: number of obs
## nc: percentage of non-censored obs
##
fxn <- function(n, theta, lambda) {
## Burr-V RNG
U <- runif(n, 0, 1)
y <- atan(log(lambda/(U^(-1/theta) - 1)))
y
}
library(maxLik)
## Define parameters
theta <- 0.5; lambda <- 2.0
m <- n*nc
# not censored obs
## Generate Burr-V random sample with left censoring
x <- fxn(n, theta, lambda) # orig data
y <- sort(x, decreasing = TRUE) # sort x decreasing
r <- n - m
# first r obs are censored
cat("First", r, "obs out of", n, "are censored\n")
if(r > 0)
delta <- y < y[r]
# indicates the obs is not censored
else
delta <- rep(TRUE, n)
yNC <- y[y < y[r]]
# non-censored y values
y[!delta] <- y[r]
LL <- function(params) {
theta <- params[1]
lambda <- params[2]
if(theta <= 0 | lambda <= 0) {
return(NA)
}
tm1 <- (!delta)*theta*log(1 + lambda*exp(-tan(y[r + 1])))
tm1 <- (!delta)*log(1 - (1 + lambda*exp(-tan(y[r + 1])))^(-theta))
# only present if censored
tm2 <- delta*log(theta*lambda)
# only present if not censored
tm3 <- delta*(theta + 1)*log(1 + lambda*exp(-tan(y)))
# only present if not censored
loglike <- tm1 + tm2 - tm3

return(loglike)
}
est <- maxLik(LL,
start=c(theta=theta, lambda=lambda),
...)
summary(est)
}

RE: Simulation Study for Left censored data [ Reply ]
By: Clara Gru on 2021-04-22 03:03
[forum:48920]
Hello Ott! Thanks for replying. This is the paper's link https://www.wseas.org/multimedia/journals/mathematics/2013/025706-152.pdf. Also attached one.

With regards to 'nsim', I thank you for the comment, completely overlooked that.

RE: Simulation Study for Left censored data [ Reply ]
By: Ott Toomet on 2021-04-21 15:44
[forum:48917]
Hi Clara,
can you tell me more? In particular it would be helpful to see the paper you are replicating. Otherwise I do not understand what you are doing.

One thing that puzzles me is what is your `nsim`? You replicate your random experiment nsim times, fine, but then you never store nor use those results in any way and only look at the last one?

Simulation Study for Left censored data [ Reply ]
By: Clara Gru on 2021-04-21 04:48
[forum:48913]
Hello! I was working on replicating the simulation results of a paper but I can seem to be able to. The codes are below, I want to be able to get average relative estimate for theta to be approx 1.0118. Thank you for the big help!

n=150; theta <- 0.5; yada <- 2.0; m <- n*0.1 #Define parameters
nsim=1000

for(i in 1:nsim) { # for nsim

fxn <- function(n, theta, yada) {
U <- runif(n, 0, 1)
y <- atan(-log((yada)^(-1)*((U)^(-1/theta)-1)))
return(y)
}

##Generation of random samples for fxn with left censoring
x <- fxn(n,theta, yada) #orig data
t <- sort(x, decreasing = TRUE) #sort x decreasing
r <- n-m
t[(r+1):n] = t[r]
delta <- ifelse (seq(1:n)<(r+1),1,0)


library(maxLik)
##MLE

#Simplifying vector
y <- t[1:r]
LL <- function(params){
theta <- params[1]
yada <- params[2]

if (theta<=0){
return(NA)
}
if (yada<=0){
return(NA)
}

tm1 <- r*theta*(1-delta)*log(1+(yada)*(exp(-1*tan(y[r]))))
tm2 <- (n-r)*delta*log(theta*yada)
tm3 <- (theta+1)*delta*sum(log(1+(yada)*(exp(-1*tan(y)))))
loglike <- - tm1 + tm2 - tm3 #full fxn
return(sum(loglike))
}
#checkLL <- sum(loglike)

#Computing for MLE
est <- maxLik(LL,start=c(theta,yada))

i <- i + 1
} #for nsim

summary(est)

Thanks to:
Vienna University of Economics and Business Powered By FusionForge