Forum: open-discussion

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Arne Henningsen on 2016-02-16 11:47

[forum:42962]

The following code illustrates my suggestion for the logLik() method:

maxLik2 <- function( logLik, ...) {
result <- maxLik::maxLik(logLik, ...)
result$logLikFunc <- logLik
return( result )
}

testfn <- function( n ) {
x <- rnorm( n )
loglik <- function( par) {
sum( dnorm(x, mean=par[1], sd = par[2], log=TRUE))
}
maxLik2(loglik, start=c(0,1))
}

m <- testfn( 200 )
m

logLik2 <- function( object, newParam = NULL, ... ) {
if( is.null( newParam ) ) {
return( logLik( object ) )
} else {
return( object$logLikFunc( newParam ) )
}
}

logLik2( m, newParam = c( 1, 2 ) )

muVal <- seq( -1, 1, 0.1 )
sdVal <- seq( 0.5, 3, 0.1 )
llVal <- matrix( NA, nrow = length( muVal ), ncol = length( sdVal ),
dimnames = list( muVal, sdVal ) )

for( i in 1:length( muVal ) ) {
for( j in 1:length( sdVal ) ) {
llVal[i,j] <- logLik2( m, newParam = c( muVal[i], sdVal[j] ) )
}
}
persp( x = muVal, y = sdVal, z = llVal, theta = 30 )

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Arne Henningsen on 2016-02-16 11:13

[forum:42961]

R 'environments' are still sometimes a mystery for me. It seems to me that R stores the environment including the values of the objects when storing a function. Hence, it seems that it is sufficient if maxLik() stores the log-likelihood function and we do not need to include additional code for storing the environment:

maxLik2 <- function( logLik, ...) {
result <- maxLik::maxLik(logLik, ...)
result$logLikFunc <- logLik
return( result )
}

testfn <- function( n, k ) {
x <- rnorm( n )
loglik <- function(mu) sum(dnorm(x * k, mean=mu, log=TRUE))
maxLik2(loglik, start=0)
}

kk <- 4
m <- testfn( 200, kk )
m
m$logLikFunc(coef(m))
m$logLikFunc(coef(m)+1)

ls(environment(m$logLikFunc))
environment(m$logLikFunc)$x

However, this seems to be too simple to be true. Do I miss anything?

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ] By: Arne Henningsen on 2016-02-16 11:04	[forum:42960]
You are right that many projects on R-Forge may not be active. I am not sure how R-Forge determines the 'activity' of projects; in my experience, forum posts have a large impact on the activity level, while commits to the SVN repository only have a minor impact on the activity level. Aand we had a lot of forum posts...

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Arne Henningsen on 2016-02-16 10:20

[forum:42959]

5) I still think that most users expect that logLik() returns a log-likelihood value and that we should not break with this 'tradition'. I also think that most users are not really interested in obtaining the log-likelihood function itself but the log-likelihood function is needed for calculating the log-likelihood values at specific (non-optimal) parameter values. Hence, I suggest to implement something like logLik( object, newParam = NULL, ... ), whereas argument 'newParam' (or 'newCoef', 'coef', 'param', 'par', ... ?) can be used to specify the parameter values, for which the log-likelihood value should be calculated. Of course, the stored log-likelihood function is used internally to calculate the log-likelihood value for the specified parameter values. I think that most users (and the developers of plotting functions) would find this user interface very useful. The few users who are actually interested in obtaining the log-likelihood function can use the 'getter' function maxLikFunc() or similar but we should predominantly 'advertise' the use of logLik( object, newParam = NULL, ... ). What do you think?

6) Branches are in the subfolder 'branches'. We currently have only one branch of the maxLik package in this subfolder (see, e.g. [1]). The use of Subversion (including branching and merging, see Chapter 4) is extensively described in the freely available book "Version Control with Subversion." [2]

[1] https://r-forge.r-project.org/scm/viewvc.php/?root=maxlik

[2] http://svnbook.red-bean.com/

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Randall Pruim on 2016-02-15 13:25

[forum:42953]

We should check this, but I think that environments are not that large because the data is not actually stored in the environment. Rather the environment contains pointers to the where the data are actually stored. If I am correct, the size of the environment itself is simply enough space to store the names and the pointers for each object "in the environment".

As with other variables in R, R keeps track of whether multiple environments have pointers to the same location and makes copies only when one of them changes. So the contents in separate environments will act as if all the data were contained in each one separately, even thought that might not actually be the case if several environments have "copies" of the same data but never change their copies.

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ] By: Randall Pruim on 2016-02-15 13:16	[forum:42952]
Regarding usage. Several R-forge projects belong to me and haven't been touched in years, having all been moved to github. I wonder how many of the projects on R-forge are in that category. I see that this project was the most active project in R-forge last week. I'm not sure that is a good sign for R-forge either.

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Randall Pruim on 2016-02-15 13:13

[forum:42951]

4) Indeed. I misremembered. Since I've never named that argument, it didn't really matter for what I had been doing.

5) I don't think this would be confusing. If you don't use the new argument, the default will give you exactly what you have always gotten. If you use the argument, then presumably you know why you are doing so. This is similar to the value argument in functions like grep (there it is a logical) which changes the format of the object returned. Of course, one could prefer another solution for other reasons, but I don't see confusion as one of them.

6) I made my first commit. I'm not sure that constitutes relearning svn, but it seems to have worked. I don't see a branch interface for svn in RStudio. I'm not sure if that is because it is not supported, because I don't know where to look, or because you aren't using any branches. Do you have a document outlining your svn usage practices?

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Ott Toomet on 2016-02-15 03:30

[forum:42948]

Good thoughts, Arne. So you kind of suggests that we should only create tools for including the data, but not to include it ourselves. I have to admit I don't really understand. Perhaps I attempt to implement it, and then decide what is the best approach. In any case, we have to implement the infrastructure that developers can use if they wish.

BTW, we use underscore: maxControl parameters use that to separate the name parts.

Randall--I made you a developer. Feel free to use your new rights ;-) I try to do something during the coming week myself as well.

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Arne Henningsen on 2016-02-14 09:05

[forum:42945]

A few further thoughts on this interesting discussion:

As some environments / objects passed to maxLik() may by very large ('big data'), it may be problematic to store the environment / objects in the object returned by maxLik() by default. Should the environment / objects be stored only if the user sets argument, say, 'storeEnv' to TRUE? Or should the default behaviour depend on the size of the environment / object?

I think that we should consider 3 different uses of maxLik:

a) A user directly uses maxLik() (in the current environment): the environment / objects do not need to be stored, right?

b) A user uses maxLik() within a function: I think that storing the environment / objects should be the responsibility of the user.

c) maxLik() is used (in a function) within an R package: I think that storing the environment / objects should be the responsibility of the package developer.

Hence, we do not need to care about storing the environment / objects. We can give instructions / examples in the documentation how to do this. What's your opinion?

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ] By: Arne Henningsen on 2016-02-14 08:46	[forum:42944]
4/5) I agree that the names of the functions should be on the one hand not too long and on the other hand 'descriptive.' In the maxLik package (and in Ott's and my other packages), we already extensively use (lower) camelCase (e.g. maxLik, logLik, ...) and never use underscore. Therefore, it would be consistent to continue using (lower) camelCase.

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Arne Henningsen on 2016-02-14 08:38

[forum:42943]

Indeed, thanks for starting this interesting discussion!

4) Please note that the first argument of maxLik() is 'logLik' (not 'loglik').

5) I am afraid that 'logLik(ml, value = “function”)' could confuse many users, because (I think) logLik() currently always returns the log likelihood value. I suggest something like logLikFunction(), LogLikFunc(), or logLikFn().

6) I think you only have to re-learn SVN.

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Arne Henningsen on 2016-02-14 08:29

[forum:42942]

1) Sounds good. Therefore, we should provide 'getter' functions and encourage users to use them instead of accessing maxLik objects as if they were normal lists.

2 & 3) I agree. If maxLik switches to S/R6, packages that are based on S3 and use maxLik may break.

4) I agree.

5d) I guess that 'install.packages(..., repos="http://R-Forge.R-project.org")' is the equivalent to install_github(). It can install the (development) version from R-Forge even if the user does have installed any development tools.

PS) I haven't tried yet to respond by e-mail.

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Arne Henningsen on 2016-02-14 08:17

[forum:42941]

R-Forge is part of the 'R Project' and hosted at the Vienna University of Economics and Business. R-Forge hosts almost 2,000 projects (thus, several thousand R packages) and has more than 10,000 users. Hence, it seems to be quite popular among R users. Anyway, I agree that there were not many changes (improvements) in the past 5 years so that the advantages of github are likely increasing over time.

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Randall Pruim on 2016-02-14 03:49

[forum:42939]

My initial reaction is that this is a good compromise. The extra arguments get automatically folded into the environment, so for simple uses, things should be pretty much automatic.

Users who are nesting multiple environments and using that to feed data to the log-likelihood function are on their own to make sure that the same data is added to the environment of the function.

Another way to implement that would be to give your loglikfunction() extractor an argument (default value of new.env()) that contains any additional data. The main advantages of that is that it ensures documentation of this use case and makes this idea more visible. (People are more likely to notice a function argument than to read the fine print in the details section of the documentation.)

Still trying to come up with a good name for this. loglikfunction() is pretty long, but at least it is relatively clear.

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Ott Toomet on 2016-02-14 02:59

[forum:42938]

I though about it and I suggest the following:
a) implement the function with the environment in the way you initially suggested.
b) implement a getter for the environment in this way that the user can add more data there if needed.
So my code would look something like this:

testfn <- function() {
x <- rnorm(100)
loglik <- function(mu) dnorm(x, mean=mu, log=TRUE)
a <- maxLik(loglik, start=0)
assign("x", x, environment(loglikfunction(a)))
}
m <- testfn()

I think we will get too far by automatically taking everything from the parent environment. I am also not sure how it would work as there are several layers of wrappers around the actual function. But now the user can decide, and add only the necessary variables. Or pass those as arguments ;-)

Thoughts?
Ott

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Randall Pruim on 2016-02-13 14:05

[forum:42933]

Regarding my last draft solution, it is an interesting question whether or not one should clone env. Cloning freezes the values in the environment to what they were when maxLik() was called, but does not do so (they way I have done this) for any of the ancestors of that environment. So it would still be possible to concoct examples where loglik() behaves differently later than it does at the moment maxLik() was executed.

If we don't clone, then any changes (like reusing one's favorite variable names in the global environment, from which one may have called maxLik() interactively) will change the values produced by ml$loglik().

This might need some careful documentation -- or perhaps a logical argument determining whether or not cloning happens.

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Randall Pruim on 2016-02-13 06:46

[forum:42932]

I think something like the following might do the trick

maxLik2 <- function(loglik, ..., env = parent.frame()) {
orig.call <- match.call()
fn <- loglik
dots <- list(...)
# clone of env
env2 <- as.environment(as.list(env, all.names=TRUE))
parent.env(env2) <- parent.env(env)

result <- maxLik::maxLik(loglik, ...)
for (n in intersect(names(dots), names(formals(fn)))) {
formals(fn)[[n]] <- NULL
assign(n, dots[[n]], env2)
}
environment(fn) <- env2
result$loglik <- fn
class(result) <- c("maxLik2", class(result))
result
}

testfn <- function() {
x <- rnorm(100)
loglik <- function(mu) sum(dnorm(x, mean=mu, log=TRUE))
maxLik2(loglik, start=0)
}
m <- testfn()
m$loglik(coef(m))
## [1] -132.6203
logLik(m)
## [1] -132.6203

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Randall Pruim on 2016-02-13 04:35

[forum:42931]

1) I'll have to think more about this way of doing things and how it might work to extract the log-likelihood function in such a case. I can see where you might want to avoid pre-processing the data each time loglik() is called by the optimizer. But I think it is cleaner to still pass that to loglik -- perhaps as an environment or a list if there are lots of things of different types.

model <- function(formula, data, ...) {
aux <- { # compute, X,Y, and various auxiliary information for loglik }
loglik <- function(param, aux) {
# ...
}

result <- maxLik(loglik)
}

Would that not work? (I haven't looked at your actual code.) I wonder how it would compare in terms of efficiency.

4/5) If stat::logLik() already defines the API, that is a good reason to stick with what they do (and perhaps to create an object of the same class and set the df attribute). But that might not preclude having an argument that causes logLik() to return the function rather than the maximum value of the function.

My only design principle regarding the name was that if loglik was a good name for the function going in, then it must be a good name for the function coming out. For very vanilla arguments (like x, object, fn) it makes sense to have a more specific name for the getter. fn() does not seem to me like a good name for a getter. For the underlying optimizers, perhaps objective()? or objectiveFn()? [But if you can avoid a two-word name, you don't have to decide between camelCase, under_score, etc., for which there is little rhyme or reason in the R code base.]

6) username = rpruim

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Ott Toomet on 2016-02-13 02:42

[forum:42929]

1) Sure, just adding this to the function does not break anything. I
still prefer to do it in a good way for not to break anything later...

| My first reaction to your example is that if we can’t support that,
| I’m not so disappointed, since I think (a) it is a bad way to code,
|
| 1) I suppose one could have an optional enclosing environment
| argument. Have to think about this more.

Maybe that is a bad way to code but I have done several of my other
projects (in particular sampleSelection) in this way. Broadly, the
structure looks like:
model <- function(formula, data, ...) {
loglik <- function(param) {
# ...
}
# compute, X,Y, and various auxiliary information for loglik,
# for instance, where in parameter vector we have certain components.
result <- maxLik(loglik)
}
I find it a convenient way to pass a substantial number of information
over to the loglik. Agree that it is not terribly important to have
this information stored, but it might be nice if one could also do
loglik plots of sampleSelection and other similar models.

| 4) I disagree. The name should be loglik because that’s what you call
| the argument to the maxLik() function. Of course, you could give it
| another name in the underlying maximizers and rename it inside
| maxLik() — I would agree that this is a good idea.

| 5) I agree regarding having methods to extract this information (like
| coef()). logLik() is already in use, but it could be reimplemented
| with an argument so that returns it either a value (as now) or the
| function (new behavior). I think I like that idea rather than
| inventing a new name.
|
| logLik(ml) # a number
| logLik(ml, value = “function”) # or maybe logLik(ml, type = 2) or some other way of distinguishing between a value and a function.

logLik is a function in 'stats'. The documentation states that:

Returns an object of class ‘logLik’. This is a number with at
least one attribute, ‘"df"’ (*d*egrees of *f*reedom), giving the
number of (estimated) parameters in the model.

(now I don't think we add 'df' but that is a different story...)

Do you always prefer to have getters with the same name as the
corresponding arguments? I don't object, but this sounds like a more
general design guideline and we should look at all the
names/parameters from that viewpoint. For instance, maxNR and other
optimizers have function argument called 'fn'.

| 6) Regarding contributing… I guess it doesn’t hurt to make me a
| contributor, but I’ve not really thought carefully about how I’d like
| to and be able to get involved given my other projects. I’m more
| likely to get involved if the project moves to github because (a) it
| is familiar to me, and (b) it is an easy matter to propose a small
| change in a pull request. But I can certainly re-learn to use R-forge
| (I used it years ago, but have never missed it since leaving it), and
| the svn integration in RStudio means that most of my workflow wouldn’t
| need to change once I get all the connections set up.

Actually just such discussions are of big help :-) If you tell me
your username I will also make you a developer.

about github/r-forge: my impression is that r-forge is in the exact
same spot where it was 5 years ago. And given githubs popularity,
there is probably little incentive to make it more powerful.

For some reason the forum is moderated despite being stated in the conf page
that it isn't.

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Randall Pruim on 2016-02-12 17:39

[forum:42928]

My first reaction to your example is that if we can’t support that, I’m not so disappointed, since I think (a) it is a bad way to code, and (b) we can document why this doesn’t work. Of course, if we can make it work, that would be fine, too.

I think your example would be better coded as

testfn <- function() {
loglik <- function(mu, x) dnorm(x, mean=mu, log=TRUE)
a <- maxLik(loglik, start=0, x = rnorm(100))
}
m <- testfn()

Or better still as

testfn <- function(x) {
loglik <- function(mu, x) dnorm(x, mean=mu, log=TRUE)
a <- maxLik(loglik, start=0, x = x)
}
m <- testfn(rnorm(100))

1) I suppose one could have an optional enclosing environment argument. Have to think about this more.

2) I’m fine with the objective function being also stored in the various optimizers. Seems like a good idea.

3) I’m not exactly sure what you mean, but at a minimum there should be a unit test something like this

example_x <- rnorm(100)
loglik <- function(mu, x) dnorm(x, mean=mu, log=TRUE)
ml <- maxLik( loglik, start = 0, x = example_x)
testthat::expect_equivalent( loglik(1, example_x), ml$loglik(1)) # could test multiple values where I have 1

testing 4 or 5 random values is probably sufficient to convince yourself that the functions are the same.

4) I disagree. The name should be loglik because that’s what you call the argument to the maxLik() function. Of course, you could give it another name in the underlying maximizers and rename it inside maxLik() — I would agree that this is a good idea.

5) I agree regarding having methods to extract this information (like coef()). logLik() is already in use, but it could be reimplemented with an argument so that returns it either a value (as now) or the function (new behavior). I think I like that idea rather than inventing a new name.

logLik(ml) # a number
logLik(ml, value = “function”) # or maybe logLik(ml, type = 2) or some other way of distinguishing between a value and a function.

6) Regarding contributing… I guess it doesn’t hurt to make me a contributor, but I’ve not really thought carefully about how I’d like to and be able to get involved given my other projects. I’m more likely to get involved if the project moves to github because (a) it is familiar to me, and (b) it is an easy matter to propose a small change in a pull request. But I can certainly re-learn to use R-forge (I used it years ago, but have never missed it since leaving it), and the svn integration in RStudio means that most of my workflow wouldn’t need to change once I get all the connections set up.

I’m glad my original post has started an interesting conversation.

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Randall Pruim on 2016-02-12 17:36

[forum:42927]

1) Adding additional information to the maxLik object should not break any existing code since the existing code won’t use that information. Even changing the structure of the underlying object wouldn’t break code that only uses the provided methods to access information — as long as the public API doesn’t change — but this cannot be enforced. If users access maxLik objects as if they were normal lists, then things will break if you change the underlying structure.

2 & 3) These are always tough decisions. Advice I received a while back seems good: If something is broke, better to just bite the bullet and fix it sooner rather than later since the pressure to do so will likely just build and the cost of the change will only increase as well. But if the current things are not broken or deficient, then by all means don’t break the API for existing code. What I’m proposing is additive and should not really matter for issues of backward compatibility unless we run into something in the implementation that would not work without an API change.

I think a switch to R6 could likely also be done without breaking the API — although you might find good reasons to want to change the API as that change happens. But I don’t have a comprehensive understanding of the API currently, so I might not be considering some important things.

4) I’d be fine with this, especially if you feel there is a large user base that won’t use it. On the other hand, importing ggplot2 or lattice or base graphics isn’t so bad since R always ships with these and they are very well supported. If it moves beyond these to things like gridExtra or latticeExra or various other extra packages, then I would lean toward the separation.

Another advantage of separation is that the checking and submission process is separated, and you only have to worry about the part that is changing.

5) I’ll leave it to the current developers to hash out R-forge vs github. I’ll just say that having moved from cvs to svn to git and from R-forge to github, I have never once pined for R-forge and the move to github was the best development decision I ever made. But it has been a long time since I used R-forge, and perhaps R-forge has gotten better over the years.

Regarding 5d) I don’t find R CMD install to be comparable to install_github(). The latter can be used by any user (even on an RStudio server). The former require the user to go to the command line — a place many users don’t even know exists. Similarly, downloading a tar.gz and installing from that is not nearly as clean as using install_github().

—rjp

PS. I tried to post this by replying to the email I received, but that didn't work. Is this a know issue with R-forge? My email bounced back with "maxlik-open-discussion wasn't found at r-forge.wu-wien.ac.at."

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Ott Toomet on 2016-02-12 03:09

[forum:42926]

Hey Randall,
played a bit with your code to think about the issues... It seems to work for what it is designed, but not in this case:

testfn <- function() {
x <- rnorm(100)
loglik <- function(mu) dnorm(x, mean=mu, log=TRUE)
a <- maxLik(loglik, start=0)
}
m <- testfn()

Now you can access m$loglik afterwards, but that one has no access to 'x' and hence it won't work. I don't really see other solution than manually assigning 'x' to the environment of 'a' in 'testfn', but have to test/think more about it. This is a sort of outside of maxlik scope but probably corresponds to quite a lot of it's use (in particular, it's use inside of other packages).

1) It would be nice to have such an option, so that the author of testfn can assign 'x' to the 'loglik' environment.

2) What would be the optimal place where to assign the objective function? I would like to have it available for all the optimizers (not just maxLik), hence it should be done somewhere inside of 'max*' routines.

3) What sort of tests should there be included?

4) I would not call it 'loglik' as the function may be something else than log-likelihood.

5) I would not document it's name (of the list component) but just the getter function. What is a good name? 'maxFunction'? I would use 'maxValue' for the corresponding value at the estimated maximum, so 'maxFunction' fits well. 'maxLik' may also add duplicate names to the getter, such as 'loglikFunction'.

Anything else? Comments?

Randall, I can give you developer access if you wish :-)

Best, Ott

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Ott Toomet on 2016-02-12 01:53

[forum:42925]

I agree that r-forge works well. However, github is far more popular. Using r-forge that is obscure for (most?) potential maxLik users, requires separate registration, and has and outdated look, may be an obstacle for using the package. But I guess those who can contribute code can also sort out the svn vs git stuff. I haven't used github much and I suspect there are many more features there, not present in r-forge, and the gap is probably widening at quite a substantial pace. If all that matters is an empirical question.

Who is hosting r-forge? I think if we stay with svn we should also think about some sort of backups for the database dumps.

Otherwise, I feel a bit uneasy about the fact that all collaboration seems to be converging to github. It just feel weird if one firm controls most of the codebase on this planet...

But anyway, the main obstacle is to find some time to think and design good code :-)

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Arne Henningsen on 2016-02-11 21:16

[forum:42919]

Dear Randall and Ott

1) contribution of code
I would be happy if Randall contributes the maxLik package :-)
If the changes that Randall proposes would *not* break backward compatibility of the maxLik function / packages, I think that it is no problem to implement them before March.

2) backward compatibility
I totally agree with Ott that we should keep backward compatibility as much as possible, because many people have implemented ML estimations with the maxLik package, e.g. in public R packages (e.g. on CRAN), private R packages, and other private or publicly available code. If the returned object includes further objects (e.g. the log-likelihood function, arguments, ...), the maxLik function / package should remain backward-compatible, right? Which changes do you suggest that would make maxLik not backward-compatible?

3) maxLik2
I think that we should avoid creating a new package unless this is absolutely necessary, because (given the popularity of the current maxLik package) we would need maintain two packages (maxLik and maxLik2) for quite some years. However, if breaking backward compatibility has major advantages (e.g. switching to S4 or S6 classes?), creating a new package may anyway be the best solution.

4) functions for plotting
As functions for plotting may depend on several packages, I agree with Ott that these functions should be in a new package (perhaps "maxLikPlot" is more informative than "maxPlot") in order to keep maxLik's dependencies as limited as possible.

5) github vs. R-Forge / subversion
I think that R-Forge / subversion works fine and you can (almost) do the same things as on github, e.g.
a) Subversion facilitates collaboration well and also provides ways to back out of mistakes
b) other users can easily create a branch / fork using subversion and later one can decide whether the changes should be merged to the (main) development version
c) you can use a "tracker" (instead of a forum) to track issues on R-Forge and I think that this works quite well
d) you can easily install the current development version by "R CMD INSTALL"
e) you can use "devtools" (e.g. together with RStudio)
f) RStudio has an integrated subversion client
I am not principally against migrating the maxLik package to github but it seems to me that github has only limited advantages and also a few disadvantages so that I am not convinced that we should 'change a running system'.

Best,
Arne

RE: suggest storing the value of loglik in the object returned from maxLik() [ Reply ]
By: Randall Pruim on 2016-02-11 05:47

[forum:42912]

github doesn't do the package building, but using devtools, it is easy to send your package to winbuilder, where it is built under the current release and developmental versions of R. For the package maintainer, it is a single command:

build_win()

Using that is independent of choosing to use github, but the entire RStudio/github/devtools suite makes for a very nice package development environment.

Similarly, github does do travis CI, but travis CI knows how to interact with github (after you do some configuration) so that whenever things change on github, travis CI knows to pull our project and rebuild and test your package.

This gets to your point of "everyone knows github". Because lots of people use github, lots of people create tools like this that work with github.

By the way, your package is already available (in read-only form) via github at https://github.com/cran/maxLik, and you could fork that to get things started very easily in github. It would even start out with 20 commits (from 20 CRAN submissions) already in place.

Older Messages