R packages by trevorhastie

glmnet - Lasso and Elastic-Net Regularized Generalized Linear Models

Extremely efficient procedures for fitting the entire lasso or elastic-net regularization path for linear regression, logistic and multinomial regression models, Poisson regression, Cox model, multiple-response Gaussian, and the grouped multinomial regression; see <doi:10.18637/jss.v033.i01> and <doi:10.18637/jss.v039.i05>. There are two new and important additions. The family argument can be a GLM family object, which opens the door to any programmed family (<doi:10.18637/jss.v106.i01>). This comes with a modest computational cost, so when the built-in families suffice, they should be used instead. The other novelty is the relax option, which refits each of the active sets in the path unpenalized. The algorithm uses cyclical coordinate descent in a path-wise fashion, as described in the papers cited.

Last updated 2 years ago

fortrancpp

15.15 score 82 stars 736 dependents 22k scripts 177k downloads

gam - Generalized Additive Models

Functions for fitting and working with generalized additive models, as described in chapter 7 of "Statistical Models in S" (Chambers and Hastie (eds), 1991), and "Generalized Additive Models" (Hastie and Tibshirani, 1990).

Last updated 7 months ago

fortranopenblas

9.50 score 4 stars 61 dependents 2.2k scripts 20k downloads

lars - Least Angle Regression, Lasso and Forward Stagewise

Efficient procedures for fitting an entire lasso sequence with the cost of a single least squares fit. Least angle regression and infinitesimal forward stagewise regression are related to the lasso, as described in the paper below.

Last updated 3 years ago

fortran

7.98 score 6 stars 78 dependents 700 scripts 9.7k downloads

mda - Mixture and Flexible Discriminant Analysis

Mixture and flexible discriminant analysis, multivariate adaptive regression splines (MARS), BRUTO, and vector-response smoothing splines. Hastie, Tibshirani and Friedman (2009) "Elements of Statistical Learning (second edition, chap 12)" Springer, New York.

Last updated 5 months ago

fortran

7.60 score 3 stars 17 dependents 428 scripts 12k downloads

ISLR - Data for an Introduction to Statistical Learning with Applications in R

We provide the collection of data-sets used in the book 'An Introduction to Statistical Learning with Applications in R'.

Last updated 4 years ago

7.58 score 4 stars 2 dependents 10k scripts 25k downloads

softImpute - Matrix Completion via Iterative Soft-Thresholded SVD

Iterative methods for matrix completion that use nuclear-norm regularization. There are two main approaches.The one approach uses iterative soft-thresholded svds to impute the missing values. The second approach uses alternating least squares. Both have an 'EM' flavor, in that at each iteration the matrix is completed with the current estimate. For large matrices there is a special sparse-matrix class named "Incomplete" that efficiently handles all computations. The package includes procedures for centering and scaling rows, columns or both, and for computing low-rank SVDs on large sparse centered matrices (i.e. principal components).

Last updated 4 years ago

fortran

7.47 score 10 stars 22 dependents 253 scripts 1.8k downloads

adelie - Group Lasso and Elastic Net Solver for Generalized Linear Models

Extremely efficient procedures for fitting the entire group lasso and group elastic net regularization path for GLMs, multinomial, the Cox model and multi-task Gaussian models. Similar to the R package 'glmnet' in scope of models, and in computational speed. This package provides R bindings to the C++ code underlying the corresponding Python package 'adelie'. These bindings offer a general purpose group elastic net solver, a wide range of matrix classes that can exploit special structure to allow large-scale inputs, and an assortment of generalized linear model classes for fitting various types of data. The package is an implementation of Yang, J. and Hastie, T. (2024) <doi:10.48550/arXiv.2405.08631>.

Last updated 30 days ago

cppopenmp

5.78 score 6 stars 3 scripts 418 downloads

ISLR2 - Introduction to Statistical Learning, Second Edition

We provide the collection of data-sets used in the book 'An Introduction to Statistical Learning with Applications in R, Second Edition'. These include many data-sets that we used in the first edition (some with minor changes), and some new datasets.

Last updated 2 years ago

5.49 score 2 stars 2.2k scripts 14k downloads

gamsel - Fit Regularization Path for Generalized Additive Models

Using overlap grouped-lasso penalties, 'gamsel' selects whether a term in a 'gam' is nonzero, linear, or a non-linear spline (up to a specified max df per variable). It fits the entire regularization path on a grid of values for the overall penalty lambda, both for gaussian and binomial families. See <doi:10.48550/arXiv.1506.03850> for more details.

Last updated 6 months ago

openblas

3.97 score 2 stars 31 scripts 276 downloads

svmpath - The SVM Path Algorithm

Computes the entire regularization path for the two-class svm classifier with essentially the same cost as a single SVM fit.

Last updated 5 years ago

2.85 score 2 dependents 39 scripts 359 downloads

ProDenICA - Product Density Estimation for ICA using Tilted Gaussian Density Estimates

A direct and flexible method for estimating an ICA model. This approach estimates the densities for each component directly via a tilted Gaussian. The tilt functions are estimated via a GAM Poisson model. Details can be found in "Elements of Statistical Learning (2nd Edition)" in Section 14.7.4.

Last updated 3 years ago

2.23 score 2 stars 21 scripts 170 downloads

sparsenet - Fit Sparse Linear Regression Models via Nonconvex Optimization

Efficient procedure for fitting regularization paths between L1 and L0, using the MC+ penalty of Zhang, C.H. (2010)<doi:10.1214/09-AOS729>. Implements the methodology described in Mazumder, Friedman and Hastie (2011) <DOI: 10.1198/jasa.2011.tm09738>. Sparsenet computes the regularization surface over both the family parameter and the tuning parameter by coordinate descent.

Last updated 4 months ago

fortran

2.08 score 2 stars 1 dependents 20 scripts 590 downloads