Deep learning based genome scan with various ready-to-use models and self-defined models.

DeepGenomeScan(genotype, ...)

## Default S3 method:
DeepGenomeScan(
  genotype, 
  env,
  method = "mlp",
  preProcess = NULL,
  ...,
  weights = NULL,
  metric = ifelse(is.factor(y), "Accuracy", "RMSE"),
  maximize = ifelse(metric <!-- %in% c("RMSE", "logLoss", "MAE"), FALSE, TRUE), -->
  trControl = trainControl(),
  tuneGrid = NULL,
  tuneLength = ifelse(trControl$method == "none", 1, 3),
seed = 123)

## S3 method for class 'formula'
DeepGenomeScan(form, data, ..., weights, subset, na.action = na.fail, contrasts = NULL,seed = 123)

## S3 method for class 'recipe'
DeepGenomeScan(
  genotype,
  data,
  method = "mlp",
  ...,
  metric = ifelse(is.factor(y_dat), "Accuracy", "RMSE"),
  maximize = ifelse(metric <!-- %in% c("RMSE", "logLoss", "MAE"), FALSE, TRUE), -->
  trControl = trainControl(),
  tuneGrid = NULL,
  tuneLength = ifelse(trControl$method == "none", 1, 3),
seed = 123)

Arguments

genotype

The genotype matrix, or other omics-based variations matrix. This is a matrix where samples are in rows and variations/features are in columns. This could be a simple matrix, data frame or other type (e.g. sparse matrix) but must have column names (see Details below). Preprocessing using the preProcess argument only supports matrices or data frames. When using the recipe method, _"genotype"_ should be an unprepared recipe object that describes the model terms (i.e. outcome, predictors, etc., e.g. the h2o model in our example) as well as any pre-processing that should be done to the data. This is an alternative approach to specifying the model. Note that, when using the recipe method, any arguments passed to preProcess will be ignored. See the links and example below for more details using recipes.

env

Env/traits data. The environmental variables/geographic coordinates, or the latent geographic genetic structure vairbales produced from KLFDAPC. For GWAS, this shoud be the traits or phenotypic variables. The data can be a numeric or factor vector corresponding to each sample.

method

The read-to-use models or user defined models. A string specifying which model to use. A list of functions can also be passed for a custom model function, see our example tutorials.

...

Arguments passed to the defined models (such as "mlpWeightDecayML"). Errors will occur if values for tuning parameters are passed here.

Details

Deep learning based genome scan with various ready-to-use models and self-defined models. Users can choose the available models in this package, and can also construct their own models then compile their model using DeepGenomeScan framework.

Value

model_train

The optimal model trained from specific model and resampling methods

%% ...

References

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., ... & Kudlur, M. (2016). Tensorflow: A system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16) (pp. 265-283).

Bergmeir, C. N., & Benítez Sánchez, J. M. (2012). Neural networks in R using the Stuttgart neural network simulator: RSNNS. American Statistical Association.

Beck, M. W. (2018). NeuralNetTools: Visualization and analysis tools for neural networks. Journal of statistical software, 85(11), 1.

Candel, A., Parmar, V., LeDell, E., & Arora, A. (2016). Deep learning with H2O. H2O. ai Inc.

Deane-Mayer, Z. A., & Knowles, J. E. (2016). caretEnsemble: ensembles of caret models. R package version, 2(0).

Gulli, A., & Pal, S. (2017). Deep learning with Keras. Packt Publishing Ltd.

Klima, G. (2016). FCNN4R: Fast Compressed Neural Networks for R. R package version 0.6, 2.

Kuhn, M. (2015). Caret: classification and regression training. ascl, ascl-1505.

Qin, X. 2020. DeepGenomeScan. v0.5.5.

Qin, X., Gaggiotti, EO., Chiang, C. 2020. Detecting natural selection via deep learning. In submission.

Author

qinxinghu@gmail.com

Note

https:// github.com/ xinghuq/ DeepGenomeScan/