variantspark: A 'Sparklyr' Extension for 'VariantSpark'
This is a 'sparklyr' extension integrating 'VariantSpark' and R. 'VariantSpark' is 
  a framework based on 'scala' and 'spark' to analyze genome datasets, 
  see <https://bioinformatics.csiro.au/>. It was tested on datasets with 3000 samples 
  each one containing 80 million features in either unsupervised clustering approaches 
  and supervised applications, like classification and regression. The genome datasets
  are usually writing in VCF, a specific text file format used 
  in bioinformatics for storing gene sequence variations. So, 'VariantSpark' is a great 
  tool for genome research, because it is able to read VCF files, run analyses and return 
  the output in a 'spark' data frame.
Documentation:
Downloads:
Linking:
Please use the canonical form
https://CRAN.R-project.org/package=variantspark
to link to this page.