rgsrs

R-CMD-check

Tidy R client for the FDA Global Substance Registration System (GSRS) REST API. Retrieve substance metadata, synonyms, cross-reference codes, and chemical structure information for any of the 170 000+ registered substances.

No API key required.

Installation

# install.packages("pak")
pak::pak("c1au6i0/rgsrs")

Functions

Function Description
gsrs_search() Free-text or Lucene-syntax search
gsrs_substance() Substance metadata by UNII (vectorised)
gsrs_names() All registered synonyms for a UNII
gsrs_codes() Cross-reference codes for a UNII, with optional code_system filter
gsrs_unii_from_name() Resolve a substance name to its UNII
gsrs_structure() Chemical structure data (SMILES, formula, MW, InChI, …) by UNII
gsrs_structure_search() Substructure / similarity / exact search by SMILES
gsrs_chem_info() Chemical structure info from any identifier: name, CAS, UNII, InChIKey, or SMILES
gsrs_hierarchy() Parent/child relationship tree for a UNII
gsrs_all() Convenience wrapper: substance + names + codes + structure + hierarchy
gsrs_browse() Page through the full GSRS substance catalogue
gsrs_vocabularies() Retrieve controlled vocabulary terms
write_dataframes_to_excel() Write a named list of data frames to .xlsx

Usage

library(rgsrs)

# Search
gsrs_search("aspirin", top = 5)

# Fetch by UNII (aspirin = R16CO5Y76E)
gsrs_substance("R16CO5Y76E")

# All synonyms
gsrs_names("R16CO5Y76E")

# Cross-references – all systems
gsrs_codes("R16CO5Y76E")

# Cross-references – CAS only
gsrs_codes("R16CO5Y76E", code_system = "CAS")

# Name → UNII
gsrs_unii_from_name(c("aspirin", "ibuprofen"))

# Chemical structure info by name
gsrs_chem_info(c("aspirin", "ibuprofen"), type = "name")

# Chemical structure info by CAS number
gsrs_chem_info(c("50-78-2", "15687-27-1"), type = "cas")

# Chemical structure info by UNII
gsrs_chem_info("R16CO5Y76E", type = "unii")

# Chemical structure info by InChIKey
gsrs_chem_info("BSYNRYMUTXBXSQ-UHFFFAOYSA-N", type = "inchikey")

# Chemical structure info by SMILES (exact match)
gsrs_chem_info("CC(=O)Oc1ccccc1C(=O)O", type = "smiles")

# Everything at once
out <- gsrs_all("R16CO5Y76E")
str(out, max.level = 1)

All functions accept a verbose argument (default TRUE) and return tidy data frames with a query column tracking the input identifier. Failed lookups return NULL with a warning rather than throwing an error.

License

MIT © 2026 Claudio Zanettini