Here are the notes regarding the ADRs of strong and moderate cytochrome P450 (CYP) substrates used in the data in the accompanying notebook (post version or Jupyter notebook version). The dataset currently contains ADRs for CYP3A4, 2D6, 2C19, 2C9, 1A2, 2B6, 2E1 and 2C8 substrates.
For columns “drug_name” and “cyp_strength_of_evidence”
note:
“cyp_strength_of_evidence” column has been renamed as “cyp_type_and_cyp_strength_of_evidence” in the cyp_substrates_adrs.csv file to include all the different CYP enzyme substrates
“cyp_strength_of_evidence” column is only used in cyps3a4_substrates.csv as it’s a file specifically for CYP3A4 substrates
All drug names and CYP strengths of evidence are based on the Flockhart cytochrome P450 drug-drug interaction table (Flockhart et al. 2021), which is available at https://drug-interactions.medicine.iu.edu/MainTable.aspx.
The strengths of evidence that a drug is metabolised by a certain CYP enzyme are categorised as below (as quoted from above web link):
- Strong Evidence (denoted with s_ followed by CYP name e.g. s_3A4) - the enzyme is majorly responsible for drug metabolism.
- Moderate Evidence (denoted with m_ followed by CYP name e.g. m_3A4) - the enzyme plays a significant but not exclusive role in drug metabolism or the supporting literature is not extensive.
Data sources for column “drug_class”
This information can be found in many different national drug formularies, drug reference textbooks e.g. Martindale, American society of health-system pharmacists’ (ASHP) drug information (DI) monographs, PubChem, ChEMBL, FDA, Micromedex etc. or online drug resources such as Drugs.com. For the particular small dataset used in the notebook, and also the later expanded dataset on all CYP substrates, the ADR reference sources listed below also contain information on therapeutic drug classes for all drugs.
Data sources for ADRs
1st-line: Drugs.com
using the health professional version for ADRs which usually contains references from pharmaceutical manufacturers’ medicines information data sheets, ASHP DI monographs or journal paper references
reason for choosing this web data source is that it is open to public (without login registrations or paywalls) and contains the last updated date along with all references used for the ADRs information on each drug
2nd-line as separate data checks:
New Zealand formulary (nzf) - mainly adapted from the British national formulary (BNF) with drug information tailored to New Zealand (NZ) populations (influenced by government funding restrictions), and likely only available to NZ residents only due to licensing restrictions, other national formularies should contain very similar drug information
electronic medicines compendium (emc) - United Kingdom (UK)-based drug reference; equivalents of UK/Europe-based pharmaceutical manufacturers’ medicines data sheets (note: there may be limited numbers of free access each month, and may require sign ups to access drug information)
Drugs@FDA - United States (US)-based drug reference from the Food and Drug Administration (FDA)
drugs.com_uk_di - UK drug information section in Drugs.com (equivalent to pharmaceutical manufacturers’ medicines information data sheets)
pharmaceutical manufacturers’ medicines information data sheets which are referenced as “data_sheet” in the data for each drug with notes (marked with asterisks)
reason to include a 2nd-line ADRs source is to ensure that all ADRs recorded in this dataset is clinically relevant in real-life cases (although mainly in clinical trials; but this is where postmarketing reports come into place, and also I’m attempting to use my previous on-site clinical work experience when curating the dataset). It also acts as an information accuracy check from a different, independent, trustworthy drug information source that is often referred to and used by healthcare professionals around the globe
two main types of occurrences/frequencies used:
^^ - common > 10%,
^ - less common 1% to 10%,
(not going to include other ones with lower incidences e.g. less common at 0.1% to 1% or rare for less than 0.1% etc.)
Notes for ADRs
About similar ADR terms
all minor ADRs are removed as the current focus will be more on ADRs that’ll affect qualities of lives or are potentially more life-threatening (e.g. dry skin, acne, chills, flatulence and abdominal discomforts are removed)
nausea and vomiting applies to many drugs so won’t be included (almost every drug will have these ADRs, they can be alleviated with electrolytes replacements and anti-nausea meds or other non-med options; rash on the other hand can sometimes be serious and life-threatening e.g. Stevens-Johnson syndrome)
similar or overlapping adverse effects will be removed to keep only one adverse effect for the same drug e.g. adverse skin reactions, rash, urticaria - rash and urticaria will be removed as adverse skin reactions encompass both symptoms
for ADR terms with similar meanings, e.g. pyrexia/fever - fever is used instead (only one will be used); attempts have been made to use just one ADR term to represent various other similar-meaning ADRs to minimise duplications
deduplications of ADR data have been attempted e.g. for fatigue, malaise, lethary or asthenia, I’ve decided to use fatigue only to cover all of these similar-meaning terms; the only exception is when the same term is reported as an ADR and also during postmarketing period, then both terms are kept (there may still be some duplicated terms that I’ve missed… but this can be sorted during the data preprocessing step if needed)
Regarding ADRs and categories
ADRs mentioned in common ADRs category and repeated in the less common one will have the ADR recorded in the higher incidence rate (at > 10%) only
some ADRs can be dose-related or formulations-related e.g. injection site irritations or allergic reactions caused by excipients/fillers, since my very initial (and naive) aim is to investigate the relationships between ADRs and drugs via computational tools e.g. any patterns between ADRs and drugs, so these types of ADRs will not be recorded in the dataset; all ADRs are currently not formulation-specific (i.e. not differentiated between solid oral dosage forms such as tablets or capsules, suspensions/liquids, subcutaneous, intravenous or intramuscular injections or topical formulations)
Some of the ADRs recorded for some of the drugs may apply to younger or paediatrics populations only - this may need to be taken care of when using these data to build machine learning models, specific drug notes with astericks in the dataset should be in place as reminders, otherwise most of the ADRs in the datset should be mainly applicable to adult and/or paediatric populations (individual drug information data sheet should hopefully provide more details about this)
this current ADRs dataset has not directly considered any pharmacogenetic/pharmacogenomic factors but this may be useful and interesting to look into in the future for relevant drug projects
For postmarketing reports
some postmarketing adverse effects are for different age populations e.g. paediatric patients of up to 12 years of age or elderly people - for now all of them are labelled as “(pm)” to denote postmarketing reports and are not differentiated in age groups
postmarketing reports are limited to the experiences of certain unknown population sizes using the drug, therefore it is not possible to directly extrapolate the frequencies of these reports, and also not possible to confirm a direct causal relationship between the drug and its postmarketing reports (a potential downside of the ADRs data, but with more reportings in place, there’s a larger possibility to observe common ADRs traits or trends during data analysis that may help to decode possible causes or mechanisms in the future)
not all postmarketing reports are included for each drug in the dataset, but only ones that are relevant clinically or mentioned in the 1st-line or in both 1st-line and 2nd-line data sources (this can be expanded further in the future if needed e.g. for investigations on postmarketing ADRs reports only)
More on psychiatric medicines
serotonergic drugs tend to induce serotonin syndrome (e.g. tremor, ataxia, restlestness, shivering, sweating, fever, tachycardia, tachypnoea, confusion, agitation, coma) especially if used in combinations concomitantly (monotherapy tends to have smaller risk)
typical ADRs of antipsychotics drugs can include extrapyramidal symptoms (parkinsonian symptoms e.g. tremors, dystonias or dyskinesia (abnormal muscular spasms), akathisia (restlessness) and tardive dyskinesia (most significant antipsychotic ADR causing abnormal involuntary movements of face, jaw and tongue; common in first generation antipsychotics)), constipation, sexual dysfunction, cardiovascular adverse effects, hyperglycaemia, problem with body temperature regulations and neuroleptic malignant syndrome (usually rare but potentially can be fatal)
Other bits and pieces
one other thing I’d like to mention is that all the ADRs recorded in this repository are not merely a copy-and-paste action from the drug reference sources, they also include or are integrated with the ones I’ve encountered from my previous pharmacist work experience (in the hope to better reflect bedside or clinical ADRs, and this is also why collecting data has been taking a long time…)
one last thing to mention is that all the ADRs in the dataset are in US spellings, but this note and also any other notebooks or associated posts will be written in my more familiar UK/NZ-based spellings (in case anyone’s wondering)
Notes for specific selected drugs
note: list of drugs not in alphabetical order; quick links for each drug in table of contents on the right
hydrocortisone
- A moderate CYP3A4 substrate with no reported ADR frequencies at all for its ADRs as they are entirely dependent on the dosage and duration of use (ADRs tend to be unnoticeable at appropriate low doses for short durations)
terfenadine
- A strong CYP3A4 substrate that is actually withdrawn from the market in 1990s due to QT prolongations
lercanidipine
- A moderate CYP3A4 substrate that has nil reported ADRs of more than 1% but has a few postmarketing reports recorded
telaprevir
- A moderate CYP3A4 substrate that is usually administered within a combination therapy (e.g. along with peginterferon alfa and ribavirin)
quinine
- A moderate CYP3A4 substrate that has all of its ADRs reported without frequencies. The most common ADRs are presented as a cluster of symptoms (known as cinchonism) and can occur during overdoses (usually very toxic) and also normal doses. These symptoms include “…tinnitus, hearing impairment, headache, nausea, vomiting, abdominal pain, diarrhoea, visual disturbances (including blindness), arrhythmias (which can have a very rapid onset), convulsions (which can be intractable), and rashes.” (as quoted from NZ formulary v150 - 01 Dec 2024)
ribociclib
- A moderate CYP3A4 substrate that has a listed ADR of on-treatment deaths, which were found to be associated with patients also taking letrozole or fulvestrant at the same time and/or in patients with underlying malignancy
nortriptyline
- A strong CYP2D6 substrate with no frequencies recorded for all of its ADRs - the recorded ADRs in files will be mainly based on nzf which has stated these ADRs as common ones with varying risk and extent, a good rule of thumb is to be aware of the well-known tricyclic antidepressant-related ADRs e.g. antimuscarinic effects (dry mouth, blurred vision, constipation, urinary retention)
perhexiline
- A strong CYP2D6 substrate with no records of medicines information in Drugs@FDA or Drugs.com so the main source of drug information is nzf and its pharmaceutical manufacturer’s medicines data sheet. The ADRs may be present for the first two to four weeks of treatment only
escitalopram
- A moderate CYP2D6 substrate that is also known to induce selective serotonin reuptake inhibitor-related hyponatraemia (through inappropriate antidiuretic hormone secretion) when certain risk factors are also present at the same time e.g. diuretic use, female gender, low body weight, geriatric populations along with low baseline sodium level
lidocaine
- A moderate CYP2D6 substrate that doesn’t have ADR frequencies recorded - ADRs for amide local anaesthetics apply instead (mainly central nervous and cardiovascular system-related) and toxicity is commonly dose-dependent (e.g. increased risk due to high plasma concentration); hypersensitivity reactions may sometimes be due to preservatives used (e.g. parabens)
metoclopramide
cyclophosphamide
- A moderate CYP2C19 substrate that is also a cytotoxic drug in the class of aklyating agents. One of its notable ADRs is urothelial toxicity (presented as haemorrhagic cystitis) caused by acrolein (its urinary metabolite via hepatic metabolism). Ways to alleviate this is to increase fluid intake for at least 24 to 48 hours after intravenous injection or use mesna. High dose of cyclophosphamide may also cause cardiotoxicity
r-mephobarbital
- A strong CYP2C19 substrate, manufactured previously as mephobarbital (also known as methylphenobarbital), is currently not commonly prescribed (discontinued from 2011-12). One of the likely reasons could be due to its metabolism by CYP enzymes to phenobarbital, which is also available as an antiepileptic drug. Other possible reasons are its higher risk from overdose and drug dependency, so often barbiturate-based drug is replaced with other antiepileptics or non-antiepileptics instead depending on indications. Its ADRs is based on this medicines data sheet link from its last pharmaceutical manufacturer (Lundbeck Inc.)
flurbiprofen
- A moderate CYP2C9 substrate and also one of the non-steroidal anti-inflammatory drugs known to be associated with cardiovascular, gastrointestinal (cyclo-oxygenase-2 selective inhibitors may have lower risk) and renal-related risks, therefore most of the ADRs and postmarketing reports recorded will focus on these physiological influences mainly
glimepiride
- A moderate CYP2C9 substrate which is also one of the sulfonylureas used for blood glucose control in type-2 diabetics. One of its postmarketing reports of “disulfiram-like reaction” is about the unpleasant effects (e.g. facial flushing, headache, palpitations, tachycardia) experienced after alcohol consumption (note: disulfiram is another drug often used for treating alcohol dependence)
glyburide
- A moderate CYP2C9 substrate which is also known as glibenclamide in some of the countries around the world
phenytoin
- A moderate CYP2C9 substrate that may cause severe cutaneous adverse reactions (SCAR), e.g. drug reaction with eosinophilia and systemic symptoms, Stevens-Johnson syndrome, toxic epidermal necrolysis, erythema multiforme, and acute generalised exanthematous pustulosis, which may require closer monitoring and drug discontinuation if required. Its occurrence is more likely in a genetic population with a particular human leukocyte antigen (HLA) allele present e.g. HLA-B^*15:02
warfarin
- A moderate CYP2C9 substrate that doesn’t have any ADR frequencies recorded as it’s a well-known vitamin K antagonist, often used as an oral anticoagulant, with a prominent ADR of causing fatal or non-fatal organ or tissue haemorrhage. The best strategy to prevent any haemorrhagic events is to monitor via international normalised ratio (INR) testing
caffeine
- A strong CYP1A2 substrate which is normally quite well tolerated orally; the ADRs recorded in the dataset here is mostly based on caffeine citrate 20mg/mL injection, which is often used in premature infants and may not be applicable to other populations
clozpine
- A strong CYP1A2 substrate that is one of the second generation or atypical antipsychotics with many ADRs requiring close monitoring e.g. frequencies of bowel motions (contipation is one of the common ADRs that is potentially fatal), white cell counts and cardiac toxicity-related symptoms and/or cardiovascular-related lab marker changes; people with pre-existing diabetes or liver diseases are also prone to ADRs related to these diseases while on clozapine
melatonin
- A strong CYP1A2 substrate that has no ADRs with frequencies above 10% as it’s normally quite well tolerated at appropriate dose for short-term use (e.g. 3 months)
pomalidomide
- A moderate CYP1A2 substrate which is also an immunomodulating drug that has antineoplastic property to treat relapsed multiple myeloma and is structurally related to thalidomide
tacrine
- A strong CYP1A2 substrate that has been withdrawn from the US market in May 2012. An archived LiverTox reference has explained in details about the highly common hepatoxicity caused by tacrine which eventually led to the market withdrawal of this drug (while there are also other anticholinesterase inhibitors available for use). For exact drug information references used by Drugs.com for the ADRs of tacrine, here’s a list of references (which included the old pharmaceutical manufacturer’s data sheet)
theophylline
- A moderate CYP1A2 substrate with ADRs that are more likely to occur in overdoses (therapeutic drug monitoring is recommended as therapeutic and toxic doses may be very close to each other)
bupropion
- A strong CYP2B6 substrate that is also used for smoking cessation
efavirenz
- A strong CYP2B6 substrate that has a very common ADR of rash that often occurs in the first 2 weeks of treatment, it could be minor if there are no blistering, desquamations or any other widespread signs and the drug can be continued with symptoms disappearing usually after about a month; alternatively, if the symptoms are severe, the drug should be discontinued. Central nervous system-type ADRs will usually slowly improve after continuous use of drug, but if there are pre-existing psychiatric conditions, the risk of psychiatric ADRs may need to be further considered before initiating or continuing the drug
isoflurane
- A strong CYP2E1 substrate which is also a type of inhalational anaesthetic agent (in volatile liquid form) that is normally administered through vaporiser and requires highly trained staff for administration and resuscitation if needed
cerivastatin
- A strong CYP2C8 substrate that has been withdrawn from the market since 2001 due to serious adverse effect of rhabdomyolysis leading to kidney failure and subsequently reported deaths (reference or alternative link)
enzalutamide
- A moderate CYP2C8 substrate with a known postmarketing ADR report of posterior reversible encephalopathy syndrome, which include symptoms such as seizure, headache, confusion, other visual and neurological disturbances with or without associated hypertension (more details here or alternative link)
rosiglitazone
- A moderate CYP2C8 substrate that has an ADR of increasing the risk of congestive heart failure and heart attack reported from postmarketing use and also one of its earlier clinical trials. It was withdrawn for medical use in several countries back in 2011 but US continued its use with restrictions and further removed the restriction (reference) when the drug was found not to increase cardiovascular risk in another clinical trial. Currently, the cardiovascular risk is still recommended to be considered before use
tucatinib
- A moderate CYP2C8 substrate that has been tested along with trastuzumab and capecitabine in clinical trials, therefore its ADRs are based on these combination of treatments, and not solely on tucatinib. One of its ADRs, palmar plantar erythrodysesthesia syndrome is also known as hand-foot syndrome or chemotherapy-induced acral erythema, with symptoms such as tingling, numbness and redness of palms and soles (reference)
Abbreviations used
note: not in alphabetical order
- ws = withdrawal symptoms
- ADH = antidiuretic hormone
- pm = postmarketing reports
- CNS = central nervous system
- CFTR = cystic fibrosis transmembrane regulator
- c_diff = Clostridioides/Clostridium difficile
- ZE = Zollinger-Ellison
- MTOR = mammalian target of rapamycin (protein kinase)
- AST = aspartate transaminase/aminotransferase
- ALT = alanine transaminase/aminotransferase
- ALP = alkaline phosphatase
- GGT = gamma-glutamyltransferase
- RTI = respiratory tract infection
- UTI = urinary tract infection
- LDH = lactate dehydrogenase
- dd = dose and duration-dependent
- pm_HIV_pit = postmarketing reports for HIV protease inhibitor therapy
- pm_hep_cyto = postmarketing reports in cancer patients where drug was taken with hepatotoxic/cytotoxic chemotherapy and antibiotics
- BUN = blood urea nitrogen
- NMS = neuroleptic malignant syndrome
- ECG = electrocardiogram
- CPK = creatine phosphokinase
- INR = international normalised ratio
- LFT = liver funtion tests
- DRESS = drug reaction with eosinophilia and systemic symptoms
- WBC = white blood cells
- RBC = red blood cells
- IFIS = intraoperative floppy iris syndrome
- LVEF = left ventricular ejection fraction
- AGEP = acute generalized exanthematous pustulosis
- COX-2 = cyclo-oxygenase-2
- NSAIDs = non-steroidal anti-inflammatory drugs
- SIADH = syndrome of inappropriate antidiuretic hormone secretion
- G6PD = glucose-6-phosphate dehydrogenase
- AV = atrioventricular
- SCAR = severe cutaneous adverse reactions
- GI = gastrointestinal
- EEG = electroencephalogram
- od = overdose
- NNRTIs = non-nucleoside reverse transcriptase inhibitors
- HDL = high-density lipoprotein
- PRES = posterior reversible encephalopathy syndrome
- CVS = cardiovascular
- TSH = thyroid stimulating hormone
- GERD = gastroesophageal reflux disease (also known as “GORD” due to alternative spelling of “gastro-oesophageal”)
- AF = atrial fibrillation