Keywords:
Gene signature stability; independent gene filtering; machine learning with RNA-seq; neonatal sepsis; transcriptomic sepsis biomarkers
Abstract:
Machine learning (ML) algorithms are powerful tools that are increasingly being used for sepsis biomarker discovery in RNA-Seq data. RNA-Seq datasets contain multiple sources and types of noise (operator, technical and non-systematic) that may bias ML classification. Normalisation and independent gene filtering approaches described in RNA-Seq workflows account for some of this variability and are typically only targeted at differential expression analysis rather than ML applications.