2  Illustrated data sources and flow

ON LiMiT collects data from multiple sources, including biological samples, questionnaires, and electronic tools. Data originates from participants, healthcare professionals, and bioanalysts, and is stored securely on GenomeDK servers for subsequent analysis. The diagram below illustrates the context of ON LiMiT, key users, and data providers:

flowchart TD

Bioanalyst(["<b>Bioanalyst</b><br><i>[person]</i><br>working with the analysis of physical samples."])
Healthprof(["<b>Healthcare professional</b><br><i>[person]</i><br>working with the participants in the study."])
Participant(["<b>Participant</b><br><i>[person]</i><br>participating in the study."])
Datamanager(["<b>Data Manager</b><br><i>[person]</i><br>working with retrieval and storage of data."])
Researcher(["<b>Researcher</b><br><i>[person]</i><br>interested in doing additional analysis of existing data."])

onlimit[["<b>ON LiMiT</b><br><i>[system]</i><br>the ON LiMiT system will be comprised of REDCap and GenomeDK.<br>REDCap receives raw data; GenomeDK stores cleaned data."]]
DS[["<b>Denmark Statistics</b><br><i>[system]</i><br>Closed analysis tool where data from ON LiMiT can be uploaded."]]
DP[["<b>Data Collectors</b><br><i>[systems]</i><br>Monsenso, MyFood24,<br>Food preferences app,<br>SENS, Libre CGM,<br>Clinic: biological samples."]]

onlimit:::system
DP:::external
DS:::external
Bioanalyst --> DP
Healthprof --> DP
Participant --> DP
Datamanager --> DP
DP --> onlimit
Researcher --> onlimit
onlimit --> DS
Researcher --> DS

classDef system stroke-width:4pt
classDef external fill:lightgrey
Figure 2.1: C4 Context diagram showing a basic overview of ON LiMiT, its anticipated users, and data providers.

2.1 Technical infrastructure

The Container diagram shows the main components of the ON LiMiT data infrastructure, including data sources, storage, and analysis tools. It provides an overview of how the data is collected, stored, and made available for analysis.

flowchart TD
Data[("Data<br>[Software system]<br>all electronic data generated by staff and participants using various tools" )]
Samples(["Biological samples<br>[objects]<br>collected from study participants and stored on ice" ])
Redcap[("REDCap<br>[database]<br>database mainly used for collecting data from staff about participants" )]
Biobank[["Biobank<br>[facility]<br>storage-facility for biological samples" ]]
Sprout[["Seedcase Sprout<br>[software]<br>software package aiding in documenting and storage of data" ]]
Genome[("GenomeDK<br>[system]<br>HPC facility where processed data will be stored" )]
Analysis[["Analysis Area on GenomeDK<br>[system]<br>facility via GenomeDK where data can be made available for researchers" ]]
DS[["Denmark Statistics<br>[system]<br>closed analysis tool where data from ON LiMiT can be uploaded" ]]
Flower[["Seedcase Flower<br>[software]<br>software package aiding with creating an overview of the ON LiMiT dataset" ]]
Website[("Website<br>[systems]<br>website showing the data dictionary for ON LiMiT data" )]
Researcher(["Researcher<br>[person]<br>interested in doing additional analysis of existing data" ])

subgraph i1 [<b>ON LiMiT environment</b>]
    Redcap --> Sprout
    Biobank ~~~ Sprout
    subgraph i2 [ ]
        Sprout --> Genome
        Genome --> Flower
        Genome -.-> Analysis
    end
    Genome --> DS
    Flower --> Website
end

Data:::external --> Sprout
Data --> Redcap
i1:::system
i2:::hidden
Samples:::external --> Redcap
Samples --> Biobank
Researcher:::external
Researcher --> DS
Researcher --> Analysis
Researcher --> Website

classDef system fill:none
classDef external fill:lightgrey
classDef hidden fill:none, stroke-width:0
Figure 2.2: C4 Container diagram showing the main components of ON LiMiT data infrastructure, including data sources, storage, and analysis tools.