This vignette presentsa complete workflow in fingerPro,
including data verification, exploratory analysis, tracer selection,
unmixing, visualization, and validation the results.
example_geochemical_3s_raw.csv,
included in the package, is used to illustrate step by step the
workflow.
Load the example dataset included in the package:
data <- read_database(
system.file("extdata", "example_geochemical_3s_raw.csv", package = "fingerPro")
)| ID | samples | Ba | Nb | Zr | Sr | Rb | Pb | Zn | Fe | Mn | Cr | Ti | Ca | Al | P | Si | Mg | V |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Source1 | 272.77 | 10.47 | 186.48 | 360.84 | 62.25 | 12.08 | 47.43 | 20105.14 | 259.01 | 90.34 | 2876.70 | 185988.2 | 35149.08 | 1104.07 | 161458.6 | 3944.15 | 56.67 |
| 2 | Source1 | 342.37 | 12.08 | 226.51 | 392.19 | 78.22 | 14.92 | 62.26 | 22804.77 | 250.86 | 78.39 | 3389.78 | 158492.0 | 41484.38 | 1064.15 | 169675.8 | 3992.01 | 59.63 |
| 3 | Source1 | 351.12 | 10.43 | 178.56 | 522.67 | 77.19 | 14.87 | 71.18 | 21169.07 | 305.97 | 61.64 | 3340.13 | 176925.6 | 39449.94 | 1314.66 | 168952.0 | 3840.61 | 42.11 |
| 4 | Source1 | 302.87 | 11.51 | 157.54 | 490.00 | 79.21 | 13.50 | 67.41 | 23004.56 | 396.77 | 80.32 | 3183.65 | 171179.3 | 41774.51 | 1116.09 | 165760.3 | 3507.03 | 61.27 |
| 5 | Source1 | 306.89 | 10.94 | 224.24 | 439.45 | 53.82 | 16.29 | 44.33 | 18263.02 | 324.41 | 66.40 | 2915.43 | 198378.5 | 32408.88 | 1111.35 | 157717.9 | 3545.03 | 41.31 |
| 6 | Source1 | 389.35 | 10.69 | 170.48 | 449.07 | 84.29 | 17.56 | 66.89 | 24718.21 | 395.48 | 69.44 | 3241.24 | 168063.6 | 44404.34 | 1286.99 | 173154.1 | 3834.79 | 69.14 |
Before selecting tracers and running the unmixing model, explore your data.
If the number of tracers is large, the output may span multiple pages. For additional options, such as navigating between pages (page =), customizing colors (colors =), or adjusting the layout of the plots (n_row =, n_col =), consult the function documentation:
The individual tracer analysis can be explored visually with ternary diagrams.
This step is especially useful for cases with three sources.
If the number of tracers is large, the output may span multiple pages. To see additional pages include this argument in the funtion.
Tracer selection is a key step in fingerPro. The process
combines pre-screening, tracer ranking, and the exploration of
consistent tracer combinations using the CTS method
The CTS workflow starts by exploring all possible minimal tracer
combinations using the funtion CTS_explore:
| seed_id | tracers | w1 | w2 | w3 | percent_physical | sd_w1 | sd_w2 | sd_w3 | max_sd_wi |
|---|---|---|---|---|---|---|---|---|---|
| 1 | Cr P | 0.4352013 | 0.2851417 | 0.2796570 | 99.4 | 0.0820186 | 0.0877820 | 0.0395752 | 0.0877820 |
| 2 | Cr Mg | 0.3957113 | 0.2746657 | 0.3296230 | 96.5 | 0.1166657 | 0.0952257 | 0.1137979 | 0.1166657 |
| 3 | Cr V | 0.4085380 | 0.2780684 | 0.3133936 | 93.0 | 0.1670452 | 0.0885184 | 0.1433845 | 0.1670452 |
| 4 | Zr P | 0.6511841 | 0.0465297 | 0.3022862 | 63.0 | 0.1707133 | 0.1707169 | 0.0413458 | 0.1707169 |
| 5 | Rb Cr | 0.6628971 | 0.3455455 | -0.0084426 | 48.6 | 0.1781801 | 0.0930122 | 0.1407672 | 0.1781801 |
| 6 | Zr Cr | 0.2836381 | 0.2449346 | 0.4714274 | 85.2 | 0.2042065 | 0.0938298 | 0.1817926 | 0.2042065 |
The user must select one of these combinations (select a seed) to
extent into a final tracer subset using the function
CTS_select: Select a seed based on the following
criteria:
Combinations with low dispersion indicate a higher discriminant capacity of the selected tracers.
In practice, the user inspects the output table and selects one row
(seed) that provides a good balance between feasibility and low
dispersion. This selected seed is then used as input in the
CTS_select function.
| ID | samples | Cr | P | Mg | V |
|---|---|---|---|---|---|
| 1 | Source1 | 90.34 | 1104.07 | 3944.15 | 56.67 |
| 2 | Source1 | 78.39 | 1064.15 | 3992.01 | 59.63 |
| 3 | Source1 | 61.64 | 1314.66 | 3840.61 | 42.11 |
| 4 | Source1 | 80.32 | 1116.09 | 3507.03 | 61.27 |
| 5 | Source1 | 66.40 | 1111.35 | 3545.03 | 41.31 |
| 6 | Source1 | 69.44 | 1286.99 | 3834.79 | 69.14 |
At this stage, selected_data contains the tracer subset
that will be used in the unmixing model.
The selected tracer subset can now be used to estimate source apportionments.
A quick run can be obtained with the default settings:
| ID | Source1 | Source2 | Source3 | GOF |
|---|---|---|---|---|
| Mixture (60) | 0.3923894 | 0.2849740 | 0.3226366 | 0.9806114 |
| Mixture (60) | 0.4212065 | 0.2826654 | 0.2961281 | 0.9785353 |
| Mixture (60) | 0.4212065 | 0.2826654 | 0.2961281 | 0.9785353 |
| Mixture (60) | 0.4848043 | 0.2830437 | 0.2321519 | 0.9726158 |
| Mixture (60) | 0.3570352 | 0.3287364 | 0.3142284 | 0.9809761 |
| Mixture (60) | 0.5087059 | 0.2138281 | 0.2774660 | 0.9748148 |
Advanced analyses can be performed by adjusting arguments such as
iter, variability, lvp,
constrained, and resolution. These options
allow the user to tailor the model settings to the characteristics of
the dataset.
For a full description of the available arguments, consult the function documentation:
The source apportionment results can be displayed using density plots or violin plots.
Finally, the apportionment solution can be checked for mathematical consistency.
The validate_results function allows the user to assess
the mathematical consistency of a given set of source apportionments.
The apportionments can come from the fingerPro model or
from any other model, and are evaluated against the tracer dataset used
for unmixing.
The user must provide:
The function computes the normalized error between the observed tracer values in the mixture and the values predicted from the proposed apportionments.
Low normalized error values indicate that the solution is consistent with the selected tracers, whereas high values may suggest inconsistencies or that the proposed apportionment is not supported by the data.apportionments <- c(0.435, 0.285, 0.280)
normalized_error <- validate_results(selected_data, apportionments)| tracer | normalized_error |
|---|---|
| Cr | 0.0000321 |
| P | 0.0002210 |
| Mg | 0.0198135 |
| V | 0.0108719 |
Low normalized error values indicate that the proposed solution is consistent with the selected tracers.
This workflow should be repeated independently for each mixture, since optimum tracer selection depends on the combined information from the sources and the specific mixture under study.
For beginner users who are not familiar with R Markdown, you can copy the code below into an .R script and run it in R or RStudio step by step.
###################################
###### 0. Install and set wd
###################################
install.packages("fingerPro") # one time
setwd("C:/your/file/directory") # your own working directory (wd)
###################################
###### 1. Load and verify the data
###################################
library(fingerPro)
data <- read_database(system.file("extdata", "example_geochemical_3s_raw.csv", package = "fingerPro")) # Input example dataset
###################################
###### 2. Exploratory analysis
###################################
###### Box plots
box_plot(data)
box_plot(data, page = 1) # Visualise a specific page (e.g. page 1)
box_plot(data, page = 2) # Visualise a specific page (e.g. page 2)
box_plot(data, page = 3) # Visualise a specific page (e.g. page 3)
box_plot(data, n_row = 3, n_col = 6,) # Visualise all tracers
# Save results as a PNG image
png("output_boxplot_all.png", width = 30, height = 15, units = "cm", res = 300) # to save .png results
box_plot(data, n_row = 3, n_col = 6,) # Visualise all tracers
dev.off()
# Check 'help' for more information
help("box_plot")
###### Correlation analysis
correlation_plot(data)
correlation_plot(data, columns = c(1:8)) # correlation plot of n tracers (e.g. 1 to 8)
# Save results as a PNG image
png("output_correlationplot_tracers1-8.png", width = 25, height = 15, units = "cm", res = 300) # to save .png results
correlation_plot(data, columns = c(1:8)) # correlation plot of n tracers (e.g. 1 to 8)
dev.off()
# Check 'help' for more information
help("correlation_plot")
###### Linear Discriminant Analysis (LDA)
LDA_plot(data)
# Save results as a PNG image
png("output_LDA.png", width = 15, height = 12, units = "cm", res = 300) # to save .png results
LDA_plot(data)
dev.off()
###### Principal Component Analysis (PCA)
PCA_plot(data)
# Save results as a PNG image
png("output_PCA.png", width = 15, height = 12, units = "cm", res = 300) # to save .png results
PCA_plot(data)
dev.off()
###### Individual tracer analysis and ternary diagrams
output_ternary <- ternary_diagram(data)
ternary_diagram(data, page = 1) # Visualise a specific page (e.g. page 1)
ternary_diagram(data, page = 2) # Visualise a specific page (e.g. page 2)
ternary_diagram(data, page = 3) # Visualise a specific page (e.g. page 3)
ternary_diagram(data, rows = 4, cols = 5) # Visualise all tracers
# e.g. Save ternary_diagram results as a PNG image
png("output_ternary_all.png", width = 18, height = 12, units = "cm", res = 300) # to save .png results
output_ternary_all <- ternary_diagram(data, rows = 4, cols = 5) # Visualise all tracers
dev.off()
# Check 'help' for more information
help("ternary_diagram")
###### Range test
data_rangetest <- range_test(data)
write.csv(data_rangetest, "output_rangetest.csv")
###################################
###### 3. Tracer selection
###################################
###### CTS_explore
tracers_seeds <- CTS_explore(data, iter = 1000)
write.csv(tracers_seeds, "output_CTS_explore_tracers_seeds.csv")
# Check 'help' for more information
help("CTS_explore")
###### CTS_select
selected_data <- CTS_select(data, tracers_seeds, seed_id = 1, error_threshold = 0.05) # (e.g. Seed 1 selected with an error of 5% (0.05))
write.csv(selected_data, "output_CTS_select_selected_data.csv")
# Check 'help' for more information
help("CTS_select")
###################################
###### 4. Unmix
###################################
output_unmix <- unmix(selected_data)
write.csv(output_unmix, "output_unmix.csv")
# Check 'help' for more information
help("unmix")
plot_results(output_unmix, violin = FALSE) # Density plot
plot_results(output_unmix, violin = TRUE) # Violing plot
# save density plot
png("output_unmix_densityplot.png", width = 18, height = 12, units = "cm", res = 300) # to save .png results
plot_results(output_unmix, violin = FALSE) # Density plot
dev.off()
# save violin plot
png("output_unmix_violinplot.png", width = 18, height = 12, units = "cm", res = 300) # to save .png results
plot_results(output_unmix, violin = TRUE) # Violing plot
dev.off()
###################################
###### 5. Validate results
###################################
apportionments <- c(0.435, 0.285, 0.280)
normalized_error <- validate_results(selected_data, apportionments = c(0.435, 0.285, 0.280), error_threshold = 0.05)
write.csv(normalized_error, "output_validate_results_normalized_error.csv")
# Check 'help' for more information
help("validate_results")