1. Load and verify the data

install.packages("fingerPro")

library(fingerPro)

Load the example dataset included in the package:

data <- read_database(
  system.file("extdata", "example_geochemical_3s_raw.csv", package = "fingerPro")
)

Preview: example_geochemical_3s_raw.csv
ID	samples	Ba	Nb	Zr	Sr	Rb	Pb	Zn	Fe	Mn	Cr	Ti	Ca	Al	P	Si	Mg	V
1	Source1	272.77	10.47	186.48	360.84	62.25	12.08	47.43	20105.14	259.01	90.34	2876.70	185988.2	35149.08	1104.07	161458.6	3944.15	56.67
2	Source1	342.37	12.08	226.51	392.19	78.22	14.92	62.26	22804.77	250.86	78.39	3389.78	158492.0	41484.38	1064.15	169675.8	3992.01	59.63
3	Source1	351.12	10.43	178.56	522.67	77.19	14.87	71.18	21169.07	305.97	61.64	3340.13	176925.6	39449.94	1314.66	168952.0	3840.61	42.11
4	Source1	302.87	11.51	157.54	490.00	79.21	13.50	67.41	23004.56	396.77	80.32	3183.65	171179.3	41774.51	1116.09	165760.3	3507.03	61.27
5	Source1	306.89	10.94	224.24	439.45	53.82	16.29	44.33	18263.02	324.41	66.40	2915.43	198378.5	32408.88	1111.35	157717.9	3545.03	41.31
6	Source1	389.35	10.69	170.48	449.07	84.29	17.56	66.89	24718.21	395.48	69.44	3241.24	168063.6	44404.34	1286.99	173154.1	3834.79	69.14

2. Exploratory analysis

Before selecting tracers and running the unmixing model, explore your data.

Boxplots

box_plot(data)

If the number of tracers is large, the output may span multiple pages. For additional options, such as navigating between pages (page =), customizing colors (colors =), or adjusting the layout of the plots (n_row =, n_col =), consult the function documentation:

help("unmix")

box_plot(data, page = 2)

box_plot(data, page = 3)

Correlation analysis

correlation_plot(data)

Linear Discriminant Analysis (LDA)

LDA_plot(data)

Principal Component Analysis (PCA)

PCA_plot(data)

Individual tracer analysis and ternary diagrams

The individual tracer analysis can be explored visually with ternary diagrams.

This step is especially useful for cases with three sources.

ternary_diagram(data)

If the number of tracers is large, the output may span multiple pages. To see additional pages include this argument in the funtion.

ternary_diagram(data, page = 2)

ternary_diagram(data, page = 3)

Range test

The range test identifies tracers whose mixture values fall outside the range defined by the sources.

range_test(data)

3. Tracer selection

Tracer selection is a key step in fingerPro. The process combines pre-screening, tracer ranking, and the exploration of consistent tracer combinations using the CTS method

CTS_explore

The CTS workflow starts by exploring all possible minimal tracer combinations using the funtion CTS_explore:

tracers_seeds <- CTS_explore(data, iter = 1000)

Preview: Minimal tracer combinations
seed_id	tracers	w1	w2	w3	percent_physical	sd_w1	sd_w2	sd_w3	max_sd_wi
1	Cr P	0.4352013	0.2851417	0.2796570	99.4	0.0820186	0.0877820	0.0395752	0.0877820
2	Cr Mg	0.3957113	0.2746657	0.3296230	96.5	0.1166657	0.0952257	0.1137979	0.1166657
3	Cr V	0.4085380	0.2780684	0.3133936	93.0	0.1670452	0.0885184	0.1433845	0.1670452
4	Zr P	0.6511841	0.0465297	0.3022862	63.0	0.1707133	0.1707169	0.0413458	0.1707169
5	Rb Cr	0.6628971	0.3455455	-0.0084426	48.6	0.1781801	0.0930122	0.1407672	0.1781801
6	Zr Cr	0.2836381	0.2449346	0.4714274	85.2	0.2042065	0.0938298	0.1817926	0.2042065

The user must select one of these combinations (select a seed) to extent into a final tracer subset using the function CTS_select: Select a seed based on the following criteria:

A high percentage of physically feasible solutions (i.e. 0 < wi < 1)
Low dispersion across sources (i.e. low variability in the estimated contributions)

Combinations with low dispersion indicate a higher discriminant capacity of the selected tracers.

In practice, the user inspects the output table and selects one row (seed) that provides a good balance between feasibility and low dispersion. This selected seed is then used as input in the CTS_select function.

In this example, the first ranked combination (row 1) is selected as the seed.

CTS_select

selected_data <- CTS_select(data, tracers_seeds, seed_id = 1, error_threshold = 0.05)

Preview: dataset after CTS_select
ID	samples	Cr	P	Mg	V
1	Source1	90.34	1104.07	3944.15	56.67
2	Source1	78.39	1064.15	3992.01	59.63
3	Source1	61.64	1314.66	3840.61	42.11
4	Source1	80.32	1116.09	3507.03	61.27
5	Source1	66.40	1111.35	3545.03	41.31
6	Source1	69.44	1286.99	3834.79	69.14

At this stage, selected_data contains the tracer subset that will be used in the unmixing model.

4. Unmixing and Visualize the results

The selected tracer subset can now be used to estimate source apportionments.

A quick run can be obtained with the default settings:

output_unmix <- unmix(selected_data)

Preview: unmixing results
ID	Source1	Source2	Source3	GOF
Mixture (60)	0.3923894	0.2849740	0.3226366	0.9806114
Mixture (60)	0.4212065	0.2826654	0.2961281	0.9785353
Mixture (60)	0.4212065	0.2826654	0.2961281	0.9785353
Mixture (60)	0.4848043	0.2830437	0.2321519	0.9726158
Mixture (60)	0.3570352	0.3287364	0.3142284	0.9809761
Mixture (60)	0.5087059	0.2138281	0.2774660	0.9748148

Advanced analyses can be performed by adjusting arguments such as iter, variability, lvp, constrained, and resolution. These options allow the user to tailor the model settings to the characteristics of the dataset.

For a full description of the available arguments, consult the function documentation:

help("unmix")

The source apportionment results can be displayed using density plots or violin plots.

Density plots

plot_results(output_unmix, violin = FALSE, )

Violin plots

plot_results(output_unmix, violin = TRUE,)

These plots help visualize the distribution of source contributions and the variability in the model results.

5. Validate the results

Finally, the apportionment solution can be checked for mathematical consistency.

The validate_results function allows the user to assess the mathematical consistency of a given set of source apportionments. The apportionments can come from the fingerPro model or from any other model, and are evaluated against the tracer dataset used for unmixing.

The user must provide:

The dataset used for tracer selection and unmixing
The estimated source apportionments

The function computes the normalized error between the observed tracer values in the mixture and the values predicted from the proposed apportionments.

Low normalized error values indicate that the solution is consistent with the selected tracers, whereas high values may suggest inconsistencies or that the proposed apportionment is not supported by the data.

apportionments <- c(0.435, 0.285, 0.280)
normalized_error <- validate_results(selected_data, apportionments)

Preview: normalized error values from validate_results
tracer	normalized_error
Cr	0.0000321
P	0.0002210
Mg	0.0198135
V	0.0108719

Low normalized error values indicate that the proposed solution is consistent with the selected tracers.

Final remarks

This workflow should be repeated independently for each mixture, since optimum tracer selection depends on the combined information from the sources and the specific mixture under study.

.R Script for beginner users

For beginner users who are not familiar with R Markdown, you can copy the code below into an .R script and run it in R or RStudio step by step.

###################################    
###### 0. Install and set wd
###################################
install.packages("fingerPro") # one time
setwd("C:/your/file/directory") # your own working directory (wd)


###################################    
###### 1. Load and verify the data
###################################      
library(fingerPro)
data <- read_database(system.file("extdata", "example_geochemical_3s_raw.csv", package = "fingerPro")) # Input example dataset


###################################    
###### 2. Exploratory analysis
###################################    


###### Box plots

box_plot(data)

box_plot(data, page = 1) # Visualise a specific page (e.g. page 1)
box_plot(data, page = 2) # Visualise a specific page (e.g. page 2)
box_plot(data, page = 3) # Visualise a specific page (e.g. page 3)
box_plot(data, n_row = 3, n_col = 6,) # Visualise all tracers

# Save results as a PNG image
png("output_boxplot_all.png", width = 30, height = 15, units = "cm", res = 300) # to save .png results
box_plot(data, n_row = 3, n_col = 6,) # Visualise all tracers
dev.off()

# Check 'help' for more information 
help("box_plot")


###### Correlation analysis

correlation_plot(data)

correlation_plot(data, columns = c(1:8)) # correlation plot of  n tracers (e.g. 1 to 8)

# Save results as a PNG image
png("output_correlationplot_tracers1-8.png", width = 25, height = 15, units = "cm", res = 300) # to save .png results
correlation_plot(data, columns = c(1:8)) # correlation plot of  n tracers (e.g. 1 to 8)
dev.off()

# Check 'help' for more information 
help("correlation_plot")


###### Linear Discriminant Analysis (LDA)

LDA_plot(data)

# Save results as a PNG image
png("output_LDA.png", width = 15, height = 12, units = "cm", res = 300) # to save .png results
LDA_plot(data)
dev.off()


###### Principal Component Analysis (PCA)

PCA_plot(data)

# Save results as a PNG image
png("output_PCA.png", width = 15, height = 12, units = "cm", res = 300) # to save .png results
PCA_plot(data)
dev.off()


###### Individual tracer analysis and ternary diagrams

output_ternary <- ternary_diagram(data)

ternary_diagram(data, page = 1) # Visualise a specific page (e.g. page 1)
ternary_diagram(data, page = 2) # Visualise a specific page (e.g. page 2)
ternary_diagram(data, page = 3) # Visualise a specific page (e.g. page 3)
ternary_diagram(data, rows = 4, cols = 5)  # Visualise all tracers

# e.g. Save ternary_diagram results as a PNG image
png("output_ternary_all.png", width = 18, height = 12, units = "cm", res = 300) # to save .png results
output_ternary_all <- ternary_diagram(data, rows = 4, cols = 5)  # Visualise all tracers
dev.off()

# Check 'help' for more information 
help("ternary_diagram")


###### Range test
data_rangetest <- range_test(data)
write.csv(data_rangetest, "output_rangetest.csv")


###################################    
###### 3. Tracer selection
###################################    


###### CTS_explore

tracers_seeds <- CTS_explore(data, iter = 1000)
write.csv(tracers_seeds, "output_CTS_explore_tracers_seeds.csv")

# Check 'help' for more information 
help("CTS_explore")


###### CTS_select

selected_data <- CTS_select(data, tracers_seeds, seed_id = 1, error_threshold = 0.05) # (e.g. Seed 1 selected with an error of 5% (0.05))
write.csv(selected_data, "output_CTS_select_selected_data.csv")

# Check 'help' for more information 
help("CTS_select")

###################################    
###### 4. Unmix
###################################    

output_unmix <- unmix(selected_data)
write.csv(output_unmix, "output_unmix.csv")

# Check 'help' for more information 
help("unmix")

plot_results(output_unmix, violin = FALSE) # Density plot
plot_results(output_unmix, violin = TRUE) # Violing plot

# save density plot
png("output_unmix_densityplot.png", width = 18, height = 12, units = "cm", res = 300) # to save .png results
plot_results(output_unmix, violin = FALSE) # Density plot
dev.off()

# save violin plot
png("output_unmix_violinplot.png", width = 18, height = 12, units = "cm", res = 300) # to save .png results
plot_results(output_unmix, violin = TRUE) # Violing plot
dev.off()


###################################    
###### 5. Validate results
###################################    

apportionments <- c(0.435, 0.285, 0.280)
normalized_error <- validate_results(selected_data, apportionments = c(0.435, 0.285, 0.280), error_threshold = 0.05)
write.csv(normalized_error, "output_validate_results_normalized_error.csv")

# Check 'help' for more information 
help("validate_results")

Workflow Example