fingerPro is a flexible framework for sediment source fingerprinting that integrates data exploration, tracer selection, and unmixing to estimate, visualize, and validate source apportionments.

This vignette is intended for users who want to start working with their own databases. It explains how to organize the analysis, how to prepare a valid input file, and how to validate the structure of the dataset before running the workflow.

A key practical idea

In fingerPro, each mixture must be analysed independently. Optimum tracer selection depends on the combined information from both the sources and the mixture. Therefore, tracer selection must be performed separately for each mixture.

For this reason, it is strongly recommended to organize the analysis using one folder per mixture. Each folder should contain the input database, together with all figures and output files generated during the analysis.

Using different sets of optimum tracers for different mixtures is not a limitation of the method. Instead, it reflects the adaptation of the model to the specific characteristics of each dataset. Therefore, comparisons between results obtained for different mixtures remain valid even when different tracer sets have been selected.

Installation

Install from CRAN:

install.packages("fingerPro")

Or from a local file:

install.packages("FingerPro_2.1.tar.gz", repos = NULL, type = "source")

Load package

library(fingerPro)

Organizing your project folder

When working with your own .csv file, set the working directory to the folder containing the input database:

setwd("C:/your/project/folder")

Reading and validating your data

To read and validate your own input database, place the .csv file in your project folder and use read_dataset():

data <- read_database("my_input_database.csv")

Preparing your own database

Before starting, it is important to prepare your input database following the structure of the example datasets provided in the package.

A valid database should include:

an ID column with unique values
a samples column identifying the different sources and the mixture
the corresponding tracer variables

In all cases, the mixture must be placed at the end of the dataset. If multiple mixture samples are available, they must share the same name in the samples column but have different ID values.

To retain conservative tracers for subsequent analyses, it is recommended to perform a basic data cleaning beforehand:

replace BDL (below detection limit) values with a small positive number
exclude tracers whose mixture value and at least one source value are BDL or zero
optionally, remove tracers with predominantly BDL values

Supported input formats

Raw dataset | Scalar tracers

This format contains individual measurements for scalar tracers.

Required structure:

ID: unique identifier for each sample ID
samples: identifies sources and mixture samles
tracer1, tracer2, ...: tracer values

Raw dataset | Isotopic tracers

This format contains individual measurements for isotopic tracers.

Required structure:

ID: ID
samples: samles
ratio1, ratio2, ...: isotopic ratios
cont_ratio1, cont_ratio2, ...: corresponding contents cont_

Averaged dataset | Scalar tracers

This format contains statistical summaries of scalar tracers.

Required structure:

ID: ID
samples: samles
mean_tracer1, mean_tracer2, ...: mean_
sd_tracer1, sd_tracer2, ...: sd_
n: number of measurements in the last column

Averaged dataset | Isotopic tracers

This format contains statistical summaries of isotopic tracers.

Required structure:

ID: ID
samples: samles
mean_ratio1, mean_ratio2, ...: mean_
mean_cont_ratio1, mean_cont_ratio2, ...: mean_cont_
sd_ratio1, sd_ratio2, ...: sd_
sd_cont_ratio1, sd_cont_ratio2, ...: sd_cont_
n: number of measurements in the last column

Example datasets

The package includes four example datasets:

example_geochemical_3s_raw.csv

Raw dataset for 3 sources and 1 mixture with 17 scalar tracers (geochemical elements).
example_isotopic_3s_raw.csv

Raw dataset for 3 sources and 1 mixture with 5 isotopic tracers (ratios and their corresponding contents).
example_geochemical_3s_mean.csv

Averaged dataset (mean, standard deviation, and number of samples) for 3 sources and 1 mixture with 17 scalar tracers (geochemical elements). In this case, the mixture has a standard deviation equal to 0; if replicates of the mixture are available, the corresponding standard deviation can be included.
example_isotopic_3s_mean.csv

Averaged dataset (mean, standard deviation, and number of samples) for 3 sources and 1 mixture with 5 isotopic tracers (ratios and their corresponding contents). In this case, the mixture has a standard deviation equal to 0; if replicates of the mixture are available, the corresponding standard deviation can be included.

Preview Example datasets

Preview: example_geochemical_3s_raw.csv
ID	samples	Ba	Nb	Zr	Sr	Rb	Pb	Zn	Fe	Mn	Cr	Ti	Ca	Al	P	Si	Mg	V
1	Source1	272.77	10.47	186.48	360.84	62.25	12.08	47.43	20105.14	259.01	90.34	2876.70	185988.2	35149.08	1104.07	161458.6	3944.15	56.67
2	Source1	342.37	12.08	226.51	392.19	78.22	14.92	62.26	22804.77	250.86	78.39	3389.78	158492.0	41484.38	1064.15	169675.8	3992.01	59.63
3	Source1	351.12	10.43	178.56	522.67	77.19	14.87	71.18	21169.07	305.97	61.64	3340.13	176925.6	39449.94	1314.66	168952.0	3840.61	42.11
4	Source1	302.87	11.51	157.54	490.00	79.21	13.50	67.41	23004.56	396.77	80.32	3183.65	171179.3	41774.51	1116.09	165760.3	3507.03	61.27
5	Source1	306.89	10.94	224.24	439.45	53.82	16.29	44.33	18263.02	324.41	66.40	2915.43	198378.5	32408.88	1111.35	157717.9	3545.03	41.31
6	Source1	389.35	10.69	170.48	449.07	84.29	17.56	66.89	24718.21	395.48	69.44	3241.24	168063.6	44404.34	1286.99	173154.1	3834.79	69.14

Preview: example_geochemical_3s_mean.csv
ID	samples	mean_Ba	mean_Nb	mean_Zr	mean_Sr	mean_Rb	mean_Pb	mean_Zn	mean_Fe	mean_Mn	mean_Cr	mean_Ti	mean_Ca	mean_Al	mean_P	mean_Si	mean_Mg	mean_V	sd_Ba	sd_Nb	sd_Zr	sd_Sr	sd_Rb	sd_Pb	sd_Zn	sd_Fe	sd_Mn	sd_Cr	sd_Ti	sd_Ca	sd_Al	sd_P	sd_Si	sd_Mg	sd_V	n
1	Source1	296.46	10.92	197.81	422.34	74.14	15.20	57.01	21948.90	305.21	72.34	3241.96	164329.0	39170.14	1112.08	172621.3	3558.43	61.36	42.03	1.03	42.42	73.31	9.77	2.39	8.20	2626.95	61.79	11.71	263.32	20916.40	4085.08	100.21	12783.58	732.85	12.84	35
2	Source2	332.46	10.69	237.24	496.58	69.47	14.47	51.93	20835.67	296.34	52.47	3299.07	158932.7	38445.32	1079.00	181702.4	3969.15	56.39	31.07	1.00	52.29	119.39	8.75	2.02	4.12	1757.66	81.21	11.60	219.91	22196.89	3073.72	105.47	14286.61	633.26	12.45	12
3	Source3	366.61	10.42	151.55	591.25	85.51	13.43	58.54	23019.85	257.60	68.18	2976.51	171863.8	43268.12	763.28	165152.9	5146.52	77.20	49.06	0.61	46.48	174.03	18.79	2.87	11.67	3195.24	50.52	10.83	172.94	15376.03	6091.50	71.13	7128.13	1442.22	18.57	12
4	Mixture	273.83	9.40	185.66	1239.79	72.43	12.08	54.04	19534.44	256.61	65.51	2753.99	186789.7	33339.04	1005.10	142091.7	4194.71	64.94	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	1

Preview: example_isotopic_3s_raw.csv
ID	samples	C24	C26	C28	C30	C32	cont_C24	cont_C26	cont_C28	cont_C30	cont_C32
1	Source1	0.9790	1.3842	0.7150	1.5571	1.7612	39.28	16.24	34.04	48.8	17.27
1	Source1	1.1900	1.3853	0.6010	1.5555	1.6894	39.28	16.24	34.04	48.8	17.27
1	Source1	1.0374	1.4054	0.4485	1.5706	1.7412	39.28	16.24	34.04	48.8	17.27
1	Source1	1.0264	1.3651	0.5883	1.5622	1.7710	39.28	16.24	34.04	48.8	17.27
1	Source1	1.1166	1.4106	0.4989	1.5491	1.7353	39.28	16.24	34.04	48.8	17.27
1	Source1	1.0598	1.4475	0.5110	1.5516	1.7198	39.28	16.24	34.04	48.8	17.27

Preview: example_isotopic_3s_mean.csv
ID	samples	mean_C24	mean_C26	mean_C28	mean_C30	mean_C32	mean_cont_C24	mean_cont_C26	mean_cont_C28	mean_cont_C30	mean_cont_C32	sd_C24	sd_C26	sd_C28	sd_C30	sd_C32	n
1	Source1	1.0618	1.3980	0.5871	1.5621	1.7487	39.28	16.24	34.04	48.80	17.27	0.0956	0.0240	0.0880	0.0119	0.0291	10
2	Source2	0.7751	1.1479	0.7092	1.1841	1.2714	30.99	34.47	24.65	17.99	11.86	0.0374	0.0362	0.0363	0.0807	0.0633	10
3	Source3	1.4113	1.9233	0.9569	1.1602	0.5516	12.60	42.34	37.37	29.81	20.35	0.0253	0.0969	0.0188	0.0210	0.0918	10
4	Mixture	1.1205	1.5043	0.6918	1.4238	1.3531	31.78	24.59	33.93	40.97	0.00	0.0000	0.0000	0.0000	0.0000	0.0000	1

Getting Started

A key practical idea

Installation

Organizing your project folder

Reading and validating your data

Preparing your own database

Supported input formats

Raw dataset | Scalar tracers

Raw dataset | Isotopic tracers

Averaged dataset | Scalar tracers

Averaged dataset | Isotopic tracers

Example datasets

Preview Example datasets

Next steps