Reference

Last updated on 2026-03-10 | Edit this page

Glossary

Accessible: A FAIR principle stating that data and metadata should be retrievable using standardized, open protocols, with clear information about authentication or authorization where required.
Automation bias: The tendency to over‑trust outputs from automated systems and discount contradictory evidence, even when the automation is wrong.
Data provenance: Information describing the origin of data, including how it was collected, processed, filtered, and transformed.
Downstream reuse: The use of data, models, or catalogues by others after publication.
Errors of commission (automation bias): Failures caused by blindly following an automated recommendation without independent verification.
Errors of omission (automation bias): Failures to notice or act when an automated system produces no warning, leading users to assume everything is correct.
FAIR: An acronym for Findable, Accessible, Interoperable, Reusable. FAIR refers to a set of guiding principles for scientific data management and stewardship, focused on enabling reuse by both humans and machines.
Findable: A FAIR principle stating that data and metadata should have globally unique, persistent identifiers and be indexed in searchable resources so they can be discovered.
Interoperable: A FAIR principle stating that data and metadata should use shared formats, vocabularies, and standards that allow integration with other datasets and tools.
Machine‑actionable: A property of data or metadata that allows computational systems to automatically find, access, and use research outputs with minimal human intervention.
Machine‑learning model: A computational model trained on data to perform tasks such as classification, regression, or prediction.
Metadata: Structured information that describes data, models, or workflows, such as units, formats, coordinate systems, data sources, and processing history.
Model artefact: Any output of an ML workflow that may be reused, including trained models, derived features, labels, catalogues, or evaluation results.
Model scope: The range of data, conditions, or scientific contexts for which a model was designed and evaluated.
Overclaiming: Making scientific claims that extend beyond what the evidence supports, often by implying generality from limited ML performance results.
Partial reproducibility: A situation in which some, but not all, components of a study can be reproduced. Partial reproducibility is common and preferable to undocumented irreproducibility.
Performance claim: An interpretation or scientific conclusion drawn from a performance metric. Overclaiming occurs when performance is assumed to imply general validity or physical understanding.
Performance metric: A quantitative measure used to evaluate a model, such as accuracy, precision, recall, RMSE, or AUC.
Repeatability: The ability for the original researcher to rerun their own analysis under the same conditions and obtain the same results. This is the minimum requirement for any computational result.
Replicability: The ability to obtain consistent scientific conclusions using new data or an independent method while addressing the same scientific question. Replication tests robustness, not computational correctness.
Reproducibility: The ability for the same results to be obtained using the same data, code, methods, and analysis conditions. In ML‑based astronomy, this usually means that another researcher can rerun the original workflow and obtain identical outputs.
Reproducible workflow: A documented sequence of data processing, analysis, and modeling steps that can be rerun to reproduce results.
Reusable: A FAIR principle stating that data and metadata should be richly described, include provenance, and have clear usage licenses so they can be reused in future research.
Scientific responsibility (in ML): The obligation to clearly state assumptions, limitations, and intended use of ML‑based results so others can interpret and reuse them appropriately.
Scope creep: The gradual reuse of a model beyond its original intended domain without explicit re‑evaluation or validation.
Test data: A held‑out dataset used to evaluate model performance after training and model selection are complete.
Training data: The dataset used to fit a machine‑learning model. The properties of the training data constrain what the model can learn and where it can be applied.
Validation data: A dataset used during model development to tune parameters or select models, but not used to train them directly.