Initial Orbit Determination with digest2¶
This tutorial demonstrates how to use the digest2 Python package for NEO orbit classification from short-arc astrometric tracklets.¶
digest2 is a fast orbit classifier that assigns pseudo-probability scores (0–100) to astrometric tracklets for each of 14 orbit classes (NEO, Main Belt, Mars Crosser, etc.). It is the primary tool used by the Minor Planet Center to decide which tracklets are posted to the NEO Confirmation Page (NEOCP) for follow-up.
This can be useful if you want to:
- Quickly classify newly observed tracklets before submitting to the MPC
- Prioritize follow-up observations of potential NEO discoveries
- Understand how the MPC evaluates NEOCP candidates
- Incorporate orbit classification into automated survey pipelines
References:
- Keys et al. 2019, ("The digest2 NEO Classification Code")
- Veres et al. 2023, ("Improvement of Digest2 NEO Classification Code-utilizing the Astrometry Data Exchange Standard")
- Veres et al. 2025, ("Improving the discovery of near-Earth objects with machine-learning methods")
Environment Setup¶
To run this notebook, you need a Python environment (>= 3.9) with digest2 and its dependencies installed. We recommend creating a dedicated conda environment and registering it as a Jupyter kernel:
# Create and activate environment
conda create -n digest2-notebook-env python=3.12 -y
conda activate digest2-notebook-env
# Install digest2 and notebook dependencies
pip install digest2 requests ipykernel
# Register as a Jupyter kernel
python -m ipykernel install --user --name digest2-notebook-env --display-name "Python (digest2-notebook-env)"
Then open this notebook in Jupyter and select Kernel > Change Kernel > Python (digest2-notebook-env).
If you are working from a local clone of the repository and want to install that, replace pip install digest2 with:
cd mpc-public/digest2
pip install '.[test]' requests ipykernel
The digest2 Python package wraps a C scoring engine. Pre-built wheels are available on PyPI for common platforms (Linux, macOS, Windows), so most users get a ready-to-use binary with pip install digest2. If no wheel matches your platform or Python version, pip will automatically compile the C code from the source distribution - this requires a local C compiler (e.g., gcc or clang) but no external C libraries such as libxml2.
Import¶
from digest2 import Digest2, Observation, classify, ClassificationResult
from digest2.observation import parse_mpc80_file, parse_ades_xml
from dataclasses import fields
import requests
import tempfile
import os
import atexit
Setup: Download Sample Data¶
digest2 requires an observatory codes file (digest2.obscodes) to look up parallax constants for each observatory.
- We download this from the MPC's Observatory Codes API in the flat-file format.
- Because the code needs to know where this is, we can either save that information into an environmental variable,
DIGEST2_OBSCODES, or pass in the path at the time of execution. In this notebook we'll typically pass inobscodes_path.
We also create sample observation files in both MPC 80-column format and ADES XML format, and store them in the same temp-directory.
# Create a temporary directory for our working files
tmpdir = tempfile.mkdtemp(prefix="digest2_tutorial_")
atexit.register(lambda: __import__('shutil').rmtree(tmpdir, ignore_errors=True))
# Download observatory codes from the MPC obscodes API (required by digest2)
# We request the flat-file format ("ObsCodes.html") which digest2 can parse directly.
# See: https://docs.minorplanetcenter.net/mpc-ops-docs/apis/obscodes/
response = requests.get(
"https://data.minorplanetcenter.net/api/obscodes",
json={"format": "ObsCodes.html"}
)
response.raise_for_status()
obscodes_path = os.path.join(tmpdir, "digest2.obscodes")
with open(obscodes_path, "w") as f:
f.write(response.text)
print(f"Observatory codes downloaded to: {obscodes_path}")
print(f" ({len(response.text.splitlines())} lines)")
Observatory codes downloaded to: /var/folders/67/j23cbc8x5r3b_1cy48v0rf4m0000gq/T/digest2_tutorial_zhr46igv/digest2.obscodes (2679 lines)
# Sample MPC 80-column observation file: 3 observations of 2016 SK99 from G96 (Mt. Lemmon)
sample_obs_content = """ K16S99K 1C2022 12 25.38496508 32 36.283+17 10 35.94 21.98GV G96
K16S99K 1C2022 12 25.39527308 32 35.635+17 10 37.27 21.72GV G96
K16S99K 1C2022 12 25.40040208 32 35.473+17 10 37.38 21.31GV G96
"""
sample_obs_path = os.path.join(tmpdir, "sample.obs")
with open(sample_obs_path, "w") as f:
f.write(sample_obs_content)
# Sample MPC 80-column observation file containing multiple tracklets
multiple_obs_content = \
""" K16S99K 1C2022 12 25.38496508 32 36.283+17 10 35.94 21.98GV G96
K16S99K 1C2022 12 25.39527308 32 35.635+17 10 37.27 21.72GV G96
K16S99K 1C2022 12 25.40040208 32 35.473+17 10 37.38 21.31GV G96
K17R88L 1C2023 11 24.28496518 31 46.283+26 20 35.94 20.38GV G96
K17R88L 1C2023 11 24.29527318 31 45.635+26 20 37.27 20.52GV G96
K17R88L 1C2023 11 24.30040218 31 45.473+26 20 37.38 20.71GV G96
"""
multiple_obs_path = os.path.join(tmpdir, "multiple.obs")
with open(multiple_obs_path, "w") as f:
f.write(multiple_obs_content)
# Sample ADES XML observation file (same object, richer metadata)
sample_xml_content = """<?xml version="1.0" encoding="UTF-8"?>
<ades version="2017">
<optical>
<provID>2016 SK99</provID>
<trkSub>C8QY322</trkSub>
<mode>CCD</mode>
<stn>G96</stn>
<obsTime>2022-12-25T09:14:20.991Z</obsTime>
<ra>128.151180</ra>
<dec>17.176650</dec>
<rmsRA>0.247</rmsRA>
<rmsDec>0.301</rmsDec>
<astCat>Gaia2</astCat>
<mag>21.98</mag>
<band>G</band>
</optical>
<optical>
<provID>2016 SK99</provID>
<trkSub>C8QY322</trkSub>
<mode>CCD</mode>
<stn>G96</stn>
<obsTime>2022-12-25T09:29:11.606Z</obsTime>
<ra>128.148480</ra>
<dec>17.177020</dec>
<rmsRA>0.431</rmsRA>
<rmsDec>0.438</rmsDec>
<astCat>Gaia2</astCat>
<mag>21.72</mag>
<band>G</band>
</optical>
<optical>
<provID>2016 SK99</provID>
<trkSub>C8QY322</trkSub>
<mode>CCD</mode>
<stn>G96</stn>
<obsTime>2022-12-25T09:36:34.723Z</obsTime>
<ra>128.147805</ra>
<dec>17.177050</dec>
<rmsRA>0.561</rmsRA>
<rmsDec>0.573</rmsDec>
<astCat>Gaia2</astCat>
<mag>21.31</mag>
<band>G</band>
</optical>
</ades>
"""
sample_xml_path = os.path.join(tmpdir, "sample.xml")
with open(sample_xml_path, "w") as f:
f.write(sample_xml_content)
print(f"Sample .obs file: {sample_obs_path}")
print(f"Multiple .obs file: {multiple_obs_path}")
print(f"Sample .xml file: {sample_xml_path}")
Sample .obs file: /var/folders/67/j23cbc8x5r3b_1cy48v0rf4m0000gq/T/digest2_tutorial_zhr46igv/sample.obs Multiple .obs file: /var/folders/67/j23cbc8x5r3b_1cy48v0rf4m0000gq/T/digest2_tutorial_zhr46igv/multiple.obs Sample .xml file: /var/folders/67/j23cbc8x5r3b_1cy48v0rf4m0000gq/T/digest2_tutorial_zhr46igv/sample.xml
Basic Usage (1): Classifying a File¶
Perhaps the simplest way to use digest2 is to call classify on an observation file.
N.B. In the subsequeent section we provide a more detailed examination of the returned results.
results = classify(sample_obs_path, obscodes_path=obscodes_path)
r = results[0]
print(f"{type(results)=}")
print(f"{isinstance(r,ClassificationResult)=}")
print(f"{r.designation=}")
print(f"Tracklet RMS: {r.rms:.2f} arcsec")
print()
print("Attribute names:")
for field in fields(r):
print(f"\t{field.name} {type(field)=}")
print()
print("NoID scores (pseudo-probability for each orbit class):")
for cls, val in r.noid.items():
print(f" {cls:4s}: {val:5.1f}")
type(results)=<class 'list'> isinstance(r,ClassificationResult)=True r.designation='K16S99K' Tracklet RMS: 0.73 arcsec Attribute names: raw type(field)=<class 'dataclasses.Field'> noid type(field)=<class 'dataclasses.Field'> rms type(field)=<class 'dataclasses.Field'> rms_prime type(field)=<class 'dataclasses.Field'> designation type(field)=<class 'dataclasses.Field'> NoID scores (pseudo-probability for each orbit class): Int : 13.5 NEO : 13.3 N22 : 6.2 N18 : 0.9 MC : 13.3 Hun : 0.0 Pho : 0.0 MB1 : 72.3 Pal : 0.0 Han : 0.0 MB2 : 0.3 MB3 : 0.0 Hil : 0.0 JTr : 0.0 JFC : 0.4
Understanding the Results¶
Each result is a ClassificationResult dataclass with the following attributes:
| Attribute | Type | Description |
|---|---|---|
designation |
str |
Object designation from the input file |
rms |
float |
Great-circle RMS fit of the tracklet (arcseconds) |
rms_prime |
float |
Adjusted RMS used internally by the scorer |
noid |
Scores |
NoID scores: pseudo-probabilities assuming the object is unidentified (0--100) |
raw |
Scores |
Raw scores: pseudo-probabilities using total population (0--100) |
top_class |
str |
The class with the highest NoID score |
The Scores object supports both attribute access (result.noid.NEO) and dict-style access (result.noid["NEO"]), as well as iteration via .items().
The RAW scores represent the probability that a given tracklet belongs to each class, making no attempt to consider whether or not the objects in that class have been primarily discovered.
The NoID scores represent the probability that an unidentified tracklet belongs to each class, accounting for objects already discovered. These are the operationally relevant scores.
The 14 orbit classes are:
| Abbr | Class | Description |
|---|---|---|
| Int | MPC Interest | q<1.3 OR e>0.5 OR i>=40 OR Q>10 |
| NEO | Near-Earth Object | q < 1.3 AU |
| N22 | Large NEO | NEO with H <= 22 |
| N18 | Very Large NEO | NEO with H <= 18 |
| MC | Mars Crosser | |
| Hun | Hungaria | |
| Pho | Phocaea | |
| MB1 | Inner Main Belt | |
| Pal | Pallas family | |
| Han | Hansa family | |
| MB2 | Middle Main Belt | |
| MB3 | Outer Main Belt | |
| Hil | Hilda group | |
| JTr | Jupiter Trojan | |
| JFC | Jupiter Family Comet |
Raw vs NoID Scores¶
Let's compare both score types to see the difference. Raw scores use the total population; NoID scores use only the undiscovered population.
print(f"{'Class':>4s} {'Raw':>6s} {'NoID':>6s}")
print("-" * 22)
for cls in r.raw:
raw_val = r.raw[cls]
noid_val = r.noid[cls]
if round(raw_val) > 0 or round(noid_val) > 0:
print(f"{cls:>4s} {raw_val:6.1f} {noid_val:6.1f}")
Class Raw NoID ---------------------- Int 9.5 13.5 NEO 9.2 13.3 N22 3.3 6.2 N18 0.9 0.9 MC 17.3 13.3 MB1 66.1 72.3 MB2 6.7 0.3
Under the hood ...¶
When you call classify with a filepath as input, under the hood it calls the Digest2 class as a context manager (with statement) so that C resources are automatically released..
This loads the population model once and can then classify many tracklets efficiently.
The evaluation is then performed using the classify_file method.
As such, the above call to classify is the same as the call below:
with Digest2(obscodes_path=obscodes_path, repeatable=True) as d2:
results = d2.classify_file(sample_obs_path)
r = results[0]
print(f"{type(results)=}")
print(f"{type(r)=}")
print(f"{r.designation=}")
print(f"Tracklet RMS: {r.rms:.2f} arcsec")
print()
print("Attribute names:")
for field in fields(r):
print(f"\t{field.name}")
print()
print("NoID scores (pseudo-probability for each orbit class):")
for cls, val in r.noid.items():
print(f" {cls:4s}: {val:5.1f}")
type(results)=<class 'list'> type(r)=<class 'digest2.result.ClassificationResult'> r.designation='K16S99K' Tracklet RMS: 0.73 arcsec Attribute names: raw noid rms rms_prime designation NoID scores (pseudo-probability for each orbit class): Int : 13.5 NEO : 13.3 N22 : 6.2 N18 : 0.9 MC : 13.3 Hun : 0.0 Pho : 0.0 MB1 : 72.3 Pal : 0.0 Han : 0.0 MB2 : 0.3 MB3 : 0.0 Hil : 0.0 JTr : 0.0 JFC : 0.4
Basic Usage (2): Classifying a File containing Multiple Tracklets¶
If the supplied file contains multiple tracklets (multiple sets of observations, each with different designations/trksubs), then multiple sets of results will be returned, one for each tracklet.
results = classify(multiple_obs_path, obscodes_path=obscodes_path)
# Loop over all returned results
for r in results:
print(f"\n\n{r.designation=}")
print(f"Tracklet RMS: {r.rms:.2f} arcsec")
print()
print("NoID scores (pseudo-probability for each orbit class):")
for cls, val in r.noid.items():
if round(val) > 0:
print(f" {cls:4s}: {val:5.1f}")
r.designation='K16S99K' Tracklet RMS: 0.73 arcsec NoID scores (pseudo-probability for each orbit class): Int : 13.5 NEO : 13.3 N22 : 6.2 N18 : 0.9 MC : 13.3 MB1 : 72.3 r.designation='K17R88L' Tracklet RMS: 0.69 arcsec NoID scores (pseudo-probability for each orbit class): Int : 100.0 NEO : 100.0 N22 : 64.7
Basic Usage (3): Programmatic Observations¶
Instead of reading from a file, you can construct Observation objects directly in Python. This is useful when integrating digest2 into an automated pipeline.
Each observation requires:
mjd: Modified Julian Datera: Right Ascension in degreesdec: Declination in degreesobscode: MPC 3-character observatory code
Optional fields include mag (magnitude), band (photometric band), rms_ra, and rms_dec (astrometric uncertainties in arcseconds).
For quick, one-off classification, the classify() convenience function handles initialization and cleanup automatically. It is polymorphic — it accepts a filepath, a single tracklet, or a batch of tracklets.
When we supply a single list of Observations to the classify function, it treats them as a single tracklet, i.e. it assumes they are all observations of the same object.
# Create observations
obs = [
Observation(mjd=59938.384965, ra=128.15118, dec=17.17665,
mag=22.22, band="G", obscode="G96"),
Observation(mjd=59938.395273, ra=128.14899, dec=17.17702,
mag=21.96, band="G", obscode="G96"),
Observation(mjd=59938.400402, ra=128.14780, dec=17.17717,
mag=21.55, band="G", obscode="G96"),
]
# Call `classify`
result = classify(obs, obscodes_path=obscodes_path, repeatable=True)
# Print results
print(f"RMS: {result.rms:.2f} arcsec")
print()
for cls, val in result.noid.items():
if round(val) > 0:
print(f" {cls:4s}: {val:5.1f}")
RMS: 0.11 arcsec Int : 96.1 NEO : 95.6 N22 : 29.0 N18 : 5.1 MC : 4.4 MB2 : 3.0 JFC : 0.7
Under the hood ...¶
When you call classify with a list of Observations as input, under the hood it calls the Digest2 class as a context manager, and then the evaluation is then performed using the classify_tracklet method.
As such, the above call to classify is the same as the call below:
# Call `classify_tracklet`
with Digest2(obscodes_path=obscodes_path, repeatable=True) as d2:
result = d2.classify_tracklet(obs)
print(f"RMS: {result.rms:.2f} arcsec")
print()
for cls, val in result.noid.items():
if round(val) > 0:
print(f" {cls:4s}: {val:5.1f}")
RMS: 0.11 arcsec Int : 96.1 NEO : 95.6 N22 : 29.0 N18 : 5.1 MC : 4.4 MB2 : 3.0 JFC : 0.7
Basic Usage (4): Multiple Tracklets¶
If we supply lists-of-lists-of-Observations to classify, then this data is interpreted as being multiple 'tracklets', each of which will be given its own digest2 score.
# Two different tracklets
tracklet_1 = [
Observation(mjd=59938.384965, ra=128.15118, dec=17.17665,
mag=22.22, band="G", obscode="G96"),
Observation(mjd=59938.395273, ra=128.14899, dec=17.17702,
mag=21.96, band="G", obscode="G96"),
Observation(mjd=59938.400402, ra=128.14780, dec=17.17717,
mag=21.55, band="G", obscode="G96"),
]
tracklet_2 = [
Observation(mjd=59938.384965, ra=130.0, dec=20.0,
mag=20.0, obscode="G96"),
Observation(mjd=59938.395273, ra=130.01, dec=20.01,
mag=20.0, obscode="G96"),
]
# Call classify with 2-tracklet input
batch_results = classify([tracklet_1, tracklet_2], obscodes_path=obscodes_path, repeatable=True)
# Print results
for i, r in enumerate(batch_results):
if r is not None:
neo_score = r.noid.NEO
print(f"Tracklet {i+1}: {r.noid.NEO=:.1f}")
Tracklet 1: r.noid.NEO=95.6 Tracklet 2: r.noid.NEO=100.0
Under the hood ...¶
When lists-of-lists-of-Observations are passed to classify, under the hood classify_batch is called.
Hence the call below is the same as the above two-tracklet call to classify.
with Digest2(obscodes_path=obscodes_path, repeatable=True) as d2:
batch_results = d2.classify_batch([tracklet_1, tracklet_2])
for i, r in enumerate(batch_results):
if r is not None:
neo_score = r.noid.NEO
print(f"Tracklet {i+1}: {r.noid.NEO=:.1f}")
Tracklet 1: r.noid.NEO=95.6 Tracklet 2: r.noid.NEO=100.0
Input Formats: 80-Column vs ADES XML¶
The MPC uses two observation formats:
- MPC 80-column format (
.obs): Legacy fixed-width format - ADES XML format (
.xml): Modern format with per-observation uncertainties
The classify_file() method auto-detects the format from the file extension. Let's classify the same object in both formats and compare.
with Digest2(obscodes_path=obscodes_path, repeatable=True) as d2:
results_obs = d2.classify_file(sample_obs_path)
results_xml = d2.classify_file(sample_xml_path)
r_obs = results_obs[0]
r_xml = results_xml[0]
print(f"{'':>4s} {'80-col':>8s} {'XML':>8s}")
print("-" * 26)
for cls in r_obs.noid:
v_obs = r_obs.noid[cls]
v_xml = r_xml.noid[cls]
if round(v_obs) > 0 or round(v_xml) > 0:
print(f"{cls:>4s} {v_obs:8.1f} {v_xml:8.1f}")
print(f"\n80-col designation: {r_obs.designation}")
print(f"XML designation: {r_xml.designation}")
80-col XML -------------------------- Int 13.5 13.5 NEO 13.3 13.3 N22 6.2 6.2 N18 0.9 0.9 MC 13.3 13.3 MB1 72.3 72.5 80-col designation: K16S99K XML designation: C8QY322
80-col & XML reading using classify¶
As the classify function calls classify_file, it too will automatically process both 80-col and XML flies
# 80-col : XML Comparison using classify
for filepath in [sample_obs_path,sample_xml_path]:
r = classify(filepath, obscodes_path=obscodes_path, repeatable=True)[0]
label = "XML :" if "xml" in filepath else "80-col:"
print(f"{label}{r.designation=},{r.noid.MB1=:.1f}")
80-col:r.designation='K16S99K',r.noid.MB1=72.3 XML :r.designation='C8QY322',r.noid.MB1=72.5
The scores are very similar between formats. Small differences can arise because:
- The ADES XML format includes per-observation astrometric uncertainties (
rmsRA,rmsDec) - The 80-column format relies on observatory-level default errors
- Designations may differ (80-column uses packed provisional designation, XML uses
trkSub)
Configuration: Observatory-Level Error Models¶
Basic, Per-Observatory, Error Configuration¶
digest2 sets a base-line assumed error that is applied to all observations from a given observatory location as the basic assumed error in the abscence of any other error information.
This observatory-specific error, errorFromConfig, is defined via the MPC.config file:
- Default error: If an
MPC.configfile is not supplied, then all observatories will default toerrorFromConfig = 1.0arc-seconds. - Per-site errors via
MPC.config: Observatory-specific errors, calibrated from historical data, can be supplied to set differing values forerrorFromConfigfor each specified observatory-codes (any observatory codes not specified in MPC.config will default to 1.0 arc-seconds as described above).
An example of supplying a smaller-than-the-default-uncertainty for the G96 (Mt. Lemmon) observatory is supplied in the cell below.
Effects of smaller uncertainties¶
The assumed astrometric uncertainty for each observatory significantly affects the scores.
Let's compare scores with and without the MPC.config file.
The bundled MPC.config specifies, for example, that observatory G96 (Mt. Lemmon) has a calibrated error of 0.29 arcsec — much better than the 1.0 arcsec default.
In the results from the cell below we see that with the default 1.0 arcsec error, the tracklet scores as predominantly main belt (MB1 ~ 85). With the calibrated G96 error of 0.29 arcsec, the position constraints are tighter, and the NEO score increases significantly. This demonstrates why per-site error calibration matters for accurate classification.
At the MPC, a NEO score of 65 or above triggers posting to the NEOCP.
# Create an empty config file (uses default 1.0 arcsec for all observatories)
empty_config_path = os.path.join(tmpdir, "empty.cfg")
with open(empty_config_path, "w") as f:
f.write("# No per-site errors\n")
# Classify using the empty config file created above
r_default = classify(sample_obs_path, config_path=empty_config_path, obscodes_path=obscodes_path, repeatable=True)[0]
# Classfy using the version of MPC.config that comes with digest2 (auto-discovered, includes G96=0.29 arcsec)
r_mpc = classify(sample_obs_path, obscodes_path=obscodes_path, repeatable=True)[0]
# Print out results
header_default = 'Default (1.0")'
header_mpc = 'MPC.config (0.29")'
print("Comparison: Effect of observatory error model")
print(f"{'Class':>4s} {header_default:>15s} {header_mpc:>18s}")
print("-" * 42)
for cls in r_default.noid:
v_def = r_default.noid[cls]
v_mpc = r_mpc.noid[cls]
if round(v_def) > 0 or round(v_mpc) > 0:
print(f"{cls:>4s} {v_def:15.1f} {v_mpc:18.1f}")
Comparison: Effect of observatory error model Class Default (1.0") MPC.config (0.29") ------------------------------------------ Int 2.3 13.5 NEO 2.2 13.3 N22 1.2 6.2 N18 0.2 0.9 MC 1.9 13.3 MB1 85.3 72.3 MB2 2.5 0.3
Custom Configuration Files¶
You can create a custom config file to set observatory errors for your own site. The format is simple: one obserrXXX=Y.YY line per observatory, where XXX is the MPC observatory code and Y.YY is the error in arcseconds.
We see in the results of the cell below that tighter uncertainties constrain the range of possible orbits, which can sharpen the classification (e.g. increasing NEO probability for objects on NEO-like trajectories). Looser uncertainties allow more orbit solutions, generally pushing scores toward the most common population (Main Belt).
# Create a custom config with a very tight error for G96
tight_config_path = os.path.join(tmpdir, "tight.cfg")
with open(tight_config_path, "w") as f:
f.write("obserrG96=0.10\n") # Very precise astrometry
# Create a custom config with a very loose error for G96
loose_config_path = os.path.join(tmpdir, "loose.cfg")
with open(loose_config_path, "w") as f:
f.write("obserrG96=2.0\n") # Poor astrometry
r_tight = classify(sample_obs_path, config_path=tight_config_path, obscodes_path=obscodes_path, repeatable=True)[0]
r_loose = classify(sample_obs_path, config_path=loose_config_path, obscodes_path=obscodes_path, repeatable=True)[0]
h_tight = 'Tight (0.1")'
h_default = 'Default (1.0")'
h_loose = 'Loose (2.0")'
print(f"{'Class':>4s} {h_tight:>13s} {h_default:>15s} {h_loose:>13s}")
print("-" * 52)
for cls in r_tight.noid:
v_t = r_tight.noid[cls]
v_d = r_default.noid[cls]
v_l = r_loose.noid[cls]
if round(v_t) > 0 or round(v_d) > 0 or round(v_l) > 0:
print(f"{cls:>4s} {v_t:13.1f} {v_d:15.1f} {v_l:13.1f}")
Class Tight (0.1") Default (1.0") Loose (2.0") ---------------------------------------------------- Int 13.5 2.3 0.8 NEO 13.3 2.2 0.7 N22 6.2 1.2 0.4 N18 0.9 0.2 0.0 MC 13.3 1.9 0.7 MB1 72.3 85.3 67.2 MB2 0.3 2.5 26.1 MB3 0.0 0.0 4.5
Advanced, Per-Observation, Error Configuration¶
WARNING: The behavior of the per-observation error model is highly non-trivial
Per-observation errors via ADES: When using ADES format or constructing observations programmatically, you can provide per-observation astrometric uncertainties (rms_ra, rms_dec). If the user supplies these individual, per-observation, uncertainties in the ADES file of observations, they will be used as the errors in digest2, allowing different errors to be associated with each different observation.
Set is_ades=True to tell digest2 to use the per-observation RMS values rather than the configured site error.
However, by default the overall observatory-level error specification from the MPC.config file, errorFromConfig, will frequently over-ride the value of the detailed observation-level errors supplied from the ADES file.
In particular:
- Error Floor
- The value of
errorFromConfigfrom the config file (whether they are the default1.0", or some explicitly provided values) is used as a floor, so per-obs errors will only be used if they are higher than theerrorFromConfigvalues. - I.e. If the supplied per-observation RMS is smaller than the value in the config file, the supplied per-observation RMS will be ignored, and the default value of
errorFromConfigused, even ifis_ades=True.
- The value of
- Error Ceiling
digest2will also, by default, impose a ceiling on the rms value of5 * errorFromConfig. This is done inupdateRMSValues.- I.e. If the supplied per-observation RMS is larger than
5*errorFromConfig, then by default the supplied per-observation RMS will be ignored, and a value of5*errorFromConfigwill be used, even ifis_ades=True.
- Ceiling Removal
- There is a
noThresholdvariable that can be supplied that will remove the ceiling. By defaultnoThreshold=False, and a threhold (ceiling) is applied. - I.e. If we supply a
noThreshold=Trueargument, then the threhold (ceiling) will not be applied.
- There is a
Thus, in practice, if you want digest2 to always use the exact RMS values supplied in the ADES file, you would need to
- Set
is_ades=Trueto telldigest2to use the per-observation RMS values - Define default error values at/close-to zero in the MPC.config file.
- Set
noThreshold=Trueto remove any ceiling.
N.B. The main practical reason for use of errorFromConfig as a floor, is to prevent unrealistically small errors in ADES files accidentally skewing the digest2 score.
In the cell(s) below we demonstrate the effect of passing per-observation uncertainties in various scenarios, and demonstrate that the effect depends upon (a) the values supplied in the config file, and (b) the value of the noThreshold boolean.
# Observations WITHOUT per-observation uncertainties
obs_no_rms = [
Observation(mjd=59938.384965, ra=128.15118, dec=17.17665, mag=22.22, band="G", obscode="G96"),
Observation(mjd=59938.395273, ra=128.14899, dec=17.17702, mag=21.96, band="G", obscode="G96"),
Observation(mjd=59938.400402, ra=128.14780, dec=17.17717, mag=21.55, band="G", obscode="G96"),
]
# Same observations WITH LARGE per-observation uncertainties (from ADES)
# - These uncertainties are larger than the default 1 arc-sec uncertainty that is assumed when the config file is empty
obs_with_large_rms = [
Observation(mjd=59938.384965, ra=128.15118, dec=17.17665, mag=22.22, band="G", obscode="G96", rms_ra=2.147, rms_dec=2.101),
Observation(mjd=59938.395273, ra=128.14899, dec=17.17702, mag=21.96, band="G", obscode="G96", rms_ra=2.131, rms_dec=2.138),
Observation(mjd=59938.400402, ra=128.14780, dec=17.17717, mag=21.55, band="G", obscode="G96", rms_ra=2.161, rms_dec=2.173),
]
# Same observations WITH SMALL per-observation uncertainties (from ADES)
# - These uncertainties are smaller than the default 1 arc-sec uncertainty that is assumed when the config file is empty
obs_with_small_rms = [
Observation(mjd=59938.384965, ra=128.15118, dec=17.17665, mag=22.22, band="G", obscode="G96", rms_ra=0.147, rms_dec=0.101),
Observation(mjd=59938.395273, ra=128.14899, dec=17.17702, mag=21.96, band="G", obscode="G96", rms_ra=0.131, rms_dec=0.138),
Observation(mjd=59938.400402, ra=128.14780, dec=17.17717, mag=21.55, band="G", obscode="G96", rms_ra=0.161, rms_dec=0.173),
]
# As above, create an empty config file (uses default 1.0 arcsec for all observatories)
empty_config_path = os.path.join(tmpdir, "empty.cfg")
with open(empty_config_path, "w") as f:
f.write("# No per-site errors\n")
# Create a custom config with an inconceivably tight error for G96
absurd_config_path = os.path.join(tmpdir, "absurd.cfg")
with open(absurd_config_path, "w") as f:
f.write("obserrG96=0.001\n") # Unbelievably precise astrometry
# Classify with & without the provided rms, comparing against the DEFAULT (1") config
with Digest2(config_path=empty_config_path, obscodes_path=obscodes_path, repeatable=True) as d2:
r_no_rms = d2.classify_tracklet(obs_no_rms) # empty config file causes digest2 to use default 1.0 arcsec for all observatories
r_with_large_rms = d2.classify_tracklet(obs_with_large_rms, is_ades=True)
r_with_small_rms = d2.classify_tracklet(obs_with_small_rms, is_ades=True)
print('\033[91m' + 'Empty Config File => Default 1" Uncertainty' + '\033[0m')
print('\033[91m' + 'When the supplied per-observation RMS is **smaller** than the value in the config file, the supplied per-observation RMS is ignored.' + '\033[0m')
print()
print(f"{'Class':>4s} {'No per-obs RMS':>15s} {'Large per-obs RMS':>17s} {'Small per-obs RMS':>17s}")
print("-" * 60)
for cls in r_no_rms.noid:
v1 = r_no_rms.noid[cls]
v2 = r_with_large_rms.noid[cls]
v3 = r_with_small_rms.noid[cls]
if round(v1) > 0 or round(v2) > 0:
print(f"{cls:>4s} {v1:15.1f} {v2:17.1f} {v3:17.1f}")
# Classify with & without the provided rms, this time using the *absurd_config_path* as the "floor"
# - We set `no_threshold=True` so that the LARGE per-obs rms is used
with Digest2(config_path=absurd_config_path, obscodes_path=obscodes_path, repeatable=True, no_threshold=True) as d2:
r_with_large_rms = d2.classify_tracklet(obs_with_large_rms, is_ades=True)
r_with_small_rms = d2.classify_tracklet(obs_with_small_rms, is_ades=True)
print('\033[91m' + 'Using an absurdly small RMS value in the config files allows very small per-observation RMS values to take effect.' + '\033[0m')
print('\033[91m' + 'NB The printed `No per-obs RMS` value uses the same 1" config as above.' + '\033[0m')
print()
print(f"{'Class':>4s} {'No per-obs RMS':>15s} {'Large per-obs RMS':>17s} {'Small per-obs RMS':>17s}")
print("-" * 60)
for cls in r_no_rms.noid:
v1 = r_no_rms.noid[cls]
v2 = r_with_large_rms.noid[cls]
v3 = r_with_small_rms.noid[cls]
if round(v1) > 0 or round(v2) > 0:
print(f"{cls:>4s} {v1:15.1f} {v2:17.1f} {v3:17.1f}")
# Classify with & without the provided rms, this time using the *absurd_config_path* as the "floor"
# - We set `no_threshold=False` so that the LARGE per-obs rms is IGNORED (5*config-value will be used)
with Digest2(config_path=absurd_config_path, obscodes_path=obscodes_path, repeatable=True, no_threshold=False) as d2:
r_with_large_rms = d2.classify_tracklet(obs_with_large_rms, is_ades=True)
r_with_small_rms = d2.classify_tracklet(obs_with_small_rms, is_ades=True)
print('\033[91m' + 'Using an absurdly small RMS value in the config files allows very small per-observation RMS values to take effect.' + '\033[0m')
print('\033[91m' + 'But if `no_threshold=False`, large per-obs RMS values will be capped at a ceiling value.' + '\033[0m')
print()
print(f"{'Class':>4s} {'No per-obs RMS':>15s} {'Large per-obs RMS':>17s} {'Small per-obs RMS':>17s}")
print("-" * 60)
for cls in r_no_rms.noid:
v1 = r_no_rms.noid[cls]
v2 = r_with_large_rms.noid[cls]
v3 = r_with_small_rms.noid[cls]
if round(v1) > 0 or round(v2) > 0:
print(f"{cls:>4s} {v1:15.1f} {v2:17.1f} {v3:17.1f}")
Empty Config File => Default 1" Uncertainty When the supplied per-observation RMS is **smaller** than the value in the config file, the supplied per-observation RMS is ignored. Class No per-obs RMS Large per-obs RMS Small per-obs RMS ------------------------------------------------------------ Int 2.2 0.6 2.2 NEO 2.2 0.6 2.2 N22 1.0 0.3 1.0 MC 1.9 0.6 1.9 MB1 94.1 62.2 94.1 MB2 2.4 19.9 2.4 MB3 0.1 15.8 0.1 Using an absurdly small RMS value in the config files allows very small per-observation RMS values to take effect. NB The printed `No per-obs RMS` value uses the same 1" config as above. Class No per-obs RMS Large per-obs RMS Small per-obs RMS ------------------------------------------------------------ Int 2.2 0.6 95.9 NEO 2.2 0.6 95.8 N22 1.0 0.3 27.0 MC 1.9 0.6 4.2 MB1 94.1 62.2 0.0 MB2 2.4 19.9 3.0 MB3 0.1 15.8 0.0 Using an absurdly small RMS value in the config files allows very small per-observation RMS values to take effect. But if `no_threshold=False`, large per-obs RMS values will be capped at a ceiling value. Class No per-obs RMS Large per-obs RMS Small per-obs RMS ------------------------------------------------------------ Int 2.2 15.7 95.9 NEO 2.2 15.6 95.8 N22 1.0 5.4 27.0 N18 0.1 1.0 3.7 MC 1.9 0.8 4.2 MB1 94.1 83.4 0.0 MB2 2.4 0.5 3.0
Class Filtering¶
If you only care about specific orbit classes, use the classes parameter to compute only those scores. This can be useful for focused analyses.
with Digest2(obscodes_path=obscodes_path, repeatable=True) as d2:
result = d2.classify_tracklet(obs, classes=["NEO", "MC", "MB1"])
print("Filtered classification (NEO, MC, MB1 only):")
print(f" NEO: {result.noid.NEO:.1f}")
print(f" MC: {result.noid.MC:.1f}")
print(f" MB1: {result.noid.MB1:.1f}")
Filtered classification (NEO, MC, MB1 only): NEO: 95.6 MC: 4.4 MB1: 0.0
Note that when filtering, scores are renormalized over the selected classes only, so the values will differ from the unfiltered case.
Parsing Observation Files¶
The parsing functions can be used independently, which is useful for inspecting observations before classification.
# Parse MPC 80-column file
tracklets_80 = parse_mpc80_file(sample_obs_path)
for desig, obs_list in tracklets_80.items():
print(f"Designation: '{desig.strip()}' ({len(obs_list)} observations)")
for o in obs_list:
print(f" MJD={o.mjd:.6f} RA={o.ra:.5f} Dec={o.dec:.5f} "
f"Mag={o.mag:.2f} Site={o.obscode}")
Designation: 'K16S99K' (3 observations) MJD=59938.384965 RA=128.15118 Dec=17.17665 Mag=22.22 Site=G96 MJD=59938.395273 RA=128.14848 Dec=17.17702 Mag=21.96 Site=G96 MJD=59938.400402 RA=128.14780 Dec=17.17705 Mag=21.55 Site=G96
# Parse ADES XML file
tracklets_xml = parse_ades_xml(sample_xml_path)
for desig, obs_list in tracklets_xml.items():
print(f"Designation: '{desig}' ({len(obs_list)} observations)")
for o in obs_list:
print(f" MJD={o.mjd:.6f} RA={o.ra:.6f} Dec={o.dec:.6f} "
f"Mag={o.mag:.2f} Site={o.obscode} "
f"rmsRA={o.rms_ra:.3f} rmsDec={o.rms_dec:.3f}")
Designation: 'C8QY322' (3 observations) MJD=59938.384965 RA=128.151180 Dec=17.176650 Mag=22.22 Site=G96 rmsRA=0.247 rmsDec=0.301 MJD=59938.395273 RA=128.148480 Dec=17.177020 Mag=21.96 Site=G96 rmsRA=0.431 rmsDec=0.438 MJD=59938.400402 RA=128.147805 Dec=17.177050 Mag=21.55 Site=G96 rmsRA=0.561 rmsDec=0.573
Note that ADES XML provides per-observation astrometric uncertainties (rmsRA, rmsDec), which the 80-column format does not.
Error Handling¶
digest2 raises standard Python exceptions for common error cases.
# Error: too few observations (need at least 2)
try:
with Digest2(obscodes_path=obscodes_path, repeatable=True) as d2:
d2.classify_tracklet([Observation(mjd=59938.0, ra=128.0, dec=17.0, obscode="G96")])
except (RuntimeError, ValueError) as e:
print(f"{type(e).__name__}: {e}")
# Error: invalid orbit class
try:
with Digest2(obscodes_path=obscodes_path, repeatable=True) as d2:
d2.classify_tracklet(obs, classes=["INVALID"])
except ValueError as e:
print(f"ValueError: {e}")
# Error: using a closed instance
try:
d2 = Digest2(obscodes_path=obscodes_path)
d2.close()
d2.classify_tracklet(obs)
except RuntimeError as e:
print(f"RuntimeError: {e}")
ValueError: At least 2 observations required ValueError: Unknown class abbreviation: INVALID RuntimeError: Digest2 instance has been closed
Summary¶
The digest2 Python package provides fast orbit classification for short-arc astrometric tracklets. Key points:
- Install:
pip install digest2 - Basic usage:
Digest2class as a context manager, orclassify()one-shot function - Input formats: MPC 80-column (
.obs) and ADES XML (.xml) files, or programmaticObservationobjects - Configuration: Per-site observatory errors via
MPC.config(bundled) or custom config files - Uncertainties matter: Smaller assumed errors produce tighter orbital constraints and sharper classification
- NEO threshold: Objects with NoID NEO score ≥ 65 are posted to the MPC's NEO Confirmation Page
For more information:
- digest2 documentation on GitHub
- NEO Confirmation Page
- Keys et al. 2019 — Algorithm description
For questions or feedback, contact the MPC via the Jira Helpdesk.