Limitations

Use boundaries

Limitations — Field Risk Atlas

This document is the long-form version of the README disclaimer. Every limitation here is a load-bearing constraint: ignoring any of them leads to material misuse of the tool's output.

1. What this tool is not

Not a real estate appraisal

The risk score is a screening artifact. It is not a substitute for a state-certified or state-licensed appraiser per FCA Regulation 12 CFR 614.4260. It was not produced under USPAP standards. Any use that requires an appraiser's signature is out of scope.

Not a private-data tool

Every input is public. The repo will not ingest, reference, or commit any proprietary data, internal organizational records, or commercially-licensed sources (parcel aggregators like ParcelQuest, Regrid, Acres GIS — incompatible with open-source posture). If you build derivative work that pairs this tool with private data, that derivative work needs its own disclosure controls.

2. Snapshot tool, not a live system

The model produces a snapshot keyed on the gsp_status_as_of field per parcel. v1's snapshot is May 2026; the regulatory state encoded in the GSP status crosswalk is:

Tule, Tulare Lake → Probationary
Kaweah, Kern, Chowchilla → Returned to DWR
Pleasant Valley → Inadequate

Refresh cadence is documented in methodology §3. A scoring run with stale inputs is fine for backtesting but should not be presented as the current state.

The USDM column (usdm_d2_weeks_52w) was zero across all 9 v1 counties in the May-2026 snapshot — California was in a wet period after the 2020–22 drought. This is accurate to the snapshot date, but it makes the 5-weight USDM component non-discriminating in v1; the snapshot date will move forward and conditions will shift before any forward-looking presentation.

3. Data quality caveats

Per-county parcel data heterogeneity

The 9 counties use 9 different REST endpoints with 9 different APN field conventions. Notable wrinkles:

Tulare's APN field is integer-encoded (drops leading zeros); PARCELID is used instead.
Madera's GIS server has a broken intermediate cert chain; ingest tolerates it for that one source.
Kern's parcel data carries an attribution disclaimer (Kern County Assessor's Office, Mapping Section; Kern Council of Governments; MCAG; City of Bakersfield; City of Shafter). Raw Kern parcel data is never committed to the repo.
San Joaquin's sjmap.org MapServer is on an older ArcGIS Server build; it doesn't serve f=geojson and caps resultRecordCount silently — ingest pulls Esri JSON and paginates by OBJECTID instead.
109 parcels (~0.04%) have collisions on parcel_id in Fresno + Kings — different physical land sharing an APN string. The overlay collapses to the largest geometry per parcel_id.

DWR i07 well-completion reports have known issues

DWR's own metadata documents missing/duplicate records and incorrect values. Aggregation to PLSS section median is robust to outliers; absolute well counts are not. The dominant_well_depth_ft column is a median over the parcel's PLSS section, not a parcel-specific drilling depth.

PLSS rancho-grant areas have no T/R/S

~6.5% of v1-bbox PLSS polygons (concentrated in Sonoma) are Mexican Land Grant ranchos that predate the PLSS system. Parcels falling on these polygons won't have well-depth statistics — that's correct behavior, not a bug. ~28% of all parcels are missing well-depth coverage as a result.

Land IQ vs CDL provenance

Land IQ accounts for 66.7% of ag-parcel crop classifications; CDL fills the rest (mostly with Grassland/Pasture → annual_low). Land IQ is the higher-fidelity source; CDL is a 30m-grid centroid-sample fallback. The crop_source audit column records which source classified each parcel.

Water tier crosswalk is partially filled

Of 505 districts in the v1 footprint, 49 have explicit ASFMRA Tier 1–4 assignments (covering 95% of ag-acres) and 175 are auto-marked n/a (mutual water companies, mobile-home parks, urban M&I, etc.). 281 long-tail districts remain blank, defaulting to white-area; collectively they cover 3.1% of ag-acres.

4. Methodology limitations

Static weights, no learning

The weights are pinned by build-plan judgment. They are not calibrated against historical outcomes data. They are not adjusted for cross-component correlation (e.g., critical-overdraft basins almost always have High priority — there's some double-counting). A serious calibration exercise would replace this with regression-fit weights against an outcomes panel.

Single time-point GSP-status visualization

The map shows a single (May-2026) snapshot. A 2020 → 2026 time slider would require a manually-curated GSP-status timeline (SWRCB order effective dates per basin) which is not in v1. Presenting the map as a "trajectory" visualization would be misleading.

Nine counties, not statewide

v1 covers Sonoma, San Joaquin, Stanislaus, Merced, Madera, Fresno, Tulare, Kings, and Kern — the full San Joaquin Valley SGMA-stress footprint plus a Sonoma vineyard contrast. The architecture is statewide-capable; extending further requires a per-county parcel ingest module. Inferences across un-covered counties (Sacramento Valley, Salinas Valley, etc.) are out of scope.

5. Use boundaries

Appropriate uses:

Field-level and regional water-risk analysis — which county / basin / crop-class concentrations are most exposed to SGMA stress
Field prioritization — flag parcels for closer review by a state-licensed appraiser or domain expert
Methodology research — a transparent and auditable model with documented data sources
Public discourse on SGMA's water-allocation and basin-stress implications

Inappropriate uses:

Any decision relying on the score in isolation, without ground-truth context the model doesn't capture
Any presentation that strips the snapshot date from the score
Any use that pairs the public score output with private data without separate disclosure controls

6. Where to read the source

Methodology rationale: methodology
Source URLs, licenses, vintages: sources
Code: github.com/mcmillinanalytics/field-risk-atlas

If you find a methodology gap or data-quality issue not documented here, file a GitHub issue against the repo.