schema: refactor data/threat models and refresh bundles#777
schema: refactor data/threat models and refresh bundles#777P3tra-WP wants to merge 1 commit intoCycloneDX:2.0-dev-threatmodelingfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors the data and threat modeling schemas to improve consistency, reuse, and explicit linkage across models. The changes consolidate data classification logic into a shared data model, add new threat model references (vulnerabilityRef and ibmRiskAtlas), fix missing risk model definitions, and reorganize the blueprint schema structure.
Changes:
- Enhanced threat modeling with vulnerabilityRef and ibmRiskAtlasReference support
- Refactored dataClassification to support enum strings, custom strings, or detailed objects with comprehensive metadata
- Consolidated dataObject and dataCategory definitions into the shared cyclonedx-data-2.0 schema
- Reorganized blueprint schema by moving actor into $defs and adding accessControlType alias
- Regenerated bundled schemas to reflect all structural changes
Reviewed changes
Copilot reviewed 4 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| cyclonedx-threat-2.0.schema.json | Added vulnerabilityRef and ibmRiskAtlasReference to threatScenario |
| cyclonedx-risk-2.0.schema.json | Added missing likelihoodFactor definition |
| cyclonedx-data-2.0.schema.json | Refactored dataClassification with detailed metadata, added dataCategory and dataObject definitions |
| cyclonedx-blueprint-2.0.schema.json | Moved actor to $defs, updated references to use shared data model definitions, removed duplicate definitions |
| cyclonedx-api-2.0-bundled.min.schema.json | Regenerated bundled schema incorporating all changes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "restricted": "Highly restricted information with limited access and special protection measures" | ||
| } | ||
| }, | ||
| { |
There was a problem hiding this comment.
Remove the second object.
| "description": "Detailed data classification metadata.", | ||
| "additionalProperties": false, | ||
| "properties": { | ||
| "level": { |
There was a problem hiding this comment.
This is where the oneOf should go IMO. So that existing levels or custom levels can be specified and still take advantage of what the object provides.
| "custom": "Custom handling requirement" | ||
| } | ||
| }, | ||
| { |
There was a problem hiding this comment.
The second option should not be a string, rather an object with a name and description.
| } | ||
| ] | ||
| }, | ||
| "retention": { |
There was a problem hiding this comment.
The retention assumes persisted data. IMO, there is a need to also handle transient data such as PHI in a short-lived JWT that expires in an hour.
There was a problem hiding this comment.
Petra mentioned TTL, which could be used alongside retention. We could also think about a simple integer field that is defined with a specific time-period. ie. nanoseconds, hours, etc.
| ] | ||
| }, | ||
| "disposal": { | ||
| "oneOf": [ |
There was a problem hiding this comment.
Change disposal to use the behavior taxonomy. See Steve for a prototype of this in action.
| "credentials", | ||
| "safety", | ||
| "operational", | ||
| "custom" |
There was a problem hiding this comment.
biometric, etc. should be supported.
| }, | ||
| "description": "Categories of data (PII, PHI, PCI, etc.)" | ||
| }, | ||
| "schema": { |
There was a problem hiding this comment.
for schema and format, work with Steve. We're doing something similar with exterenalReferences. Related to #185.
| "weakness": { | ||
| "$ref": "#/$defs/weaknessReference" | ||
| }, | ||
| "ibmRiskAtlas": { |
There was a problem hiding this comment.
rename this so that it is not IBM specific.
| } | ||
| } | ||
| }, | ||
| "ibmRiskAtlasReference": { |
There was a problem hiding this comment.
Rename - make it more universal
| "ibmRiskAtlasReference": { | ||
| "type": "object", | ||
| "additionalProperties": false, | ||
| "properties": { |
There was a problem hiding this comment.
add a new field to represent "IBM Risk Atlas" and any other custom taxonomy.
There was a problem hiding this comment.
Take a look at the vulnerabiity schema which supports CWE and multiple ratings, including CSVV, SSVC, and OWASP Risk Rating.
There was a problem hiding this comment.
In the case of ASVS and similar, these can also be handled in CDXA.
Title:
Refactor data/threat modeling schemas and regenerate bundles
Description:
This PR updates the data and threat modeling schemas to improve consistency, reuse, and explicit linkage across models, and regenerates bundled schemas.
What changed
Data classification refactor:
Moved detailed dataClassification into the shared data model.
dataClassification now supports: enum string, custom string, or detailed object.
Detailed object uses dataCategory for dataTypes.
Data objects and categories:
dataObject and dataCategory moved to cyclonedx-data-2.0.schema.json.
dataSet.dataObjects now references shared dataObject definitions.
Flow metadata consolidation:
Removed flow.dataFormat and flow.classification; flows reference dataObjects for these details.
Threat model enhancements:
Added vulnerabilityRef on threatScenario to link threats to vulnerabilities.
Added ibmRiskAtlas reference object.
CAPEC references already supported via attackPattern / attackPatternReference.
Risk model fixups:
Added missing likelihoodFactor definition.
Blueprint schema fixups:
Moved actor into $defs and added accessControlType alias to authorizationType.
Enforced dataObject classification via oneOf (inline vs ref).
Bundled outputs regenerated:
cyclonedx-2.0-bundled.schema.json
cyclonedx-2.0-bundled.min.schema.json
cyclonedx-api-2.0-bundled.schema.json
cyclonedx-api-2.0-bundled.min.schema.json
Notes
Bundler warns about missing 2020-12 meta-schema in AJV (existing behavior).
Testing
Bundled schemas regenerated via bundle-schemas.js.