Generate dummy TSV data with randomized FASTA files based on JSON schema.
- Generate TSV data from JSON schemas
- Randomize FASTA file names and headers
- Add field constraints to limit values
- Optionally inject validation errors for testing
- Use unique-names-generator for readable random names
- Hierarchical validation: Ensures provinces/states are children of their respective countries
npm install
npm startThen open http://localhost:3000
This project is configured to deploy on Netlify:
- Connect your GitHub repository to Netlify
- Use these build settings:
- Build command: (leave empty or
echo 'No build needed') - Publish directory:
. - Functions directory:
.netlify/functions
- Build command: (leave empty or
The configuration is in netlify.toml.
- Select a JSON schema from the dropdown
- Enter a submission name (e.g., "MPOX", "COVID")
- Set the number of rows to generate
- Optionally add field constraints
- Optionally enable validation error injection
- Click "Generate & Download"
index.html- Main application interfaceapp.js- JavaScript logicserver.js- Local development server.netlify/functions/api.js- Netlify serverless functionschemas/- JSON schema filesfastas/- FASTA files for randomizationafrica_hierarchical_enriched.json- Hierarchical geographic data for country-province validation
The generator automatically validates that geo_loc_name_state_province_territory values are children of the selected geo_loc_name_country. This uses the africa_hierarchical_enriched.json file to build a country-to-provinces mapping at runtime.
When a country is selected, the generator will:
- Filter the available provinces to only those belonging to that country
- Randomly select from the valid provinces for that country
- Fall back to the full province list if no mapping is found
This applies to all geographic field pairs:
geo_loc_name_country→geo_loc_name_state_province_territoryhost_residence_geo_loc_name_country→host_residence_geo_loc_name_state_province_territorylocation_of_exposure_geo_loc_name_country→location_of_exposure_geo_loc_name_state_province_territory
The original Python tool is still available:
python generate_dummy_tsv.py mpox.json 50 mpox.zip --spread 10 --tsv-name mpox.tsv