Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
b888155
adding graduation_rates_ipeds data
balit-raibot Jan 3, 2026
c646483
resolving comments
balit-raibot Jan 3, 2026
787d577
Merge branch 'master' into GradRates_IPEDS
balit-raibot Jan 3, 2026
3dfb390
Merge branch 'master' into GradRates_IPEDS
balit-raibot Jan 27, 2026
25e0bd5
reorganizing folder
smarthg-gi Jan 29, 2026
64f92a5
Merge branch 'GradRates_IPEDS' of https://github.com/balit-raibot/dat…
smarthg-gi Jan 29, 2026
378ffae
Reorganizing files
smarthg-gi Jan 30, 2026
c41c343
Update README.md
smarthg-gi Jan 30, 2026
8b07050
adding files
smarthg-gi Feb 5, 2026
f6da264
Merge branch 'GradRates_IPEDS' of https://github.com/balit-raibot/dat…
smarthg-gi Feb 5, 2026
ae7fdf0
updating schema file
smarthg-gi Feb 5, 2026
d6899f5
Updating readme
smarthg-gi Feb 5, 2026
1deb20f
Updating files
smarthg-gi Feb 16, 2026
32a9aee
Merge branch 'master' into GradRates_IPEDS
smarthg-gi Feb 16, 2026
da74d65
Moving files to correct directory
smarthg-gi Feb 16, 2026
836edae
removing stat_vars.mcf file
smarthg-gi Feb 16, 2026
9f91140
Updating directory structure
smarthg-gi Feb 16, 2026
5ce5ce0
Removing schema mcf file
smarthg-gi Feb 16, 2026
df5dbca
Update README.md
smarthg-gi Feb 16, 2026
9b92062
Update manifest.json
smarthg-gi Feb 16, 2026
934845b
renaming directory structure
smarthg-gi Feb 16, 2026
305a070
Merge branch 'GradRates_IPEDS' of https://github.com/balit-raibot/dat…
smarthg-gi Feb 16, 2026
16e0bc1
Merge branch 'master' into GradRates_IPEDS
smarthg-gi Feb 17, 2026
1afd7e5
Merge branch 'master' into GradRates_IPEDS
balit-raibot Feb 24, 2026
8ae31b3
Merge branch 'master' into GradRates_IPEDS
balit-raibot Feb 26, 2026
b454c91
made courseCompletionTime as a constraint property
balit-raibot Feb 26, 2026
bfc03dd
reprocessed test_data with new property
balit-raibot Feb 26, 2026
8788569
Merge branch 'master' into GradRates_IPEDS
balit-raibot Mar 18, 2026
822c8b2
Updating latest changes in pvmap and respective files
smarthg-gi Mar 25, 2026
8c12e1e
updating files
smarthg-gi Mar 25, 2026
2d729b6
updating files
smarthg-gi Mar 26, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions statvar_imports/ipeds/ipeds_graduationrates_national/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# IPEDS GraduationRates National Dataset
## Overview
This dataset contains national-level graduation rate statistics for students who started as full-time, first-time (FTFT) degree or certificate-seeking undergraduates.
Specifically, it provides graduation rates at three different time intervals: 100%, 150%, and 200% of the "normal time" to completion.
It captures these key metrics:
- 100% Graduation Rate: Students finishing within the standard program length
- 150% Graduation Rate: The standard reporting benchmark (e.g., 6 years for a Bachelor's)
- 200% Graduation Rate: The extended benchmark (e.g., 8 years for a Bachelor's)

The cohort year in the data refers to the specific time, a group of students who first entered an institution or started a degree. For some cohort year 2018-2022, the data refers to the graduation rates in 2022 for the students who enrolled in 2018.

type of place: Country.
years: 2009-2024
## Data Source
**Source URL:**
https://nces.ed.gov/ipeds/search/

**Provenance Description:**
The data comes from U.S. Department of Education, National Center for Education Statistics (NCES). Specifically, the data is drawn from the Integrated Postsecondary Education Data System (IPEDS), which is a comprehensive system of interrelated surveys that gathers institutional-level data from colleges, universities, and technical/vocational schools across the United States.

## Refresh Type
Semi-Automatic Refresh

For refresh of the data, the import is set up for semi automation with a manual download step to download the data into a gcs path.

##Data Publish Frequency
Release Frequency = Annual
Provisional data is released during the early fall (Sep-Oct).

## How To Download Input Data
To download the data, you'll need to use the provided source link. The source link leads to the IPEDS Data Explorer, which is a search tool provided by NCES. Here you need to filter the Graduation Rates as:
- Go to the source link which leads to data explorer
- Under the 'Surveys' dropdown, select 'Graduation Rates 200% (GR200)'
- By default, the data now will be visible for the latest year
- To fetch data for specific years, or all years, select the data year/years from the 'Data Year' dropdown
- Once the table opens, from the page header, select the 'Excel' option, which downloads the data in the .xlsx format
- The downloaded data is now avaialble for processing.
- Move the data to the path: **gs://unresolved_mcf/IPEDS/graduation_rates_national/input_files/**
- Process the data using the stat_var_processor script and the GCS bucket path for input as shown in below section.

## Processing Instructions
To process the IPEDS Graduation Rate data and generate statistical variables, use the following command from the "data" directory:

**For Data Run**
```bash
python ../../tools/statvar_importer/stat_var_processor.py \
--input_data=gs://unresolved_mcf/IPEDS/graduation_rates_national/input_files/*.csv \
--pv_map=graduation_rates_ipeds_pvmap.csv \
--output_path=output/graduation_rates_ipeds_output \
--config_file=graduation_rates_ipeds_metadata.csv \
--existing_statvar_mcf=gs://unresolved_mcf/scripts/statvar/stat_vars.mcf
```

This generates the following output files:
- output csv
- output_stat_vars_scehma.mcf
- output_stat_vars.mcf
- output.tmcf

**For Data Quality Checks and validation**
Validation of the data is done using the lint flag in the java tool present.

```bash
java -jar datacommons-import-tool-0.1-jar-with-dependencies.jar lint graduation_rates_ipeds_output_stat_vars_schema.mcf graduation_rates_ipeds_output.csv graduation_rates_ipeds_output.tmcf graduation_rates_ipeds_output_stat_vars.mcf
```

This generates the following output files:
- report.json
- summary_report.csv
- summary_report.html

The report files can be analysed to check for errors and warnings.
Further, Linting is performed on the generated output files using the DataCommons import tool.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
parameter,value
output_columns,"observationDate,observationPeriod,value,unit,observationAbout,variableMeasured"
dc_api_root,https://api.datacommons.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
key,p1,v1,p2,v2,p3,v3,p4,v4,p5,v5
National Center for Education Statistics,statType,measuredValue,populationType,Student,observationAbout,country/USA,,,,
Total,value,{Number},institutionType,"""""",statType,measuredValue,measuredProperty,graduationRate,unit,Percent
Overall,value,{Number},institutionType,"""""",statType,measuredValue,measuredProperty,graduationRate,unit,Percent
All institutions,value,{Number},institutionType,"""""",statType,measuredValue,measuredProperty,graduationRate,unit,Percent
Public,value,{Number},institutionType,PubliclyOwnedInstitute,statType,measuredValue,measuredProperty,graduationRate,unit,Percent
not-for-profit,value,{Number},institutionType,PrivatelyOwnedNotForProfitInstitute,statType,measuredValue,measuredProperty,graduationRate,unit,Percent
nonprofit,value,{Number},institutionType,PrivatelyOwnedNotForProfitInstitute,statType,measuredValue,measuredProperty,graduationRate,unit,Percent
for-profit,value,{Number},institutionType,PrivatelyOwnedForProfitInstitute,statType,measuredValue,measuredProperty,graduationRate,unit,Percent
attending 4-year,educationalAttainment,BachelorsDegree,#Header,educationalAttainment,,,,,,
attending 2-year,educationalAttainment,AssociateDegreeOrCertificate,#Header,educationalAttainment,,,,,,
attending 2-year,educationalAttainment,AssociateDegreeOrCertificate,#Header,educationalAttainment,,,,,,
less-than- 2-year,educationalAttainment,PostSecondaryCertificate,#Header,educationalAttainment,,,,,,
less-than-2-year,educationalAttainment,PostSecondaryCertificate,#Header,educationalAttainment,,,,,,
attending less-than-,educationalAttainment,PostSecondaryCertificate,#Header,educationalAttainment,,,,,,
within 100%,courseCompletionTime,CourseCompletedWithin100PercentOfNormalTime,,,,,,,,
within 150%,courseCompletionTime,CourseCompletedWithin150PercentOfNormalTime,,,,,,,,
within 200%,courseCompletionTime,CourseCompletedWithin200PercentOfNormalTime,,,,,,,,
Within 100 percent,courseCompletionTime,CourseCompletedWithin100PercentOfNormalTime,,,,,,,,
Within 150 percent,courseCompletionTime,CourseCompletedWithin150PercentOfNormalTime,,,,,,,,
Within 200 percent,courseCompletionTime,CourseCompletedWithin200PercentOfNormalTime,,,,,,,,
cohort years 2000 and 2004,observationDate,2004,observationPeriod,P4Y,,,,,,
cohort years 2001 and 2005,observationDate,2005,observationPeriod,P4Y,,,,,,
cohort years 2002 and 2006,observationDate,2006,observationPeriod,P4Y,,,,,,
cohort years 2003 and 2007,observationDate,2007,observationPeriod,P4Y,,,,,,
cohort years 2004 and 2008,observationDate,2008,observationPeriod,P4Y,,,,,,
cohort years 2005 and 2009,observationDate,2009,observationPeriod,P4Y,,,,,,
cohort years 2006 and 2010,observationDate,2010,observationPeriod,P4Y,,,,,,
cohort years 2007 and 2011,observationDate,2011,observationPeriod,P4Y,,,,,,
cohort years 2008 and 2012,observationDate,2012,observationPeriod,P4Y,,,,,,
cohort years 2009 and 2013,observationDate,2013,observationPeriod,P4Y,,,,,,
cohort years 2010 and 2014,observationDate,2014,observationPeriod,P4Y,,,,,,
cohort years 2011 and 2015,observationDate,2015,observationPeriod,P4Y,,,,,,
cohort years 2012 and 2016,observationDate,2016,observationPeriod,P4Y,,,,,,
cohort years 2013 and 2017,observationDate,2017,observationPeriod,P4Y,,,,,,
cohort years 2014 and 2018,observationDate,2018,observationPeriod,P4Y,,,,,,
cohort years 2015 and 2019,observationDate,2019,observationPeriod,P4Y,,,,,,
21 changes: 21 additions & 0 deletions statvar_imports/ipeds/ipeds_graduationrates_national/manifest.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{
"import_specifications": [
{
"import_name": "IPEDS_GraduationRates_National",
"curator_emails": ["support@datacommons.org"],
"provenance_url": "https://nces.ed.gov/ipeds/search",
"provenance_description": "",
"scripts": ["../../tools/statvar_importer/stat_var_processor.py --input_data=gs://unresolved_mcf/IPEDS/graduation_rates_national/input_files/*.csv --pv_map=graduation_rates_ipeds_pvmap.csv --config_file=graduation_rates_ipeds_metadata.csv --output_path=output/graduation_rates_ipeds_output --existing_statvar_mcf=gs://unresolved_mcf/scripts/statvar/stat_vars.mcf
"],
"import_inputs": [
{
"template_mcf": "output/graduation_rates_ipeds_output.tmcf",
"cleaned_csv": "output/graduation_rates_ipeds_output.csv"
}
],
"cron_schedule": "0 0 15 8 *"
}
]
}


Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
National Center for Education Statistics,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6
"Table 10. Graduation rates within 100, 150, and 200 percent of normal program completion time at Title IV institutions among the students who started as full-time, first-time degree/certificate-seeking undergraduate students, by control of institution, degree or certificate sought, and level of institution: United States, cohort years 2010 and 2014",,,,,,
,,,,,,
,,,,,,
,,,,,,
,,,,,,
,,,,,,
Degree or certificate sought and level of institution,,,,,Private ,
,,All institutions,Public,,Nonprofit,For-profit
,,,,,,
Bachelor’s or equivalent degree-seeking students attending 4-year institutions and completing bachelor’s or equivalent degree (cohort year 2010),,,,,,
,,,,,,
Within 100 percent of normal program completion time,,40.9,35.7,,53.8,13
Within 150 percent of normal program completion time,,60,58.9,,65.9,20.3
Within 200 percent of normal program completion time,,62,61.5,,67,21.3
,,,,,,
Degree- or certificate-seeking students attending 2-year institutions and completing a degree or certificate (cohort year 2014),,,,,,
,,,,,,
Within 100 percent of normal program completion time,,18.5,14.5,,26.8,41
Within 150 percent of normal program completion time,,33.1,26.7,,62.2,63.5
Within 200 percent of normal program completion time,,37.8,32.3,,63.7,64.4
,,,,,,
Degree- or certificate-seeking students attending less-than-2-year institutions and completing a degree or certificate (cohort year 2014),,,,,,
,,,,,,
Within 100 percent of normal program completion time,,45.8,64.5,,58.3,42.4
Within 150 percent of normal program completion time,,69.3,73.6,,72.6,68.5
Within 200 percent of normal program completion time,,70.2,74.5,,73.1,69.5
"NOTE: Title IV institutions are those with a written agreement with the U.S. Department of Education that allows the institution to participate in any of the Title IV federal student financial assistance programs. United States includes the 50 states and the District of Columbia. The four U.S. service academies that are not Title IV eligible are included in the Integrated Postsecondary Education Data System (IPEDS) universe because they are federally funded and open to the public and are included in this table. The rates in this table reflect graduation rates at institutions regardless of the length of programs, unless otherwise indicated. The graduation rate was calculated as required for disclosure and reporting purposes under the Student Right-to-Know Act. This rate was calculated as the total number of completers within 100, 150, or 200 percent of normal time (e.g. “normal” program completion time for a bachelor’s degree would be 4 years) divided by the adjusted cohort (revised cohort minus any allowable exclusions). The revised cohort is the number of students entering the institution as full-time, first-time degree- or certificate-seeking undergraduates in the reference year. Allowable exclusions include those students who died or were totally and permanently disabled; students who left school to serve in the armed forces (or have been called up to active duty); those who left to serve with a foreign aid service of the federal government, such as the Peace Corps; and those who left to serve on official church missions. Definitions for terms used in this table may be found in the collection year’s archived downloadable glossary located at https://nces.ed.gov/ipeds/use-the-data/annual-survey-forms-packages-archived?year=2018.",,,,,,
"SOURCE: U.S. Department of Education, National Center for Education Statistics, IPEDS, Winter 2018–19, 200 Percent Graduation Rates component (final data).",,,,,,
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
observationDate,observationPeriod,value,unit,observationAbout,variableMeasured
2014,P4Y,40.9,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin100PercentOfNormalTime_EducationalAttainmentBachelorsDegree
2014,P4Y,35.7,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin100PercentOfNormalTime_EducationalAttainmentBachelorsDegree_PubliclyOwnedInstitute
2014,P4Y,53.8,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin100PercentOfNormalTime_EducationalAttainmentBachelorsDegree_PrivatelyOwnedNotForProfitInstitute
2014,P4Y,13,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin100PercentOfNormalTime_EducationalAttainmentBachelorsDegree_PrivatelyOwnedForProfitInstitute
2014,P4Y,60,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin150PercentOfNormalTime_EducationalAttainmentBachelorsDegree
2014,P4Y,58.9,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin150PercentOfNormalTime_EducationalAttainmentBachelorsDegree_PubliclyOwnedInstitute
2014,P4Y,65.9,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin150PercentOfNormalTime_EducationalAttainmentBachelorsDegree_PrivatelyOwnedNotForProfitInstitute
2014,P4Y,20.3,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin150PercentOfNormalTime_EducationalAttainmentBachelorsDegree_PrivatelyOwnedForProfitInstitute
2014,P4Y,62,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin200PercentOfNormalTime_EducationalAttainmentBachelorsDegree
2014,P4Y,61.5,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin200PercentOfNormalTime_EducationalAttainmentBachelorsDegree_PubliclyOwnedInstitute
2014,P4Y,67,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin200PercentOfNormalTime_EducationalAttainmentBachelorsDegree_PrivatelyOwnedNotForProfitInstitute
2014,P4Y,21.3,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin200PercentOfNormalTime_EducationalAttainmentBachelorsDegree_PrivatelyOwnedForProfitInstitute
2014,P4Y,18.5,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin100PercentOfNormalTime_EducationalAttainmentAssociateDegreeOrCertificate
2014,P4Y,14.5,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin100PercentOfNormalTime_EducationalAttainmentAssociateDegreeOrCertificate_PubliclyOwnedInstitute
2014,P4Y,26.8,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin100PercentOfNormalTime_EducationalAttainmentAssociateDegreeOrCertificate_PrivatelyOwnedNotForProfitInstitute
2014,P4Y,41,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin100PercentOfNormalTime_EducationalAttainmentAssociateDegreeOrCertificate_PrivatelyOwnedForProfitInstitute
2014,P4Y,33.1,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin150PercentOfNormalTime_EducationalAttainmentAssociateDegreeOrCertificate
2014,P4Y,26.7,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin150PercentOfNormalTime_EducationalAttainmentAssociateDegreeOrCertificate_PubliclyOwnedInstitute
2014,P4Y,62.2,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin150PercentOfNormalTime_EducationalAttainmentAssociateDegreeOrCertificate_PrivatelyOwnedNotForProfitInstitute
2014,P4Y,63.5,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin150PercentOfNormalTime_EducationalAttainmentAssociateDegreeOrCertificate_PrivatelyOwnedForProfitInstitute
2014,P4Y,37.8,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin200PercentOfNormalTime_EducationalAttainmentAssociateDegreeOrCertificate
2014,P4Y,32.3,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin200PercentOfNormalTime_EducationalAttainmentAssociateDegreeOrCertificate_PubliclyOwnedInstitute
2014,P4Y,63.7,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin200PercentOfNormalTime_EducationalAttainmentAssociateDegreeOrCertificate_PrivatelyOwnedNotForProfitInstitute
2014,P4Y,64.4,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin200PercentOfNormalTime_EducationalAttainmentAssociateDegreeOrCertificate_PrivatelyOwnedForProfitInstitute
2014,P4Y,45.8,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin100PercentOfNormalTime_EducationalAttainmentPostSecondaryCertificate
2014,P4Y,64.5,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin100PercentOfNormalTime_EducationalAttainmentPostSecondaryCertificate_PubliclyOwnedInstitute
2014,P4Y,58.3,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin100PercentOfNormalTime_EducationalAttainmentPostSecondaryCertificate_PrivatelyOwnedNotForProfitInstitute
2014,P4Y,42.4,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin100PercentOfNormalTime_EducationalAttainmentPostSecondaryCertificate_PrivatelyOwnedForProfitInstitute
2014,P4Y,69.3,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin150PercentOfNormalTime_EducationalAttainmentPostSecondaryCertificate
2014,P4Y,73.6,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin150PercentOfNormalTime_EducationalAttainmentPostSecondaryCertificate_PubliclyOwnedInstitute
2014,P4Y,72.6,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin150PercentOfNormalTime_EducationalAttainmentPostSecondaryCertificate_PrivatelyOwnedNotForProfitInstitute
2014,P4Y,68.5,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin150PercentOfNormalTime_EducationalAttainmentPostSecondaryCertificate_PrivatelyOwnedForProfitInstitute
2014,P4Y,70.2,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin200PercentOfNormalTime_EducationalAttainmentPostSecondaryCertificate
2014,P4Y,74.5,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin200PercentOfNormalTime_EducationalAttainmentPostSecondaryCertificate_PubliclyOwnedInstitute
2014,P4Y,73.1,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin200PercentOfNormalTime_EducationalAttainmentPostSecondaryCertificate_PrivatelyOwnedNotForProfitInstitute
2014,P4Y,69.5,Percent,country/USA,dcid:GraduationRate_Student_CourseCompletedWithin200PercentOfNormalTime_EducationalAttainmentPostSecondaryCertificate_PrivatelyOwnedForProfitInstitute
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Node: E:graduation_rates_ipeds_output->E0
observationDate: C:graduation_rates_ipeds_output->observationDate
observationPeriod: C:graduation_rates_ipeds_output->observationPeriod
value: C:graduation_rates_ipeds_output->value
unit: C:graduation_rates_ipeds_output->unit
observationAbout: C:graduation_rates_ipeds_output->observationAbout
variableMeasured: C:graduation_rates_ipeds_output->variableMeasured
typeOf: dcs:StatVarObservation
Loading