Skip to content

Commit 4f262eb

Browse files
mohanchenabacus_fixer
andauthored
add output-specification in doc (deepmodeling#7030)
Co-authored-by: abacus_fixer <mohanchen@pku.eud.cn>
1 parent 342de19 commit 4f262eb

2 files changed

Lines changed: 301 additions & 0 deletions

File tree

docs/advanced/output_files/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,6 @@ Detailed Introduction of the Output Files
55
.. toctree::
66
:maxdepth: 1
77

8+
output-specification
89
running_scf.log
910

Lines changed: 300 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,300 @@
1+
# ABACUS Output File Specification
2+
3+
## 1. Background and Motivation
4+
5+
ABACUS is an integrated software package designed to provide a cohesive user experience. To achieve this goal, **we are establishing unified output file standards** that all developers should follow.
6+
7+
Recent versions of ABACUS have been working on standardizing all output file naming conventions. This work is ongoing, and if there are discrepancies between actual file names and this document, please refer to the latest documentation.
8+
9+
All output file naming conventions can be found in the online documentation (with corresponding 3.10-LTS file names): [ABACUS Input Documentation](https://abacus.deepmodeling.com/en/latest/advanced/input_files/input-main.html)
10+
11+
## 2. File Naming Conventions
12+
13+
### 2.1 Basic Rules
14+
15+
**Rule 1:** All ABACUS output files are stored in the `OUT.{suffix}/` directory.
16+
17+
**Rule 2:** File extensions by category:
18+
| Extension | Description |
19+
|-----------|-------------|
20+
| `.txt` | Text file |
21+
| `.dat` | Binary file |
22+
| `.csr` | Sparse matrix format |
23+
| `.cube` | 3D spatial data format |
24+
25+
**Rule 3:** For special output quantities (e.g., wavefunctions), add `_pw` or `_nao` to distinguish between plane wave basis and numerical atomic orbital basis.
26+
27+
**Rule 4:** File names are lowercase. Physical quantities appear at the beginning:
28+
29+
| Prefix | Physical Quantity |
30+
|--------|-------------------|
31+
| `chg` | Charge density |
32+
| `pot` | Potential |
33+
| `eig` | Eigenvalue / Energy level |
34+
| `wf` | Wavefunction |
35+
| `dm` | Density matrix |
36+
| `h` | Hamiltonian matrix H |
37+
| `s` | Overlap matrix S |
38+
| `t` | Kinetic energy operator |
39+
| `r` | Position operator or Bravais lattice vector R |
40+
| `k` | k-point in Brillouin zone |
41+
| `xyz` | Three spatial directions |
42+
| `ini` | Initial state (before electronic iteration) |
43+
44+
**Rule 5:** Suffixes following the physical quantity:
45+
46+
| Suffix | Meaning |
47+
|--------|---------|
48+
| `s1`, `s2`, `s3`, `s4` | Spin channel (1, 2 for collinear; 1, 2, 3, 4 for non-collinear with SOC) |
49+
| `s12` | Non-collinear spin calculation |
50+
| `k#` | k-point index (e.g., `k1`, `k2`) |
51+
| `g#` | Ionic step index for relax/md (e.g., `g1`, `g2`) |
52+
53+
**Important:**
54+
- All index numbers start from 1 (not 0)
55+
- For Gamma-only algorithm in LCAO, no `k` index is included
56+
- Overlap matrix `s` does not distinguish spin, so only one matrix is output
57+
58+
### 2.2 Examples
59+
60+
| File Name | Interpretation |
61+
|-----------|----------------|
62+
| `chgs1.cube` | Charge density, spin 1 |
63+
| `chgs2.cube` | Charge density, spin 2 |
64+
| `chgs3.cube` | Charge density, spin 3 (non-collinear with SOC) |
65+
| `pots1.cube` | Local potential, spin 1 |
66+
| `eig_occ.txt` | Eigenvalues and occupations |
67+
| `doss1g1_nao.txt` | DOS, spin 1, geometry step 1, NAO basis |
68+
| `wf_pw.dat` | Wavefunction, plane wave basis |
69+
| `sr.csr` | Overlap matrix in real space (no spin index) |
70+
71+
### 2.3 Common Output Files
72+
73+
| File Name | Description |
74+
|-----------|-------------|
75+
| `running_scf.log` | SCF iteration log |
76+
| `dos.txt` | Density of states |
77+
| `eig_occ.txt` | Eigenvalues and occupations |
78+
| `mulliken.txt` | Mulliken population analysis |
79+
| `band.txt` | Band structure |
80+
| `chgs1.cube`, `chgs2.cube` | Charge density (spin 1, spin 2) |
81+
| `chg.cube` | Total charge density |
82+
| `taus1.cube`, `taus2.cube` | Kinetic energy density (tau) |
83+
| `pots1.cube`, `pots2.cube` | Local potential |
84+
85+
## 3. File Format Standards
86+
87+
### 3.1 Header Section with Comments
88+
89+
**Every output file should include `#` comment lines** to explain:
90+
- Data meaning
91+
- Units
92+
- Source module
93+
- Key parameters
94+
95+
```
96+
# <description of file content>
97+
# Module: <source module name>
98+
# Units: <unit information>
99+
<value> # <description>
100+
# <column headers with units>
101+
```
102+
103+
**Example:**
104+
```
105+
1 # ionic step
106+
8207 # number of points
107+
# Module: DOS calculation
108+
# Units: energy in eV, DOS in states/eV
109+
# energy elec_states sum_states states_smear sum_states
110+
```
111+
112+
### 3.2 Data Section
113+
114+
| Requirement | Description |
115+
|-------------|-------------|
116+
| **Separator** | Use spaces for column alignment |
117+
| **Precision** | Controlled by input parameter (see Section 3.4) |
118+
| **Units** | Always specify units in header or column names |
119+
| **Comments** | Use `#` for comment lines |
120+
121+
### 3.3 Compact Data Layout
122+
123+
**Avoid sparse format with single value per line.** Instead, output 6-8 values per line with proper alignment:
124+
125+
**Bad (sparse):**
126+
```
127+
1.234567
128+
2.345678
129+
3.456789
130+
4.567890
131+
```
132+
133+
**Good (compact):**
134+
```
135+
1.234567 2.345678 3.456789 4.567890 5.678901 6.789012
136+
7.890123 8.901234
137+
```
138+
139+
### 3.4 Precision Control via Input Parameters
140+
141+
Output precision should be controllable via input parameters, similar to `out_chg` and `out_pot`:
142+
143+
**Example (from `out_pot`):**
144+
```
145+
out_pot 1 8
146+
```
147+
- First integer: output type (1 = output total local potential)
148+
- Second integer: precision (number of significant digits, default 8)
149+
150+
**Implementation pattern:**
151+
```cpp
152+
// In input parameters
153+
int out_type = 0; // output type
154+
int out_precision = 8; // precision
155+
156+
// In output function
157+
ofs << std::setprecision(out_precision) << value;
158+
```
159+
160+
## 4. Output Volume Control
161+
162+
### 4.1 Reduce File Size
163+
164+
Current integration tests have output files with tens of thousands of lines. Recommendations:
165+
166+
| Issue | Solution |
167+
|-------|----------|
168+
| Too many output files | Consolidate related data into fewer files |
169+
| Files too large | Reduce output frequency, use compact format |
170+
| Redundant information | Avoid repeating header information in every block |
171+
172+
### 4.2 Balance Test Coverage and Efficiency
173+
174+
- Output only essential data for integration tests
175+
- Use `out_level` parameter to control verbosity
176+
- Consider binary format for large datasets
177+
178+
## 5. Naming Conventions for Physical Quantities
179+
180+
### 5.1 Standard Names (Keep Short: 3-8 Characters)
181+
182+
| Physical Quantity | Recommended Name | Unit |
183+
|-------------------|------------------|------|
184+
| Total energy | `etot` | eV |
185+
| Kinetic energy | `ekin` | eV |
186+
| Potential energy | `epot` | eV |
187+
| Force | `force` | eV/Angstrom |
188+
| Stress | `stress` | kBar |
189+
| Charge | `chg` | e |
190+
| Magnetization | `mag` | μB |
191+
| Band index | `n` | - |
192+
| k-point | `kpt` | - |
193+
| Spin | `spin` | - |
194+
| Occupation | `occ` | - |
195+
| Energy level | `eig` | eV |
196+
197+
### 5.2 Naming Style
198+
199+
- Use `snake_case` (lowercase with underscores)
200+
- **Keep names short (3-8 characters)**
201+
- Avoid abbreviations unless widely understood
202+
203+
## 6. Developer Checklist
204+
205+
Before adding a new output file or modifying an existing one, verify:
206+
207+
- [ ] **File name is short (3-8 characters)**
208+
- [ ] File name follows naming conventions (Section 2)
209+
- [ ] File is stored in `OUT.{suffix}/` directory
210+
- [ ] File numbering starts from 1 (not 0)
211+
- [ ] No redundant keywords (e.g., `data`) in filename
212+
- [ ] Use correct extension (`.txt`, `.dat`, `.csr`, `.cube`)
213+
- [ ] Add `_pw` or `_nao` for basis-specific outputs
214+
- [ ] Header section includes `#` comment lines with:
215+
- [ ] Data meaning
216+
- [ ] Units
217+
- [ ] Source module
218+
- [ ] Column headers include units
219+
- [ ] Data uses compact format (6-8 values per line, not single value per line)
220+
- [ ] Output precision is controllable via input parameter
221+
- [ ] File size is reasonable (avoid tens of thousands of lines)
222+
- [ ] Physical quantities use standard names (Section 5)
223+
- [ ] Documentation is updated in `docs/` directory
224+
225+
## 7. Code Implementation Guidelines
226+
227+
### 7.1 Output Function Template
228+
229+
```cpp
230+
void OutputMyData::write(const std::string& filename, int precision)
231+
{
232+
std::ofstream ofs(filename);
233+
234+
// Header section with comments
235+
ofs << "# Density of states" << std::endl;
236+
ofs << "# Module: DOS" << std::endl;
237+
ofs << "# Units: energy in eV, DOS in states/eV" << std::endl;
238+
ofs << ionic_step << " # ionic step" << std::endl;
239+
ofs << num_data << " # number of data points" << std::endl;
240+
ofs << "#";
241+
ofs << std::setw(15) << "energy(eV)";
242+
ofs << std::setw(15) << "dos";
243+
ofs << std::endl;
244+
245+
// Data section - compact format (6 values per line)
246+
ofs << std::setprecision(precision);
247+
for (int i = 0; i < num_data; ++i)
248+
{
249+
ofs << std::setw(15) << energy[i];
250+
ofs << std::setw(15) << dos[i];
251+
if ((i + 1) % 3 == 0) ofs << std::endl; // 3 pairs = 6 values per line
252+
}
253+
if (num_data % 3 != 0) ofs << std::endl;
254+
255+
ofs.close();
256+
}
257+
```
258+
259+
### 7.2 Key Points
260+
261+
1. **Keep file names short (3-8 characters)**
262+
2. **Add `#` comment lines for data meaning, units, and source module**
263+
3. **Use compact format: 6-8 values per line**
264+
4. **Make precision controllable via input parameter**
265+
5. **Use `std::setw()` for column alignment**
266+
267+
## 8. Existing Good Examples
268+
269+
| File | Location | Strengths |
270+
|------|----------|-----------|
271+
| `chgs1.cube` | `tests/03_NAO_multik/scf_out_chg_tau/OUT.autotest/` | Short name, spin index convention |
272+
| `dos.txt` | `tests/03_NAO_multik/scf_out_dos_spin2/OUT.autotest/` | Clear header with metadata |
273+
| `eig_occ.txt` | Same as above | Clear block structure for spin/k-point |
274+
| `mulliken.txt` | `tests/03_NAO_multik/scf_out_mul/OUT.autotest/` | Clear atom separators |
275+
276+
## 9. Review Process
277+
278+
For new output formats:
279+
280+
1. Check if similar output already exists - reuse format if possible
281+
2. Follow this specification
282+
3. Add documentation in `docs/advanced/output_files/`
283+
4. Submit PR with output sample for review
284+
285+
## 10. Summary
286+
287+
| Aspect | Requirement |
288+
|--------|-------------|
289+
| Directory | All outputs in `OUT.{suffix}/` |
290+
| File name | **Short (3-8 chars)**, lowercase, physical quantity prefix |
291+
| Extensions | `.txt`, `.dat`, `.csr`, `.cube` |
292+
| Basis suffix | `_pw` or `_nao` for basis-specific outputs |
293+
| Spin index | `s1`, `s2`, `s3`, `s4` (SOC), `s12` (non-collinear) |
294+
| Numbering | Start from 1, not 0 |
295+
| Header | `#` comments with meaning, units, source module |
296+
| Data | **Compact: 6-8 values per line**, space-separated |
297+
| Precision | Controllable via input parameter |
298+
| Volume | Reasonable file size, avoid excessive output |
299+
300+
**Remember: ABACUS outputs should present a unified, professional interface to users.**

0 commit comments

Comments
 (0)