We have provided several useful validation tools for structures determined by X-ray crystallography. These checks should be performed before you submit your files to pdb_extract for annotation, but should not serve as a replacement for complete validation.
Three popular programs (sfcheck , refmac, phenix.model_vs_data) are used to check that your structure factors and the coordinates match. These checks are static, meaning that no rounds of refinement are carried out, just statistics generated. Information relevant to such calculations (such as TLS, NCS, coordinates, ADP, occupancy) in your pdb/mmCIF file is directly used for validation.
For Neutron and X-ray hybrid methods, you need to input the structure factor file in mmCIF format. The mmCIF file must have two data blocks, with the first for Xray and the second for neutron. If you have mtz format, you need to use the tool to convert it to cif format first. The semi-auto conversion is preferred.
How TO RUN
If your structure factor format is not the mmCIF, this tool will convert your file to mmCIF format for processing.
The sfcheck tool will return the following report:
Sf-convert will automatically convert your structure factor file to any of the following formats: mmCIF, MTZ, CNS/CNX, XPLOR, SHELX, TNT, HKL2000, SCALEPACK, D*Trek, SAINT, or generic OTHER format.
You can download the desktop version of sf-convert here.
To convert a structure factor file using the online version, input your Structure Factor File into the SF-Tool form. You will also need to provide your coordinate file if you wish to convert to the MTZ format, since its format also contains unit cell parameters (a, b, c, alpha, beta, gamma, symmetry) and resolution information.
TO CONVERT YOUR STRUCTURE FACTOR FILES:If you wish to convert TNT, SHELX, and OTHER SF files to another format, you must indicate if the data has been calculated as amplitudes (F) or intensities (I).
Most reflection/structure factor data are organized into (at minimum) 5 columns (see examples). The default column assignments used by most programs are:
H K L F(or I) SigF(or SigI or status) ...and a floating Free_R flag columnwith at least one space in between each column. These column labels might not be displayed depending on the program's particular format, but would be displayed in an mtz_dump output.
An automatic conversion is used to convert one standardized SF format to another. If you are using one of the accepted formats (column labels, etc.), and you have made no manual changes to the structure file format yourself, then this method will work for you. Our tool is programmed to recognize which columns correspond to each parameter (depending on the program's format) and "automatically" converts the data to the new format accordingly.
However, if you used a novel program whose SF file format does not match defined criteria, or you have manually changed the column labels, then your file format will not be recognized by sf-convert. In this case, a semi-automatic conversion is necessary which will require some additional input from the user. A table will be displayed where you must match the appropriate SF parameters with your column labels. This tool will then perform a complete DATA TYPE conversion from CNS or MTZ format to the mmCIF format, rather than only converting the essential data as in the automatic conversion.
During conversion, you can also set aside 5, 8, 10% of the reflection data for cross-validation (Free_R factor). This can be either a new Free-R selection, or you can repopulate the reflection list. To do this, write the percentage as a whole number in the provided box. If you want to leave the current Free_R flags untouched, leave this box empty.
Questions, comments, and suggestions should be sent to deposit@deposit.rcsb.org.