STAR File

This module defines functions for parsing STAR (Self-defining Text Archiving and Retrieval) files. This includes metadata files from cryo-EM image analysis programs including RELION and XMIPP as well as the Crystallographic Information File (CIF) format used for much of the PDB.

class StarDict(parsingDict, prog, title='unnamed', indices=None)[source]
getDict()[source]
getTitle()[source]
numDataBlocks()[source]
pop(index)[source]

Pop dataBlock with the given index from the list of dataBlocks in StarDict

printData()[source]
search(substr, return_indices=False)[source]
setTitle(value)[source]
class StarDataBlock(starDict, key, indices=None)[source]
getDict()[source]
getLoop(index)[source]
getTitle()[source]
numEntries()[source]
numLoops()[source]
pop(index)[source]

Pop loop with the given index from the list of loops in dataBlock

printData()[source]
search(substr, return_indices=False)[source]
setTitle(title)[source]
class StarLoop(dataBlock, key, indices=None)[source]
getData(key)[source]
getDict()[source]
getTitle()[source]
numFields()[source]
numRows()[source]
printData()[source]
search(substr, return_indices=False)[source]
setTitle(title)[source]
parseSTAR(filename, **kwargs)[source]

Returns a dictionary containing data parsed from a STAR file.

Parameters:
  • filename (str) – a filename The .star extension can be omitted.
  • start (int, None) – line number for starting Default is None, meaning start at the beginning
  • stop (int, None) – line number for stopping Default is None, meaning don’t stop.
  • shlex (bool) – whether to use shlex for splitting lines so as to preserve quoted substrings Default is False
writeSTAR(filename, starDict, **kwargs)[source]

Writes a STAR file from a dictionary containing data such as that parsed from a Relion STAR file.

Parameters:
  • filename (str) – a filename The .star extension can be omitted.
  • starDict (dict) – a dictionary in STAR format This should have nested entries starting with data blocks then loops/tables then field names and finally data.

kwargs can be given including the program style to follow (prog)

parseImagesFromSTAR(particlesSTAR, **kwargs)[source]

Parses particle images using data from a STAR file containing information about them.

Parameters:
  • particlesSTAR (str) – a filename for a STAR file.
  • block_indices (list, ndarray) – indices for data blocks containing rows corresponding to images of interest The indexing scheme is similar to that for numpy arrays. Default behavior is use all data blocks about images
  • row_indices (list, ndarray) – indices for rows corresponding to images of interest The indexing scheme is similar to that for numpy arrays. row_indices should be a 1D or 2D array-like. 2D row_indices should contain an entry for each relevant loop. If a 1D array-like is given the same row indices will be applied to all loops. Default behavior is to use all rows about images
  • particle_indices (list, ndarray) – indices for particles regardless of STAR structure default is take all particles Please note: this acts after block_indices and row_indices
  • saveImageArrays (bool) – whether to save the numpy array for each image to file default is False
  • saveDirectory (str) – directory where numpy image arrays are saved default is None, which means save to the current working directory
  • rotateImages (bool) – whether to apply in plane translations and rotations using provided psi and origin data, default is True
parseSTARSection(lines, key, report=True)[source]

Parse a section of data from lines from a STAR file corresponding to a key (part before the dot). This can be a loop or data block.

Returns data encapulated in a list and the associated fields.

Parameters:report (bool) – whether to report warnings about not finding data default True