annotations.precomputed#
- ngsidekick.annotations.precomputed.write_precomputed_annotations(df, coord_space, annotation_type, properties=(), relationships=(), output_dir='annotations', write_sharded=True, *, write_by_id=True, write_by_relationship=True, write_by_spatial_chunk=True, num_spatial_levels=7, target_chunk_limit=10000, shuffle_before_assigning_spatial_levels=True, description='')[source]#
Export the data from a pandas DataFrame into neuroglancer’s precomputed annotations format as described in the neuroglancer spec.
A progress bar is shown when writing each portion of the export (annotation ID index, related ID indexes), but there may be a significant amount of preprocessing time that occurs before the actual writing begins.
Note
Internally, the data will be copied during processing and again during writing, incurring significant RAM usage for large datasets. To save at least some RAM, you can wrap your dataframe in a TableHandle and then delete your own reference to the dataframe before calling this function. The TableHandle’s reference will be deleted internally as soon as possible (after the data is transformed for writing, before this function returns).
- Parameters:
df (
DataFrame|TableHandle) –DataFrame or TableHandle. The index of the DataFrame is used as the annotation ID, so it must be unique. The required columns depend on the annotation_type and the coordinate space. For example, assuming
coord_space.names == ['x', 'y', 'z'], then provide the following columns:For point annotations, provide [‘x’, ‘y’, ‘z’]
For line annotations or axis_aligned_bounding_box annotations, provide [‘xa’, ‘ya’, ‘za’, ‘xb’, ‘yb’, ‘zb’]
For ellipsoid annotations, provide [‘x’, ‘y’, ‘z’, ‘rx’, ‘ry’, ‘rz’] for the center point and radii.
You may also provide additional columns to use as annotation properties, in which case their column names should be listed in the ‘properties’ argument. (See below.)
If you provide a TableHandle, the handle’s reference will be unset before this function returns, deleting your data if you didn’t retain a reference to it yourself. (If you do retain a reference, it defeats the point of using a TableHandle in the first place.)
coord_space (
CoordinateSpace|str|list[str] |dict[str,list]) –neuroglancer.coordinate_space.CoordinateSpaceor equivalent. The coordinate space of the annotations. Among other things, this determines which input columns represent the annotation geometry. For convenience, we accept a couple different formats for the coordinate space, assuming a default scale of 1 nm if no scale/units are provided.Examples (all equivalent):
>>> coord_space = "xyz" >>> coord_space = ['x', 'y', 'z'] >>> coord_space = {"names": ['x', 'y', 'z']} >>> coord_space = { "names": ['x', 'y', 'z'], "units": ['nm', 'nm', 'nm'], "scales": [1, 1, 1] } >>> coord_space = CoordinateSpace( ... names=['x', 'y', 'z'], ... scales=[1.0, 1.0, 1.0], ... units=['nm', 'nm', 'nm'] ... )
annotation_type (
Literal['point','line','ellipsoid','axis_aligned_bounding_box']) – Literal[‘point’, ‘line’, ‘ellipsoid’, ‘axis_aligned_bounding_box’] The type of annotation to export. Note that the columns you provide in the DataFrame depend on the annotation type.properties (
list[str] |list[AnnotationPropertySpec] |dict[str,AnnotationPropertySpec] |list[dict]) –If your dataframe contains columns for annotation properties, list the names of those columns here.
Categorical columns will be automatically converted to integers with associated enum labels.
To provide an rgb or rgba property such as ‘mycolor’, provide separate columns in your dataframe named ‘mycolor_r’, ‘mycolor_g’, ‘mycolor_b’ (and ‘mycolor_a’), and then include ‘mycolor’ in the properties list here.
The full property spec for each property will be inferred from the column dtype, but if you want to explicitly override any property specs yourself, you can pass a list of AnnotationPropertySpec objects here instead of just listing column names.
Property names must start with a lowercase letter and may contain only letters, numbers, and underscores.
relationships (
list[str]) – list[str] If your annotations have related segment IDs, such relationships can be provided in the columns of your DataFrame. Each relationship should be listed in a single column, whose values are lists of segment IDs. In the special case where each annotation has exactly one related segment, the column may have dtype=np.uint64 instead of containing lists.output_dir (
str) – str The directory into which the exported annotations will be written. Subdirectories will be created for the “annotation ID index” and each “related object id index” as needed.write_sharded (
bool) – bool Whether to write the output as sharded files. The sharded format is preferable for most use cases. Without sharding, every annotation results in a separate file in the annotation ID index. Similarly, every related ID results in a separate file in the related ID index.write_by_id (
bool) – bool Whether to write the annotations to the “Annotation ID Index”. If False, skip writing.write_relationships – bool Whether to write the relationships to the “Related Object ID Index”. If False, skip writing.
write_by_spatial_chunk (
bool) – bool Whether to write the spatial index.num_spatial_levels (
int) – int The maximum number of spatial index levels to write. If not all levels are needed (because all annotations fit within the first N levels), then the actual number of levels written will be less than this value.target_chunk_limit (
int) –int For the spatial index, this is how many annotations we aim to place in each chunk (regardless of the level). If there are more annotations than fit within the specified num_spacial_levels while (approximately) adhering to the target_chunk_limit at each level, then the extra annotations will be assigned to the last level.
Note
Instead of specifying a valid limit here, you can disable subsampling in neuroglancer by setting this to the special value of 0. In our implementation, this is only valid when num_spatial_levels=1.
shuffle_before_assigning_spatial_levels (
bool) – bool Whether to shuffle the annotations before assigning spatial levels. By default, we shuffle the annotations to avoid any bias in the spatial assignment, which is what the neuroglancer spec recommends. However, in some use-cases a bias may be desirable (e.g. deliberately preferring to show larger annotations when zoomed out). So if this is False, the annotations will be assigned to spatial levels in the order they appear in the input dataframe, with earlier annotations assigned to coarser spatial levels.description (
str) – str A description of the annotation collection.write_by_relationship (bool)
- class ngsidekick.annotations.precomputed.TableHandle(df=None)[source]#
Bases:
objectA wrapper for a pandas DataFrame that can be provided to transfer ownership of the DataFrame to
write_precomputed_annotations(), which will delete the handle’s reference to the DataFrame as soon as possible to save RAM.Example:
>>> handle = TableHandle(df) >>> del df # Delete your own reference to the original data >>> write_precomputed_annotations(handle, 'xyz', 'point')
- Parameters:
df (DataFrame | None)
Precomputed Annotations#
This module provides tools for working with precomputed annotation formats.