lamindb.models .md

Auxiliary models & database library.

Registry basics

class lamindb.models.BaseSQLRecord(*args, **kwargs)

Base SQL metadata record.

It provides methods to SQLRecord and all its subclasses, but doesn’t come with the additional branch and space fields.

classmethod filter(*queries, **expressions)

Query records.

Parameters:
  • queries – One or multiple Q objects.

  • expressions – Fields and values passed as Django query expressions.

Return type:

QuerySet

See also

Examples

>>> ln.Project(name="my label").save()
>>> ln.Project.filter(name__startswith="my").to_dataframe()
classmethod get(idlike=None, **expressions)

Get a single record.

Parameters:
  • idlike (int | str | None, default: None) – Either a uid stub, uid or an integer id.

  • expressions – Fields and values passed as Django query expressions.

Raises:

lamindb.errors.ObjectDoesNotExist – In case no matching record is found.

Return type:

SQLRecord

See also

Examples

record = ln.Record.get("FvtpPJLJ")
record = ln.Record.get(name="my-label")
classmethod to_dataframe(include=None, features=False, limit=100)

Evaluate and convert to pd.DataFrame.

By default, maps simple fields and foreign keys onto DataFrame columns.

Guide: Query & search registries

Parameters:
  • include (str | list[str] | None, default: None) – Related data to include as columns. Takes strings of form "records__name", "cell_types__name", etc. or a list of such strings. For Artifact, Record, and Run, can also pass "features" to include features with data types pointing to entities in the core schema. If "privates", includes private fields (fields starting with _).

  • features (bool | list[str], default: False) – Configure the features to include. Can be a feature name or a list of such names. If "queryset", infers the features used within the current queryset. Only available for Artifact, Record, and Run.

  • limit (int, default: 100) – Maximum number of rows to display. If None, includes all results.

  • order_by – Field name to order the records by. Prefix with ‘-’ for descending order. Defaults to ‘-id’ to get the most recent records. This argument is ignored if the queryset is already ordered or if the specified field does not exist.

Return type:

DataFrame

Examples

Include the name of the creator:

ln.Record.to_dataframe(include="created_by__name"])

Include features:

ln.Artifact.to_dataframe(include="features")

Include selected features:

ln.Artifact.to_dataframe(features=["cell_type_by_expert", "cell_type_by_model"])
classmethod search(string, *, field=None, limit=20, case_sensitive=False)

Search.

Parameters:
  • string (str) – The input string to match against the field ontology values.

  • field (str | DeferredAttribute | None, default: None) – The field or fields to search. Search all string fields by default.

  • limit (int | None, default: 20) – Maximum amount of top results to return.

  • case_sensitive (bool, default: False) – Whether the match is case sensitive.

Return type:

QuerySet

Returns:

A sorted DataFrame of search results with a score in column score. If return_queryset is True. QuerySet.

See also

filter() lookup()

Examples

records = ln.Record.from_values(["Label1", "Label2", "Label3"], field="name").save()
ln.Record.search("Label2")
classmethod lookup(field=None, return_field=None)

Return an auto-complete object for a field.

Parameters:
  • field (str | DeferredAttribute | None, default: None) – The field to look up the values for. Defaults to first string field.

  • return_field (str | DeferredAttribute | None, default: None) – The field to return. If None, returns the whole record.

  • keep – When multiple records are found for a lookup, how to return the records. - "first": return the first record. - "last": return the last record. - False: return all records.

Return type:

NamedTuple

Returns:

A NamedTuple of lookup information of the field values with a dictionary converter.

See also

search()

Examples

Lookup via auto-complete on .:

import bionty as bt
bt.Gene.from_source(symbol="ADGB-DT").save()
lookup = bt.Gene.lookup()
lookup.adgb_dt

Look up via auto-complete in dictionary:

lookup_dict = lookup.dict()
lookup_dict['ADGB-DT']

Look up via a specific field:

lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id")
genes.ensg00000002745

Return a specific field value instead of the full record:

lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
classmethod connect(instance)

Query a non-default LaminDB instance.

Parameters:

instance (str | None) – An instance identifier of form “account_handle/instance_name”.

Return type:

QuerySet

Examples

ln.Record.connect("account_handle/instance_name").search("label7", field="name")
save(*args, **kwargs)

Save.

Always saves to the default database.

Return type:

TypeVar(T, bound= SQLRecord)

describe()

Describe record including relations.

Parameters:

return_str (bool, default: False) – Return a string instead of printing.

Return type:

None | str

delete(permanent=None)

Delete.

Parameters:

permanent (bool | None, default: None) – For consistency, False raises an error, as soft delete is impossible.

Returns:

When permanent=True, returns Django’s delete return value – a tuple of (deleted_count, {registry_name: count}). Otherwise returns None.

class lamindb.models.SQLRecord(*args, **kwargs)

An object that maps to a row in a SQL table in the database.

For the inherited SQLRecord class method definitions, see BaseSQLRecord.

Every SQLRecord is a data model that comes with a registry in form of a SQL table in your database.

Sub-classing SQLRecord creates a new registry while instantiating a SQLRecord creates a new object.

Example:

from lamindb import SQLRecord, fields

# sub-classing `SQLRecord` creates a new registry
class Experiment(SQLRecord):
    name: str = fields.CharField()

# instantiating `Experiment` creates a record `experiment`
experiment = Experiment(name="my experiment")

# you can save the record to the database
experiment.save()

# `Experiment` refers to the registry, which you can query
df = Experiment.filter(name__startswith="my ").to_dataframe()

SQLRecord’s metaclass is Registry.

SQLRecord inherits from Django’s Model class. Why does LaminDB call it SQLRecord and not Model? The term SQLRecord can’t lead to confusion with statistical, machine learning or biological models.

is_locked: bool

Whether the object is locked for edits.

branch: Branch

The branch.

space: Space

The space.

restore()

Restore from trash onto the main branch.

Does not restore descendant objects if the object is HasType with is_type = True.

Return type:

None

delete(permanent=None, **kwargs)

Delete object.

If object is HasType with is_type = True, deletes all descendant objects, too.

Parameters:

permanent (bool | None, default: None) – Whether to permanently delete the object (skips trash). If None, performs soft delete if the object is not already in the trash.

Returns:

When permanent=True, returns Django’s delete return value – a tuple of (deleted_count, {registry_name: count}). Otherwise returns None.

Examples

For any SQLRecord object sqlrecord, call:

sqlrecord.delete()
class lamindb.models.Registry(name, bases, attrs, **kwargs)

Metaclass for SQLRecord.

Each Registry object is a SQLRecord class and corresponds to a table in the metadata SQL database.

You work with Registry objects whenever you use class methods of SQLRecord.

You call any subclass of SQLRecord a “registry” and their objects “records”. A SQLRecord object corresponds to a row in the SQL table.

If you want to create a new registry, you sub-class SQLRecord.

Example:

from lamindb import SQLRecord, fields

# sub-classing `SQLRecord` creates a new registry
class Experiment(SQLRecord):
    name: str = fields.CharField()

# instantiating `Experiment` creates a record `experiment`
experiment = Experiment(name="my experiment")

# you can save the record to the database
experiment.save()

# `Experiment` refers to the registry, which you can query
df = Experiment.filter(name__startswith="my ").to_dataframe()

Note: Registry inherits from Django’s ModelBase.

class lamindb.models.QuerySet(model=None, query=None, using=None, hints=None)

Sets of records returned by queries.

Implements additional filtering capabilities.

See also

django QuerySet

Examples

>>> ULabel(name="my label").save()
>>> queryset = ULabel.filter(name="my label")
>>> queryset # an instance of QuerySet
get(idlike=None, **expressions)

Query a single record. Raises error if there are more or none.

Return type:

SQLRecord

filter(*queries, **expressions)

Query a set of records.

Return type:

QuerySet

Mixins for registries

class lamindb.models.IsVersioned(*args, **kwargs)

Base class for versioned models.

property stem_uid: str

Universal id characterizing the version family.

The full uid of a record is obtained via concatenating the stem uid and version information:

stem_uid = random_base62(n_char)  # a random base62 sequence of length 12 (transform) or 16 (artifact, collection)
version_uid = "0000"  # an auto-incrementing 4-digit base62 number
uid = f"{stem_uid}{version_uid}"  # concatenate the stem_uid & version_uid
property version: str

The version of an object.

Defines version of an object within a family of objects characterized by the same stem_uid.

Returns .version_tag if set, otherwise the last 4 characters of the uid.

property versions: QuerySet

Lists all records of the same version family.

Example:

artifact.versions.to_dataframe()       # all versions of the artifact in a dataframe
artifact.versions.get(is_latest=True)  # the latest version of the artifact
class lamindb.models.HasType(*args, **kwargs)

Mixin for registries that have a hierarchical type assigned.

Such registries have a .type foreign key pointing to themselves.

A type hence allows hierarchically grouping records under types.

For instance, using the example of ln.Record:

experiment_type = ln.Record(name="Experiment", is_type=True).save()
experiment1 = ln.Record(name="Experiment 1", type=experiment_type).save()
experiment2 = ln.Record(name="Experiment 2", type=experiment_type).save()
query_types()

Query types of a record recursively.

While .type retrieves the type, this method retrieves all super types of that type:

# Create type hierarchy
type1 = model_class(name="Type1", is_type=True).save()
type2 = model_class(name="Type2", is_type=True, type=type1).save()
type3 = model_class(name="Type3", is_type=True, type=type2).save()

# Create a record with type3
record = model_class(name=f"{model_name}3", type=type3).save()

# Query super types
super_types = record.query_types()
assert super_types[0] == type3
assert super_types[1] == type2
assert super_types[2] == type1
Return type:

SQLRecordList

class lamindb.models.HasParents

Base class for hierarchical registries (ontologies).

view_parents(field=None, with_children=False, distance=5)

View parents in an ontology.

Parameters:
  • field (str | DeferredAttribute | None, default: None) – Field to display on graph

  • with_children (bool, default: False) – Whether to also show children.

  • distance (int, default: 5) – Maximum distance still shown.

Ontological hierarchies: ULabel (project & sub-project), CellType (cell type & subtype).

Examples

>>> import bionty as bt
>>> bt.Tissue.from_source(name="subsegmental bronchus").save()
>>> record = bt.Tissue.get(name="respiratory tube")
>>> record.view_parents()
>>> tissue.view_parents(with_children=True)
view_children(field=None, distance=5)

View children in an ontology.

Parameters:
  • field (str | DeferredAttribute | None, default: None) – Field to display on graph

  • distance (int, default: 5) – Maximum distance still shown.

Ontological hierarchies: ULabel (project & sub-project), CellType (cell type & subtype).

Examples

>>> import bionty as bt
>>> bt.Tissue.from_source(name="subsegmental bronchus").save()
>>> record = bt.Tissue.get(name="respiratory tube")
>>> record.view_parents()
>>> tissue.view_parents(with_children=True)
query_parents()

Query parents in an ontology.

Return type:

QuerySet

query_children()

Query children in an ontology.

Return type:

QuerySet

class lamindb.models.CanCurate

Base class providing SQLRecord-based validation.

inspect(field=None, *, mute=False, organism=None, source=None, from_source=True, strict_source=False)

Inspect if values are mappable to a field.

Being mappable means that an exact match exists.

Parameters:
  • values (list[str] | Series | array) – Values that will be checked against the field.

  • field (str | DeferredAttribute | None, default: None) – The field of values. Examples are 'ontology_id' to map against the source ID or 'name' to map against the ontologies field names.

  • mute (bool, default: False) – Whether to mute logging.

  • organism (str | SQLRecord | None, default: None) – An Organism name or record.

  • source (SQLRecord | None, default: None) – A bionty.Source record that specifies the version to inspect against.

  • strict_source (bool, default: False) – Determines the validation behavior against records in the registry. - If False, validation will include all records in the registry, ignoring the specified source. - If True, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.

Return type:

bionty.base.dev.InspectResult

See also

validate()

Example:

import bionty as bt

# save some gene records
bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save()

# inspect gene symbols
gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
result = bt.Gene.inspect(gene_symbols, field=bt.Gene.symbol, organism="human")
assert result.validated == ["A1CF", "A1BG"]
assert result.non_validated == ["FANCD1", "FANCD20"]
validate(field=None, *, mute=False, organism=None, source=None, strict_source=False)

Validate values against existing values of a string field.

Note this is strict_source validation, only asserts exact matches.

Parameters:
  • values (list[str] | Series | array) – Values that will be validated against the field.

  • field (str | DeferredAttribute | None, default: None) – The field of values. Examples are 'ontology_id' to map against the source ID or 'name' to map against the ontologies field names.

  • mute (bool, default: False) – Whether to mute logging.

  • organism (str | SQLRecord | None, default: None) – An Organism name or record.

  • source (SQLRecord | None, default: None) – A bionty.Source record that specifies the version to validate against.

  • strict_source (bool, default: False) – Determines the validation behavior against records in the registry. - If False, validation will include all records in the registry, ignoring the specified source. - If True, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.

Return type:

ndarray

Returns:

A vector of booleans indicating if an element is validated.

See also

inspect()

Example:

import bionty as bt

bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save()

gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
bt.Gene.validate(gene_symbols, field=bt.Gene.symbol, organism="human")
#> array([ True,  True, False, False])
from_values(field=None, create=False, organism=None, source=None, standardize=True, from_source=True, mute=False)

Bulk create validated records by parsing values for an identifier such as a name or an id).

Parameters:
  • values (list[str] | Series | array) – A list of values for an identifier, e.g. ["name1", "name2"].

  • field (str | DeferredAttribute | None, default: None) – A SQLRecord field to look up, e.g., bt.CellMarker.name.

  • create (bool, default: False) – Whether to create records if they don’t exist.

  • organism (SQLRecord | str | None, default: None) – A bionty.Organism name or record.

  • source (SQLRecord | None, default: None) – A bionty.Source record to validate against to create records for.

  • standardize (bool, default: True) – Whether to standardize synonyms in the values.

  • from_source (bool, default: True) – Whether to create records from public source.

  • mute (bool, default: False) – Whether to mute logging.

Return type:

SQLRecordList

Returns:

A list of validated records. For bionty registries. Also returns knowledge-coupled records.

Notes

For more info, see tutorial: Manage biological ontologies.

Example:

import bionty as bt

# Bulk create from non-validated values will log warnings & returns empty list
ulabels = ln.ULabel.from_values(["benchmark", "prediction", "test"])
assert len(ulabels) == 0

# Bulk create records from validated values returns the corresponding existing records
ulabels = ln.ULabel.from_values(["benchmark", "prediction", "test"], create=True).save()
assert len(ulabels) == 3

# Bulk create records from public reference
bt.CellType.from_values(["T cell", "B cell"]).save()
standardize(field=None, *, return_field=None, return_mapper=False, case_sensitive=False, mute=False, from_source=True, keep='first', synonyms_field='synonyms', organism=None, source=None, strict_source=False)

Maps input synonyms to standardized names.

Parameters:
  • values (Iterable) – Identifiers that will be standardized.

  • field (str | DeferredAttribute | None, default: None) – The field representing the standardized names.

  • return_field (str | DeferredAttribute | None, default: None) – The field to return. Defaults to field.

  • return_mapper (bool, default: False) – If True, returns {input_value: standardized_name}.

  • case_sensitive (bool, default: False) – Whether the mapping is case sensitive.

  • mute (bool, default: False) – Whether to mute logging.

  • from_source (bool, default: True) – Whether to standardize from public source. Defaults to True for BioRecord registries.

  • keep (Literal['first', 'last', False], default: 'first') –

    When a synonym maps to multiple names, determines which duplicates to mark as pd.DataFrame.duplicated: - "first": returns the first mapped standardized name - "last": returns the last mapped standardized name - False: returns all mapped standardized name.

    When keep is False, the returned list of standardized names will contain nested lists in case of duplicates.

    When a field is converted into return_field, keep marks which matches to keep when multiple return_field values map to the same field value.

  • synonyms_field (str, default: 'synonyms') – A field containing the concatenated synonyms.

  • organism (str | SQLRecord | None, default: None) – An Organism name or record.

  • source (SQLRecord | None, default: None) – A bionty.Source record that specifies the version to validate against.

  • strict_source (bool, default: False) – Determines the validation behavior against records in the registry. - If False, validation will include all records in the registry, ignoring the specified source. - If True, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.

Return type:

list[str] | dict[str, str]

Returns:

If return_mapper is False – a list of standardized names. Otherwise, a dictionary of mapped values with mappable synonyms as keys and standardized names as values.

See also

add_synonym()

Add synonyms.

remove_synonym()

Remove synonyms.

Example:

import bionty as bt

# save some gene records
bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save()

# standardize gene synonyms
gene_synonyms = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
bt.Gene.standardize(gene_synonyms)
#> ['A1CF', 'A1BG', 'BRCA2', 'FANCD20']
add_synonym(synonym, force=False, save=None)

Add synonyms to a record.

Parameters:
  • synonym (str | list[str] | Series | array) – The synonyms to add to the record.

  • force (bool, default: False) – Whether to add synonyms even if they are already synonyms of other records.

  • save (bool | None, default: None) – Whether to save the record to the database.

See also

remove_synonym()

Remove synonyms.

Example:

import bionty as bt

# save "T cell" record
record = bt.CellType.from_source(name="T cell").save()
record.synonyms
#> "T-cell|T lymphocyte|T-lymphocyte"

# add a synonym
record.add_synonym("T cells")
record.synonyms
#> "T cells|T-cell|T-lymphocyte|T lymphocyte"
remove_synonym(synonym)

Remove synonyms from a record.

Parameters:

synonym (str | list[str] | Series | array) – The synonym values to remove.

See also

add_synonym()

Add synonyms

Example:

import bionty as bt

# save "T cell" record
record = bt.CellType.from_source(name="T cell").save()
record.synonyms
#> "T-cell|T lymphocyte|T-lymphocyte"

# remove a synonym
record.remove_synonym("T-cell")
record.synonyms
#> "T lymphocyte|T-lymphocyte"
set_abbr(value)

Set value for abbr field and add to synonyms.

Parameters:

value (str) – A value for an abbreviation.

See also

add_synonym()

Example:

import bionty as bt

# save an experimental factor record
scrna = bt.ExperimentalFactor.from_source(name="single-cell RNA sequencing").save()
assert scrna.abbr is None
assert scrna.synonyms == "single-cell RNA-seq|single-cell transcriptome sequencing|scRNA-seq|single cell RNA sequencing"

# set abbreviation
scrna.set_abbr("scRNA")
assert scrna.abbr == "scRNA"
# synonyms are updated
assert scrna.synonyms == "scRNA|single-cell RNA-seq|single cell RNA sequencing|single-cell transcriptome sequencing|scRNA-seq"
class lamindb.models.TracksRun(*args, **kwargs)

Base class tracking latest run, creating user, and created_at timestamp.

created_by_id
created_by: User

Creator of record.

run_id
run: Run | None

Run that created record.

class lamindb.models.TracksUpdates(*args, **kwargs)

Base class tracking previous runs and updated_at timestamp.

Managers

class lamindb.models.FeatureManager(sqlrecord)

Feature manager.

property slots: dict[str, Schema]

Features by schema slot.

Example:

artifact.features.slots
#> {'var': <Schema: var>, 'obs': <Schema: obs>}
describe(return_str=False)

Pretty print features.

This is what artifact.describe() calls under the hood.

Return type:

str | None

get_values(external_only=False)

Get features as a dictionary.

Includes annotation with internal and external feature values.

Parameters:

external_only (bool, default: False) – If True, only return external feature annotations.

Return type:

dict[str, Any]

add_values(values, feature_field=FieldAttr(Feature.name), schema=None)

Add values for features.

Parameters:
  • values (dict[str, str | int | float | bool]) – A dictionary of keys (features) & values (labels, strings, numbers, booleans, datetimes, etc.). If a value is None, it will be skipped.

  • feature_field (DeferredAttribute, default: FieldAttr(Feature.name)) – The field of a registry to map the keys of the values dictionary.

  • schema (Schema, default: None) – Schema to validate against.

Return type:

None

set_values(values, feature_field=FieldAttr(Feature.name), schema=None)

Set values for features.

Like add_values, but first removes all existing external feature annotations.

Parameters:
  • values (dict[str, str | int | float | bool]) – A dictionary of keys (features) & values (labels, strings, numbers, booleans, datetimes, etc.). If a value is None, it will be skipped.

  • feature_field (DeferredAttribute, default: FieldAttr(Feature.name)) – The field of a registry to map the keys of the values dictionary.

  • schema (Schema, default: None) – Schema to validate against.

Return type:

None

remove_values(feature=None, *, value=None)

Remove values for features.

Parameters:
  • feature (str | Feature | list[str | Feature], default: None) – Indicate one or several features for which to remove values. If None, values for all external features will be removed.

  • value (Any | None, default: None) – An optional value to restrict removal to a single value.

Return type:

None

class lamindb.models.LabelManager(sqlrecord)

Label manager.

This allows to manage untyped labels ULabel and arbitrary typed labels (e.g., CellLine) and associate labels with features.

describe(return_str=True)

Describe the labels.

Return type:

str

add(records, feature=None)

Add one or several labels and associate them with a feature.

Parameters:
Return type:

None

get(feature, mute=False, flat_names=False)

Get labels given a feature.

Parameters:
  • feature (Feature) – Feature under which labels are grouped.

  • mute (bool, default: False) – Show no logging.

  • flat_names (bool, default: False) – Flatten list to names rather than returning records.

Return type:

QuerySet | dict[str, QuerySet] | list

add_from(data, transfer_logs=None)

Add labels from an artifact or collection to another artifact or collection.

Return type:

None

Examples

artifact1 = ln.Artifact(pd.DataFrame(index=[0, 1])).save()
artifact2 = ln.Artifact(pd.DataFrame(index=[2, 3])).save()
records = ln.Record.from_values(["Label1", "Label2"], field="name").save()
labels = ln.Record.filter(name__icontains = "label")
artifact1.records.set(labels)
artifact2.labels.add_from(artifact1)
class lamindb.models.QueryManager(*args, **kwargs)

Manage queries through fields.

Examples

Populate the .parents ManyToMany relationship (a QueryManager):

ln.Record.from_values(["Label1", "Label2", "Label3"], field="name")).save()
labels = ln.Record.filter(name__icontains="label")
label1 = ln.Record.get(name="Label1")
label1.parents.set(labels)

Convert all linked parents to a DataFrame:

label1.parents.to_dataframe()
to_list(field=None)

Populate a list.

to_dataframe(**kwargs)

Convert to DataFrame.

For **kwargs, see lamindb.models.QuerySet.to_dataframe().

search(string, **kwargs)

Search.

Parameters:
  • string (str) – The input string to match against the field ontology values.

  • field – The field or fields to search. Search all string fields by default.

  • limit – Maximum amount of top results to return.

  • case_sensitive – Whether the match is case sensitive.

Returns:

A sorted DataFrame of search results with a score in column score. If return_queryset is True. QuerySet.

See also

filter() lookup()

Examples

records = ln.Record.from_values(["Label1", "Label2", "Label3"], field="name").save()
ln.Record.search("Label2")
lookup(field=None, **kwargs)

Return an auto-complete object for a field.

Parameters:
  • field (str | DeferredAttribute | None, default: None) – The field to look up the values for. Defaults to first string field.

  • return_field – The field to return. If None, returns the whole record.

  • keep – When multiple records are found for a lookup, how to return the records. - "first": return the first record. - "last": return the last record. - False: return all records.

Return type:

NamedTuple

Returns:

A NamedTuple of lookup information of the field values with a dictionary converter.

See also

search()

Examples

Lookup via auto-complete on .:

import bionty as bt
bt.Gene.from_source(symbol="ADGB-DT").save()
lookup = bt.Gene.lookup()
lookup.adgb_dt

Look up via auto-complete in dictionary:

lookup_dict = lookup.dict()
lookup_dict['ADGB-DT']

Look up via a specific field:

lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id")
genes.ensg00000002745

Return a specific field value instead of the full record:

lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
get_queryset()
class lamindb.models.RelatedManager(*args, **kwargs)

Manager for many-to-many and reverse foreign key relationships.

Provides relationship manipulation methods.

Examples

Populate the .parents ManyToMany relationship (a RelatedManager):

ln.Record.from_values(["Label1", "Label2", "Label3"], field="name")).save()
labels = ln.Record.filter(name__icontains="label")
label1 = ln.Record.get(name="Label1")
label1.parents.set(labels)

Convert all linked parents to a DataFrame:

label1.parents.to_dataframe()

Remove a parent label:

label1.parents.remove(label2)

Clear all parent labels:

label1.parents.clear()
add(*objs, bulk=True)

Add objects to the relationship.

Return type:

None

set(objs, *, bulk=True, clear=False)

Set the relationship to the specified objects.

Return type:

None

remove(*objs, bulk=True)

Remove objects from the relationship.

Return type:

None

clear()

Remove all objects from the relationship.

Return type:

None

Annotations of objects

Artifact, run, collection, annotations can be conditioned on features. Besides linking categorical data, you can also link simple data types by virtue of the JsonValue model.

class lamindb.models.JsonValue(*args, **kwargs)

JSON values for annotating artifacts and runs.

Categorical values are stored in their respective registries: ULabel, CellType, etc.

Unlike for ULabel, in JsonValue, values are grouped by features and not by an ontological hierarchy.

value: Any

The JSON-like value.

hash: str

Value hash.

feature: Feature | None

The dimension metadata.

runs: Run

Runs annotated with this feature value.

artifacts: Artifact

Artifacts annotated with this feature value.

classmethod get_or_create(feature, value)

Annotating artifacts.

class lamindb.models.ArtifactArtifact(created_at, created_by, run, id, artifact, value, feature)
artifact: Artifact
value: Artifact
feature: Feature | None
class lamindb.models.ArtifactJsonValue(created_at, created_by, run, id, artifact, jsonvalue)
artifact: Artifact
jsonvalue: JsonValue
class lamindb.models.ArtifactProject(created_at, created_by, run, id, artifact, project, feature)
artifact: Artifact
project: Project
feature: Feature | None
class lamindb.models.ArtifactRecord(created_at, created_by, run, id, artifact, record, feature)
artifact: Artifact
record: Record
feature: Feature
class lamindb.models.ArtifactReference(created_at, created_by, run, id, artifact, reference, feature)
artifact: Artifact
reference: Reference
feature: Feature | None
class lamindb.models.ArtifactRun(created_at, created_by, id, artifact, run, feature)
artifact: Artifact
feature: Feature | None
class lamindb.models.ArtifactSchema(created_at, created_by, run, id, artifact, schema, slot, feature_ref_is_semantic)
slot: str | None

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

feature_ref_is_semantic: bool | None

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

artifact: Artifact
schema: Schema
class lamindb.models.ArtifactULabel(created_at, created_by, run, id, artifact, ulabel, feature)
artifact: Artifact
ulabel: ULabel
feature: Feature | None
class lamindb.models.ArtifactUser(created_at, created_by, run, id, artifact, user, feature)
artifact: Artifact
user: User
feature: Feature | None

Annotating collections.

class lamindb.models.CollectionArtifact(created_at, created_by, run, id, collection, artifact)
collection: Collection
artifact: Artifact
class lamindb.models.CollectionProject(created_at, created_by, run, id, collection, project)
collection: Collection
project: Project
class lamindb.models.CollectionReference(created_at, created_by, run, id, collection, reference)
collection: Collection
reference: Reference
class lamindb.models.CollectionULabel(created_at, created_by, run, id, collection, ulabel, feature)
collection: Collection
ulabel: ULabel
feature: Feature | None
class lamindb.models.CollectionRecord(created_at, created_by, run, id, collection, record, feature)
collection: Collection
record: Record
feature: Feature

Annotating runs.

class lamindb.models.RunJsonValue(id, run, jsonvalue, created_at, created_by)
created_at: datetime

Time of creation of record.

run: Run
jsonvalue: JsonValue
created_by: User

Creator of record.

class lamindb.models.RunProject(id, run, project, created_at, created_by)
created_at: datetime

Time of creation of record.

run: Run
project: Project
created_by: User

Creator of record.

class lamindb.models.RunULabel(id, run, ulabel, created_at, created_by)
created_at: datetime

Time of creation of record.

run: Run
ulabel: ULabel
created_by: User

Creator of record.

class lamindb.models.RunRecord(id, run, record, feature, created_at, created_by)
created_at: datetime

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

run: Run
record: Record
feature: Feature
created_by: User

Annotating transforms.

class lamindb.models.TransformProject(created_at, created_by, run, id, transform, project)
transform: Transform
project: Project
class lamindb.models.TransformReference(created_at, created_by, run, id, transform, reference)
transform: Transform
reference: Reference
class lamindb.models.TransformULabel(created_at, created_by, run, id, transform, ulabel)
transform: Transform
ulabel: ULabel

Building relationships among transforms.

class lamindb.models.TransformTransform(id, successor, predecessor, config, created_at, created_by)
config: dict | None

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

created_at: datetime

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

successor: Transform
predecessor: Transform
created_by: User

Annotating features, blocks, and ulabels with projects.

class lamindb.models.FeatureProject(created_at, created_by, run, id, feature, project)
feature: Feature
project: Project
class lamindb.models.BlockProject(created_at, created_by, run, id, block, project)
block
project: Project
class lamindb.models.ULabelProject(created_at, created_by, run, id, ulabel, project)
ulabel: ULabel
project: Project
class lamindb.models.SchemaProject(created_at, created_by, run, id, schema, project)
schema: Schema
project: Project
class lamindb.models.ProjectRecord(created_at, created_by, run, id, project, feature, record)
project: Project
feature: Feature | None
record: Record

Building schemas.

class lamindb.models.SchemaComponent(created_at, created_by, run, id, composite, component, slot)
slot: str | None

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

composite: Schema
component: Schema
class lamindb.models.SchemaFeature(id, schema, feature)
schema: Schema
feature: Feature

Annotating references with records.

class lamindb.models.ReferenceRecord(created_at, created_by, run, id, reference, feature, record)
reference: Reference
feature: Feature | None
record: Record

Record values

Record values work almost exactly like artifact and run annotations, with the exception that JSON values are stored in RecordJson on a per-record basis and not in JsonValue.

class lamindb.models.RecordArtifact(id, record, feature, value)
record: Record
feature: Feature
value: Artifact
class lamindb.models.RecordCollection(id, record, feature, value)
record: Record
feature: Feature
value: Collection
class lamindb.models.RecordJson(id, record, feature, value)
value: Any

A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.

record: Record
feature: Feature
class lamindb.models.RecordProject(id, record, feature, value)
record: Record
feature: Feature
value: Project
class lamindb.models.RecordRecord(id, record, feature, value)
record: Record
feature: Feature
value: Record
class lamindb.models.RecordReference(id, record, feature, value)
record: Record
feature: Feature
value: Reference
class lamindb.models.RecordRun(id, record, feature, value)
record: Record
feature: Feature
value: Run
class lamindb.models.RecordTransform(id, record, feature, value)
record: Record
feature: Feature
value: Transform
class lamindb.models.RecordULabel(id, record, feature, value)
record: Record
feature: Feature
value: ULabel
class lamindb.models.RecordUser(id, record, feature, value)
record: Record
feature: Feature
value: User
class lamindb.models.TransformRecord(run, id, transform, record, feature, created_at, created_by)
transform: Transform
record: Record
feature: Feature

Blocks

class lamindb.models.BaseBlock(*args, **kwargs)
created_by_id
created_by: User

Creator of block.

class lamindb.models.Block(*args, **kwargs)

A root block for every registry that can appear at the top of the registry root block in the GUI.

key: str

The key for which we want to create a block.

projects: <class 'lamindb.models.project.Project'>

Projects that annotate this block.

class lamindb.models.ArtifactBlock(*args, **kwargs)

An unstructured notes block that can be attached to an artifact.

artifact: Artifact

The artifact to which the block is attached.

class lamindb.models.BranchBlock(*args, **kwargs)

An unstructured notes block that can be attached to a branch.

branch: Branch

The branch to which the block is attached.

class lamindb.models.CollectionBlock(*args, **kwargs)

An unstructured notes block that can be attached to a collection.

collection: Collection

The collection to which the block is attached.

class lamindb.models.FeatureBlock(*args, **kwargs)

An unstructured notes block that can be attached to a feature.

feature: Feature

The feature to which the block is attached.

class lamindb.models.ProjectBlock(*args, **kwargs)

An unstructured notes block that can be attached to a project.

project: Project

The project to which the block is attached.

class lamindb.models.RecordBlock(*args, **kwargs)

An unstructured notes block that can be attached to a record.

record: Record

The record to which the block is attached.

class lamindb.models.RunBlock(*args, **kwargs)

An unstructured notes block that can be attached to a run.

run: Run

The run to which the block is attached.

class lamindb.models.SchemaBlock(*args, **kwargs)

An unstructured notes block that can be attached to a schema.

schema: Schema

The schema to which the block is attached.

class lamindb.models.SpaceBlock(*args, **kwargs)

An unstructured notes block that can be attached to a space.

space: Space

The space to which the block is attached.

class lamindb.models.TransformBlock(*args, **kwargs)

An unstructured notes block that can be attached to a transform.

transform: Transform

The transform to which the block is attached.

line_number: int | None

The line number in the source code to which the block belongs.

class lamindb.models.ULabelBlock(*args, **kwargs)

An unstructured notes block that can be attached to a ulabel.

ulabel

The ulabel to which the block is attached.

Utils

class lamindb.models.LazyArtifact(suffix, overwrite_versions, **kwargs)

Lazy artifact for streaming to auto-generated internal paths.

This is needed when it is desirable to stream to a lamindb auto-generated internal path and register the path as an artifact (see Artifact).

This object creates a real artifact on .save() with the provided arguments.

Parameters:
  • suffix (str) – The suffix for the auto-generated internal path

  • overwrite_versions (bool) – Whether to overwrite versions.

  • **kwargs – Keyword arguments for the artifact to be created.

Examples

Create a lazy artifact, write to the path and save to get a real artifact:

lazy = ln.Artifact.from_lazy(suffix=".zarr", overwrite_versions=True, key="mydata.zarr")
zarr.open(lazy.path, mode="w")["test"] = np.array(["test"]) # stream to the path
artifact = lazy.save()
property path: UPath
save(upload=None, **kwargs)
Return type:

Artifact

class lamindb.models.InspectResult(validated_df, validated, nonvalidated, frac_validated, n_empty, n_unique)

Result of inspect.

An InspectResult object of calls such as inspect().

property df: DataFrame

A DataFrame indexed by values with a boolean __validated__ column.

property validated: list[str]

List of successfully validate() validated items.

property non_validated: list[str]

List of unsuccessfully validate() items.

This list can be used to remove any non-validated values such as genes that do not map against the specified source.

property frac_validated: float

Fraction of items that were validated.

property n_empty: int

Number of empty items.

property n_unique: int

Number of unique items.

property synonyms_mapper: dict

Synonyms mapper dictionary.

Such a dictionary maps the actual values to their synonyms which can be used to rename values accordingly.

Examples

>>> markers = pd.DataFrame(index=["KI67","CCR7"])
>>> synonyms_mapper = bt.CellMarker.standardize(markers.index, return_mapper=True)

{‘KI67’: ‘Ki67’, ‘CCR7’: ‘Ccr7’}

class lamindb.models.ValidateFields
class lamindb.models.SchemaOptionals(schema)

Manage and access optional features in a schema.

get_uids()

Get the uids of the optional features.

Does not need an additional query to the database, while get() does.

Return type:

list[str]

get()

Get the optional features.

Return type:

QuerySet

set(features)

Set the optional features (overwrites whichever schemas are currently optional).

Return type:

None

remove(features)

Make one or multiple features required by removing them from the set of optional features.

Return type:

None

add(features)

Make one or multiple features optional by adding them to the set of optional features.

Return type:

None

class lamindb.models.query_set.BiontyDB(query_db, module_name)

Namespace for Bionty registries (Gene, CellType, Disease, etc.).

class lamindb.models.query_set.PertdbDB(query_db, module_name)

Namespace for PertDB registries (Biologic, Compound, etc.).