lamindb.models
¶
Auxiliary models & database library.
Registry basics¶
- class lamindb.models.BaseSQLRecord(*args, **kwargs)¶
Base SQL metadata record.
It provides methods to
SQLRecordand all its subclasses, but doesn’t come with the additionalbranchandspacefields.- classmethod filter(*queries, **expressions)¶
Query records.
- Parameters:
queries – One or multiple
Qobjects.expressions – Fields and values passed as Django query expressions.
- Return type:
See also
Guide: Query & search registries
Django documentation: Queries
Examples
>>> ln.Project(name="my label").save() >>> ln.Project.filter(name__startswith="my").to_dataframe()
- classmethod get(idlike=None, **expressions)¶
Get a single record.
- Parameters:
idlike (
int|str|None, default:None) – Either a uid stub, uid or an integer id.expressions – Fields and values passed as Django query expressions.
- Raises:
lamindb.errors.ObjectDoesNotExist – In case no matching record is found.
- Return type:
See also
Guide: Query & search registries
Django documentation: Queries
Examples
record = ln.Record.get("FvtpPJLJ") record = ln.Record.get(name="my-label")
- classmethod to_dataframe(include=None, features=False, limit=100)¶
Evaluate and convert to
pd.DataFrame.By default, maps simple fields and foreign keys onto
DataFramecolumns.Guide: Query & search registries
- Parameters:
include (
str|list[str] |None, default:None) – Related data to include as columns. Takes strings of form"records__name","cell_types__name", etc. or a list of such strings. ForArtifact,Record, andRun, can also pass"features"to include features with data types pointing to entities in the core schema. If"privates", includes private fields (fields starting with_).features (
bool|list[str], default:False) – Configure the features to include. Can be a feature name or a list of such names. If"queryset", infers the features used within the current queryset. Only available forArtifact,Record, andRun.limit (
int, default:100) – Maximum number of rows to display. IfNone, includes all results.order_by – Field name to order the records by. Prefix with ‘-’ for descending order. Defaults to ‘-id’ to get the most recent records. This argument is ignored if the queryset is already ordered or if the specified field does not exist.
- Return type:
DataFrame
Examples
Include the name of the creator:
ln.Record.to_dataframe(include="created_by__name"])
Include features:
ln.Artifact.to_dataframe(include="features")
Include selected features:
ln.Artifact.to_dataframe(features=["cell_type_by_expert", "cell_type_by_model"])
- classmethod search(string, *, field=None, limit=20, case_sensitive=False)¶
Search.
- Parameters:
string (
str) – The input string to match against the field ontology values.field (
str|DeferredAttribute|None, default:None) – The field or fields to search. Search all string fields by default.limit (
int|None, default:20) – Maximum amount of top results to return.case_sensitive (
bool, default:False) – Whether the match is case sensitive.
- Return type:
- Returns:
A sorted
DataFrameof search results with a score in columnscore. Ifreturn_querysetisTrue.QuerySet.
See also
filter()lookup()Examples
records = ln.Record.from_values(["Label1", "Label2", "Label3"], field="name").save() ln.Record.search("Label2")
- classmethod lookup(field=None, return_field=None)¶
Return an auto-complete object for a field.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – The field to look up the values for. Defaults to first string field.return_field (
str|DeferredAttribute|None, default:None) – The field to return. IfNone, returns the whole record.keep – When multiple records are found for a lookup, how to return the records. -
"first": return the first record. -"last": return the last record. -False: return all records.
- Return type:
NamedTuple- Returns:
A
NamedTupleof lookup information of the field values with a dictionary converter.
See also
search()Examples
Lookup via auto-complete on
.:import bionty as bt bt.Gene.from_source(symbol="ADGB-DT").save() lookup = bt.Gene.lookup() lookup.adgb_dt
Look up via auto-complete in dictionary:
lookup_dict = lookup.dict() lookup_dict['ADGB-DT']
Look up via a specific field:
lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id") genes.ensg00000002745
Return a specific field value instead of the full record:
lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
- classmethod connect(instance)¶
Query a non-default LaminDB instance.
- Parameters:
instance (
str|None) – An instance identifier of form “account_handle/instance_name”.- Return type:
Examples
ln.Record.connect("account_handle/instance_name").search("label7", field="name")
- save(*args, **kwargs)¶
Save.
Always saves to the default database.
- Return type:
TypeVar(T, bound= SQLRecord)
- describe()¶
Describe record including relations.
- Parameters:
return_str (
bool, default:False) – Return a string instead of printing.- Return type:
None|str
- delete(permanent=None)¶
Delete.
- Parameters:
permanent (
bool|None, default:None) – For consistency,Falseraises an error, as soft delete is impossible.- Returns:
When
permanent=True, returns Django’s delete return value – a tuple of (deleted_count, {registry_name: count}). Otherwise returns None.
- class lamindb.models.SQLRecord(*args, **kwargs)¶
An object that maps to a row in a SQL table in the database.
For the inherited
SQLRecordclass method definitions, seeBaseSQLRecord.Every
SQLRecordis a data model that comes with a registry in form of a SQL table in your database.Sub-classing
SQLRecordcreates a new registry while instantiating aSQLRecordcreates a new object.Example:
from lamindb import SQLRecord, fields # sub-classing `SQLRecord` creates a new registry class Experiment(SQLRecord): name: str = fields.CharField() # instantiating `Experiment` creates a record `experiment` experiment = Experiment(name="my experiment") # you can save the record to the database experiment.save() # `Experiment` refers to the registry, which you can query df = Experiment.filter(name__startswith="my ").to_dataframe()
SQLRecord’s metaclass isRegistry.SQLRecordinherits from Django’sModelclass. Why does LaminDB call itSQLRecordand notModel? The termSQLRecordcan’t lead to confusion with statistical, machine learning or biological models.-
is_locked:
bool¶ Whether the object is locked for edits.
- restore()¶
Restore from trash onto the main branch.
Does not restore descendant objects if the object is
HasTypewithis_type = True.- Return type:
None
- delete(permanent=None, **kwargs)¶
Delete object.
If object is
HasTypewithis_type = True, deletes all descendant objects, too.- Parameters:
permanent (
bool|None, default:None) – Whether to permanently delete the object (skips trash). IfNone, performs soft delete if the object is not already in the trash.- Returns:
When
permanent=True, returns Django’s delete return value – a tuple of (deleted_count, {registry_name: count}). Otherwise returns None.
Examples
For any
SQLRecordobjectsqlrecord, call:sqlrecord.delete()
-
is_locked:
- class lamindb.models.Registry(name, bases, attrs, **kwargs)¶
Metaclass for
SQLRecord.Each
Registryobject is aSQLRecordclass and corresponds to a table in the metadata SQL database.You work with
Registryobjects whenever you use class methods ofSQLRecord.You call any subclass of
SQLRecorda “registry” and their objects “records”. ASQLRecordobject corresponds to a row in the SQL table.If you want to create a new registry, you sub-class
SQLRecord.Example:
from lamindb import SQLRecord, fields # sub-classing `SQLRecord` creates a new registry class Experiment(SQLRecord): name: str = fields.CharField() # instantiating `Experiment` creates a record `experiment` experiment = Experiment(name="my experiment") # you can save the record to the database experiment.save() # `Experiment` refers to the registry, which you can query df = Experiment.filter(name__startswith="my ").to_dataframe()
Note:
Registryinherits from Django’sModelBase.
- class lamindb.models.QuerySet(model=None, query=None, using=None, hints=None)¶
Sets of records returned by queries.
Implements additional filtering capabilities.
See also
Examples
>>> ULabel(name="my label").save() >>> queryset = ULabel.filter(name="my label") >>> queryset # an instance of QuerySet
- get(idlike=None, **expressions)¶
Query a single record. Raises error if there are more or none.
- Return type:
Mixins for registries¶
- class lamindb.models.IsVersioned(*args, **kwargs)¶
Base class for versioned models.
- property stem_uid: str¶
Universal id characterizing the version family.
The full uid of a record is obtained via concatenating the stem uid and version information:
stem_uid = random_base62(n_char) # a random base62 sequence of length 12 (transform) or 16 (artifact, collection) version_uid = "0000" # an auto-incrementing 4-digit base62 number uid = f"{stem_uid}{version_uid}" # concatenate the stem_uid & version_uid
- property version: str¶
The version of an object.
Defines version of an object within a family of objects characterized by the same
stem_uid.Returns
.version_tagif set, otherwise the last 4 characters of theuid.
- class lamindb.models.HasType(*args, **kwargs)¶
Mixin for registries that have a hierarchical
typeassigned.Such registries have a
.typeforeign key pointing to themselves.A
typehence allows hierarchically grouping records under types.For instance, using the example of
ln.Record:experiment_type = ln.Record(name="Experiment", is_type=True).save() experiment1 = ln.Record(name="Experiment 1", type=experiment_type).save() experiment2 = ln.Record(name="Experiment 2", type=experiment_type).save()
- query_types()¶
Query types of a record recursively.
While
.typeretrieves thetype, this method retrieves all super types of thattype:# Create type hierarchy type1 = model_class(name="Type1", is_type=True).save() type2 = model_class(name="Type2", is_type=True, type=type1).save() type3 = model_class(name="Type3", is_type=True, type=type2).save() # Create a record with type3 record = model_class(name=f"{model_name}3", type=type3).save() # Query super types super_types = record.query_types() assert super_types[0] == type3 assert super_types[1] == type2 assert super_types[2] == type1
- Return type:
SQLRecordList
- class lamindb.models.HasParents¶
Base class for hierarchical registries (ontologies).
- view_parents(field=None, with_children=False, distance=5)¶
View parents in an ontology.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – Field to display on graphwith_children (
bool, default:False) – Whether to also show children.distance (
int, default:5) – Maximum distance still shown.
Ontological hierarchies:
ULabel(project & sub-project),CellType(cell type & subtype).Examples
>>> import bionty as bt >>> bt.Tissue.from_source(name="subsegmental bronchus").save() >>> record = bt.Tissue.get(name="respiratory tube") >>> record.view_parents() >>> tissue.view_parents(with_children=True)
- view_children(field=None, distance=5)¶
View children in an ontology.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – Field to display on graphdistance (
int, default:5) – Maximum distance still shown.
Ontological hierarchies:
ULabel(project & sub-project),CellType(cell type & subtype).Examples
>>> import bionty as bt >>> bt.Tissue.from_source(name="subsegmental bronchus").save() >>> record = bt.Tissue.get(name="respiratory tube") >>> record.view_parents() >>> tissue.view_parents(with_children=True)
- class lamindb.models.CanCurate¶
Base class providing
SQLRecord-based validation.- inspect(field=None, *, mute=False, organism=None, source=None, from_source=True, strict_source=False)¶
Inspect if values are mappable to a field.
Being mappable means that an exact match exists.
- Parameters:
values (
list[str] |Series|array) – Values that will be checked against the field.field (
str|DeferredAttribute|None, default:None) – The field of values. Examples are'ontology_id'to map against the source ID or'name'to map against the ontologies field names.mute (
bool, default:False) – Whether to mute logging.organism (
str|SQLRecord|None, default:None) – An Organism name or record.source (
SQLRecord|None, default:None) – Abionty.Sourcerecord that specifies the version to inspect against.strict_source (
bool, default:False) – Determines the validation behavior against records in the registry. - IfFalse, validation will include all records in the registry, ignoring the specified source. - IfTrue, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
- Return type:
bionty.base.dev.InspectResult
See also
Example:
import bionty as bt # save some gene records bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() # inspect gene symbols gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"] result = bt.Gene.inspect(gene_symbols, field=bt.Gene.symbol, organism="human") assert result.validated == ["A1CF", "A1BG"] assert result.non_validated == ["FANCD1", "FANCD20"]
- validate(field=None, *, mute=False, organism=None, source=None, strict_source=False)¶
Validate values against existing values of a string field.
Note this is strict_source validation, only asserts exact matches.
- Parameters:
values (
list[str] |Series|array) – Values that will be validated against the field.field (
str|DeferredAttribute|None, default:None) – The field of values. Examples are'ontology_id'to map against the source ID or'name'to map against the ontologies field names.mute (
bool, default:False) – Whether to mute logging.organism (
str|SQLRecord|None, default:None) – An Organism name or record.source (
SQLRecord|None, default:None) – Abionty.Sourcerecord that specifies the version to validate against.strict_source (
bool, default:False) – Determines the validation behavior against records in the registry. - IfFalse, validation will include all records in the registry, ignoring the specified source. - IfTrue, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
- Return type:
ndarray- Returns:
A vector of booleans indicating if an element is validated.
See also
Example:
import bionty as bt bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"] bt.Gene.validate(gene_symbols, field=bt.Gene.symbol, organism="human") #> array([ True, True, False, False])
- from_values(field=None, create=False, organism=None, source=None, standardize=True, from_source=True, mute=False)¶
Bulk create validated records by parsing values for an identifier such as a name or an id).
- Parameters:
values (
list[str] |Series|array) – A list of values for an identifier, e.g.["name1", "name2"].field (
str|DeferredAttribute|None, default:None) – ASQLRecordfield to look up, e.g.,bt.CellMarker.name.create (
bool, default:False) – Whether to create records if they don’t exist.organism (
SQLRecord|str|None, default:None) – Abionty.Organismname or record.source (
SQLRecord|None, default:None) – Abionty.Sourcerecord to validate against to create records for.standardize (
bool, default:True) – Whether to standardize synonyms in the values.from_source (
bool, default:True) – Whether to create records from public source.mute (
bool, default:False) – Whether to mute logging.
- Return type:
SQLRecordList- Returns:
A list of validated records. For bionty registries. Also returns knowledge-coupled records.
Notes
For more info, see tutorial: Manage biological ontologies.
Example:
import bionty as bt # Bulk create from non-validated values will log warnings & returns empty list ulabels = ln.ULabel.from_values(["benchmark", "prediction", "test"]) assert len(ulabels) == 0 # Bulk create records from validated values returns the corresponding existing records ulabels = ln.ULabel.from_values(["benchmark", "prediction", "test"], create=True).save() assert len(ulabels) == 3 # Bulk create records from public reference bt.CellType.from_values(["T cell", "B cell"]).save()
- standardize(field=None, *, return_field=None, return_mapper=False, case_sensitive=False, mute=False, from_source=True, keep='first', synonyms_field='synonyms', organism=None, source=None, strict_source=False)¶
Maps input synonyms to standardized names.
- Parameters:
values (
Iterable) – Identifiers that will be standardized.field (
str|DeferredAttribute|None, default:None) – The field representing the standardized names.return_field (
str|DeferredAttribute|None, default:None) – The field to return. Defaults to field.return_mapper (
bool, default:False) – IfTrue, returns{input_value: standardized_name}.case_sensitive (
bool, default:False) – Whether the mapping is case sensitive.mute (
bool, default:False) – Whether to mute logging.from_source (
bool, default:True) – Whether to standardize from public source. Defaults toTruefor BioRecord registries.keep (
Literal['first','last',False], default:'first') –When a synonym maps to multiple names, determines which duplicates to mark as
pd.DataFrame.duplicated: -"first": returns the first mapped standardized name -"last": returns the last mapped standardized name -False: returns all mapped standardized name.When
keepisFalse, the returned list of standardized names will contain nested lists in case of duplicates.When a field is converted into return_field, keep marks which matches to keep when multiple return_field values map to the same field value.
synonyms_field (
str, default:'synonyms') – A field containing the concatenated synonyms.organism (
str|SQLRecord|None, default:None) – An Organism name or record.source (
SQLRecord|None, default:None) – Abionty.Sourcerecord that specifies the version to validate against.strict_source (
bool, default:False) – Determines the validation behavior against records in the registry. - IfFalse, validation will include all records in the registry, ignoring the specified source. - IfTrue, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
- Return type:
list[str] |dict[str,str]- Returns:
If
return_mapperisFalse– a list of standardized names. Otherwise, a dictionary of mapped values with mappable synonyms as keys and standardized names as values.
See also
add_synonym()Add synonyms.
remove_synonym()Remove synonyms.
Example:
import bionty as bt # save some gene records bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() # standardize gene synonyms gene_synonyms = ["A1CF", "A1BG", "FANCD1", "FANCD20"] bt.Gene.standardize(gene_synonyms) #> ['A1CF', 'A1BG', 'BRCA2', 'FANCD20']
- add_synonym(synonym, force=False, save=None)¶
Add synonyms to a record.
- Parameters:
synonym (
str|list[str] |Series|array) – The synonyms to add to the record.force (
bool, default:False) – Whether to add synonyms even if they are already synonyms of other records.save (
bool|None, default:None) – Whether to save the record to the database.
See also
remove_synonym()Remove synonyms.
Example:
import bionty as bt # save "T cell" record record = bt.CellType.from_source(name="T cell").save() record.synonyms #> "T-cell|T lymphocyte|T-lymphocyte" # add a synonym record.add_synonym("T cells") record.synonyms #> "T cells|T-cell|T-lymphocyte|T lymphocyte"
- remove_synonym(synonym)¶
Remove synonyms from a record.
- Parameters:
synonym (
str|list[str] |Series|array) – The synonym values to remove.
See also
add_synonym()Add synonyms
Example:
import bionty as bt # save "T cell" record record = bt.CellType.from_source(name="T cell").save() record.synonyms #> "T-cell|T lymphocyte|T-lymphocyte" # remove a synonym record.remove_synonym("T-cell") record.synonyms #> "T lymphocyte|T-lymphocyte"
- set_abbr(value)¶
Set value for abbr field and add to synonyms.
- Parameters:
value (
str) – A value for an abbreviation.
See also
Example:
import bionty as bt # save an experimental factor record scrna = bt.ExperimentalFactor.from_source(name="single-cell RNA sequencing").save() assert scrna.abbr is None assert scrna.synonyms == "single-cell RNA-seq|single-cell transcriptome sequencing|scRNA-seq|single cell RNA sequencing" # set abbreviation scrna.set_abbr("scRNA") assert scrna.abbr == "scRNA" # synonyms are updated assert scrna.synonyms == "scRNA|single-cell RNA-seq|single cell RNA sequencing|single-cell transcriptome sequencing|scRNA-seq"
- class lamindb.models.TracksRun(*args, **kwargs)¶
Base class tracking latest run, creating user, and
created_attimestamp.- created_by_id¶
- run_id¶
- class lamindb.models.TracksUpdates(*args, **kwargs)¶
Base class tracking previous runs and
updated_attimestamp.
Managers¶
- class lamindb.models.FeatureManager(sqlrecord)¶
Feature manager.
- property slots: dict[str, Schema]¶
Features by schema slot.
Example:
artifact.features.slots #> {'var': <Schema: var>, 'obs': <Schema: obs>}
- describe(return_str=False)¶
Pretty print features.
This is what
artifact.describe()calls under the hood.- Return type:
str|None
- get_values(external_only=False)¶
Get features as a dictionary.
Includes annotation with internal and external feature values.
- Parameters:
external_only (
bool, default:False) – IfTrue, only return external feature annotations.- Return type:
dict[str,Any]
- add_values(values, feature_field=FieldAttr(Feature.name), schema=None)¶
Add values for features.
- Parameters:
values (
dict[str,str|int|float|bool]) – A dictionary of keys (features) & values (labels, strings, numbers, booleans, datetimes, etc.). If a value isNone, it will be skipped.feature_field (
DeferredAttribute, default:FieldAttr(Feature.name)) – The field of a registry to map the keys of thevaluesdictionary.schema (
Schema, default:None) – Schema to validate against.
- Return type:
None
- set_values(values, feature_field=FieldAttr(Feature.name), schema=None)¶
Set values for features.
Like
add_values, but first removes all existing external feature annotations.- Parameters:
values (
dict[str,str|int|float|bool]) – A dictionary of keys (features) & values (labels, strings, numbers, booleans, datetimes, etc.). If a value isNone, it will be skipped.feature_field (
DeferredAttribute, default:FieldAttr(Feature.name)) – The field of a registry to map the keys of thevaluesdictionary.schema (
Schema, default:None) – Schema to validate against.
- Return type:
None
- remove_values(feature=None, *, value=None)¶
Remove values for features.
- Parameters:
- Return type:
None
- class lamindb.models.LabelManager(sqlrecord)¶
Label manager.
This allows to manage untyped labels
ULabeland arbitrary typed labels (e.g.,CellLine) and associate labels with features.- describe(return_str=True)¶
Describe the labels.
- Return type:
str
- add(records, feature=None)¶
Add one or several labels and associate them with a feature.
- get(feature, mute=False, flat_names=False)¶
Get labels given a feature.
- add_from(data, transfer_logs=None)¶
Add labels from an artifact or collection to another artifact or collection.
- Return type:
None
Examples
artifact1 = ln.Artifact(pd.DataFrame(index=[0, 1])).save() artifact2 = ln.Artifact(pd.DataFrame(index=[2, 3])).save() records = ln.Record.from_values(["Label1", "Label2"], field="name").save() labels = ln.Record.filter(name__icontains = "label") artifact1.records.set(labels) artifact2.labels.add_from(artifact1)
- class lamindb.models.QueryManager(*args, **kwargs)¶
Manage queries through fields.
Examples
Populate the
.parentsManyToMany relationship (aQueryManager):ln.Record.from_values(["Label1", "Label2", "Label3"], field="name")).save() labels = ln.Record.filter(name__icontains="label") label1 = ln.Record.get(name="Label1") label1.parents.set(labels)
Convert all linked parents to a
DataFrame:label1.parents.to_dataframe()
- to_list(field=None)¶
Populate a list.
- to_dataframe(**kwargs)¶
Convert to DataFrame.
For
**kwargs, seelamindb.models.QuerySet.to_dataframe().
- search(string, **kwargs)¶
Search.
- Parameters:
string (
str) – The input string to match against the field ontology values.field – The field or fields to search. Search all string fields by default.
limit – Maximum amount of top results to return.
case_sensitive – Whether the match is case sensitive.
- Returns:
A sorted
DataFrameof search results with a score in columnscore. Ifreturn_querysetisTrue.QuerySet.
See also
filter()lookup()Examples
records = ln.Record.from_values(["Label1", "Label2", "Label3"], field="name").save() ln.Record.search("Label2")
- lookup(field=None, **kwargs)¶
Return an auto-complete object for a field.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – The field to look up the values for. Defaults to first string field.return_field – The field to return. If
None, returns the whole record.keep – When multiple records are found for a lookup, how to return the records. -
"first": return the first record. -"last": return the last record. -False: return all records.
- Return type:
NamedTuple- Returns:
A
NamedTupleof lookup information of the field values with a dictionary converter.
See also
search()Examples
Lookup via auto-complete on
.:import bionty as bt bt.Gene.from_source(symbol="ADGB-DT").save() lookup = bt.Gene.lookup() lookup.adgb_dt
Look up via auto-complete in dictionary:
lookup_dict = lookup.dict() lookup_dict['ADGB-DT']
Look up via a specific field:
lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id") genes.ensg00000002745
Return a specific field value instead of the full record:
lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
- get_queryset()¶
- class lamindb.models.RelatedManager(*args, **kwargs)¶
Manager for many-to-many and reverse foreign key relationships.
Provides relationship manipulation methods.
See also
Examples
Populate the
.parentsManyToMany relationship (aRelatedManager):ln.Record.from_values(["Label1", "Label2", "Label3"], field="name")).save() labels = ln.Record.filter(name__icontains="label") label1 = ln.Record.get(name="Label1") label1.parents.set(labels)
Convert all linked parents to a
DataFrame:label1.parents.to_dataframe()
Remove a parent label:
label1.parents.remove(label2)
Clear all parent labels:
label1.parents.clear()
- add(*objs, bulk=True)¶
Add objects to the relationship.
- Return type:
None
- set(objs, *, bulk=True, clear=False)¶
Set the relationship to the specified objects.
- Return type:
None
- remove(*objs, bulk=True)¶
Remove objects from the relationship.
- Return type:
None
- clear()¶
Remove all objects from the relationship.
- Return type:
None
Annotations of objects¶
Artifact, run, collection, annotations can be conditioned on features.
Besides linking categorical data, you can also link simple data types
by virtue of the JsonValue model.
- class lamindb.models.JsonValue(*args, **kwargs)¶
JSON values for annotating artifacts and runs.
Categorical values are stored in their respective registries:
ULabel,CellType, etc.Unlike for
ULabel, inJsonValue, values are grouped by features and not by an ontological hierarchy.- value: Any¶
The JSON-like value.
- hash: str¶
Value hash.
- classmethod get_or_create(feature, value)¶
Annotating artifacts.
- class lamindb.models.ArtifactArtifact(created_at, created_by, run, id, artifact, value, feature)¶
- class lamindb.models.ArtifactJsonValue(created_at, created_by, run, id, artifact, jsonvalue)¶
- class lamindb.models.ArtifactProject(created_at, created_by, run, id, artifact, project, feature)¶
- class lamindb.models.ArtifactRecord(created_at, created_by, run, id, artifact, record, feature)¶
- class lamindb.models.ArtifactReference(created_at, created_by, run, id, artifact, reference, feature)¶
- class lamindb.models.ArtifactRun(created_at, created_by, id, artifact, run, feature)¶
- class lamindb.models.ArtifactSchema(created_at, created_by, run, id, artifact, schema, slot, feature_ref_is_semantic)¶
- slot: str | None¶
A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.
- feature_ref_is_semantic: bool | None¶
A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.
- class lamindb.models.ArtifactULabel(created_at, created_by, run, id, artifact, ulabel, feature)¶
- class lamindb.models.ArtifactUser(created_at, created_by, run, id, artifact, user, feature)¶
Annotating collections.
- class lamindb.models.CollectionArtifact(created_at, created_by, run, id, collection, artifact)¶
- collection: Collection¶
- class lamindb.models.CollectionProject(created_at, created_by, run, id, collection, project)¶
- collection: Collection¶
- class lamindb.models.CollectionReference(created_at, created_by, run, id, collection, reference)¶
- collection: Collection¶
- class lamindb.models.CollectionULabel(created_at, created_by, run, id, collection, ulabel, feature)¶
- collection: Collection¶
- class lamindb.models.CollectionRecord(created_at, created_by, run, id, collection, record, feature)¶
- collection: Collection¶
Annotating runs.
- class lamindb.models.RunJsonValue(id, run, jsonvalue, created_at, created_by)¶
-
created_at:
datetime¶ Time of creation of record.
-
created_at:
- class lamindb.models.RunProject(id, run, project, created_at, created_by)¶
-
created_at:
datetime¶ Time of creation of record.
-
created_at:
- class lamindb.models.RunULabel(id, run, ulabel, created_at, created_by)¶
-
created_at:
datetime¶ Time of creation of record.
-
created_at:
- class lamindb.models.RunRecord(id, run, record, feature, created_at, created_by)¶
-
created_at:
datetime¶ A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.
-
created_at:
Annotating transforms.
- class lamindb.models.TransformProject(created_at, created_by, run, id, transform, project)¶
- class lamindb.models.TransformReference(created_at, created_by, run, id, transform, reference)¶
- class lamindb.models.TransformULabel(created_at, created_by, run, id, transform, ulabel)¶
Building relationships among transforms.
- class lamindb.models.TransformTransform(id, successor, predecessor, config, created_at, created_by)¶
-
config:
dict|None¶ A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.
-
created_at:
datetime¶ A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.
-
config:
Annotating features, blocks, and ulabels with projects.
- class lamindb.models.FeatureProject(created_at, created_by, run, id, feature, project)¶
- class lamindb.models.ULabelProject(created_at, created_by, run, id, ulabel, project)¶
- class lamindb.models.SchemaProject(created_at, created_by, run, id, schema, project)¶
- class lamindb.models.ProjectRecord(created_at, created_by, run, id, project, feature, record)¶
Building schemas.
- class lamindb.models.SchemaComponent(created_at, created_by, run, id, composite, component, slot)¶
- slot: str | None¶
A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.
Annotating references with records.
Record values¶
Record values work almost exactly like artifact and run annotations,
with the exception that JSON values are stored in RecordJson on a per-record basis
and not in JsonValue.
- class lamindb.models.RecordArtifact(id, record, feature, value)¶
- class lamindb.models.RecordCollection(id, record, feature, value)¶
-
- value: Collection¶
- class lamindb.models.RecordJson(id, record, feature, value)¶
-
value:
Any¶ A wrapper for a deferred-loading field. When the value is read from this object the first time, the query is executed.
-
value:
- class lamindb.models.RecordProject(id, record, feature, value)¶
- class lamindb.models.RecordRecord(id, record, feature, value)¶
- class lamindb.models.RecordReference(id, record, feature, value)¶
- class lamindb.models.RecordRun(id, record, feature, value)¶
- class lamindb.models.RecordTransform(id, record, feature, value)¶
- class lamindb.models.RecordULabel(id, record, feature, value)¶
- class lamindb.models.RecordUser(id, record, feature, value)¶
Blocks¶
- class lamindb.models.Block(*args, **kwargs)¶
A root block for every registry that can appear at the top of the registry root block in the GUI.
-
key:
str¶ The key for which we want to create a block.
- projects: <class 'lamindb.models.project.Project'>¶
Projects that annotate this block.
-
key:
- class lamindb.models.ArtifactBlock(*args, **kwargs)¶
An unstructured notes block that can be attached to an artifact.
- class lamindb.models.BranchBlock(*args, **kwargs)¶
An unstructured notes block that can be attached to a branch.
- class lamindb.models.CollectionBlock(*args, **kwargs)¶
An unstructured notes block that can be attached to a collection.
-
collection:
Collection¶ The collection to which the block is attached.
-
collection:
- class lamindb.models.FeatureBlock(*args, **kwargs)¶
An unstructured notes block that can be attached to a feature.
- class lamindb.models.ProjectBlock(*args, **kwargs)¶
An unstructured notes block that can be attached to a project.
- class lamindb.models.RecordBlock(*args, **kwargs)¶
An unstructured notes block that can be attached to a record.
- class lamindb.models.RunBlock(*args, **kwargs)¶
An unstructured notes block that can be attached to a run.
- class lamindb.models.SchemaBlock(*args, **kwargs)¶
An unstructured notes block that can be attached to a schema.
- class lamindb.models.SpaceBlock(*args, **kwargs)¶
An unstructured notes block that can be attached to a space.
Utils¶
- class lamindb.models.LazyArtifact(suffix, overwrite_versions, **kwargs)¶
Lazy artifact for streaming to auto-generated internal paths.
This is needed when it is desirable to stream to a
lamindbauto-generated internal path and register the path as an artifact (seeArtifact).This object creates a real artifact on
.save()with the provided arguments.- Parameters:
suffix (
str) – The suffix for the auto-generated internal pathoverwrite_versions (
bool) – Whether to overwrite versions.**kwargs – Keyword arguments for the artifact to be created.
Examples
Create a lazy artifact, write to the path and save to get a real artifact:
lazy = ln.Artifact.from_lazy(suffix=".zarr", overwrite_versions=True, key="mydata.zarr") zarr.open(lazy.path, mode="w")["test"] = np.array(["test"]) # stream to the path artifact = lazy.save()
- class lamindb.models.InspectResult(validated_df, validated, nonvalidated, frac_validated, n_empty, n_unique)¶
Result of inspect.
An InspectResult object of calls such as
inspect().- property df: DataFrame¶
A DataFrame indexed by values with a boolean
__validated__column.
- property validated: list[str]¶
List of successfully
validate()validated items.
- property non_validated: list[str]¶
List of unsuccessfully
validate()items.This list can be used to remove any non-validated values such as genes that do not map against the specified source.
- property frac_validated: float¶
Fraction of items that were validated.
- property n_empty: int¶
Number of empty items.
- property n_unique: int¶
Number of unique items.
- property synonyms_mapper: dict¶
Synonyms mapper dictionary.
Such a dictionary maps the actual values to their synonyms which can be used to rename values accordingly.
Examples
>>> markers = pd.DataFrame(index=["KI67","CCR7"]) >>> synonyms_mapper = bt.CellMarker.standardize(markers.index, return_mapper=True)
{‘KI67’: ‘Ki67’, ‘CCR7’: ‘Ccr7’}
- class lamindb.models.ValidateFields¶
- class lamindb.models.SchemaOptionals(schema)¶
Manage and access optional features in a schema.
- get_uids()¶
Get the uids of the optional features.
Does not need an additional query to the database, while
get()does.- Return type:
list[str]
- set(features)¶
Set the optional features (overwrites whichever schemas are currently optional).
- Return type:
None
- remove(features)¶
Make one or multiple features required by removing them from the set of optional features.
- Return type:
None
- add(features)¶
Make one or multiple features optional by adding them to the set of optional features.
- Return type:
None
- class lamindb.models.query_set.BiontyDB(query_db, module_name)¶
Namespace for Bionty registries (Gene, CellType, Disease, etc.).
- class lamindb.models.query_set.PertdbDB(query_db, module_name)¶
Namespace for
PertDBregistries (Biologic, Compound, etc.).