Entity collections#

To group multiple entities together, mammos_entity provides the class EntityCollection.

import mammos_entity as me

EntityCollection basics#

Entities can be passed as keyword arguments when creating the collection. In addition, the collection can have a description:

collection = me.EntityCollection(
    description="Some random test data.\n\nDescriptions can have multiple lines.",
    Tc=me.Tc([10, 100], "K"),
    Ms=me.Ms([50, 60], "A/m"),
)
collection
EntityCollection(
    description='Some random test data.\n\nDescriptions can have multiple lines.',
    Tc=Entity(ontology_label='CurieTemperature', value=array([ 10., 100.]), unit='K'),
    Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([50., 60.]), unit='A / m'),
)

EntityCollection is not limited to storing Entity objects. It also accepts mammos_units.Quantity objects (the object returned from Entity.quantity or Entity.q) or other data (list/tuple/numpy array/etc.). We refer to all these objects as entity-like. (Implementation detail: no checks are performed when passing an entity-like. Some operations later on may however fail/produce surprising results if unsuitable elements have been passed as entity-likes.)

When possible you should use Entity objects. This is however not always possible as the ontology does not cover everything.

Accessing elements#

We can access the entities in the collection using two different ways. First, we can access entities via their name using:

collection.Tc
CurieTemperature(value=[ 10. 100.], unit=K)

This method is limited to entity names that are valid Python names. Furthermore, EntityCollection has a number of methods. If you have an entity with the same name you cannot access it via attribute access (you would get the method instead). Therefore, we can also access entities using:

collection["Ms"]
SpontaneousMagnetization(value=[50. 60.], unit=A / m)

In addition the collection carries a description, which we can access with (we print it because our description is multi-line):

print(collection.description)
Some random test data.

Descriptions can have multiple lines.

Defining an entity with the name description is not allowed.

Adding or overwriting elements#

Additional entities can be added at any later point by just adding a new attribute to the collection:

collection.A = [8e-12, 9e-12]
collection
EntityCollection(
    description='Some random test data.\n\nDescriptions can have multiple lines.',
    Tc=Entity(ontology_label='CurieTemperature', value=array([ 10., 100.]), unit='K'),
    Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([50., 60.]), unit='A / m'),
    A=[8e-12, 9e-12],
)

Likewise, we can add entities using:

collection["B_ext"] = me.B(1, "T")

Both methods are generally equivalent. If you need an entity with the same name as one of the methods of EntityCollection only the latter way of adding it works. You should avoid reusing method names if you can.

Our collection now carries the following elements:

collection
EntityCollection(
    description='Some random test data.\n\nDescriptions can have multiple lines.',
    Tc=Entity(ontology_label='CurieTemperature', value=array([ 10., 100.]), unit='K'),
    Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([50., 60.]), unit='A / m'),
    A=[8e-12, 9e-12],
    B_ext=Entity(ontology_label='MagneticFluxDensity', value=1.0, unit='T'),
)

If an entity with the given name exists already it will be overwritten. We can use both access methods.

First, we replace the entity Ms with a quantity Ms:

collection.Ms = me.Ms([400, 500], "kA/m").q
collection
EntityCollection(
    description='Some random test data.\n\nDescriptions can have multiple lines.',
    Tc=Entity(ontology_label='CurieTemperature', value=array([ 10., 100.]), unit='K'),
    Ms=<Quantity [400., 500.] kA / m>,
    A=[8e-12, 9e-12],
    B_ext=Entity(ontology_label='MagneticFluxDensity', value=1.0, unit='T'),
)

Second, we overwrite B_ext with a new entity:

collection["B_ext"] = me.B([1, 1.2], "T")
collection
EntityCollection(
    description='Some random test data.\n\nDescriptions can have multiple lines.',
    Tc=Entity(ontology_label='CurieTemperature', value=array([ 10., 100.]), unit='K'),
    Ms=<Quantity [400., 500.] kA / m>,
    A=[8e-12, 9e-12],
    B_ext=Entity(ontology_label='MagneticFluxDensity', value=array([1. , 1.2]), unit='T'),
)

Note

  • Entities are immutable. You can instead use this method to replace an entity in a collection.

  • You can use the same mechanism to add an entity from one collection to another.

Checking if an element is in the collection#

To check if an entity-like with a given name exists in the collection use:

"Ms" in collection
True
"Js" in collection
False

Iterating over all entities in the collection#

We can iterate over all entity-likes in the collection: we get tuples (name, entity_like).

In the following example we print name, entity_like and type of entity_like for each element in the collection:

for name, entity_like in collection:
    print(f"{name}\t{entity_like} of type '{type(entity_like).__name__}'")
Tc	CurieTemperature(value=[ 10. 100.], unit=K) of type 'Entity'
Ms	[400. 500.] kA / m of type 'Quantity'
A	[8e-12, 9e-12] of type 'list'
B_ext	MagneticFluxDensity(value=[1.  1.2], unit=T) of type 'Entity'

Removing elements#

We can remove elements from the collection using del:

del collection.B_ext  # equivalent alternative: del collection["B_ext"]
collection
EntityCollection(
    description='Some random test data.\n\nDescriptions can have multiple lines.',
    Tc=Entity(ontology_label='CurieTemperature', value=array([ 10., 100.]), unit='K'),
    Ms=<Quantity [400., 500.] kA / m>,
    A=[8e-12, 9e-12],
)

Saving to file#

EntityCollections can be stored as YAML, CSV or HDF5 files:

collection.to_yaml("example.yaml")

More details are provided in the respective notebooks.

Conversion to and from dataframe#

If all entities in the collection are one-dimensional and have the same length, the collection can be converted to a pandas dataframe:

data = collection.to_dataframe()
data
Tc Ms A
0 10.0 400.0 8.000000e-12
1 100.0 500.0 9.000000e-12

By default, only the name of the entity in the collection is used as header.

Units can optionally be included in the column header, the ontology information is however always lost in the dataframe:

collection.to_dataframe(include_units=True)
Tc (K) Ms (kA / m) A
0 10.0 400.0 8.000000e-12
1 100.0 500.0 9.000000e-12

It is also possible to convert a dataframe back to an EntityCollection. The dataframe does not carry enough metadata (ontology information is missing, units are not always present). Therefore, the additional metadata has to be provided as a dictionary. When starting from an EntityCollection the metadata dictionary can be created as follows:

metadata = collection.metadata()
metadata
{'description': 'Some random test data.\n\nDescriptions can have multiple lines.',
 'Tc': {'ontology_label': 'CurieTemperature', 'unit': 'K', 'description': ''},
 'Ms': {'unit': 'kA / m'},
 'A': {}}

Ignoring the special key description, each key corresponds to one entity-like in the collection. The values are dictionaries whose keys depend on the type of the entity-like:

  • for entities it has keys ontology_label, unit and description

  • for quantities it has key unit

  • otherwise it is empty

We can now create a new entity collection using the dataframe data and the metadata dictionary:

me.EntityCollection.from_dataframe(data, metadata)
EntityCollection(
    description='Some random test data.\n\nDescriptions can have multiple lines.',
    Tc=Entity(ontology_label='CurieTemperature', value=array([ 10., 100.]), unit='K'),
    Ms=<Quantity [400., 500.] kA / m>,
    A=array([8.e-12, 9.e-12]),
)

Metadata lookup is done by column name. Therefore, only dataframes without units are supported (because keys in the metadata dictionary do not contain units). If you have a dataframe with incompatible headers, e.g. with units, you need to first align column names and metadata keys.

We can modify the dataframe and/or metadata dictionary before creating the EntityCollection. As an example, we add the missing ontology information for Ms to the metadata and scale data for the A column by 2:

data["A"] *= 2
metadata["Ms"]["ontology_label"] = "SpontaneousMagnetization"
me.EntityCollection.from_dataframe(data, metadata)
EntityCollection(
    description='Some random test data.\n\nDescriptions can have multiple lines.',
    Tc=Entity(ontology_label='CurieTemperature', value=array([ 10., 100.]), unit='K'),
    Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([400., 500.]), unit='kA / m'),
    A=array([1.6e-11, 1.8e-11]),
)

Conversion to a dataframe can e.g. be useful to combine two EntityCollections in advanced ways. More details in this tutorial.

Nested entity collections#

In addition to storing entity-like data (entities, quantities, raw numbers), EntityCollection can also contain other (nested) entity collections:

sample = me.EntityCollection(
    edge_length=me.Entity("Length", 1, "mm"),
    material_properties=collection,
)
sample
EntityCollection(
    description='',
    edge_length=Entity(ontology_label='Length', value=1.0, unit='mm'),
    material_properties=EntityCollection(
        description='Some random test data.\n\nDescriptions can have multiple lines.',
        Tc=Entity(ontology_label='CurieTemperature', value=array([ 10., 100.]), unit='K'),
        Ms=<Quantity [400., 500.] kA / m>,
        A=[8e-12, 9e-12],
    ),
)

Nested collections can only be saved to YAML and HDF5 and cannot be converted to a pandas dataframe.