Useful operations#
This notebook contains a collection of common use cases. Each section is self-contained.
Converting unformatted files to a mammos-entity format#
Users may wish to update other files to the mammos-entity formats in order to make use of the additional functionality.
Details of the file formats are explained in the EntityCollection API reference.
Converting your “raw” data into the mammos formats involves three main steps:
Load your file into python (e.g. with
pandas).Create an
EntityorQuantityfor each column (by assigning the correct ontology term and/or units).Export the result with one of the
EntityCollection._to_<csv|hdf5|yaml>methods.
First let’s create a file so we can see an example of how to do the conversion. We will create the following structure:
1 10.0 1.6083568305976572 -16778187.088808443
1 9.0 1.6083393931987826 -15498304.121589921
...
This file is quite basic, in particular, there are no headers, no units, no ontology information, and columns are separated with a space.
Only the person writing the file knows what each of the columns are.
In this example, the first column is the configuration type, the second column is the value of \(\mu_0 H_{\mathsf{ext}}\) in Tesla, the third column is the magnetic polarization in Tesla and the last column is the energy density in J/m\(^3\).
from pathlib import Path
import mammos_entity as me
import mammos_units as u
import pandas as pd
Path("example.dat").write_text("""\
1 10.0 1.6083568305976572 -16778187.088808443
1 9.0 1.6083393931987826 -15498304.121589921
1 8.0 1.6083184361075116 -14218436.37373519
1 7.0 1.608292941666901 -12938587.029585946
1 6.0 1.6082614950059932 -11658760.230932372
""")
224
We can use pandas to read the file into python:
df = pd.read_csv("example.dat", sep=" ", names=["configuration_type", "mu0_Hext", "J", "energy_density"])
df
| configuration_type | mu0_Hext | J | energy_density | |
|---|---|---|---|---|
| 0 | 1 | 10.0 | 1.608357 | -1.677819e+07 |
| 1 | 1 | 9.0 | 1.608339 | -1.549830e+07 |
| 2 | 1 | 8.0 | 1.608318 | -1.421844e+07 |
| 3 | 1 | 7.0 | 1.608293 | -1.293859e+07 |
| 4 | 1 | 6.0 | 1.608261 | -1.165876e+07 |
To rewrite this in a mammos format, we then need to associate each column with an entity, quantity, or another python object. Now is also time to do any data manipulation (such as changing units).
In this example we:
Convert configuration type to a
numpyarray.Convert magnetic flux density (\(\mu_0 H_{\mathsf{ext}}\)) to the external magnetic field Entity using
mammos_unitsfor unit conversion.Convert magnetic polarization to the corresponding entity.
Convert energy density to the corresponding entity.
We store everything in an `EntityCollection.
# conversion from T to A/m requires using the `magnetic_flux_field` equivalency
Hext = (df["mu0_Hext"].to_numpy() * u.T).to("A/m", equivalencies=u.magnetic_flux_field())
collection = me.EntityCollection(
configuration_type=df["configuration_type"].to_numpy(),
H=me.Entity("ExternalMagneticField", Hext), # Hext is a Quantity, so we don't have to pass units
J=me.Entity("MagneticPolarisation", df["J"], "T"),
energy_density=me.Entity("EnergyDensity", df["energy_density"], "J/m^3"),
)
collection
EntityCollection(
description='',
configuration_type=array([1, 1, 1, 1, 1]),
H=Entity(ontology_label='ExternalMagneticField', value=array([7957747.15026276, 7161972.43523649, 6366197.72021021,
5570423.00518393, 4774648.29015766]), unit='A / m'),
J=Entity(ontology_label='MagneticPolarisation', value=array([1.60835683, 1.60833939, 1.60831844, 1.60829294, 1.6082615 ]), unit='T'),
energy_density=Entity(ontology_label='EnergyDensity', value=array([-16778187.08880844, -15498304.12158992, -14218436.37373519,
-12938587.02958595, -11658760.23093237]), unit='J / m3'),
)
We can now write the collection in one of the mammos-entity formats, e.g. as CSV:
collection.to_csv("example.csv")
Looking at the file produced we can see the data is now in the correct format with the ontology information included:
print(Path("example.csv").read_text())
# mammos csv v3
,ExternalMagneticField,MagneticPolarisation,EnergyDensity
,,,
,https://w3id.org/emmo/domain/magnetic-materials#EMMO_da08f0d3-fe19-58bc-8fb6-ecc8992d5eb3,https://w3id.org/emmo#EMMO_74a096dd_cc83_4c7e_b704_0541620ff18d,https://w3id.org/emmo/domain/magnetic-materials#EMMO_56258d3a-f2ee-554e-af99-499dd8620457
,A / m,T,J / m3
configuration_type,H,J,energy_density
1,7957747.150262763,1.6083568305976572,-16778187.088808443
1,7161972.435236487,1.6083393931987826,-15498304.12158992
1,6366197.72021021,1.6083184361075116,-14218436.37373519
1,5570423.005183934,1.608292941666901,-12938587.029585946
1,4774648.290157658,1.6082614950059932,-11658760.230932372
Combining tow EntityCollection objects#
import mammos_entity as me
import pandas as pd
Appending to a collection#
We first create two different collections:
ec_1 = me.EntityCollection(x=[0, 5, 10] * u.mm, Tc=me.Tc([50, 53, 56], "K"))
ec_1
EntityCollection(
description='',
x=<Quantity [ 0., 5., 10.] mm>,
Tc=Entity(ontology_label='CurieTemperature', value=array([50., 53., 56.]), unit='K'),
)
ec_2 = me.EntityCollection(x=[0, 10, 20] * u.mm, Ms=me.Ms([4e5, 6e5, 7e5], "A/m"))
ec_2
EntityCollection(
description='',
x=<Quantity [ 0., 10., 20.] mm>,
Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([400000., 600000., 700000.]), unit='A / m'),
)
Adding an entity from one collection to another can be achieved by simply adding an attribute:
ec_1.Ms = ec_2.Ms
ec_1
EntityCollection(
description='',
x=<Quantity [ 0., 5., 10.] mm>,
Tc=Entity(ontology_label='CurieTemperature', value=array([50., 53., 56.]), unit='K'),
Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([400000., 600000., 700000.]), unit='A / m'),
)
To perform more complex operations with entity collections a temporary conversion to a pandas dataframe is often required. As an example we show how to combine two entity collections using different pandas functions.
Combining with pandas – index-based#
First we create two collections:
ec_1 = me.EntityCollection(x=[0, 5, 10] * u.mm, Tc=me.Tc([50, 53, 56], "K"))
ec_1
EntityCollection(
description='',
x=<Quantity [ 0., 5., 10.] mm>,
Tc=Entity(ontology_label='CurieTemperature', value=array([50., 53., 56.]), unit='K'),
)
ec_2 = me.EntityCollection(Ms=me.Ms([4e5, 6e5, 7e5], "A/m"))
ec_2
EntityCollection(
description='',
Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([400000., 600000., 700000.]), unit='A / m'),
)
We can convert the collections to pandas dataframes and in addition extract the metadata:
data_1 = ec_1.to_dataframe(include_units=False)
metadata_1 = ec_1.metadata()
data_1
| x | Tc | |
|---|---|---|
| 0 | 0.0 | 50.0 |
| 1 | 5.0 | 53.0 |
| 2 | 10.0 | 56.0 |
metadata_1
{'description': '',
'x': {'unit': 'mm'},
'Tc': {'ontology_label': 'CurieTemperature', 'unit': 'K', 'description': ''}}
data_2 = ec_2.to_dataframe(include_units=False)
metadata_2 = ec_2.metadata()
data_2
| Ms | |
|---|---|
| 0 | 400000.0 |
| 1 | 600000.0 |
| 2 | 700000.0 |
metadata_2
{'description': '',
'Ms': {'ontology_label': 'SpontaneousMagnetization',
'unit': 'A / m',
'description': ''}}
We can now combine the columns. Pandas offers multiple different ways:
data_1.join(data_2)
| x | Tc | Ms | |
|---|---|---|---|
| 0 | 0.0 | 50.0 | 400000.0 |
| 1 | 5.0 | 53.0 | 600000.0 |
| 2 | 10.0 | 56.0 | 700000.0 |
pd.concat((data_1, data_2), axis=1)
| x | Tc | Ms | |
|---|---|---|---|
| 0 | 0.0 | 50.0 | 400000.0 |
| 1 | 5.0 | 53.0 | 600000.0 |
| 2 | 10.0 | 56.0 | 700000.0 |
data_combined = pd.merge(data_1, data_2, left_index=True, right_index=True)
data_combined
| x | Tc | Ms | |
|---|---|---|---|
| 0 | 0.0 | 50.0 | 400000.0 |
| 1 | 5.0 | 53.0 | 600000.0 |
| 2 | 10.0 | 56.0 | 700000.0 |
To convert this dataframe back to an entity collection (e.g. to subsequently write it to a file with ontology information) we need to also merge the metadata. We can e.g. add the missing Ms key to metadata_1:
metadata_1["Ms"] = metadata_2["Ms"]
and can subsequently create a new collection:
me.EntityCollection.from_dataframe(data_combined, metadata_1)
EntityCollection(
description='',
x=<Quantity [ 0., 5., 10.] mm>,
Tc=Entity(ontology_label='CurieTemperature', value=array([50., 53., 56.]), unit='K'),
Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([400000., 600000., 700000.]), unit='A / m'),
)
Combining with pandas – merge based on specific column(s)#
This example shows how to merge two EntityCollections that have a shared entity. We use the shared entity as key to merge on.
First, we create two collections:
ec_1 = me.EntityCollection(x=[0, 5, 10] * u.mm, Tc=me.Tc([50, 53, 56], "K"))
ec_1
EntityCollection(
description='',
x=<Quantity [ 0., 5., 10.] mm>,
Tc=Entity(ontology_label='CurieTemperature', value=array([50., 53., 56.]), unit='K'),
)
ec_2 = me.EntityCollection(x=[0, 10, 20] * u.mm, Ms=me.Ms([4e5, 6e5, 7e5], "A/m"))
ec_2
EntityCollection(
description='',
x=<Quantity [ 0., 10., 20.] mm>,
Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([400000., 600000., 700000.]), unit='A / m'),
)
Both collections have the entity-like x, for which a subset of values are identical.
We convert the collections to pandas dataframes and in addition extract the metadata:
data_1 = ec_1.to_dataframe(include_units=False)
metadata_1 = ec_1.metadata()
data_1
| x | Tc | |
|---|---|---|
| 0 | 0.0 | 50.0 |
| 1 | 5.0 | 53.0 |
| 2 | 10.0 | 56.0 |
metadata_1
{'description': '',
'x': {'unit': 'mm'},
'Tc': {'ontology_label': 'CurieTemperature', 'unit': 'K', 'description': ''}}
data_2 = ec_2.to_dataframe(include_units=False)
metadata_2 = ec_2.metadata()
data_2
| x | Ms | |
|---|---|---|
| 0 | 0.0 | 400000.0 |
| 1 | 10.0 | 600000.0 |
| 2 | 20.0 | 700000.0 |
metadata_2
{'description': '',
'x': {'unit': 'mm'},
'Ms': {'ontology_label': 'SpontaneousMagnetization',
'unit': 'A / m',
'description': ''}}
We can now merge the two dataframes. By default pandas will merge on all columns with identical names and only keep data present in both dataframes:
data_combined = pd.merge(data_1, data_2)
data_combined
| x | Tc | Ms | |
|---|---|---|---|
| 0 | 0.0 | 50.0 | 400000.0 |
| 1 | 10.0 | 56.0 | 600000.0 |
To convert this dataframe back to an entity collection (e.g. to subsequently write it to a file with ontology information) we need to also merge the metadata. We can e.g. add the missing Ms key to metadata_1:
metadata_1["Ms"] = metadata_2["Ms"]
We can now create a new collection using the new dataframe and the updated metadata:
ec_combined = me.EntityCollection.from_dataframe(data_combined, metadata_1)
ec_combined
EntityCollection(
description='',
x=<Quantity [ 0., 10.] mm>,
Tc=Entity(ontology_label='CurieTemperature', value=array([50., 56.]), unit='K'),
Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([400000., 600000.]), unit='A / m'),
)
Pandas’ merge function is very powerful. In the following we show how to keep all rows present in the first dataframe, for more details refer to the pandas documentation.
We can use how="left" to let pandas keep all rows present in the left dataframe. Missing data in the right dataframe will be filled with NaNs:
data_combined = pd.merge(data_1, data_2, how="left")
data_combined
| x | Tc | Ms | |
|---|---|---|---|
| 0 | 0.0 | 50.0 | 400000.0 |
| 1 | 5.0 | 53.0 | NaN |
| 2 | 10.0 | 56.0 | 600000.0 |
me.EntityCollection.from_dataframe(data_combined, metadata_1)
EntityCollection(
description='',
x=<Quantity [ 0., 5., 10.] mm>,
Tc=Entity(ontology_label='CurieTemperature', value=array([50., 53., 56.]), unit='K'),
Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([400000., nan, 600000.]), unit='A / m'),
)
Appending rows to an EntityCollection (values to all individual entities)#
In this example we show how to add an additional value to each entity in a collection. All entities in our collection have the same length and are one-dimensional, so we can understand this as adding a row to our entity table.
First, we create two collections with the same entities:
ec_1 = me.EntityCollection(x=[0, 5, 10], Tc=me.Tc([50, 51, 52], "K"), Ms=me.Ms([4e5, 6e5, 7e5], "A/m"))
ec_1
EntityCollection(
description='',
x=[0, 5, 10],
Tc=Entity(ontology_label='CurieTemperature', value=array([50., 51., 52.]), unit='K'),
Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([400000., 600000., 700000.]), unit='A / m'),
)
ec_2 = me.EntityCollection(x=[15], Tc=me.Tc([53], "K"), Ms=me.Ms([8e5], "A/m"))
ec_2
EntityCollection(
description='',
x=[15],
Tc=Entity(ontology_label='CurieTemperature', value=array([53.]), unit='K'),
Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([800000.]), unit='A / m'),
)
We convert both collections to dataframes and extract the metadata (which is identical for both collections so we only need it once):
data_1 = ec_1.to_dataframe()
data_2 = ec_2.to_dataframe()
metadata = ec_1.metadata()
data_1
| x | Tc | Ms | |
|---|---|---|---|
| 0 | 0 | 50.0 | 400000.0 |
| 1 | 5 | 51.0 | 600000.0 |
| 2 | 10 | 52.0 | 700000.0 |
data_2
| x | Tc | Ms | |
|---|---|---|---|
| 0 | 15 | 53.0 | 800000.0 |
We can now combine the tables using:
combined = pd.concat((data_1, data_2), ignore_index=True)
combined
| x | Tc | Ms | |
|---|---|---|---|
| 0 | 0 | 50.0 | 400000.0 |
| 1 | 5 | 51.0 | 600000.0 |
| 2 | 10 | 52.0 | 700000.0 |
| 3 | 15 | 53.0 | 800000.0 |
The columns have not changed, so the metadata can stay unchanged and we can convert the result back into an EntityCollection:
me.EntityCollection.from_dataframe(combined, metadata)
EntityCollection(
description='',
x=array([ 0, 5, 10, 15]),
Tc=Entity(ontology_label='CurieTemperature', value=array([50., 51., 52., 53.]), unit='K'),
Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([400000., 600000., 700000., 800000.]), unit='A / m'),
)