Writing and reading CSV#

mammos_entity can write and read an EntityCollection in CSV format with a custom metadata block at the top of the file. Details of the file format are explained in the mammos_entity.EntityCollection.to_csv() API reference. CSV only supports tabular data. Therefore, an EntityCollection can only be saved to CSV if its data is tabular, i.e. all contained entity-likes are one-dimensional and have the same length.

from pathlib import Path

import mammos_entity as me
import mammos_units as u

Writing#

We create some artificial data that we can write to file:

collection = me.EntityCollection(
    "Example description.\nThe description can have multiple lines.\n\nLines can also be empty.",
    Ms=me.Ms([600, 650, 700], "kA/m", description="Evaluated using UppASD with 70000 Monte Carlo steps."),
    T=me.T([1, 2, 3], "K"),
    theta_angle=[0, 0.5, 0.7] * u.rad,
    demag_factor=me.Entity("DemagnetizingFactor", [1 / 3, 1 / 3, 1 / 3]),
    comments=["Some comment", "Some other comment", "A third comment"],
)
collection
EntityCollection(
    description='Example description.\nThe description can have multiple lines.\n\nLines can also be empty.',
    Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([600., 650., 700.]), unit='kA / m', description='Evaluated using UppASD with 70000 Monte Carlo steps.'),
    T=Entity(ontology_label='ThermodynamicTemperature', value=array([1., 2., 3.]), unit='K'),
    theta_angle=<Quantity [0. , 0.5, 0.7] rad>,
    demag_factor=Entity(ontology_label='DemagnetizingFactor', value=array([0.33333333, 0.33333333, 0.33333333])),
    comments=['Some comment', 'Some other comment', 'A third comment'],
)

We can write data to a csv file as shown in the following cell:

collection.to_csv("example.csv")

This has produced the following file which contains the EntityCollection description along with Entity and unit information for each column of the csv.

print(Path("example.csv").read_text())
# mammos csv v3
#----------------------------------------
# Example description.
# The description can have multiple lines.
# 
# Lines can also be empty.
#----------------------------------------
SpontaneousMagnetization,ThermodynamicTemperature,,DemagnetizingFactor,
Evaluated using UppASD with 70000 Monte Carlo steps.,,,,
https://w3id.org/emmo/domain/magnetic-materials#EMMO_032731f8-874d-5efb-9c9d-6dafaa17ef25,https://w3id.org/emmo#EMMO_affe07e4_e9bc_4852_86c6_69e26182a17f,,https://w3id.org/emmo/domain/magnetic-materials#EMMO_0f2b5cc9-d00a-5030-8448-99ba6b7dfd1e,
kA / m,K,rad,,
Ms,T,theta_angle,demag_factor,comments
600.0,1.0,0.0,0.3333333333333333,Some comment
650.0,2.0,0.5,0.3333333333333333,Some other comment
700.0,3.0,0.7,0.3333333333333333,A third comment

Reading#

We can read it back in to recreate the original EntityCollection:

content = me.from_csv("example.csv")
content
EntityCollection(
    description='Example description.\nThe description can have multiple lines.\n\nLines can also be empty.',
    Ms=Entity(ontology_label='SpontaneousMagnetization', value=array([600., 650., 700.]), unit='kA / m', description='Evaluated using UppASD with 70000 Monte Carlo steps.'),
    T=Entity(ontology_label='ThermodynamicTemperature', value=array([1., 2., 3.]), unit='K'),
    theta_angle=<Quantity [0. , 0.5, 0.7] rad>,
    demag_factor=Entity(ontology_label='DemagnetizingFactor', value=array([0.33333333, 0.33333333, 0.33333333])),
    comments=array(['Some comment', 'Some other comment', 'A third comment'],
          dtype=object),
)

As shown in the EntityCollection notebook we can access the individual elements of the collection:

content.Ms
SpontaneousMagnetization(value=[600. 650. 700.], unit=kA / m, description='Evaluated using UppASD with 70000 Monte Carlo steps.')
content.theta_angle
\[[0,~0.5,~0.7] \; \mathrm{rad}\]

We can also get a pandas dataframe of the data we have read. Here, we choose to include units it the column names:

content.to_dataframe(include_units=True)
Ms (kA / m) T (K) theta_angle (rad) demag_factor comments
0 600.0 1.0 0.0 0.333333 Some comment
1 650.0 2.0 0.5 0.333333 Some other comment
2 700.0 3.0 0.7 0.333333 A third comment

Reading with pandas#

We recommend to read these files using mammos_entity however the data can be read directly using other packages such as pandas. However, we can only conveniently load the values and not the ontology or unit information. We can use the header argument to point at the line number of the header. The function pandas.read_csv uses the header line to get column headers and reads everything following it as data. Note that the number of header lines changes depending on the length of the collection description. Reading with mammos-entity and then converting to a dataframe is therefore preferable.

import pandas as pd

pd.read_csv("example.csv", header=11)
Ms T theta_angle demag_factor comments
0 600.0 1.0 0.0 0.333333 Some comment
1 650.0 2.0 0.5 0.333333 Some other comment
2 700.0 3.0 0.7 0.333333 A third comment